{ "info": { "author": "hiviah", "author_email": "hiviah@github.com", "bugtrack_url": null, "classifiers": [], "description": "HTTPS Everywhere Rule Checker\n=============================\n\nAuthor: Ondrej Mikle, CZ.NIC (ondrej.mikle \\|at\\_sign\\| nic.cz)\n\nInstallation and requirements\n-----------------------------\n\n::\n\n pip install https-everywhere-checker\n\nConfiguration\n-------------\n\nCopy ``checker.config.sample`` to ``checker.config`` and change the\n``rulesdir`` under ``[rulesets]`` to point to a directory with the XML\nfiles of HTTPS Everywhere rules (usually the\n``src/chrome/content/rules`` of locally checked out git tree of HTTPS\nEverywhere).\n\nRunning\n-------\n\nOnce you have modified the config, run:\n\n::\n\n check-https-rules checker.config\n\nOutput will be written to selected log file, infos/warnings/errors\ncontain the useful information.\n\nFeatures\n--------\n\n- Attempts to follow Firefox behavior as closely as possible (including\n rewriting HTTP redirects according to rules; well except for\n Javascript and meta-redirects)\n- IDN domain support\n- Currently two metrics on \"distance\" of two resources implemented, one\n is purely string-based, the other tries to measure \"similarity of the\n shape of DOM tree\"\n- Multi-threaded scanner\n- Support for various \"platforms\" (e.g. CAcert), i.e. sets of CA\n certificate sets which can be switched during following of redirects\n- set of used CA certificates can be statically restricted to one CA\n certificate set (see ``static_ca_path`` in config file)\n\nWhat errors in rulesets can be detected\n---------------------------------------\n\n- big difference in HTML page structure\n- error in ruleset - declared target that no rule rewrites, bad regexps\n (usually capture groups are wrong), incomplete FQDNs, non-existent\n domains\n- HTTP 200 in original page, while rewritten page returns 4xx/5xx\n- cycle detection in redirects\n- transvalid certificates (incomplete chains)\n- other invalid certificate detection (self-signed, expired, CN\n mismatch...)\n\nFalse positives and shortcomings\n--------------------------------\n\n- Some pages deliberately have different HTTP and HTTPS page, some for\n example redirect to different page under https\n- URLs to scan are naively guessed from target hosts, having test set\n of URLs in a ruleset would improve it (better coverage)\n\nKnown bugs\n----------\n\nCURL+NSS can't handle hosts with SNI sharing same IP address\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nPyCURL and NSS incorrectly handle the case when two FQDNs have identical\nIP address, use Server Name Indication and try to resume TLS session\nwith the same session ID. Even turning off SSL session cache via setting\n``pycurl.SSL_SESSIONID_CACHE`` to zero won't help (it's ignored by\nlibcurl/pycurl for some reason). PyCURL+NSS fail to see that server\ndidn't acknowledge SNI in response (see RFC 4366 reference below), thus\n'Host' header in HTTP and SNI seen by server are different, thus HTTP\n404.\n\nThis one issue was especially insidious bug, many thanks to Pavel Jan\u00edk\nfor helping hunt this bug down.\n\nTestcase\n^^^^^^^^\n\nSee ``curl_test_nss/curl_testcase_nss_sni.py`` script that demonstrates\nthe bug.\n\nTechnical details\n^^^^^^^^^^^^^^^^^\n\nPyCURL sends TLS handshake with SNI for the first host. This works.\nConnection is then closed, but PyCURL+NSS remembers the SSL session ID.\nIt will attempt to use the same session ID when later connecting to\nsecond host on the same IP.\n\nHowever, the server won't acknowledge what client requested with new\nSNI, because client attempts to resume during TLS handshake using the\nincorrect session ID. Thus the session is \"resumed\" to the first host's\nSNI.\n\nSide observation: When validation is turned off in PyCURL+NSS, it also\nturns off session resume as a side effect (the code is in curl's nss.c).\n\nWorkaround\n^^^^^^^^^^\n\nSet config to use SSLv3 instead of default TLSv1 (option ``ssl_version``\nunder ``http`` section).\n\nNormative reference\n^^^^^^^^^^^^^^^^^^^\n\nSee last four paragraphs of `RFC 4366, section\n3.1 `__. Contrast with\n`RFC 6066 section 3 `__,\nlast two paragraphs. In TLS 1.2 the logic is reversed - server must not\nresume such connection and must go through full handshake again.\n\nAt most 9 capture groups in rule supported\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nThis is a workaround for ambiguous rewrites in rules such as:\n\n::\n\n \n\nThe ``$101`` would actually mean 101-st group, so we assume that only first digit after ``$``\ndenotes the group (which is how it seems to work in javascript).\n\nMay not work under Windows\n~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nAccording to `PyCURL\ndocumentation `__,\nusing CAPATH may not work under Windows. I'd guess it's due to openssl's\n``c_rehash`` utility that creates symlinks to PEM certificates.\nHypothetically it could work if the symlinks were replaced by regular\nfiles with identical names, but haven't tried.\n\nThreading bugs and workarounds\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nThere are some race conditions with Python threads and OpenSSL/GnuTLS\nthat cause about due to SIGPIPE or SIGSEGV. While libcurl code seems to\nhave implemented the necessary callbacks, there's a bug somewhere :-)\n\nWorkaround: set ``fetch_in_subprocess`` under ``http`` section in config\nto true when using multiple threads for fetching. Using subprocess is on\nby default.\n\nYou might have to set PYTHONPATH if working dir is different from code\ndir with python scripts.\n\nIf underlying SSL library is NSS, threading looks fine.\n\nAs a side effect, the CURL+NSS SNI bug does not happen with subprocesses\n(SSL session ID cache is not kept among process invocations).\n\nIf pure-threaded version starts eating too much memory (like 1 GB in a\nminute), turn on the ``fetch_in_subprocess`` option metioned above. Some\ncombinations of CURL and SSL library versions do that. Spawning separate\nsubprocesses prevents any caches building up and eating too much memory.\n\nUsing subprocess hypothetically might cause a deadlock due to\ninsufficient buffer size when exchanging data through stdin/stdout in\ncase of a large HTML page, but hasn't happened for any of the rules\n(I've tried to run them on the complete batch of rulesets contained in\nHTTPS Everywhere Nov 2 2012 commit\nc343f230a49d960dba90424799c3bacc2325fc94). Though in case deadlock\nhappens, increase buffer size in ``subprocess.Popen`` invocation in\n``http_client.py``.\n\nGeneric bugs/quirks of SSL libraries\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nEach of the three possible libraries (OpenSSL, GnuTLS, NSS) has\ndifferent set of quirks. GnuTLS seems to be the most strict one\nregarding relevant RFCs and will not for instance tolerate certificate\nchain in wrong order or forgive server not sending ``close_notify``\nalert.\n\nThus it's entirely possible that while a server chain and SSL/TLS\nhandshake seems OK when using one lib, it may break with the other.\n\nTransvalid certificates (transitive closure of root and intermediate certs)\n---------------------------------------------------------------------------\n\nThe ``platform_certs/FF_transvalid.tar.bz2`` attempts to simulate common\nbrowser behavior of caching intermediate certs. The directory contains\nFF's builtin certs and all intermediate certs that validate from FF's\nbuiltin certs (a transitive closure).\n\nThe certs above are in a tarball (need to be unpacked and c\\_rehash'd\nfor use).\n\nThe script is in ``certs_transitive_closure/build_closure.sh`` and is\nrather crude, definitely needs some double-checking of sanity (see\ncomments inside the script).\n\nQuick outline of the script's algorithm:\n\n1. IntermediateSet\\_0 := {trusted builtin certs from clean install of\n Firefox}\n2. Certs that have basic constraints CA=true or are X509 version 1 are\n exported from some DB like SSL Observatory\n3. Iterate over all exported certs, add new unique certificates not yet\n contained in IntermediateSet\\_n validate against latest\n IntermediateSet\\_n, forming IntermediateSet\\_{n+1}\n4. n += 1\n5. If any certs were added in step 3, goto 3, else end\n\nLast IntermediateSet is the closure.\n\n\n\n\nHistory\n-------\n\n0.1.0 (unreleased)\n++++++++++++++++++\n\n* First release on PyPI.", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/hiviah/https-everywhere-checker", "keywords": "https https-everywhere http security", "license": "GPL3", "maintainer": null, "maintainer_email": null, "name": "https-everywhere-checker", "package_url": "https://pypi.org/project/https-everywhere-checker/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/https-everywhere-checker/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://github.com/hiviah/https-everywhere-checker" }, "release_url": "https://pypi.org/project/https-everywhere-checker/0.1.0/", "requires_dist": null, "requires_python": null, "summary": "Rule checker for HTTPS Everywhere", "version": "0.1.0" }, "last_serial": 1220689, "releases": { "0.1.0": [] }, "urls": [] }