{ "info": { "author": "", "author_email": "", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: Apache Software License", "Programming Language :: Java", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Topic :: Internet :: WWW/HTTP" ], "description": "urlcanon\n========\n\n.. image:: https://travis-ci.org/iipc/urlcanon.svg?branch=master\n :target: https://travis-ci.org/iipc/urlcanon\n :alt: build status\n\nA URL canonicalization (normalization) library for Python and Java.\n\nIt currently provides:\n\n* A URL parser which preserves the input bytes exactly\n* A precanned canonicalization ruleset that tries to match the normalization\n implicit in the `parsing rules used by browsers\n `_\n* An alternative URL serialization suitable for sorting and prefix-matching:\n `SSURT `_.\n\n**Status:** Stable and in production use for some time. But no API or output\nstability guarantees yet. There are differences in features between Java and\nPython versions.\n\nExamples\n--------\n\nPython\n~~~~~~\n\n.. code:: python\n\n >>> import urlcanon\n >>> input_url = \"http://///EXAMPLE.com:80/foo/../bar\"\n >>> parsed_url = urlcanon.parse_url(input_url)\n >>> print(parsed_url)\n http://///EXAMPLE.com:80/foo/../bar\n >>> urlcanon.whatwg(parsed_url)\n \n >>> print(parsed_url)\n http://example.com/bar\n >>> print(parsed_url.ssurt())\n b'com,example,//:http/bar'\n >>>\n >>> rule = urlcanon.MatchRule(ssurt=b'com,example,//:http/bar')\n >>> urlcanon.whatwg.rule_applies(rule, b'https://example..com/bar/baz')\n False\n >>> urlcanon.whatwg.rule_applies(rule, b'HTtp:////eXAMple.Com/bar//baz//..///quu')\n True\n\nPython releases are available in PyPI:\n\n.. code:: sh\n\n pip install urlcanon\n\nJava\n~~~~\n\n.. code:: java\n\n String inputUrl = \"http://///EXAMPLE.com:80/foo/../bar\";\n ParsedUrl parsedUrl = ParsedUrl.parseUrl(inputUrl);\n\n System.out.println(parsedUrl);\n // http://///EXAMPLE.com:80/foo/../bar\n\n Canonicalizer.WHATWG.canonicalize(parsedUrl);\n\n System.out.println(parsedUrl);\n // http://example.com/bar\n\n System.out.println(parsedUrl.ssurt());\n // \"com,example,//:http/bar\"\n\nJava releases are available in the Maven Central repository:\n\n.. code:: xml\n\n \n org.netpreserve\n urlcanon\n 0.1.1\n \n\nLicense\n-------\n\n* Copyright (C) 2016-2018 Internet Archive\n* Copyright (C) 2016-2017 National Library of Australia\n\nLicensed under the Apache License, Version 2.0 (the \"License\"); you may\nnot use this software except in compliance with the License. You may\nobtain a copy of the License at\n\n http://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an \"AS IS\" BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/iipc/urlcanon", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "urlcanon", "package_url": "https://pypi.org/project/urlcanon/", "platform": "", "project_url": "https://pypi.org/project/urlcanon/", "project_urls": { "Homepage": "https://github.com/iipc/urlcanon" }, "release_url": "https://pypi.org/project/urlcanon/0.3.1/", "requires_dist": null, "requires_python": "", "summary": "url canonicalization library for python and java", "version": "0.3.1" }, "last_serial": 5478807, "releases": { "0.1.dev11": [ { "comment_text": "", "digests": { "md5": "2828d230f7330646c0e00dfeeca41dc0", "sha256": "e51900a5ea35e5305960c7f0d58d7c7757a0b692d47aa48960fb625cdc424c0a" }, "downloads": -1, "filename": "urlcanon-0.1.dev11.tar.gz", "has_sig": false, "md5_digest": "2828d230f7330646c0e00dfeeca41dc0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 24883, "upload_time": "2017-03-13T22:52:31", "url": "https://files.pythonhosted.org/packages/95/b5/2f7dafb85e66b13680006ad73fedac9471794aec002b4458d8762df0a291/urlcanon-0.1.dev11.tar.gz" } ], "0.1.dev12": [ { "comment_text": "", "digests": { "md5": "cc3152fc91ffa86603afcbfb3d41a41b", "sha256": "19601117a79a941e37946eb4a106ce48aec8ac2a010d0adb2fe41e6eb711fc51" }, "downloads": -1, "filename": "urlcanon-0.1.dev12.tar.gz", "has_sig": false, "md5_digest": "cc3152fc91ffa86603afcbfb3d41a41b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 24983, "upload_time": "2017-03-14T00:14:25", "url": "https://files.pythonhosted.org/packages/62/d7/795ba665655d32bff6ed298ebc2d09cb7add4e80baa083744faba287e3fe/urlcanon-0.1.dev12.tar.gz" } ], "0.1.dev13": [ { "comment_text": "", "digests": { "md5": "309be7daa4223ac2695ffed833936b2e", "sha256": "c79c9115a3ffc1ebd590ebd345857f1a0c7e41918eaddd4da16ff9a2b86d938e" }, "downloads": -1, "filename": "urlcanon-0.1.dev13.tar.gz", "has_sig": false, "md5_digest": "309be7daa4223ac2695ffed833936b2e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 25005, "upload_time": "2017-03-14T20:18:39", "url": "https://files.pythonhosted.org/packages/e3/59/d0fe0edcef40abca08f6774546d54857e1dfb5eb5e5776182871c847d86d/urlcanon-0.1.dev13.tar.gz" } ], "0.1.dev15": [ { "comment_text": "", "digests": { "md5": "4637499809452f0aac26fee60fbd474a", "sha256": "429a7138bf07531a369f2dad4c3e79980c9b505c12b1d120ba6376b09928f455" }, "downloads": -1, "filename": "urlcanon-0.1.dev15.tar.gz", "has_sig": false, "md5_digest": "4637499809452f0aac26fee60fbd474a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 25115, "upload_time": "2017-03-15T02:04:00", "url": "https://files.pythonhosted.org/packages/91/50/a11cf168764cd24776f4d26c7f8b3fbeb7b01bea192cd00527e03bc0d71d/urlcanon-0.1.dev15.tar.gz" } ], "0.1.dev16": [ { "comment_text": "", "digests": { "md5": "05f90e3ae3010d2320fffc983af530fc", "sha256": "9370d9b7516d72ba14d79d423d7686b839b47136fd6c56790c680eb8b94cba64" }, "downloads": -1, "filename": "urlcanon-0.1.dev16.tar.gz", "has_sig": false, "md5_digest": "05f90e3ae3010d2320fffc983af530fc", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 25315, "upload_time": "2017-03-15T16:30:50", "url": "https://files.pythonhosted.org/packages/4e/30/4e73f2b3156097e0466a6118484f5a67c7b132bba8bb308ab841a4925305/urlcanon-0.1.dev16.tar.gz" } ], "0.1.dev20": [ { "comment_text": "", "digests": { "md5": "9df3ada5295404caa3058ffaf6574556", "sha256": "925b317ff53536cc254194b6997e20c584abd9cb79feb379cf8fba9f67fc019a" }, "downloads": -1, "filename": "urlcanon-0.1.dev20.tar.gz", "has_sig": false, "md5_digest": "9df3ada5295404caa3058ffaf6574556", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 26672, "upload_time": "2018-05-01T22:36:31", "url": "https://files.pythonhosted.org/packages/1d/cb/5483f018b104fca68c09889ebe52b1c33db760b4ffd112f5b9e8a9f244cb/urlcanon-0.1.dev20.tar.gz" } ], "0.1.dev21": [ { "comment_text": "", "digests": { "md5": "c2fcd181d4209a2d7190d8e369c8961c", "sha256": "d469f20b0edca9151382b7dd0e51cefb5c63104af5eec619dbd7e626bbfb66f9" }, "downloads": -1, "filename": "urlcanon-0.1.dev21.tar.gz", "has_sig": false, "md5_digest": "c2fcd181d4209a2d7190d8e369c8961c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 26720, "upload_time": "2018-05-02T21:22:04", "url": "https://files.pythonhosted.org/packages/ce/86/bc0ce3cb93e238bbbc794b878c9aab40c9768273608bb62e22c74c78fc45/urlcanon-0.1.dev21.tar.gz" } ], "0.1.dev23": [ { "comment_text": "", "digests": { "md5": "9688c1294f0bd19611874610e346fac4", "sha256": "a24cc11d7d8afa6eba19a39c1e4f1e4ba1ce43a6cabb1f44fa8f31a641262c48" }, "downloads": -1, "filename": "urlcanon-0.1.dev23.tar.gz", "has_sig": false, "md5_digest": "9688c1294f0bd19611874610e346fac4", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 26742, "upload_time": "2018-05-16T21:41:12", "url": "https://files.pythonhosted.org/packages/23/a0/11f10940835bc9f11abea9caa81dc92968e17593bd938149ed1f4cd4ad7b/urlcanon-0.1.dev23.tar.gz" } ], "0.1.dev9": [ { "comment_text": "", "digests": { "md5": "e5ff22b000dce181bfd72938763ee7ef", "sha256": "090e99196c9b794d378104f38f9e0a9f3abbcae6611fabe0e52b1841bc93a22c" }, "downloads": -1, "filename": "urlcanon-0.1.dev9.tar.gz", "has_sig": false, "md5_digest": "e5ff22b000dce181bfd72938763ee7ef", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 22904, "upload_time": "2017-03-13T22:20:10", "url": "https://files.pythonhosted.org/packages/47/ef/328dc51628c2e290fa5ce8e23cb2bfaaa6a70e6a72d4930fa58f5186b755/urlcanon-0.1.dev9.tar.gz" } ], "0.2": [ { "comment_text": "", "digests": { "md5": "acac72ec469f202d498856669f79e7a2", "sha256": "c693e376ded540219b920805f6689db0372839b0140751024ef67da43f82af23" }, "downloads": -1, "filename": "urlcanon-0.2.tar.gz", "has_sig": false, "md5_digest": "acac72ec469f202d498856669f79e7a2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 27225, "upload_time": "2018-08-16T23:06:46", "url": "https://files.pythonhosted.org/packages/ee/87/fe1b30ef9df2d7710863251a35136f7b870ef4320850296903d834530378/urlcanon-0.2.tar.gz" } ], "0.2.1": [ { "comment_text": "", "digests": { "md5": "4f9186a5f60ae39ccc08540df8186f51", "sha256": "a2a25d579b7b2e978379d929d52e4c8f7cb7657d58dde7e20ee8a8377da3ff0f" }, "downloads": -1, "filename": "urlcanon-0.2.1.tar.gz", "has_sig": false, "md5_digest": "4f9186a5f60ae39ccc08540df8186f51", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 27224, "upload_time": "2018-12-18T00:03:30", "url": "https://files.pythonhosted.org/packages/51/1b/ab6f410f54062ca8e2b28dbd455cb534813c9a2ca3ed4c7f103fd7db3ce7/urlcanon-0.2.1.tar.gz" } ], "0.3.0": [ { "comment_text": "", "digests": { "md5": "2e86ae773371ecea0246b6cfb9797db0", "sha256": "8cf6a4c8154cf1fa8ba00cc25681f736eb6973ccc149507252727f7e38b84175" }, "downloads": -1, "filename": "urlcanon-0.3.0.tar.gz", "has_sig": false, "md5_digest": "2e86ae773371ecea0246b6cfb9797db0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 27197, "upload_time": "2019-02-22T22:05:38", "url": "https://files.pythonhosted.org/packages/10/53/f5198200aa27504cbb9f00d587d348fc9da8686c6243202663120ae2f017/urlcanon-0.3.0.tar.gz" } ], "0.3.1": [ { "comment_text": "", "digests": { "md5": "d961106a2e524ce5f59a34171a324188", "sha256": "30f5bf0e2e4a0feb6dd9ee139a4180a5d493117e8a1448569da3d73c18b92b62" }, "downloads": -1, "filename": "urlcanon-0.3.1.tar.gz", "has_sig": false, "md5_digest": "d961106a2e524ce5f59a34171a324188", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 13516, "upload_time": "2019-07-02T23:46:43", "url": "https://files.pythonhosted.org/packages/cb/65/222a5733af4c6d728fa90b0dcead218b3b0460eacdae22bb9ecdea1bbe5d/urlcanon-0.3.1.tar.gz" } ], "0.3.dev29": [ { "comment_text": "", "digests": { "md5": "1913937ce5529c4f9cdf76b90b1f2eb2", "sha256": "3bbe28e6fbdc332f481937abadfc72527ee52512d5282d3dfb4c0fc6196e1c56" }, "downloads": -1, "filename": "urlcanon-0.3.dev29.tar.gz", "has_sig": false, "md5_digest": "1913937ce5529c4f9cdf76b90b1f2eb2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 27225, "upload_time": "2019-02-22T08:03:16", "url": "https://files.pythonhosted.org/packages/e9/18/10806abf67c82238df10a20f739f09cc72d1fa1cc1c9bbeefc2a6ff1b302/urlcanon-0.3.dev29.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "d961106a2e524ce5f59a34171a324188", "sha256": "30f5bf0e2e4a0feb6dd9ee139a4180a5d493117e8a1448569da3d73c18b92b62" }, "downloads": -1, "filename": "urlcanon-0.3.1.tar.gz", "has_sig": false, "md5_digest": "d961106a2e524ce5f59a34171a324188", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 13516, "upload_time": "2019-07-02T23:46:43", "url": "https://files.pythonhosted.org/packages/cb/65/222a5733af4c6d728fa90b0dcead218b3b0460eacdae22bb9ecdea1bbe5d/urlcanon-0.3.1.tar.gz" } ] }