{ "info": { "author": "", "author_email": "dhking@wharton.upenn.edu", "bugtrack_url": null, "classifiers": [], "description": "[![Build Status](https://travis-ci.org/wharton/S3WebCache.svg?branch=master)](https://travis-ci.org/wharton/S3WebCache)\n[![PyPI version](https://badge.fury.io/py/S3WebCache.svg)](https://badge.fury.io/py/S3WebCache)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\n# S3 Web Cache\n\nThis is a simple package for archiving web pages (HTML) to S3. It acts as a cache returning the S3 version of the page if it exists. If not it gets the url through [Requests](http://docs.python-requests.org/en/master/) and archives it in s3.\n\nOur use case: provide a reusable history of pages included in a web scrape. An archived version of a particular URL at a moment in time. Since the web is always changing, different research questions can be asked at a later date, without losing the original content. Please only use in this manner if you have obtained permission for the pages you are requesting.\n\n\n## Quickstart\n\n\n### Install\n\n`pip install s3webcache`\n\n\n### Usage\n\n```\nfrom s3webcache import S3WebCache\n\ns3wc = S3WebCache(\n bucket_name=,\n aws_access_key_id=,\n aws_secret_key=,\n aws_default_region=)\n\nrequest = s3wc.get(\"https://en.wikipedia.org/wiki/Whole_Earth_Catalog\")\n\nif request.success:\n html = request.message\n```\n\nIf the required AWS credentials are not given it will fallback to using environment variables.\n\nThe `.get(url)` operation returns a namedtuple Request: (success: bool, message: str).\n\nFor successful operations, `.message` contains the url data.\nFor unsuccessful operations, `.message` contains error information.\n\n\n### Options\n\nS3WebCache() takes the following arguments with these defaults:\n - bucket_name: str\n - path_prefix: str = None\\\n Subdirectories to store URLs. `path_prefix='ht'` will start archiving at path s3://BUCKETNAME/ht/\n - aws_access_key_id: str = None \n - aws_secret_key: str = None\n - aws_default_region: str = None \n - trim_website: bool = False\n Trim out the hostname. Defaults to storing the hostname as dot replaced underscores. `https://github.com/wharton/S3WebCache` would be `s3://BUCKETNAME/github_com/wharton/S3WebCache`.\\\n Set this to true and it will be stored as `s3://BUCKETNAME/wharton/S3WebCache`.\n - allow_forwarding: bool = True\n Will follow HTTP 300 class redirects.\n\n\n## TODO\n\n - Add 'update s3 if file is older than...' behavior\n - Add transparent compression by default (gzip, lz4, etc)\n - Add rate limiting\n\n\n## Reference\n\n[AWS S3 API documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html) \n\n\n## License\n\nMIT\n\n\n## Tests\n\nThrough Travis-ci", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "S3WebCache", "package_url": "https://pypi.org/project/S3WebCache/", "platform": "", "project_url": "https://pypi.org/project/S3WebCache/", "project_urls": null, "release_url": "https://pypi.org/project/S3WebCache/0.2.2/", "requires_dist": null, "requires_python": "", "summary": "", "version": "0.2.2" }, "last_serial": 5557821, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "43839dff5e9603cfedf25a940e7c5e2a", "sha256": "3fe1e8e811566589f81a44fb32b11161863c8386c12cb50f788858fbcb534258" }, "downloads": -1, "filename": "S3WebCache-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "43839dff5e9603cfedf25a940e7c5e2a", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 2164, "upload_time": "2019-04-11T19:30:18", "url": "https://files.pythonhosted.org/packages/53/d9/8bbb1a7db9d54d6239a891d3d76ed0944aaa3cec1a32ad4c9c610f58a9da/S3WebCache-0.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "cc1b0b0bd35a9ac7c13203489dd9cd7a", "sha256": "813ed52c42eb9345b594faaa8b132a51ab54ab9b6836c4a63d1d07d791de4f6f" }, "downloads": -1, "filename": "S3WebCache-0.0.1.tar.gz", "has_sig": false, "md5_digest": "cc1b0b0bd35a9ac7c13203489dd9cd7a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 932, "upload_time": "2019-04-11T19:30:23", "url": "https://files.pythonhosted.org/packages/6c/78/bf0ea0c72aa5c6fd83405918529d9f14eda9b7a7bdf2af76ebeb733d5677/S3WebCache-0.0.1.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "26beaa45276053cef00d343c728f1987", "sha256": "826cf4b332ac59212275d8dc9ea8f223ffb415e9e5ae71e2cec18f74b953ff26" }, "downloads": -1, "filename": "S3WebCache-0.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "26beaa45276053cef00d343c728f1987", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 3475, "upload_time": "2019-04-11T19:30:19", "url": "https://files.pythonhosted.org/packages/70/72/8cdff2e102926efaefe1d0b520d3a2f713172d0fb1e8dec0277fad018a59/S3WebCache-0.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1311d585edcb26da87ee0f2d0374c869", "sha256": "3faded75434306e4c31cc74285603c075059774131c306aeaca05a77343dc327" }, "downloads": -1, "filename": "S3WebCache-0.0.2.tar.gz", "has_sig": false, "md5_digest": "1311d585edcb26da87ee0f2d0374c869", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2227, "upload_time": "2019-04-11T19:30:24", "url": "https://files.pythonhosted.org/packages/2a/6e/f48988139317b08bdc81fb4f79f96a051158ec0fe4715b07594c56179a35/S3WebCache-0.0.2.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "91ccd0010ec15aa7f582e44bb9b2146f", "sha256": "ec1f96e1dde6ab08b81343de28d716d70ded0e7f3b2e4cc27dcc9cf8d3279125" }, "downloads": -1, "filename": "S3WebCache-0.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "91ccd0010ec15aa7f582e44bb9b2146f", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 3501, "upload_time": "2019-04-11T19:30:21", "url": "https://files.pythonhosted.org/packages/18/b6/54284d4f941c9a81c18893e9163a90e2c11822cf12cdb244360ea02b2099/S3WebCache-0.0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "534254046c7602e009b8145fe7af5fd6", "sha256": "134a00022cfd98395b6b49aa7e5c344c12096504d8bd9d1a321ab5394b1d0939" }, "downloads": -1, "filename": "S3WebCache-0.0.3.tar.gz", "has_sig": false, "md5_digest": "534254046c7602e009b8145fe7af5fd6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2279, "upload_time": "2019-04-11T19:30:25", "url": "https://files.pythonhosted.org/packages/05/ad/e436d31f06680017277ca6d2b558e7fec7979d5fdfb6e8a744bae0cce906/S3WebCache-0.0.3.tar.gz" } ], "0.1.0": [ { "comment_text": "", "digests": { "md5": "7de7c05c34da8f6b5b61a945fb0af5c3", "sha256": "618c1e2c45247aa820bb0675269ff18499cc52616eb92711cd6cb35dbdf38494" }, "downloads": -1, "filename": "S3WebCache-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "7de7c05c34da8f6b5b61a945fb0af5c3", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 3627, "upload_time": "2019-04-11T19:44:56", "url": "https://files.pythonhosted.org/packages/71/80/6dd50033aa126d8e4087650e65d1a837ca0dd3ba5f478b2b0fed0f2c55ea/S3WebCache-0.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "c249c560f017dcb56eadecd506c49c9e", "sha256": "df1dd53b6f677e2c3fc2ef856286d01ec0c8f94639a65e5aaf31ac245ded7016" }, "downloads": -1, "filename": "S3WebCache-0.1.0.tar.gz", "has_sig": false, "md5_digest": "c249c560f017dcb56eadecd506c49c9e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2426, "upload_time": "2019-04-11T19:45:00", "url": "https://files.pythonhosted.org/packages/1e/71/645a6305a012e0be06a68bce0a303142234a7c52be65879add22ed3bb6a8/S3WebCache-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "3558cada74c9821ac5cc9f02f57b4374", "sha256": "b96b24d0ee5f2c4556d654262e89608e7daa595ad724718cd5984f432b665a18" }, "downloads": -1, "filename": "S3WebCache-0.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "3558cada74c9821ac5cc9f02f57b4374", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 3626, "upload_time": "2019-04-11T19:46:00", "url": "https://files.pythonhosted.org/packages/25/cc/af17d1ce925eb267e2fff094221aa25a2be8dcd1d488d6c09c733ac2d244/S3WebCache-0.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "c6d914fc8bae9471f356ba694b789635", "sha256": "3f44d561c90fba8716e68131c65e7246ceb2b331453dfe7e7cd099a47d130dba" }, "downloads": -1, "filename": "S3WebCache-0.1.1.tar.gz", "has_sig": false, "md5_digest": "c6d914fc8bae9471f356ba694b789635", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2420, "upload_time": "2019-04-11T19:46:03", "url": "https://files.pythonhosted.org/packages/18/62/ceccc373181260c0080f9ee83d286f94e3dbc3ccd0f1e57c1d038521041c/S3WebCache-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "4abf5c40b1ed02ac6e0d412fba8876cc", "sha256": "9c64a59ff383035da10324a2faad4d46852872f879e708d7e994a253752c6d3f" }, "downloads": -1, "filename": "S3WebCache-0.1.2-py3-none-any.whl", "has_sig": false, "md5_digest": "4abf5c40b1ed02ac6e0d412fba8876cc", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 3686, "upload_time": "2019-04-11T20:18:54", "url": "https://files.pythonhosted.org/packages/57/e4/ac81f989bff08f097302c165c37cecfff754157f9a1014f7d0e6bde61cff/S3WebCache-0.1.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "aebf32a54f7fe0bf2745b976e4d4bc43", "sha256": "ddfce37094d7e20befc23fab7630ecab22823a21b226f0818b9f2aa6ed0708e7" }, "downloads": -1, "filename": "S3WebCache-0.1.2.tar.gz", "has_sig": false, "md5_digest": "aebf32a54f7fe0bf2745b976e4d4bc43", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2472, "upload_time": "2019-04-11T20:18:57", "url": "https://files.pythonhosted.org/packages/0a/49/ad2e6879ec660a7c9de24c49c159767eeb6c98377f124f80127d4fb47952/S3WebCache-0.1.2.tar.gz" } ], "0.1.4": [ { "comment_text": "", "digests": { "md5": "6ead60191eddf85ab977e5cf7d362de5", "sha256": "5a70867a50e5c33ea60eec4492c83f7c34b8b4c568a33eb3031ce8851d6e2aa0" }, "downloads": -1, "filename": "S3WebCache-0.1.4-py3-none-any.whl", "has_sig": false, "md5_digest": "6ead60191eddf85ab977e5cf7d362de5", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 3699, "upload_time": "2019-04-11T20:32:31", "url": "https://files.pythonhosted.org/packages/03/ff/8167d55d86fb95ccfe1bc00e63269f0a015867a442d080e41aa0fe3c20f7/S3WebCache-0.1.4-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "f9a824a2a26b4c8a5f1f02fcd4396840", "sha256": "b99872ab9abae83b732709f5d13239e1292c233134aae748a9842a6c59339919" }, "downloads": -1, "filename": "S3WebCache-0.1.4.tar.gz", "has_sig": false, "md5_digest": "f9a824a2a26b4c8a5f1f02fcd4396840", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2497, "upload_time": "2019-04-11T20:32:35", "url": "https://files.pythonhosted.org/packages/4f/82/cc889f190a8299dc613c6957d68f500b3d9e31ef39c3c5b47e631ed18fa0/S3WebCache-0.1.4.tar.gz" } ], "0.1.5": [ { "comment_text": "", "digests": { "md5": "573fca0b4754ed79d51c31cec2b6c8c5", "sha256": "1e8b1725f49f7a7e630d01b5675720b67dbc0b2f6784af3e99a97d55dec127e0" }, "downloads": -1, "filename": "S3WebCache-0.1.5-py3-none-any.whl", "has_sig": false, "md5_digest": "573fca0b4754ed79d51c31cec2b6c8c5", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 4231, "upload_time": "2019-04-12T13:08:36", "url": "https://files.pythonhosted.org/packages/af/19/ae001b7d355ccba1e5d3cdc06463713bb6f7158c3754d6b83e5237cca510/S3WebCache-0.1.5-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "ec6d72183e33562776402d1dd0cd5126", "sha256": "90b7e102d88753eedbf9f6219d8e20c7d173376b9fec2e648e28292f4a9d16bf" }, "downloads": -1, "filename": "S3WebCache-0.1.5.tar.gz", "has_sig": false, "md5_digest": "ec6d72183e33562776402d1dd0cd5126", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2757, "upload_time": "2019-04-12T13:08:38", "url": "https://files.pythonhosted.org/packages/6f/93/2323183b48d13dfb4adc465bb472121c7088ab68ce9f1b1bff82f56941bc/S3WebCache-0.1.5.tar.gz" } ], "0.1.6": [ { "comment_text": "", "digests": { "md5": "28aa1046520b9f4c3ea28e334c442da0", "sha256": "1ef06b3f14a3818aefbc22ae608ac58d81fad7d529f5f3dab7685f7ca79ff2aa" }, "downloads": -1, "filename": "S3WebCache-0.1.6-py3-none-any.whl", "has_sig": false, "md5_digest": "28aa1046520b9f4c3ea28e334c442da0", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 4510, "upload_time": "2019-04-12T17:07:36", "url": "https://files.pythonhosted.org/packages/ee/64/51a0a516e71f3316a028bbe09b8eb6883a0cd36a571de713c4cedb403998/S3WebCache-0.1.6-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "713ba73debb9e8b977c715e386e8a343", "sha256": "9a778276e9f435124a07c81b6da142a2a0713a047a959fc2a9fbf6dd19da7d54" }, "downloads": -1, "filename": "S3WebCache-0.1.6.tar.gz", "has_sig": false, "md5_digest": "713ba73debb9e8b977c715e386e8a343", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2932, "upload_time": "2019-04-12T17:07:38", "url": "https://files.pythonhosted.org/packages/86/9c/a405d2e26de755e5232479c0b733df6e3ab1703b7b3df4ea1c28cfc02551/S3WebCache-0.1.6.tar.gz" } ], "0.1.7": [ { "comment_text": "", "digests": { "md5": "6d4e6641b25be01c68c928ce1db79603", "sha256": "42e8e4a9cced8b06eca0464940f29bee6ca2d4d868de2201a4c95b4b38e3b2c3" }, "downloads": -1, "filename": "S3WebCache-0.1.7-py3-none-any.whl", "has_sig": false, "md5_digest": "6d4e6641b25be01c68c928ce1db79603", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 4561, "upload_time": "2019-04-12T18:40:02", "url": "https://files.pythonhosted.org/packages/4d/da/fd8dce947d675a68f0a320529b0434e822b50e814be3ba0a1f32dafae9d7/S3WebCache-0.1.7-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "bb5c074223f8d211c17ae02913bd9371", "sha256": "f2fc355f3855910e7e0b537ef53bcae526c5132edca461efbf8e8664826bb1a0" }, "downloads": -1, "filename": "S3WebCache-0.1.7.tar.gz", "has_sig": false, "md5_digest": "bb5c074223f8d211c17ae02913bd9371", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2986, "upload_time": "2019-04-12T18:40:03", "url": "https://files.pythonhosted.org/packages/28/fe/28d0fba577ca8e09aadb28f8c2ed55b22984e658c77680e609aa9dbc27d4/S3WebCache-0.1.7.tar.gz" } ], "0.1.8": [ { "comment_text": "", "digests": { "md5": "18b0d41dd2973ba888292f2035adc2c9", "sha256": "72dd25d65aa02c7c50ac6e9d11711dc0083db5d2290d90a87e0d37c4d71778e0" }, "downloads": -1, "filename": "S3WebCache-0.1.8.tar.gz", "has_sig": false, "md5_digest": "18b0d41dd2973ba888292f2035adc2c9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4061, "upload_time": "2019-06-28T19:46:17", "url": "https://files.pythonhosted.org/packages/f9/98/9bdc4c4946c9be6e9fcb2e6eab255da129dfd7fe6396b2357f240f0c15fe/S3WebCache-0.1.8.tar.gz" } ], "0.1.9": [ { "comment_text": "", "digests": { "md5": "ad5e88c056a800f7ef7168c4545389b6", "sha256": "8be1127fbeb4ced698f901928efe2c0acb60cce96cfe4ef80931f5e843f77625" }, "downloads": -1, "filename": "S3WebCache-0.1.9.tar.gz", "has_sig": false, "md5_digest": "ad5e88c056a800f7ef7168c4545389b6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4052, "upload_time": "2019-06-28T20:37:00", "url": "https://files.pythonhosted.org/packages/ad/a7/df769e7500c607fc4252ca138764d4cad5be439c3b1825abe46fd73bfacf/S3WebCache-0.1.9.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "125d97dfd37678fb14467cf0b52ffffd", "sha256": "54cb044a6930e1ba8cc7d7e73443f95fb546f8b1ee3497e5e44e0d58d818059f" }, "downloads": -1, "filename": "S3WebCache-0.2.0.tar.gz", "has_sig": false, "md5_digest": "125d97dfd37678fb14467cf0b52ffffd", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4055, "upload_time": "2019-06-28T20:40:43", "url": "https://files.pythonhosted.org/packages/54/dc/12c8f756629b7c89466fb76926aed3ca9745408677e47221c29e9166c214/S3WebCache-0.2.0.tar.gz" } ], "0.2.2": [ { "comment_text": "", "digests": { "md5": "2fe89a525b9dd38ea92f8528367bafed", "sha256": "21a6ba81754b1f54aaec2cf1dda187cca3afdab65dfedcd1ce4a66cc17aaa8bb" }, "downloads": -1, "filename": "S3WebCache-0.2.2.tar.gz", "has_sig": false, "md5_digest": "2fe89a525b9dd38ea92f8528367bafed", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4120, "upload_time": "2019-07-05T14:00:19", "url": "https://files.pythonhosted.org/packages/a5/46/ea7dfad1bf73ab6a179b387da3832a086d488e730f4194b20b041b95e6a7/S3WebCache-0.2.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "2fe89a525b9dd38ea92f8528367bafed", "sha256": "21a6ba81754b1f54aaec2cf1dda187cca3afdab65dfedcd1ce4a66cc17aaa8bb" }, "downloads": -1, "filename": "S3WebCache-0.2.2.tar.gz", "has_sig": false, "md5_digest": "2fe89a525b9dd38ea92f8528367bafed", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4120, "upload_time": "2019-07-05T14:00:19", "url": "https://files.pythonhosted.org/packages/a5/46/ea7dfad1bf73ab6a179b387da3832a086d488e730f4194b20b041b95e6a7/S3WebCache-0.2.2.tar.gz" } ] }