{ "info": { "author": "Elizabeth Seiver, Sebastian Bassi, M Pacer", "author_email": "eseiver@plos.org, sebastian.bassi@globant.com, mpacer@berkeley.edu", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Topic :: Scientific/Engineering" ], "description": ".. image:: https://api.travis-ci.org/PLOS/allofplos.svg?branch=master\n :target: https://travis-ci.org/PLOS/allofplos\n :alt: Build Status\n\nAll of Plos (allofplos)\n=======================\n\nCopyright (c) 2017, Public Library of Science. MIT License, see\nLICENSE.txt for more information.\n\nWhy allofplos?\n--------------\n\nThis is for downloading/updating/maintaining a repository of all PLOS\nXML article files. This can be used to have a copy of the PLOS text\ncorpus for further analysis. Use this program to download all PLOS XML\narticle files instead of doing web scraping.\n\n**NOTE**: This software is not stable, we consider it beta state and will\nbe in this stage until version 1.0. This means that programming interface\nmay change and after a new version a full corpus download may be required.\n\nInstallation instructions\n-------------------------\n\nThis program requires Python 3.4+.\n\nMake a virtual environment:\n\n``$ virtualenv allofplos``\n\nUsing pip:\n\n``(allofplos)$ pip install allofplos``\n\nThis should install *allofplos* and requirements. At this stage you are ready to go.\n\nIf you want to manually install from source (for example for development purposes), first clone the project repository:\n\n``(allofplos)$ git clone git@github.com:PLOS/allofplos.git``\n\nInstall Python dependencies inside the newly created virtual environment:\n\n``(allofplos)$ pip install -r requirements.txt``\n\nHow to run the program\n----------------------\n\nExecute the following command.\n\n``(allofplos)$ python -m allofplos.update``\n\nThe first time it runs it will download a >4.4 Gb zip file\n(**allofplos_xml.zip**) with all the XML files inside.\n**Note**: Make sure that you have enough space in your device for the\nzip file and for its content before running this command (at least 30Gb).\nAfter this file is downloaded, it will extract its contents into the\nallofplos\\_xml directory inside your installation of *allofplos*.\n\nIf you want to see the directory on your file system where this is installed run\n\n``python -c \"from allofplos import get_corpus_dir; print(get_corpus_dir())\"``\n\nIf you ever downloaded the corpus before, it will make an incremental\nupdate to the existing corpus. The script:\n\n- checks for and then downloads to a temporary folder individual new articles that have been published\n- of those new articles, checks whether they are corrections (and\n whether the linked corrected article has been updated)\n- checks whether there are VORs (Versions of Record) for uncorrected\n proofs in the main articles directory and downloads those\n- checks whether the newly downloaded articles are uncorrected proofs\n or not after all of these checks, it moves the new articles into the\n main articles folder.\n\nHere\u2019s what the print statements might look like on a typical run:\n\n::\n\n 147 new articles to download.\n 147 new articles downloaded.\n 3 amended articles found.\n 0 amended articles downloaded with new xml.\n Creating new text list of uncorrected proofs from scratch.\n No new VOR articles indexed in Solr.\n 17 VOR articles directly downloaded.\n 17 uncorrected proofs updated to version of record. 44 uncorrected proofs remaining in uncorrected proof list.\n 9 uncorrected proofs found. 53 total in list.\n Corpus started with 219792 articles.\n Moving new and updated files...\n 164 files moved. Corpus now has 219939 articles.\n\nHow to run the tests\n--------------------\n\nTo run the tests, you will need to install *allofplos* with its testing\ndependencies. These testing dependencies include ``pytest``, which we will use\nto run the tests.\n\nTo get all of the *allofplos* testing dependencies run:\n\n``(allofplos)$ pip install -U allofplos[test]``\n\nThen when you run:\n\n``(allofplos)$ pytest --pyargs allofplos``\n\nIt should return something like this:\n\n.. code::\n\n collected 20 items\n\n allofplos/tests/test_corpus.py ............ [ 60%]\n allofplos/tests/test_unittests.py ........ [100%]\n\n ==================== 20 passed in 0.36 seconds =========================\n\n\nCommunity guidelines\n--------------------\n\nIf you wish to contribute to this project please open a ticket in the\nGitHub repo at https://github.com/PLOS/allofplos/issues. For support\nrequests write to mining@plos.org", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/PLOS/allofplos", "keywords": "science PLOS publishing", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "allofplos", "package_url": "https://pypi.org/project/allofplos/", "platform": "", "project_url": "https://pypi.org/project/allofplos/", "project_urls": { "Homepage": "https://github.com/PLOS/allofplos" }, "release_url": "https://pypi.org/project/allofplos/0.12.0/", "requires_dist": null, "requires_python": ">=3.4", "summary": "Get and analyze all PLOS articles", "version": "0.12.0" }, "last_serial": 5621178, "releases": { "0.10.0": [ { "comment_text": "", "digests": { "md5": "4d0f39b9a0a4d89d28db72b5d7cccfca", "sha256": "e75b0271160ca5bdb5ab4f756ededcad68faf6fbbf8a24a8d9db8575415613c9" }, "downloads": -1, "filename": "allofplos-0.10.0-py3-none-any.whl", "has_sig": false, "md5_digest": "4d0f39b9a0a4d89d28db72b5d7cccfca", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.4", "size": 3463780, "upload_time": "2018-01-19T08:08:26", "url": "https://files.pythonhosted.org/packages/00/d2/ecd91ee7accf242e6bb49b962c6145007af2d81abd20ebd92da9f21bfd00/allofplos-0.10.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "0189457daab0d2c61b87094149c4dad5", "sha256": "68c2f361650670c310330c238dfeaeb35c75e437fcd69787e89179371504de22" }, "downloads": -1, "filename": "allofplos-0.10.0.tar.gz", "has_sig": false, "md5_digest": "0189457daab0d2c61b87094149c4dad5", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.4", "size": 3365969, "upload_time": "2018-01-19T08:08:30", "url": "https://files.pythonhosted.org/packages/5d/14/acb13aee786860cc142f2fe1514e7603952f05eb8aca4544cbb942fe74f0/allofplos-0.10.0.tar.gz" } ], "0.10.1": [ { "comment_text": "", "digests": { "md5": "174c3a37509eabbcfe9c3b92739570f0", "sha256": "5b7c53f37f878b692ccb145e022d1b4c924eab58633d99c5ce759c46c6a967a7" }, "downloads": -1, "filename": "allofplos-0.10.1-py3-none-any.whl", "has_sig": false, "md5_digest": "174c3a37509eabbcfe9c3b92739570f0", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.4", "size": 3463789, "upload_time": "2018-01-19T22:58:19", "url": "https://files.pythonhosted.org/packages/e1/30/5cd995fab94072bedaf492bcf92ffd2e95001e7202caba33658574ef91b9/allofplos-0.10.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "9ccaf9c33e140a3e9108130b337b622a", "sha256": "fcf43d9e69dd1a24c4877e79739f21fdbed5e0f8ba938c042aaef43012259a23" }, "downloads": -1, "filename": "allofplos-0.10.1.tar.gz", "has_sig": false, "md5_digest": "9ccaf9c33e140a3e9108130b337b622a", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.4", "size": 3365905, "upload_time": "2018-01-19T22:58:24", "url": "https://files.pythonhosted.org/packages/df/ec/141bb44c8a79bf2d467b8f4f43e6797b32cb701c0fbfef8d07dd15615f1b/allofplos-0.10.1.tar.gz" } ], "0.10.2": [ { "comment_text": "", "digests": { "md5": "4f9dd042e4fb5a899cdc1e7aa0873528", "sha256": "e7973cb098b1880699914e0e1f97b41574a56b4621dc81a606a56555759fade8" }, "downloads": -1, "filename": "allofplos-0.10.2-py3-none-any.whl", "has_sig": false, "md5_digest": "4f9dd042e4fb5a899cdc1e7aa0873528", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.4", "size": 3464426, "upload_time": "2018-02-02T19:39:58", "url": "https://files.pythonhosted.org/packages/4b/26/4d15892f58a1a41146c1a36281eb7f26aafac614c40d3aebf06911b66b36/allofplos-0.10.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "e2e1665c50130ec0f32a27685a505ccb", "sha256": "e123c3e27fcc8df778bbec7ee7c8c62e504451feddfd3c85de540b5074900d7a" }, "downloads": -1, "filename": "allofplos-0.10.2.tar.gz", "has_sig": false, "md5_digest": "e2e1665c50130ec0f32a27685a505ccb", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.4", "size": 3374491, "upload_time": "2018-02-02T19:40:02", "url": "https://files.pythonhosted.org/packages/e2/08/be783d3d65498e952f1ba0c9eea82e81dd1d972989588a83e670ecbab6f3/allofplos-0.10.2.tar.gz" } ], "0.11.0": [ { "comment_text": "", "digests": { "md5": "864a8283f36fc0d3203139e9dfb024f5", "sha256": "e81e663e07fc043b2f6d6b6cf94c9df7234b653364afa123b8c50af60bd204d1" }, "downloads": -1, "filename": "allofplos-0.11.0-py3-none-any.whl", "has_sig": false, "md5_digest": "864a8283f36fc0d3203139e9dfb024f5", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.4", "size": 3520784, "upload_time": "2018-03-06T17:21:53", "url": "https://files.pythonhosted.org/packages/59/a5/2d1824d8ce474c08c9342b2b05992aaf90437c88dfd58cf9f4b9c01dddac/allofplos-0.11.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "4867495f75f34fa8e4bb2b20ceca3b31", "sha256": "a4a40e2563c0ef5297067f5a14ec4c2ee2652148df5033e1da2b164c08940edd" }, "downloads": -1, "filename": "allofplos-0.11.0.tar.gz", "has_sig": false, "md5_digest": "4867495f75f34fa8e4bb2b20ceca3b31", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.4", "size": 3428208, "upload_time": "2018-03-06T17:21:56", "url": "https://files.pythonhosted.org/packages/06/c8/dd7bdc8ea846ace7007039af6a96250a5cd6ea4231019fe16070b4bc4748/allofplos-0.11.0.tar.gz" } ], "0.11.1": [ { "comment_text": "", "digests": { "md5": "2cf0452920bc99d1035166066abf47d3", "sha256": "17149c52f089c561885be0eb4827edd6ee1dbaea4d17828c38d39d58b603e1f7" }, "downloads": -1, "filename": "allofplos-0.11.1-py3-none-any.whl", "has_sig": false, "md5_digest": "2cf0452920bc99d1035166066abf47d3", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.4", "size": 3520789, "upload_time": "2018-03-22T22:22:11", "url": "https://files.pythonhosted.org/packages/87/72/5717cd0c5a1bd21fdab3be77e3561bab773a455d73e31c1f709499572ccc/allofplos-0.11.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "2ef907489d6e00226a259ea6962a8510", "sha256": "a51580b77563d168885b4e36f751e3d7f48195b7d8c886414ae850eccba218e0" }, "downloads": -1, "filename": "allofplos-0.11.1.tar.gz", "has_sig": false, "md5_digest": "2ef907489d6e00226a259ea6962a8510", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.4", "size": 3419674, "upload_time": "2018-03-22T22:22:16", "url": "https://files.pythonhosted.org/packages/a1/5e/ec056dd07190504c653d69aed02ef8d8d232305fea563a079dda09330488/allofplos-0.11.1.tar.gz" } ], "0.12.0": [ { "comment_text": "", "digests": { "md5": "ef93c208e456efa3bf578ae24a1bc3a8", "sha256": "6d978478b4c4849c31ef46d9f05bfdb529e33700aa5ad25af367f8d8a6565f20" }, "downloads": -1, "filename": "allofplos-0.12.0.tar.gz", "has_sig": false, "md5_digest": "ef93c208e456efa3bf578ae24a1bc3a8", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.4", "size": 3426634, "upload_time": "2019-08-01T22:59:25", "url": "https://files.pythonhosted.org/packages/3d/6e/b981cc4a2bcfcb5f8aa0fd43867d2a2964b49acc4ac1c0a8a4a8eeb8d9c9/allofplos-0.12.0.tar.gz" } ], "0.8.0": [ { "comment_text": "", "digests": { "md5": "c667bb485a82a16a5d6575d9d2dda194", "sha256": "630af124c351489c3ee612d0b6ba76397be505053996735e105e003503a5abaa" }, "downloads": -1, "filename": "allofplos-0.8.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "c667bb485a82a16a5d6575d9d2dda194", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 18725, "upload_time": "2017-10-05T00:23:48", "url": "https://files.pythonhosted.org/packages/fe/f0/cc376d154404c9e6b12e6fd0875b5985113f575deff935c4323f6fd588ca/allofplos-0.8.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "c77605ade4f9221dfbdfd491873cf85d", "sha256": "8f7457f9d6b0e92b117d571e1d5e1494a32f6275cdea7a8eec5303a0acfed6a6" }, "downloads": -1, "filename": "allofplos-0.8.0.tar.gz", "has_sig": false, "md5_digest": "c77605ade4f9221dfbdfd491873cf85d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17668, "upload_time": "2017-10-05T00:23:50", "url": "https://files.pythonhosted.org/packages/26/11/3855e10997b04cfbde6315d729e79d3a0c8d6c16751479f6a1816e37bf3b/allofplos-0.8.0.tar.gz" } ], "0.8.1": [ { "comment_text": "", "digests": { "md5": "90f3a0cbe59d1ad2dd2c7f7e5e22886b", "sha256": "2351e8c4a90897fecc24244a4b0ef3d822036b67a7d4afbef841deb79c2c5662" }, "downloads": -1, "filename": "allofplos-0.8.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "90f3a0cbe59d1ad2dd2c7f7e5e22886b", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 19150, "upload_time": "2017-10-06T00:05:23", "url": "https://files.pythonhosted.org/packages/88/a9/1c0e4f8bffc137f4aad5097682d5fa23504f97d50b1333e3658800c1b0db/allofplos-0.8.1-py2.py3-none-any.whl" } ], "0.8.3": [ { "comment_text": "", "digests": { "md5": "62e1c54c34bf6c0f93b7e7d2fc2a9deb", "sha256": "641cad9d0db23e65aea4601be92bacd6bc0ff4c31e82efb7b1cc119d217d2f74" }, "downloads": -1, "filename": "allofplos-0.8.3-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "62e1c54c34bf6c0f93b7e7d2fc2a9deb", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=3.4", "size": 19266, "upload_time": "2017-10-20T17:56:46", "url": "https://files.pythonhosted.org/packages/04/ee/33483152060b99cf98b3abbae255404f26a8f0c873a33b24fbbfe08f317d/allofplos-0.8.3-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "21550e1bd16d14f6bcf21ca711984f8c", "sha256": "5eeba7797401f49288da7efe4dce9f071a7e0be245beed2e2283a76b246c6791" }, "downloads": -1, "filename": "allofplos-0.8.3.tar.gz", "has_sig": false, "md5_digest": "21550e1bd16d14f6bcf21ca711984f8c", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.4", "size": 17815, "upload_time": "2017-10-20T17:56:47", "url": "https://files.pythonhosted.org/packages/4c/fa/035cf5577dd1be444b02c8df63d8ee7c8820fed0beb9022309534b503a66/allofplos-0.8.3.tar.gz" } ], "0.8.4": [ { "comment_text": "", "digests": { "md5": "7ac9e8e2a983bde1e2450eabf889c63d", "sha256": "d9658c7722a0c8c00205cef03cd2017d57f9c8eba24595d225ab79d289f98626" }, "downloads": -1, "filename": "allofplos-0.8.4-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "7ac9e8e2a983bde1e2450eabf889c63d", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=3.4", "size": 19822, "upload_time": "2017-10-26T00:43:36", "url": "https://files.pythonhosted.org/packages/0d/9f/8ed61bcf1233c7fec04bdba1b9bb5729bfb8706daf6c47a8f03505b53d8f/allofplos-0.8.4-py2.py3-none-any.whl" } ], "0.9.0": [ { "comment_text": "", "digests": { "md5": "69b8dcae1c0fe1e8ecc6ac8d1dc34bd4", "sha256": "0be0af1e71ef7cb6ca77ad64898db7f3ff1807729b5bca5e3e3ef6d56601975f" }, "downloads": -1, "filename": "allofplos-0.9.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "69b8dcae1c0fe1e8ecc6ac8d1dc34bd4", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=3.4", "size": 45106, "upload_time": "2017-11-29T07:41:03", "url": "https://files.pythonhosted.org/packages/34/3c/77337a51182e15b6238c910e9a4d01f0512165908d7b4600853b0c051678/allofplos-0.9.0-py2.py3-none-any.whl" } ], "0.9.5": [ { "comment_text": "", "digests": { "md5": "d3f6b23850580cdc270e38180a6e724d", "sha256": "9821164303598665a72a1fa5fc8542c8b4e582fb13279caaabe9e1587be53e57" }, "downloads": -1, "filename": "allofplos-0.9.5-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "d3f6b23850580cdc270e38180a6e724d", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=3.4", "size": 49277, "upload_time": "2017-12-15T06:20:25", "url": "https://files.pythonhosted.org/packages/05/5f/c9e78479d2402c8b6a46370a7be3a648c56732e37a2a5b7e7d9a48ada8dc/allofplos-0.9.5-py2.py3-none-any.whl" } ], "0.9.6": [ { "comment_text": "", "digests": { "md5": "2a4d42bdfdb1b7f69e1eb7608047a04a", "sha256": "6c68d9223345df9076ae28810dec93fa2f010642df1afdf412fefa55e20b4bf1" }, "downloads": -1, "filename": "allofplos-0.9.6-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "2a4d42bdfdb1b7f69e1eb7608047a04a", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=3.4", "size": 3460389, "upload_time": "2017-12-28T01:54:54", "url": "https://files.pythonhosted.org/packages/fe/e5/9b3c075e8ef944890b01f2564d294fd0d5627cbb6e6723776aa46ad59b7f/allofplos-0.9.6-py2.py3-none-any.whl" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "ef93c208e456efa3bf578ae24a1bc3a8", "sha256": "6d978478b4c4849c31ef46d9f05bfdb529e33700aa5ad25af367f8d8a6565f20" }, "downloads": -1, "filename": "allofplos-0.12.0.tar.gz", "has_sig": false, "md5_digest": "ef93c208e456efa3bf578ae24a1bc3a8", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.4", "size": 3426634, "upload_time": "2019-08-01T22:59:25", "url": "https://files.pythonhosted.org/packages/3d/6e/b981cc4a2bcfcb5f8aa0fd43867d2a2964b49acc4ac1c0a8a4a8eeb8d9c9/allofplos-0.12.0.tar.gz" } ] }