{ "info": { "author": "John Harrison", "author_email": "john@bloomonkey.co.uk", "bugtrack_url": null, "classifiers": [ "Intended Audience :: Developers", "Intended Audience :: Information Technology", "License :: OSI Approved :: BSD License", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Topic :: Text Processing :: Markup" ], "description": "OAI-PMH Harvest\n===============\n\n.. image:: https://travis-ci.org/bloomonkey/oai-harvest.svg?branch=master\n :target: https://travis-ci.org/bloomonkey/oai-harvest\n\n.. image:: https://img.shields.io/pypi/v/oaiharvest.svg\n :target: https://pypi.python.org/pypi/oaiharvest\n :alt: Latest Version\n\n.. image:: https://img.shields.io/pypi/l/oaiharvest.svg\n :target: LICENSE.rst\n :alt: license:BSD\n\n\nContents\n--------\n\n- `Description`_\n- `Author(s)`_\n- `Latest Version`_\n- `Documentation`_\n- `Requirements / Dependencies`_\n- `Installation`_\n- `Bugs, Feature requests etc.`_\n- `Copyright And Licensing`_\n- `Examples`_\n\n\nDescription\n-----------\n\nA harvester to collect records from an OAI-PMH enabled provider.\n\nThe harvester can be used to carry out one-time harvesting of all\nrecords from a particular OAI-PMH provider by giving its base URL. It\ncan also be used for selective harvesting, e.g. to harvest only records\nupdated after, or before specified dates.\n\nTo assist in regular harvesting from one or more OAI-PMH providers,\nthere's a provider registry. It is possible to associate a short\nmemorable name for a provider with its base URLs, destination directory\nfor harvested records, and the format (metadataPrefix) in which records\nshould be harvested. The registry will also record the date and time of\nthe most recent harvest, and automatically add this to subsequent\nrequests in order to avoid repeatedly harvesting unmodified records.\n\nThis could be used in conjunction with a scheduler (e.g. CRON) to\nmaintain a reasonably up-to-date copy of the record in one or more\nproviders. `Examples`_ of how to accomplish these tasks are available\nbelow.\n\n\nAuthor(s)\n---------\n\nJohn Harrison at the `University of Liverpool`_ \n\n\nLatest Version\n--------------\n\nThe latest release version is available in the Python Packages Index:\n\nhttps://pypi.python.org/pypi/oaiharvest\n\n.. image:: https://img.shields.io/pypi/v/oaiharvest.svg\n :target: https://pypi.python.org/pypi/oaiharvest\n :alt: Latest PyPI Version\n\n\nSource code is under version control and available from:\n\nhttp://github.com/bloomonkey/oai-harvest\n\n\nDocumentation\n-------------\n\nAll executable commands are self documenting, i.e. you can get help on\nhow to use them with the ``-h`` or ``--help`` option.\n\nAt this time the only additional documentation that exists can be found\nin this README file!\n\n\nRequirements / Dependencies\n---------------------------\n\n- Python_ >= 2.7 or Python 3.x\n- pyoai_\n- lxml_\n- sqlite3_\n\nNote that Python 3.x support requires pyoai 2.4.6+.\n\nAs this release is not yet available on PyPI, use\n``pip3 install git+https://github.com/infrae/pyoai.git``\n\nPython3 support is still in beta and might have some bugs.\n\nInstallation\n------------\n\nUsers\n~~~~~\n\n``pip install git+http://github.com/bloomonkey/oai-harvest.git#egg=oaiharvest``\n\n\nDevelopers\n~~~~~~~~~~\n\nI recommend that you use virtualenv_ to isolate your development\nenvironment from system Python_ and any packages that may be installed\nthere.\n\n1. In GitHub_, fork the repository\n\n2. Clone your fork::\n\n git clone git@github.com:/oai-harvest.git\n\n3. Setup development virtualenv using tox::\n\n pip install tox\n tox -e dev\n\n4. Activate development virtualenv:\n\n -nix::\n\n source env/bin/activate\n\n Windows::\n\n env\\Scripts\\activate\n\n\nBugs, Feature requests etc.\n---------------------------\n\nBug reports and feature requests can be submitted to the GitHub issue\ntracker:\nhttp://github.com/bloomonkey/oai-harvest/issues\n\nIf you'd like to contribute code, patches etc. please email the author,\nor submit a pull request on GitHub.\n\n\nCopyright And Licensing\n-----------------------\n\n\n.. image:: https://img.shields.io/pypi/l/oaiharvest.svg\n :target: LICENSE.rst\n :alt: license:BSD\n\nCopyright (c) `University of Liverpool`_, 2013-2014\n\nSee `LICENSE.rst `_ for licensing details.\n\n\nExamples\n--------\n\nHarvesting records from an OAI-PMH provider URL\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nAll records\n'''''''''''\n\n::\n\n oai-harvest http://example.com/oai\n\n\nRecords modified since a certain date\n'''''''''''''''''''''''''''''''''''''\n\n::\n\n oai-harvest --from 2013-01-01 http://example.com/oai\n\n\nRecords from a named set\n''''''''''''''''''''''''\n\n::\n\n oai-harvest --set \"some:set\" http://example.com/oai\n\n\nLimiting the number of records to harvest\n'''''''''''''''''''''''''''''''''''''''''\n\n::\n\n oai-harvest --limit 50 http://example.com/oai\n\n\nGetting help on all available options\n'''''''''''''''''''''''''''''''''''''\n\n::\n\n oai-harvest --help\n\n\nOAI-PMH Provider Registry\n~~~~~~~~~~~~~~~~~~~~~~~~~\n\nAdding a provider\n'''''''''''''''''\n\n::\n\n oai-reg add provider1 http://example.com/oai/1\n\n\nIf you don't supply ``--metadataPrefix`` and ``--directory`` options,\nyou will be interactively prompted to supply alternatives, or accept\nthe defaults.\n\n\nRemoving an existing provider\n'''''''''''''''''''''''''''''\n\n::\n\n oai-reg rm provider1 [provider2]\n\n\nListing existing providers\n''''''''''''''''''''''''''\n\n::\n\n oai-reg list\n\n\nHarvesting from OAI-PMH providers in the registry\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nYou can harvest from one or more providers in the registry using the\nshort names that they were registered with::\n\n oai-harvest provider1 [provider2]\n\n\nBy default, this will harvest all records modified since the last\nharvest from each provider. You can over-ride this behavior using the\n``--from`` and ``--until`` options.\n\nYou can also harvest from all providers in the registry::\n\n oai-harvest all\n\n\nScheduling Regular Harvesting\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nIn order to maintain a reasonably up-to-date copy of all the the\nrecords held by those providers, one could configure a scheduler to\nperiodically harvest from all registered providers. e.g. to tell CRON\nto harvest all at 2am every day, one might add the following to\ncrontab::\n\n 0 2 * * * oai-harvest all\n\n\n.. Links\n.. _Python: http://www.python.org/\n.. _pyoai: https://pypi.python.org/pypi/pyoai\n.. _PyPI: https://pypi.python.org/pypi\n.. _lxml: https://pypi.python.org/pypi/lxml\n.. _sqlite3: http://www.sqlite.org/\n.. _`University of Liverpool`: http://www.liv.ac.uk\n.. _GitHub: http://github.com\n.. _virtualenv: http://www.virtualenv.org/en/latest/\n", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://github.com/bloomonkey/oai-harvest", "keywords": "", "license": "BSD", "maintainer": "", "maintainer_email": "", "name": "oaiharvest", "package_url": "https://pypi.org/project/oaiharvest/", "platform": "", "project_url": "https://pypi.org/project/oaiharvest/", "project_urls": { "Homepage": "http://github.com/bloomonkey/oai-harvest" }, "release_url": "https://pypi.org/project/oaiharvest/3.0.0/", "requires_dist": null, "requires_python": "", "summary": "A harvester to collect records from an OAI-PMH enabled provider.", "version": "3.0.0" }, "last_serial": 3295555, "releases": { "0.0": [], "1.0": [ { "comment_text": "", "digests": { "md5": "f3c252cda220c445a4eb5fc12426ca40", "sha256": "5a1b0a6691c777cbca2a754b3f591b164abe98b8233ef31afb07d167d02c549b" }, "downloads": -1, "filename": "oaiharvest-1.0.tar.gz", "has_sig": false, "md5_digest": "f3c252cda220c445a4eb5fc12426ca40", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 16197, "upload_time": "2013-04-15T09:33:00", "url": "https://files.pythonhosted.org/packages/64/16/199d2a0517b5ced34e555ef5104efff5a0828f05ffff79c66dae5e4f4ee0/oaiharvest-1.0.tar.gz" } ], "1.0c1": [ { "comment_text": "", "digests": { "md5": "5ae8b30766af14fbfb657416e9375062", "sha256": "ffc6e46fb33298ba327d91e04347b07516ddb26f9d294617a0d79472624ee71b" }, "downloads": -1, "filename": "oaiharvest-1.0c1.tar.gz", "has_sig": false, "md5_digest": "5ae8b30766af14fbfb657416e9375062", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 15850, "upload_time": "2013-03-27T17:07:56", "url": "https://files.pythonhosted.org/packages/db/56/9d734f33c967412ca9ba96ad77912bc11524777692340730552bcfc28dcf/oaiharvest-1.0c1.tar.gz" } ], "1.1": [ { "comment_text": "", "digests": { "md5": "0eafb7c3c7dbfce8cc07c6dff9ec4589", "sha256": "5d5e810c69a056b97f7a8f0c69b992d0a8bd2404e92407965484d3dd3702718e" }, "downloads": -1, "filename": "oaiharvest-1.1.tar.gz", "has_sig": false, "md5_digest": "0eafb7c3c7dbfce8cc07c6dff9ec4589", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17007, "upload_time": "2013-05-09T16:06:36", "url": "https://files.pythonhosted.org/packages/fd/0c/b7e002664855c393a68eed0fdd74d33e5bb2d8db6c58c033af5724c1a43d/oaiharvest-1.1.tar.gz" } ], "1.1.1": [ { "comment_text": "", "digests": { "md5": "04270005da2d9607d0383a188bf63592", "sha256": "9ad447bbdf7651be28066adbe26a5e3f1c64205ebb381a1ec1af032206858201" }, "downloads": -1, "filename": "oaiharvest-1.1.1.tar.gz", "has_sig": false, "md5_digest": "04270005da2d9607d0383a188bf63592", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17373, "upload_time": "2013-05-10T10:34:07", "url": "https://files.pythonhosted.org/packages/8e/65/c9b4226c6bf9b4615c6b23adee1a860f5ef35d9a52b326374c9129f63d81/oaiharvest-1.1.1.tar.gz" } ], "1.1.2": [ { "comment_text": "", "digests": { "md5": "8accb74490fc2dc5d3766476e721e1a2", "sha256": "5fe9163619a982dff0122e0d7daad5b96b56f596628571f014dae93c28cc93b1" }, "downloads": -1, "filename": "oaiharvest-1.1.2.tar.gz", "has_sig": false, "md5_digest": "8accb74490fc2dc5d3766476e721e1a2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17760, "upload_time": "2013-07-24T09:54:28", "url": "https://files.pythonhosted.org/packages/f2/fd/688390ae4c7bba110c1877c59381482060ccbaaaec245712116aced1c885/oaiharvest-1.1.2.tar.gz" } ], "1.1.3": [ { "comment_text": "", "digests": { "md5": "9a85bfcef17012cba7f2b624f376b064", "sha256": "5372981ecbe93dc5c921eed6c488089ca42add61a49b03caa2df1aaad4ae44a3" }, "downloads": -1, "filename": "oaiharvest-1.1.3.tar.gz", "has_sig": false, "md5_digest": "9a85bfcef17012cba7f2b624f376b064", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 18012, "upload_time": "2013-10-14T12:09:47", "url": "https://files.pythonhosted.org/packages/e9/cb/847e2da83861f82a11c48be05e16429728f48cf681c3359433b3e59db657/oaiharvest-1.1.3.tar.gz" } ], "1.1.4": [ { "comment_text": "", "digests": { "md5": "5c451b8770f49161d7b490a486788d8b", "sha256": "3f75cd1dc6d3d584f51c26201039cd3f70284c82f29d2a4a99cc58c33179ac2e" }, "downloads": -1, "filename": "oaiharvest-1.1.4.tar.gz", "has_sig": false, "md5_digest": "5c451b8770f49161d7b490a486788d8b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 19573, "upload_time": "2014-04-04T09:00:42", "url": "https://files.pythonhosted.org/packages/ce/3b/2a313d4c5023e26373d38eda378cd2f893122b29a1ef8b916d596589c868/oaiharvest-1.1.4.tar.gz" } ], "1.1.5": [ { "comment_text": "", "digests": { "md5": "976c7d540d6d6ab4134426118f0dc498", "sha256": "af93c2ecd0167cc1318291c1b84d6d0ec4e802dcbedf79ccbe916dbc03f01755" }, "downloads": -1, "filename": "oaiharvest-1.1.5.tar.gz", "has_sig": false, "md5_digest": "976c7d540d6d6ab4134426118f0dc498", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 19848, "upload_time": "2014-04-28T15:30:29", "url": "https://files.pythonhosted.org/packages/94/db/d85fb3f02efbb047dc1d9f5e83c75b91351f155a7fdc68fe17190b9093be/oaiharvest-1.1.5.tar.gz" } ], "1.1c1": [], "1.2.dev": [ { "comment_text": "", "digests": { "md5": "406bb9f85f594b766a9affd410c4a7fd", "sha256": "7031984f5fb11997d82c63ae84d6896472948a363951a08c43c587cce92a77de" }, "downloads": -1, "filename": "oaiharvest-1.2.dev.tar.gz", "has_sig": false, "md5_digest": "406bb9f85f594b766a9affd410c4a7fd", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 19783, "upload_time": "2014-04-28T15:19:37", "url": "https://files.pythonhosted.org/packages/cd/af/fc1fcdfa5d13a1ae6d326505e0c649f2d8f46d8d28dda233d6e9cadca5ca/oaiharvest-1.2.dev.tar.gz" } ], "2.0.0": [ { "comment_text": "", "digests": { "md5": "01e55e53f4e7fa65f3a2d96471ab29f9", "sha256": "042ad1d17a4f59c96bb550581fa8d71dc032baa7472fc2d328dbc848414c85b8" }, "downloads": -1, "filename": "oaiharvest-2.0.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "01e55e53f4e7fa65f3a2d96471ab29f9", "packagetype": "bdist_wheel", "python_version": "3.5", "requires_python": null, "size": 17485, "upload_time": "2017-04-19T15:41:08", "url": "https://files.pythonhosted.org/packages/ff/f0/31a352a83a1d6edce14f78efa0724cd94a7c85b0fe37f57c663df3fc99ae/oaiharvest-2.0.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "b6200f01e81f47e79733cca4ae4fea37", "sha256": "724231763c2393ffb9b25d75c7064831a3b5f0895973a7fa61a2a29e5a59b5fe" }, "downloads": -1, "filename": "oaiharvest-2.0.0.tar.gz", "has_sig": false, "md5_digest": "b6200f01e81f47e79733cca4ae4fea37", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 16327, "upload_time": "2017-04-19T15:41:06", "url": "https://files.pythonhosted.org/packages/95/b4/e7478ba201e6e3b08070232712f463cef4a086c136d93fadf2a38ce3c0f7/oaiharvest-2.0.0.tar.gz" } ], "3.0.0": [ { "comment_text": "", "digests": { "md5": "a05b2af3e77dae995c65c99c54ef9eb6", "sha256": "280d3643154027163392596f1aa10121f9a9e8324b5985750b1d3a0170f34c3d" }, "downloads": -1, "filename": "oaiharvest-3.0.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "a05b2af3e77dae995c65c99c54ef9eb6", "packagetype": "bdist_wheel", "python_version": "3.5", "requires_python": null, "size": 19227, "upload_time": "2017-10-31T22:06:37", "url": "https://files.pythonhosted.org/packages/cc/69/4957f4d49e8ef67caca21063fc7202bb8629703fc87aae725fd0b0b19fc7/oaiharvest-3.0.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "98e46e5239ff1923270d2a7611a34f4f", "sha256": "ec8e1bee0b26f17ac1c2a422d4e007f385dbd9d4201b32a36ebf836d9fb1f7dc" }, "downloads": -1, "filename": "oaiharvest-3.0.0.tar.gz", "has_sig": false, "md5_digest": "98e46e5239ff1923270d2a7611a34f4f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17691, "upload_time": "2017-10-31T22:06:34", "url": "https://files.pythonhosted.org/packages/02/75/f30cdf454fe4153d5d98fbbac4d3364b0972237ee45610206807299a2d7e/oaiharvest-3.0.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "a05b2af3e77dae995c65c99c54ef9eb6", "sha256": "280d3643154027163392596f1aa10121f9a9e8324b5985750b1d3a0170f34c3d" }, "downloads": -1, "filename": "oaiharvest-3.0.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "a05b2af3e77dae995c65c99c54ef9eb6", "packagetype": "bdist_wheel", "python_version": "3.5", "requires_python": null, "size": 19227, "upload_time": "2017-10-31T22:06:37", "url": "https://files.pythonhosted.org/packages/cc/69/4957f4d49e8ef67caca21063fc7202bb8629703fc87aae725fd0b0b19fc7/oaiharvest-3.0.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "98e46e5239ff1923270d2a7611a34f4f", "sha256": "ec8e1bee0b26f17ac1c2a422d4e007f385dbd9d4201b32a36ebf836d9fb1f7dc" }, "downloads": -1, "filename": "oaiharvest-3.0.0.tar.gz", "has_sig": false, "md5_digest": "98e46e5239ff1923270d2a7611a34f4f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17691, "upload_time": "2017-10-31T22:06:34", "url": "https://files.pythonhosted.org/packages/02/75/f30cdf454fe4153d5d98fbbac4d3364b0972237ee45610206807299a2d7e/oaiharvest-3.0.0.tar.gz" } ] }