{ "info": { "author": "Datopian", "author_email": "contact@datopian.com", "bugtrack_url": null, "classifiers": [ "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# Harvester Next Generation for CKAN\n\n## Install\n\n```\npip install ckan-harvester\n```\n\n\n### Use data.json sources\n\n```python\nfrom harvester.data_json import DataJSON\ndj = DataJSON()\ndj.url = 'https://data.iowa.gov/data.json'\nret, error = dj.download_data_json()\nprint(ret, error)\n# True None\n\nret, error = dj.load_data_json()\nprint(ret, error)\n# True None\n\nret, errors = dj.validate_json()\nprint(ret, errors)\n# False ['Error validating JsonSchema: \\'bureauCode\\' is a required property ...\n\n# full dict with the source\nprint(dj.data_json)\n\"\"\"\n{\n\t'@context': 'https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld',\n\t'@id': 'https://data.iowa.gov/data.json',\n\t'@type': 'dcat:Catalog',\n\t'conformsTo': 'https://project-open-data.cio.gov/v1.1/schema',\n\t'describedBy': 'https://project-open-data.cio.gov/v1.1/schema/catalog.json',\n\t'dataset': [{\n\t\t'accessLevel': 'public',\n\t\t'landingPage': 'https://data.iowa.gov/d/23jk-3uwr',\n\t\t'issued': '2017-01-30',\n\t\t'@type': 'dcat:Dataset',\n\n ... \n\"\"\"\n# just headers\nprint(dj.headers)\n\n\"\"\"\n{\n'@context': 'https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld',\n'@id': 'https://data.iowa.gov/data.json',\n'@type': 'dcat:Catalog',\n'conformsTo': 'https://project-open-data.cio.gov/v1.1/schema',\n'describedBy': 'https://project-open-data.cio.gov/v1.1/schema/catalog.json',\n}\n\"\"\"\n\nfor dataset in dj.datasets:\n print(dataset['title'])\n\nImpaired Streams 2014\n2009-2010 Iowa Public School District Boundaries\n2015 - 2016 Iowa Public School District Boundaries\nImpaired Streams 2010\nImpaired Lakes 2014\n2007-2008 Iowa Public School District Boundaries\nImpaired Streams 2012\n2011-2012 Iowa Public School District Boundaries\nActive and Completed Watershed Projects - IDALS\n2012-2013 Iowa Public School District Boundaries\n2010-2011 Iowa Public School District Boundaries\n2016-2017 Iowa Public School District Boundaries\n2014 - 2015 Iowa Public School District Boundaries\nImpaired Lakes 2008\n2008-2009 Iowa Public School District Boundaries\n2013-2014 Iowa Public School District Boundaries\nImpaired Lakes 2010\nImpaired Lakes 2012\nImpaired Streams 2008\n\n```\n\n\n### Use CSW sources\n\n```python\nfrom harvester.csw import CSWSource\nc = CSWSource(url='http://data.nconemap.com/geoportal/csw?Request=GetCapabilities&Service=CSW&Version=2.0.2')\ncsw.connect_csw()\n # True\n\ncsw_info = csw.read_csw_info()\nprint('CSW title: {}'.format(csw_info['identification']['title']))\n # CSW title: ArcGIS Server Geoportal Extension 10 - OGC CSW 2.0.2 ISO AP\n```\n\n## Development\n\nTo setup a develop environment, clone the repository and in a virtualenv install the dependencies\n\n```\npip install -r requirements.txt\n```\n\nThis will install the library in development mode, and other libraries for tests. \n\n## Test\n\nThen to run the test suite with pytest:\n\n```\npytest\n```\n\nWe use [pytest-vcr](https://pytest-vcr.readthedocs.io/en/latest/) based on the wonderful [VCRpy](https://vcrpy.readthedocs.io/en/latest/), to mock http requests. In this way, we don't need to hit the real internet to run our test (which is very fragile and slow), because there is a mocked version of a each response needed by tests, in vcr's *cassettes* format. \n\nIn order to update these *cassettes* just run as following: \n\n```\npytest --vcr-record=all\n```\n\nTo actually hit the internet without use mocks, disable the plugin \n\n```\npytest --disable-vcr\n```\n\nTests without a CKAN instance\n\n```\npython -m pytest tests\n\n================ test session starts =============\nplatform linux -- Python 3.6.8, pytest-5.2.0, py-1.8.0, pluggy-0.13.0\nrootdir: /home/hudson/dev/datopian/ckan-ng-harvester-core\nplugins: vcr-1.0.2\ncollected 17 items \n\ntests/test_csw_dataset_adapter.py .... [ 23%]\ntests/test_data_json.py ....... [ 64%]\ntests/test_datajson_dataset_adapter.py .....[100%]\n\n=============== 17 passed in 17.52s ==============\n```\n\nTests with a CKAN instance. \nYou will need to copy settings.py file to local_settings.py file and fill the required values. \nYou can use a local or remote CKAN instance. \n\n\n```\npython -m pytest tests_with_ckan/test_harvest.py\n```\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://gitlab.com/datopian/ckan-ng-harvester-core", "keywords": "harvester,CKAN", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "ckan-harvester", "package_url": "https://pypi.org/project/ckan-harvester/", "platform": "", "project_url": "https://pypi.org/project/ckan-harvester/", "project_urls": { "Homepage": "https://gitlab.com/datopian/ckan-ng-harvester-core" }, "release_url": "https://pypi.org/project/ckan-harvester/0.109/", "requires_dist": [ "python-slugify (>=3.0.0)", "requests (>=2.20.0)", "OWSLib (>=0.18.0)", "datapackage (>=1.6.2)", "jsonschema (>=3.0.2)", "rfc3987 (>=1.3.8)", "validate-email (>=1.3)", "Jinja2 (>=2.10.1)", "pathlib (>=1.0.1)", "importlib-resources (>=1.0.2)", "lxml (>=4.4.1)" ], "requires_python": ">=3.6", "summary": "Harvester Next Generation Core for CKAN", "version": "0.109" }, "last_serial": 5938824, "releases": { "0.101": [ { "comment_text": "", "digests": { "md5": "b0c7f89ba59b4c0b8a7001c5b6e7fdab", "sha256": "173480a4406613441976899ed6f9d70950599bb71bf8a6b0394a4adf9e7a7f61" }, "downloads": -1, "filename": "ckan_harvester-0.101-py3-none-any.whl", "has_sig": false, "md5_digest": "b0c7f89ba59b4c0b8a7001c5b6e7fdab", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 22555, "upload_time": "2019-09-02T19:58:16", "url": "https://files.pythonhosted.org/packages/6a/f3/fa97dab6f8d7f5f1c981e762e8d93fa960ad38bf5e661757091dce4afddf/ckan_harvester-0.101-py3-none-any.whl" } ], "0.104": [ { "comment_text": "", "digests": { "md5": "2368ed051161850843b653328ef35c5f", "sha256": "25142478818003b8e11281668141f3fd826a681296ab60a68a59bcc27e343aa0" }, "downloads": -1, "filename": "ckan_harvester-0.104-py3-none-any.whl", "has_sig": false, "md5_digest": "2368ed051161850843b653328ef35c5f", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 23130, "upload_time": "2019-09-02T20:12:10", "url": "https://files.pythonhosted.org/packages/29/c4/e238c0d6ad2471445c27226064d7529bf01e26c9db04afe157db58a85a4e/ckan_harvester-0.104-py3-none-any.whl" } ], "0.105": [ { "comment_text": "", "digests": { "md5": "ba19b6db455874765f6f574fde78c469", "sha256": "ae5a6cc4a267d947176960ff538462929d3b091ecd7ba602347ef2d93d24d3fa" }, "downloads": -1, "filename": "ckan_harvester-0.105-py3-none-any.whl", "has_sig": false, "md5_digest": "ba19b6db455874765f6f574fde78c469", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 23130, "upload_time": "2019-09-03T12:55:35", "url": "https://files.pythonhosted.org/packages/21/34/e191360da5d36f7d1e7a704b5b4c134bcf1ba23fcfa33f14640d072aa112/ckan_harvester-0.105-py3-none-any.whl" } ], "0.106": [ { "comment_text": "", "digests": { "md5": "155892184027cc57774282fa7f0159ba", "sha256": "dde2738238344f751fba72d6a5c10812f986c42e97dec813785eb22c3f133419" }, "downloads": -1, "filename": "ckan_harvester-0.106-py3-none-any.whl", "has_sig": false, "md5_digest": "155892184027cc57774282fa7f0159ba", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 31640, "upload_time": "2019-09-03T13:22:14", "url": "https://files.pythonhosted.org/packages/3c/3f/5c1767b13ccb8aad7b69fe51a25fd7f8c39c6080d1ddaff7a61bdec3cc4f/ckan_harvester-0.106-py3-none-any.whl" } ], "0.108": [ { "comment_text": "", "digests": { "md5": "aa8abc24f7e588f95ab2f5f8ec6ab6bf", "sha256": "f0ede7ad45c0e62b69bb9c958fa0127f39da995e75a3aee6d115212c117a6a82" }, "downloads": -1, "filename": "ckan_harvester-0.108-py3-none-any.whl", "has_sig": false, "md5_digest": "aa8abc24f7e588f95ab2f5f8ec6ab6bf", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 46766, "upload_time": "2019-10-02T15:55:28", "url": "https://files.pythonhosted.org/packages/0c/20/ea823680e6849e519f651f8acad97081ddfabfc27e7f076c1c91379fcd1a/ckan_harvester-0.108-py3-none-any.whl" } ], "0.109": [ { "comment_text": "", "digests": { "md5": "81e25359358edd38a4568ac16a456e1e", "sha256": "6a98ff4316d8a649e233164941226945f1f4d36811517db7c846136db93df685" }, "downloads": -1, "filename": "ckan_harvester-0.109-py3-none-any.whl", "has_sig": false, "md5_digest": "81e25359358edd38a4568ac16a456e1e", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 46809, "upload_time": "2019-10-07T13:51:37", "url": "https://files.pythonhosted.org/packages/0e/bc/60d6127dff6f50d9bd4ee133bf238422735b035fd194e3d16c62af75766e/ckan_harvester-0.109-py3-none-any.whl" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "81e25359358edd38a4568ac16a456e1e", "sha256": "6a98ff4316d8a649e233164941226945f1f4d36811517db7c846136db93df685" }, "downloads": -1, "filename": "ckan_harvester-0.109-py3-none-any.whl", "has_sig": false, "md5_digest": "81e25359358edd38a4568ac16a456e1e", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 46809, "upload_time": "2019-10-07T13:51:37", "url": "https://files.pythonhosted.org/packages/0e/bc/60d6127dff6f50d9bd4ee133bf238422735b035fd194e3d16c62af75766e/ckan_harvester-0.109-py3-none-any.whl" } ] }