{ "info": { "author": "Kyle Wilcox", "author_email": "kwilcox@sasascience.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Developers", "Intended Audience :: Science/Research", "License :: OSI Approved :: GNU General Public License v3 (GPLv3)", "Operating System :: POSIX :: Linux", "Programming Language :: Python", "Topic :: Scientific/Engineering" ], "description": "metadown\n========\n\nA programatic collector/downloader for IOOS like 19115-2 metadata written in Python.\n\nServices supported:\n\n* [THREDDS](http://www.unidata.ucar.edu/projects/THREDDS/) (ISO service must be enabled)\n* [GeoNetwork](http://geonetwork-opensource.org/)\n\n## Installation\n\nmetadown is available on pypi and is easiest installed using `pip`.\n\n```bash\npip install metadown\n```\n`lxml`, `requests`, and `thredds_crawler` will be installed automatically\n\n## Usage\n\n### THREDDS\n\nThe ThreddsCollector can take three optional arguments. Both are strongly suggested so you don't crawl an\nentire THREDDS server (unless that is what you want to do).\n\n* `debug` (bool) - Print what is happening to the console. Useful for debugging what to put into `selects` and `skips`.\n* `selects` (list) - Select datasets based on their THREDDS ID. Python regex is supported.\n* `skips` (list) - Skip datasets based on their name and catalogRefs based on their xlink:title. By default, the crawler uses some common regular expressions to skip lists of thousands upon thousands of individual files that are part of aggregations or FMRCs (they are below.) Setting the `skip` parameter to anything other than a superset of the defaults below runs the risk of having some angry system admins after you.\n\n * `.*files.*`\n * `.*Individual Files.*`\n * `.*File_Access.*`\n * `.*Forecast Model Run.*`\n * `.*Constant Forecast Offset.*`\n * `.*Constant Forecast Date.*`\n \nYou can access the default `skip` list through the ThreddsCollector.SKIPS class variable\n```python\n> from metadown.collectors.thredds import ThreddsCollector\n> print ThreddsCollector.SKIPS\n[\n '.*files.*',\n '.*Individual Files.*',\n '.*File_Access.*',\n '.*Forecast Model Run.*',\n '.*Constant Forecast Offset.*',\n '.*Constant Forecast Date.*'\n]\n```\n\nIf you need to remove or add a new `skip`, it is **strongly** encouraged you use the `SKIPS` class variable as a starting point!\n\n```python\n> from metadown.collectors.thredds import ThreddsCollector\n> skips = ThreddsCollector.SKIPS + [\".*-Day-Aggregation\"]\n> metadata_urls = ThreddsCollector(\"http://tds.maracoos.org/thredds/MODIS.xml\", selects=[\".*-Agg\"], skips=skips).run()\n> print metadata_urls\n[\n 'http://tds.maracoos.org/thredds/iso/MODIS-Agg.nc?dataset=MODIS-Agg&catalog=http://tds.maracoos.org/thredds/MODIS.xml',\n 'http://tds.maracoos.org/thredds/iso/MODIS-2009-Agg.nc?dataset=MODIS-2009-Agg&catalog=http://tds.maracoos.org/thredds/MODIS.xml',\n 'http://tds.maracoos.org/thredds/iso/MODIS-2010-Agg.nc?dataset=MODIS-2010-Agg&catalog=http://tds.maracoos.org/thredds/MODIS.xml',\n 'http://tds.maracoos.org/thredds/iso/MODIS-2011-Agg.nc?dataset=MODIS-2011-Agg&catalog=http://tds.maracoos.org/thredds/MODIS.xml',\n 'http://tds.maracoos.org/thredds/iso/MODIS-2012-Agg.nc?dataset=MODIS-2012-Agg&catalog=http://tds.maracoos.org/thredds/MODIS.xml',\n 'http://tds.maracoos.org/thredds/iso/MODIS-2013-Agg.nc?dataset=MODIS-2013-Agg&catalog=http://tds.maracoos.org/thredds/MODIS.xml'\n]\n```\n\n\n### GeoNetwork\n\n```python\nfrom metadown.collectors.geonetwork import GeoNetworkCollector\n\ngnc = GeoNetworkCollector(\"http://data.glos.us/metadata\")\nmetadata_urls = gnc.run()\nprint metadata_urls\n[\n ...\n 'http://data.glos.us/metadata/srv/en/iso19139.xml?id=39848', \n 'http://data.glos.us/metadata/srv/en/iso19139.xml?id=39846', \n 'http://data.glos.us/metadata/srv/en/iso19139.xml?id=39845'\n]\n```\n\n\n### Downloading resulting ISO files\n\nOnce you have the metadata urls for the data you want, do whatever you want!\nIf you would like to rename or modify the metadata files, there is a helper class called `XmlDownloader`\n\nXmlDownloader takes in three parameters:\n\n* `url_list` (required) - a list of URLs\n* `download_path` (required) - folder to download files to on your local machine\n* `namer` (optional) - a python function for renaming the metadata files before saving them to your local machine. It should take in a single url and return a string filename for the url to be saved as.\n \n Example `namer` function that renames GeoNetwork URLs\n ```python\n from urlparse import urlsplit\n def geonetwork_renamer(url, **kwargs):\n uid = urlsplit(url).query\n uid = uid[uid.index(\"=\")+1:]\n return \"GeoNetwork-\" + uid + \".xml\"\n ``` \n\n* `modifier` (optional) - a python function for full control over the metadata content. It should take in a single url and return a `str` representation of ISO19115-2.\n\n Example `modifier` function that translates GeoNetwork's ISO19139 to ISO19115-2\n ```python\n from metadown.utils.etree import etree\n def geonetwork_modifier(url, **kwargs):\n gmi_ns = \"http://www.isotc211.org/2005/gmi\"\n etree.register_namespace(\"gmi\",gmi_ns)\n new_root = etree.Element(\"{%s}MI_Metadata\" % gmi_ns)\n old_root = etree.parse(url).getroot()\n # carry over an attributes we need\n [new_root.set(k,v) for k,v in old_root.attrib.items()]\n # carry over children\n [new_root.append(e) for e in old_root]\n return etree.tostring(new_root, encoding=\"UTF-8\", pretty_print=True, xml_declaration=True)\n ```\n\n```python\nfrom metadown.downloader import XmlDownloader\nXmlDownloader.run(metadata_urls, download_directory, namer=geonetwork_renamer, modifier=geonetwork_modifier)\n```", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/asascience-open/metadown", "keywords": null, "license": "GPLv3", "maintainer": null, "maintainer_email": null, "name": "metadown", "package_url": "https://pypi.org/project/metadown/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/metadown/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://github.com/asascience-open/metadown" }, "release_url": "https://pypi.org/project/metadown/0.9/", "requires_dist": null, "requires_python": null, "summary": "A Python library for collecting Met/Ocean metadata from various services", "version": "0.9" }, "last_serial": 2867941, "releases": { "0.1": [ { "comment_text": "", "digests": { "md5": "9d996ef750513f61c8e65d6dacc7a726", "sha256": "47950b0604c13c09d4589669438f707c7cff016e563e64fb53ca562de59dfe2a" }, "downloads": -1, "filename": "metadown-0.1.tar.gz", "has_sig": false, "md5_digest": "9d996ef750513f61c8e65d6dacc7a726", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4640, "upload_time": "2013-07-25T16:55:57", "url": "https://files.pythonhosted.org/packages/ad/6c/fdaa4812752cef212307efe6a3eca4321e415a8650d7811b8c1c53cbbbd3/metadown-0.1.tar.gz" } ], "0.2": [ { "comment_text": "", "digests": { "md5": "011291f130522fbd11d442a947f72902", "sha256": "dbbd44ec70ad79beca102c76d357e195811f20c7ee31c14ba3a2880c703b7dcb" }, "downloads": -1, "filename": "metadown-0.2.tar.gz", "has_sig": false, "md5_digest": "011291f130522fbd11d442a947f72902", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5376, "upload_time": "2013-07-29T16:58:00", "url": "https://files.pythonhosted.org/packages/25/96/688c28120a7b10ae0d823d5366d706ca3168430c3041300c861757e0f5cd/metadown-0.2.tar.gz" } ], "0.3": [ { "comment_text": "", "digests": { "md5": "d908b78be5cef01c8e99703eab037ed4", "sha256": "383a097701d6d9eae7738d180fd6a3f7d900572cf097b7384856340e5beb3e1f" }, "downloads": -1, "filename": "metadown-0.3.tar.gz", "has_sig": false, "md5_digest": "d908b78be5cef01c8e99703eab037ed4", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5740, "upload_time": "2013-07-29T16:59:59", "url": "https://files.pythonhosted.org/packages/fb/9f/97184e674c8b4bfb4044706c3ffa241ec84f02c85b0b5f31a5f47b1517f6/metadown-0.3.tar.gz" } ], "0.4": [ { "comment_text": "", "digests": { "md5": "b9fec2872bd673807d254f938faeed5f", "sha256": "f62416795202b897d69743789a23871a8d531d2f3fa604851762f053e35c1a2f" }, "downloads": -1, "filename": "metadown-0.4.tar.gz", "has_sig": false, "md5_digest": "b9fec2872bd673807d254f938faeed5f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5778, "upload_time": "2013-07-29T17:59:21", "url": "https://files.pythonhosted.org/packages/04/d0/03ff44ff815ce210e7864ead98af74f6ebee2b3a9ffa27da62893610fb05/metadown-0.4.tar.gz" } ], "0.5": [ { "comment_text": "", "digests": { "md5": "892df7fb90bb6e274b40db349ca25cbc", "sha256": "7d2563778904be7568db34244f2e45852979ece51cfb445d4f75fb81e0ffb91b" }, "downloads": -1, "filename": "metadown-0.5.tar.gz", "has_sig": false, "md5_digest": "892df7fb90bb6e274b40db349ca25cbc", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5832, "upload_time": "2013-08-02T17:40:20", "url": "https://files.pythonhosted.org/packages/89/ee/5c20bd8ec5e3acacab4a0c8960e6e76f3f9c7e6a471ff1267bdc5a6f4bd2/metadown-0.5.tar.gz" } ], "0.6": [ { "comment_text": "", "digests": { "md5": "84120f9e64f7c367c5fddcbdfa9e73b7", "sha256": "92df31aa89a071ba35a4d84fd5a158deb3a408ad117930fe042fdf1576e0ed7a" }, "downloads": -1, "filename": "metadown-0.6.tar.gz", "has_sig": false, "md5_digest": "84120f9e64f7c367c5fddcbdfa9e73b7", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5873, "upload_time": "2013-08-08T16:53:48", "url": "https://files.pythonhosted.org/packages/30/bd/0fed7702007b520af29291e2c2f45ae5d715d75cb04bf7a06b893dc26380/metadown-0.6.tar.gz" } ], "0.8": [ { "comment_text": "", "digests": { "md5": "af246feae8a565af618bacece4b782d0", "sha256": "066108963f355fbe9923e7b3eb5f131c763eb0958f78db64cfe9d856ed9c0122" }, "downloads": -1, "filename": "metadown-0.8.tar.gz", "has_sig": false, "md5_digest": "af246feae8a565af618bacece4b782d0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6532, "upload_time": "2014-04-02T21:26:28", "url": "https://files.pythonhosted.org/packages/b3/a0/e6b7962aaf95a8c40a4db6a12e67480a836e8c219fd28748d09e12420d5e/metadown-0.8.tar.gz" } ], "0.9": [] }, "urls": [] }