{ "info": { "author": "Nat Wilson", "author_email": "natw@fortyninemaps.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6" ], "description": "Dependency resolution for datasets\n==================================\n\n|Build Status|\n\n*depgraph* is a tiny (<500 LOC) Python library for expressing networks\nof datasets and their relationships. In this way, it is superficially\nsimilar to `Airflow `__ and\n`Luigi `__, although those tools\ncontain significantly more functionality.\n\nNetworks are declared in terms of the relationships (graph edges)\nbetween source and target datasets (graph nodes). Target datasets can\nthen report sets of precursor datasets in the correct order. This makes\nit simple to throw together a build script and construct dependencies,\nsequentially or with parallelization.\n\nTraditionally, each ``Dataset`` is designed to correspond to a file. A\n``DatasetGroup`` class handles cases where multiple files can be\nconsidered a single file (e.g. a binary data file and its XML metadata).\n\n Different kinds of resources, such as database tables, can be used\n as long as they can be queried to determine whether they exist (how\n how old they are, in order to tak advantage of age-based incremental\n building).\n\nWhen a ``Dataset`` requires a different dataset to be built to satisfy\nits dependencies, it provides a reason, such as:\n\n- the ``Dataset`` is missing, and so must be built\n- the ``Dataset`` is out of date\n\n*depgraph* is intended to be a reusable component for constructing\nscientific dataset build tools. Important considerations for such a\nbuild tool are that it must:\n\n- permit `reproducible\n analysis `__\n- be documenting so that `a workflow can be easily\n reported `__\n- perform fast rebuilds to enable experimentation\n\nBeyond the standard library, *depgraph* has no dependencies of its own,\nso it is easy to include in projects running on a laptop, on a large\ncluster, or in the cloud. *depgraph* supports modern Python\nimplementations (Python 2, Python 3, PyPy), and works on Linux, OS X,\nand Windows.\n\nImportant parts\n---------------\n\n``Dataset`` defines an individual data product, represented by a\nfilename, *name*. Additional keyword arguments may be provided in order\nto facilitate the build process.\n\nThe ancestors of a dataset can be retrieved with ``Dataset.parents(n)``,\nwhere *n* is the number of generations to include. *n=0* means include\nonly the direct parents, while *n=1* includes grandparents. *n=-1*\nincludes every ancestor. ``Dataset.roots()`` returns the top-level\nancestors, i.e. those with no additional parents.\n\nSimilarly, ``Dataset.children(n)`` yields the descendants of a dataset,\nif any.\n\nRelationships are defined with ``Dataset.dependson(obj)``, where *obj*\nis another ``Dataset`` instance. Relationships can be defined\nprogrammatically to construct large dependency graphs.\n\nA user defined ``build(dataset, reason)`` function (name unimportant)\ntakes a dataset and constructs it based on its ancestors and any other\nattributes of the ``Dataset``. The *reason* is a ``Reason`` object that\nspecifies the motivation for a build step.\n\nThe ``depgraph.buildall()`` function or ``Dataset.buildnext()`` method\ncan be used to obtain ancestor datasets and reason pairs to feed to the\n``build()`` function. Alternatively, the ``build()`` function can be\ndecorated with the ``buildmanager`` decorator, which creates a function\nthat automatically constructs a dataset by assembling its dependencies\nin order (see the examples below).\n\nComplex dependency graphs can be visualized by using the ``graphviz()``\nfunction, which returns a `DOT\nlanguage `__ string\nencoding the visual graph.\n\nExample\n-------\n\nDeclare a set of dependencies resembling the graph below:\n\n::\n\n R0 R1 R2 R3 [raw data]\n \\ / | |\n DA0 DA1 /\n \\ / \\ /\n DB0 DB1\n \\ / | \\\n \\ / | \\\n DC0 DC1 DC2 [products]\n\n.. code:: python\n\n from depgraph import Dataset, buildmanager\n\n # Define Datasets\n # Use an optional keyword `tool` to provide a key instructing our build tool\n # how to assemble this product. Here we've used strings, but another pattern\n # would be to provide a callback function\n R0 = Dataset(\"data/raw0\", tool=\"read_csv\")\n R1 = Dataset(\"data/raw1\", tool=\"read_csv\")\n R2 = Dataset(\"data/raw2\", tool=\"database_query\")\n R3 = Dataset(\"data/raw3\", tool=\"read_hdf\")\n\n DA0 = Dataset(\"step1/da0\", tool=\"merge_fish_counts\")\n DA1 = Dataset(\"step1/da1\", tool=\"process_filter\")\n\n DB0 = Dataset(\"step2/db0\", tool=\"join_counts\")\n DB1 = Dataset(\"step2/db1\", tool=\"join_by_date\")\n\n DC0 = Dataset(\"results/dc0\", tool=\"merge_model_obs\")\n DC1 = Dataset(\"results/dc1\", tool=\"compute_uncertainty\")\n DC2 = Dataset(\"results/dc2\", tool=\"make_plots\")\n\n # Declare dependency relationships so that depgraph and determine the order of\n # the build\n DA0.dependson(R0, R1)\n DA1.dependson(R2)\n DB0.dependson(DA0, DA1)\n DB1.dependson(DA1, R3)\n DC0.dependson(DB0, DB1)\n DC1.dependson(DB1)\n DC2.dependson(DB1)\n\n # Option 1:\n # Define a function that builds individual dependencies. The *buildmanager*\n # decorator transforms it into a loop that builds all dependencies above a\n # target\n @buildmanager\n def batchbuilder(dependency, reason):\n # [....]\n return exitcode\n\n batchbuilder(DC1)\n\n # Option 2:\n # Implement the build loop manually\n from depgraph import buildall\n\n def build(dependency, reason):\n # This may have the same logic as `batchbuilder` above, but we\n # will call it directly rather than wrapping it in @buildmanager\n # [....]\n return exitcode\n\n for stage in buildall(DC1):\n\n # A build stage is a list of dependencies whose own dependencies are met and\n # that are independent, i.e. they can be built in parallel\n\n for dep, reason in stage:\n\n # Each target is a dataset with a 'name' attribute and whatever\n # additional keyword arguments where defined with it.\n # The 'reason' is a depgraph.Reason object that codifies why a\n # particular target is necessary (e.g. it's out of date, it's missing\n # and required by a subsequent target, etc.)\n print(\"Building {0} with {1} because {2}\".format(dep.name, dep.tool,\n reason))\n\n # Call a function or start a subprocess that will result in the\n # target being built and saved to a file\n return_val = build(dep, reason)\n\n # Perform logging, clean-up, or error handling operations\n # [....]\n\nChanges\n-------\n\n0.4\n~~~\n\n- Performance improvements\n- ``buildall`` generator function, which is more efficient than\n repeatedly calling ``Dataset.buildnext()``\n\n0.3\n~~~\n\n- Cyclic graph detection\n- Graphviz export\n\n0.2\n~~~\n\n- Rewrite, dropping ``DependencyGraph`` and making ``Dataset`` the\n primary class\n\n0.1\n~~~\n\n- First version, copied from ``depchain`` module of asputil package\n\n.. |Build Status| image:: https://travis-ci.org/njwilson23/depgraph.svg?branch=master\n :target: https://travis-ci.org/njwilson23/depgraph", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/njwilson23/depgraph", "keywords": "", "license": "MIT License", "maintainer": "", "maintainer_email": "", "name": "data-depgraph", "package_url": "https://pypi.org/project/data-depgraph/", "platform": "", "project_url": "https://pypi.org/project/data-depgraph/", "project_urls": { "Homepage": "https://github.com/njwilson23/depgraph" }, "release_url": "https://pypi.org/project/data-depgraph/0.4.4/", "requires_dist": null, "requires_python": "", "summary": "Micro dependency fulfillment library for scientific datasets", "version": "0.4.4" }, "last_serial": 3318589, "releases": { "0.1.dev0": [ { "comment_text": "", "digests": { "md5": "15fdddf73c21ac63e5f9f8e1f0ffe809", "sha256": "37422d077869e5d111ae5708d62f7e78e24cbfa8c6e9a27f5b8296306f900a30" }, "downloads": -1, "filename": "data-depgraph-0.1.dev0.tar.gz", "has_sig": false, "md5_digest": "15fdddf73c21ac63e5f9f8e1f0ffe809", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6389, "upload_time": "2016-03-16T18:14:05", "url": "https://files.pythonhosted.org/packages/8f/80/a3e17af58dea3c583aadd0439db2751b48b8c41b93a23805740cecca5cde/data-depgraph-0.1.dev0.tar.gz" } ], "0.1.dev1": [ { "comment_text": "", "digests": { "md5": "5b9f8d740b79f312c42c4dc0c32eae45", "sha256": "12b08dabe8ab90407bfb042c49d87d3cb62aa2da93c8a56c2197a381c713db86" }, "downloads": -1, "filename": "data-depgraph-0.1.dev1.tar.gz", "has_sig": false, "md5_digest": "5b9f8d740b79f312c42c4dc0c32eae45", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6394, "upload_time": "2016-03-16T22:29:40", "url": "https://files.pythonhosted.org/packages/45/1c/89220bd861e956b8725774cfb844cf0bcd8164c385202018fd6c9ab98833/data-depgraph-0.1.dev1.tar.gz" } ], "0.2": [ { "comment_text": "", "digests": { "md5": "8cbee60241293b4db5086fba1a66d629", "sha256": "e89ecb425596115403f1d224b42c6288103378519badc6a526a3e084bbbd362a" }, "downloads": -1, "filename": "data-depgraph-0.2.tar.gz", "has_sig": false, "md5_digest": "8cbee60241293b4db5086fba1a66d629", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5139, "upload_time": "2016-03-16T22:45:01", "url": "https://files.pythonhosted.org/packages/a4/d1/c8afad6c4ed3a8c5636d7089873048fd559a592c6761c2900fbc86cca018/data-depgraph-0.2.tar.gz" } ], "0.3": [ { "comment_text": "", "digests": { "md5": "96468f5ee9b83051ddbfa6415a6d75b5", "sha256": "0e38e62ece2492d72b3b34be6004915aab756729bfd4fa1c0eb19716a42f7d7a" }, "downloads": -1, "filename": "data-depgraph-0.3.tar.gz", "has_sig": false, "md5_digest": "96468f5ee9b83051ddbfa6415a6d75b5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6868, "upload_time": "2016-03-29T19:14:11", "url": "https://files.pythonhosted.org/packages/b6/2a/b24df84164cbd78281177094b7b06a335afc7f977bb9e5411dafb6df4996/data-depgraph-0.3.tar.gz" } ], "0.3.2": [ { "comment_text": "", "digests": { "md5": "2d075cc1dcc64ff84422497762a43e5a", "sha256": "6c4c2e1f4527ef44711545a6ee6e5db71e8844deabc436ad04f0605f107cf065" }, "downloads": -1, "filename": "data-depgraph-0.3.2.tar.gz", "has_sig": false, "md5_digest": "2d075cc1dcc64ff84422497762a43e5a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7309, "upload_time": "2016-04-07T23:30:45", "url": "https://files.pythonhosted.org/packages/3d/da/66e58d2db1f10eb6efe1d5fda3ca059bf18f312ce6c2b98685c069d06260/data-depgraph-0.3.2.tar.gz" } ], "0.3.3": [ { "comment_text": "", "digests": { "md5": "868a5fa5728a9d41eda83c98e6f31f51", "sha256": "896cf81c5fc1ee804893edc56da5edc2ded093b33e436168576661e9cd28f3fd" }, "downloads": -1, "filename": "data-depgraph-0.3.3.tar.gz", "has_sig": false, "md5_digest": "868a5fa5728a9d41eda83c98e6f31f51", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7457, "upload_time": "2016-04-08T05:32:19", "url": "https://files.pythonhosted.org/packages/5b/c9/d8be2c0da727cad5a442f2ef670305e370fc5c8bdae0008a8435bcdadb3c/data-depgraph-0.3.3.tar.gz" } ], "0.3.4": [ { "comment_text": "", "digests": { "md5": "b37423b93302cc5ca5993cb8d158ebb0", "sha256": "ff72f6fd3955be6156f49396cb0d690a0f3f58c4fc52a7b09e47f8b78526504b" }, "downloads": -1, "filename": "data-depgraph-0.3.4.tar.gz", "has_sig": false, "md5_digest": "b37423b93302cc5ca5993cb8d158ebb0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7545, "upload_time": "2016-04-10T17:33:06", "url": "https://files.pythonhosted.org/packages/bc/9d/f570add7b008014823a40e8319d757e3ceabc016e1e425a6a8b014809e31/data-depgraph-0.3.4.tar.gz" } ], "0.3.5": [ { "comment_text": "", "digests": { "md5": "7d2b340308bc01676d337b756f12670f", "sha256": "219167286b317db5dcfefd2c3519d7cb32c6ddc6d55bd2c9d952a1cc2c430fd8" }, "downloads": -1, "filename": "data-depgraph-0.3.5.tar.gz", "has_sig": false, "md5_digest": "7d2b340308bc01676d337b756f12670f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8011, "upload_time": "2016-10-26T18:11:15", "url": "https://files.pythonhosted.org/packages/41/8e/3b1e815363d57146131143406be1dab48feec551003339f01404816707b5/data-depgraph-0.3.5.tar.gz" } ], "0.3.dev0": [ { "comment_text": "", "digests": { "md5": "890f70c439eca3b1225e834d93449a55", "sha256": "2576854badcc230914d8d97f060c934db6fd94918bb9ec919bcb68ed0af810ff" }, "downloads": -1, "filename": "data-depgraph-0.3.dev0.tar.gz", "has_sig": false, "md5_digest": "890f70c439eca3b1225e834d93449a55", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6105, "upload_time": "2016-03-18T17:24:49", "url": "https://files.pythonhosted.org/packages/41/24/ce503240f07f6db87c92e9085d55b656c846b3d5253f51f7582774883875/data-depgraph-0.3.dev0.tar.gz" } ], "0.4.0": [ { "comment_text": "", "digests": { "md5": "d45e0cbe4bbfe44065b7d0e1d639e901", "sha256": "b886bfeb1d53aa6c763448f53d7c588aa0ccf62ef26fb805e905bb7a0028252a" }, "downloads": -1, "filename": "data-depgraph-0.4.0.tar.gz", "has_sig": false, "md5_digest": "d45e0cbe4bbfe44065b7d0e1d639e901", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8310, "upload_time": "2016-12-14T03:29:40", "url": "https://files.pythonhosted.org/packages/32/b9/90f4e79e91f0bfd28a4b4f76d54a1a09e2932804a472dbe31f17a9a1c60b/data-depgraph-0.4.0.tar.gz" } ], "0.4.1": [ { "comment_text": "", "digests": { "md5": "06c167cd70d90b3948fed5568f3f6ba6", "sha256": "840bac73d61e1ed6444f0380410bea99dba45d6bfef223001173cb4a7ce9fc82" }, "downloads": -1, "filename": "data-depgraph-0.4.1.tar.gz", "has_sig": false, "md5_digest": "06c167cd70d90b3948fed5568f3f6ba6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9054, "upload_time": "2017-06-22T02:17:49", "url": "https://files.pythonhosted.org/packages/35/d2/ec7b2f6dc72d8cb6fbc7461916beea0b55201d04a9305cec4fb24119602e/data-depgraph-0.4.1.tar.gz" } ], "0.4.2": [ { "comment_text": "", "digests": { "md5": "b41350d0bd5e4e47bd20fe645b702910", "sha256": "27cc449da7b5e8dc1428c1f54a90440417b41425db91a95097468c0ffd9f374e" }, "downloads": -1, "filename": "data-depgraph-0.4.2.tar.gz", "has_sig": false, "md5_digest": "b41350d0bd5e4e47bd20fe645b702910", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9175, "upload_time": "2017-06-29T03:50:28", "url": "https://files.pythonhosted.org/packages/54/49/da7e4036aa02217d3854a22ed6a5cf4646339bb8774a0c8d6086a295e6bb/data-depgraph-0.4.2.tar.gz" } ], "0.4.3": [ { "comment_text": "", "digests": { "md5": "1729e584ea0daae94af7f1a2f35e42ac", "sha256": "9153495ebf7a42e373a5769850c1e5c22c2073b62d738ba7de2e158a354992d5" }, "downloads": -1, "filename": "data-depgraph-0.4.3.tar.gz", "has_sig": false, "md5_digest": "1729e584ea0daae94af7f1a2f35e42ac", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9067, "upload_time": "2017-07-21T01:24:31", "url": "https://files.pythonhosted.org/packages/5e/15/d6f65f4331b7418fae0c25c3bf463576dbbe150deb8e0ba6fd9045ac1060/data-depgraph-0.4.3.tar.gz" } ], "0.4.4": [ { "comment_text": "", "digests": { "md5": "7177a65015e5f6bc04b44cb624c0cb54", "sha256": "80538a9ac62eb3939e3452081c910c89395961713a8ec0c9ecd7dfc109993e0a" }, "downloads": -1, "filename": "data-depgraph-0.4.4.tar.gz", "has_sig": false, "md5_digest": "7177a65015e5f6bc04b44cb624c0cb54", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 13104, "upload_time": "2017-11-09T05:38:49", "url": "https://files.pythonhosted.org/packages/e6/4a/0819deec7718204e1d6a3558ce1fe914b5843f1516765975012d9aff00ec/data-depgraph-0.4.4.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "7177a65015e5f6bc04b44cb624c0cb54", "sha256": "80538a9ac62eb3939e3452081c910c89395961713a8ec0c9ecd7dfc109993e0a" }, "downloads": -1, "filename": "data-depgraph-0.4.4.tar.gz", "has_sig": false, "md5_digest": "7177a65015e5f6bc04b44cb624c0cb54", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 13104, "upload_time": "2017-11-09T05:38:49", "url": "https://files.pythonhosted.org/packages/e6/4a/0819deec7718204e1d6a3558ce1fe914b5843f1516765975012d9aff00ec/data-depgraph-0.4.4.tar.gz" } ] }