{ "info": { "author": "Cedric Canovas", "author_email": "dev@canovas.me", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: Apache Software License", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# Overview\n\n`dataenforce` is a Python package used to enforce column names & types of pandas DataFrames using Python 3 type hinting.\n\nIt is a common issue in Data Analysis to pass dataframes into functions without a clear idea of which columns are included or not, and as columns are added to or removed from input data, code can break in unexpected ways. With `dataenforce`, you can provide a clear interface to your functions and ensure that the input dataframes will have the right format when your code is used.\n\n# How to install\n\nInstall with pip:\n```\npip install dataenforce\n```\n\nYou can also pip install it from the sources, or just import the `dataenforce` folder.\n\n# How to use\n\nThere are two parts in `dataenforce`: the type-hinting part, and the validation. You can use type-hinting with the provided class to indicate what shape the input dataframes should have, and the validation decorator to additionally ensure the format is respected in every function call.\n\n## Type-hinting: `Dataset`\n\nThe `Dataset` type indicates that we expect a `pandas.DataFrame`\n\n### Column name checking\n\n```py\nfrom dataenforce import Dataset\n\ndef process_data(data: Dataset[\"id\", \"name\", \"location\"])\n pass\n```\n\nThe code above specifies that `data` must be a DataFrame with exactly the 3 mentioned columns. If you want to only specify a subset of columns which is required, you can use an ellipsis:\n```py\ndef process_data(data: Dataset[\"id\", \"name\", \"location\", ...])\n pass\n```\n\n### dtype checking\n\n```py\ndef process_data(data: Dataset[\"id\": int, \"name\": object, \"latitude\": float, \"longitude\": float])\n pass\n```\n\nThe code above specifies the column names which must be there, with associated types. A combination of only names & with types is possible: `Dataset[\"id\": int, \"name\"]`.\n\n### Reusing dataframe formats\n\nAs you're likely to use the same column subsets several times in your code, you can define them to reuse & combine them later:\n```py\nDName = Dataset[\"id\", \"name\"]\nDLocation = Dataset[\"id\", \"latitude\", \"longitude\"]\n\n# Expects columns id, name\ndef process1(data: DName):\n pass\n\n# Expects columns id, name, latitude, longitude, timestamp\ndef process2(data: Dataset[DName, DLocation, \"timestamp\"])\n pass\n```\n\n## Enforcing: `@validate`\n\nThe `@validate` decorator ensures that input `Dataset`s have the right format when the function is called, otherwise raises `TypeError`.\n\n```py\nfrom dataenforce import Dataset, validate\nimport pandas as pd\n\n@validate\ndef process_data(data: Dataset[\"id\", \"name\"]):\n pass\n\nprocess_data(pd.DataFrame(dict(id=[1,2], name=[\"Alice\", \"Bob\"]))) # Works\nprocess_data(pd.DataFrame(dict(id=[1,2]))) # Raises a TypeError, column name missing\n```\n\n# How to test\n\n`dataenforce` uses `pytest` as a testing library. If you have `pytest` installed, just run `pytest` in the command line while being in the root folder.\n\n# Notes\n\n* You can use `dataenforce` to type-hint the return value of a function, but it is not currently possible to `validate` it (it is not included in the checks)\n* `dataenforce` is released under the Apache License 2.0, meaning you can freely use the library and redistribute it, provided Copyright is kept\n* Dependencies: Pandas & Numpy\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/CedricFR/dataenforce", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "dataenforce", "package_url": "https://pypi.org/project/dataenforce/", "platform": "", "project_url": "https://pypi.org/project/dataenforce/", "project_urls": { "Homepage": "https://github.com/CedricFR/dataenforce" }, "release_url": "https://pypi.org/project/dataenforce/0.1.1/", "requires_dist": null, "requires_python": "", "summary": "Enforce column names & data types of pandas DataFrames", "version": "0.1.1" }, "last_serial": 4698994, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "420ab34498a3d58e638fe802e6e6f6ab", "sha256": "e33b8448c2418cd9f052357e8fbf2263e6dd07b6bc3f28cafb575922d3ac0050" }, "downloads": -1, "filename": "dataenforce-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "420ab34498a3d58e638fe802e6e6f6ab", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 3428, "upload_time": "2018-12-21T21:06:40", "url": "https://files.pythonhosted.org/packages/83/71/c727cbd7a257c70f0797ca7cec59ff2ed366879bc9d325fa9d01e6543bfe/dataenforce-0.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "6b17dfc812ec411549e4a97a4dfb9c13", "sha256": "df7faa60c0fa25b8e521ebafb41d58949c4c744b4d0fcf899a77d9f507db32a5" }, "downloads": -1, "filename": "dataenforce-0.0.1.tar.gz", "has_sig": false, "md5_digest": "6b17dfc812ec411549e4a97a4dfb9c13", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2152, "upload_time": "2018-12-21T21:06:42", "url": "https://files.pythonhosted.org/packages/2d/63/a126c38d5aa7a7fa9160a487b3fd4c15f8722d3c09b0fc77891790bbe38e/dataenforce-0.0.1.tar.gz" } ], "0.1": [ { "comment_text": "", "digests": { "md5": "44b85213522759d99b2793a5bc756a42", "sha256": "8409fefa21a49a51547c3d15e9a6f375f77f65c53f04f9294d23fa00f11aa8a9" }, "downloads": -1, "filename": "dataenforce-0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "44b85213522759d99b2793a5bc756a42", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 6198, "upload_time": "2018-12-25T18:37:52", "url": "https://files.pythonhosted.org/packages/f3/fe/99f816f3eefcda16679a3ded3f4542e3a8004888d5263f02a3236edc9442/dataenforce-0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "db3c05283a4d6b3f6380a67e60bcb8ca", "sha256": "fb864ed0b199ee68c586e90fa9e107eeb41ebad52cb9200cc5411bd45dd5c9c3" }, "downloads": -1, "filename": "dataenforce-0.1.tar.gz", "has_sig": false, "md5_digest": "db3c05283a4d6b3f6380a67e60bcb8ca", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3899, "upload_time": "2018-12-25T18:37:54", "url": "https://files.pythonhosted.org/packages/f0/87/09fde22c9015f58f8956af35b0e0094b67538715c03816e2dcfc96bbc9be/dataenforce-0.1.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "f628152f34fc95c9b227ea8726d38bd7", "sha256": "5aa6331262bd23f03e998ec27844849a811f53d0485e2045cd8ffb853373ed27" }, "downloads": -1, "filename": "dataenforce-0.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "f628152f34fc95c9b227ea8726d38bd7", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 6274, "upload_time": "2019-01-15T13:58:55", "url": "https://files.pythonhosted.org/packages/24/b6/2ee08832bdce6f33f068e3378ca59a4276d5b55bf62566f65fb8b289cc8e/dataenforce-0.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "19a7317de868fde433e95eac09fb2536", "sha256": "0fdc0ee4bbc0929b0a5250b42368657f93c333e4ba3e5df66e0e836d89a5872a" }, "downloads": -1, "filename": "dataenforce-0.1.1.tar.gz", "has_sig": false, "md5_digest": "19a7317de868fde433e95eac09fb2536", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3939, "upload_time": "2019-01-15T13:58:57", "url": "https://files.pythonhosted.org/packages/c7/30/58919486c3fa85032649125d1be8a69a63cac45e049f74a5c57414352663/dataenforce-0.1.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "f628152f34fc95c9b227ea8726d38bd7", "sha256": "5aa6331262bd23f03e998ec27844849a811f53d0485e2045cd8ffb853373ed27" }, "downloads": -1, "filename": "dataenforce-0.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "f628152f34fc95c9b227ea8726d38bd7", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 6274, "upload_time": "2019-01-15T13:58:55", "url": "https://files.pythonhosted.org/packages/24/b6/2ee08832bdce6f33f068e3378ca59a4276d5b55bf62566f65fb8b289cc8e/dataenforce-0.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "19a7317de868fde433e95eac09fb2536", "sha256": "0fdc0ee4bbc0929b0a5250b42368657f93c333e4ba3e5df66e0e836d89a5872a" }, "downloads": -1, "filename": "dataenforce-0.1.1.tar.gz", "has_sig": false, "md5_digest": "19a7317de868fde433e95eac09fb2536", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3939, "upload_time": "2019-01-15T13:58:57", "url": "https://files.pythonhosted.org/packages/c7/30/58919486c3fa85032649125d1be8a69a63cac45e049f74a5c57414352663/dataenforce-0.1.1.tar.gz" } ] }