{ "info": { "author": "Daniel van der Ende", "author_email": "", "bugtrack_url": null, "classifiers": [], "description": "[![Build Status](https://travis-ci.com/danielvdende/opulent-pandas.svg?token=km81qsbsLrgZWGfcfi7a&branch=master)](https://travis-ci.com/danielvdende/opulent-pandas)\n[![PyPI version](https://badge.fury.io/py/opulent-pandas.svg)](https://badge.fury.io/py/opulent-pandas)\n# Opulent-Pandas\nOpulent-Pandas is a schema validation packages aimed specifically at validating the schema of pandas dataframes. \nIt takes heavy inspiration from [voluptuous](https://github.com/alecthomas/voluptuous), and tries to stay as close as possible to the API defined in this package. Opulent-Pandas\nis different from voluptuous in that it heavily relies on [Pandas](https://pandas.pydata.org/) to perform the validation. This makes Opulent-Pandas considerably faster\nthan voluptuous on larger datasets. It does, however, mean that the input format is also a Pandas DataFrame, rather than a dict (as is the case for voluptuous)\nA performance comparison of voluptuous and Opulent-Pandas will be added to this readme soon!\n\n## Example\nDefining a schema in Opulent-Pandas is very similar to how you would in voluptuous. To make the similarities and differences clear, let's walk through the same example as is done in the voluptuous readme.\n\nTwitter's [user search API](https://dev.twitter.com/rest/reference/get/users/search) accepts\nquery URLs like:\n\n```\n$ curl 'https://api.twitter.com/1.1/users/search.json?q=python&per_page=20&page=1'\n```\n\nTo validate this we might use a schema like:\n\n```pycon\n>>> from opulent_pandas import Schema, TypeValidator, Required\n>>> schema = Schema({\n... Required('q'): [TypeValidator(str)],\n... Required('per_page'): [TypeValidator(int)],\n... Required('page'): [TypeValidator(int)],\n... })\n\n```\nComparing with voluptuous, you'll notice that the validators per field are always specified as a list. Other than that,\nit's very similar to how you would define the schema with voluptuous\n\nIf we look at the more complex schema, as defined in the readme of voluptuous, we see very similar schemas:\n\n```pycon\n>>> from opulent_pandas.validator import Required, RangeValidator, TypeValidator, ValueLengthValidator \n>>> schema = Schema({\n... Required('q'): [TypeValidator(str), ValueLengthValidator(min_length=1)],\n... Required('per_page'): [TypeValidator(int), RangeValidator(min=1, max=20)],\n... Required('page'): [TypeValidator(int), RangeValidator(min=0)],\n... })\n\n```\n\nOne difference between Opulent-Pandas and voluptuous is that Opulent-Pandas has a `validate` function that can be used\nto validate a given data structure rather tha voluptuous' approach of passing the data directly to your schema as a parameter. \n\nIf you pass data in that does not satisfy the requirements specified in your Opulent-Pandas schema, you'll get a corresponding error message. Walking\nthrough the examples provided in the voluptuous readme:\n\nThere are 3 required fields:\nTODO: this example should also tell you which columns are missing. Seems to be a bug.\n```pycon\n>>> from opulent_pandas import MissingColumnError\n>>> try:\n... schema.validate({})\n... raise AssertionError('MissingColumnError not raised')\n... except MissingColumnError as e:\n... exc = e\n>>> str(exc) == \"Columns missing\"\nTrue\n\n```\n\n`q` must be a string:\n\n```pycon\n>>> from opulent_pandas import InvalidTypeError\n>>> try:\n... schema.validate(pd.DataFrame({'q': [123], 'per_page':[10], 'page': [1]})\n... raise AssertionError('InvalidTypeError not raised')\n... except InvalidTypeError as e:\n... exc = e\n>>> str(exc) == \"Invalid data type found for column: q. Required: \"\nTrue\n\n```\n\n...and must be at least one character in length:\n\n```pycon\n>>> from opulent_pandas import ValueLengthError\n>>> try:\n... schema.validate(pd.DataFrame({'q': [''], 'per_page': 5, 'page': 12}))\n... raise AssertionError('ValueLengthError not raised')\n... except ValueLengthError as e:\n... exc = e\n>>> str(exc) == \"Value found with length smaller than enforced minimum length for column: q. Minimum Length: 1\"\nTrue\n\n```\n\n\"per\\_page\" is a positive integer no greater than 20:\n\n```pycon\n>>> from opulent_pandas import RangeError\n>>> try:\n... schema.validate(pd.DataFrame({'q': ['#topic'], 'per_page': [900], 'page': [12]}))\n... raise AssertionError('RangeError not raised')\n... except RangeError as e:\n... exc = e\n>>> str(exc) == \"Value found larger than enforced maximum for column: per_page. Required maximum: 20\"\nTrue\n\n>>> try:\n... schema.validate(pd.DataFrame({'q': ['#topic'], 'per_page': [-10], 'page': [12]}))\n... raise AssertionError('RangeError not raised')\n... except RangeError as e:\n... exc = e\n>>> str(exc) == \"Value found larger than enforced minimum for column: per_page. Required minimum: 1\"\nTrue\n\n```\n\n\"page\" is an integer \\>= 0:\n\n```pycon\n>>> try:\n... schema.validate(pd.DataFrame({'q': ['#topic'], 'per_page': ['one']})\n... raise AssertionError('InvalidTypeError not raised')\n... except InvalidTypeError as e:\n... exc = e\n>>> str(exc) == \"Invalid data type found for column: page. Required type: \"\nTrue\n\n```\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "opulent-pandas", "package_url": "https://pypi.org/project/opulent-pandas/", "platform": "", "project_url": "https://pypi.org/project/opulent-pandas/", "project_urls": null, "release_url": "https://pypi.org/project/opulent-pandas/0.0.4/", "requires_dist": [ "pandas (==0.23.4)", "flake8 (==3.5.0) ; extra == 'lint'", "pytest (==4.0.2) ; extra == 'test'" ], "requires_python": "", "summary": "A package to validate the schema of a pandas dataframe", "version": "0.0.4" }, "last_serial": 5288085, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "1ec4b0c4592a0d4ffd28e674d82479db", "sha256": "43773afad252b6dddfe0b9c57b59a2c6de8e5b7e1a4e5283c1387cbeea39b5cb" }, "downloads": -1, "filename": "opulent_pandas-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "1ec4b0c4592a0d4ffd28e674d82479db", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 4440, "upload_time": "2018-12-20T12:40:38", "url": "https://files.pythonhosted.org/packages/67/89/7a9f278a4a469ce4fa1969d06160791fb258345c425516d166248cae40b6/opulent_pandas-0.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "c91880e4e86d6baa53cdad0ba4e827b8", "sha256": "14a0db9d494c79f55a5a8b0abc30c6c9ce80a9a43f4752f7bab0a3a1dd271287" }, "downloads": -1, "filename": "opulent-pandas-0.0.1.tar.gz", "has_sig": false, "md5_digest": "c91880e4e86d6baa53cdad0ba4e827b8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2299, "upload_time": "2018-12-20T12:40:40", "url": "https://files.pythonhosted.org/packages/e1/a5/f937a5db6d30d7acf1df182916ca71a7f16660c29cf341f3c89817c6563c/opulent-pandas-0.0.1.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "d1769667b2a6e03d9c1da6167e284eb7", "sha256": "2f018ebcc17168a740b3fc7857bc66e7241a798acc02c887fc51771927efc031" }, "downloads": -1, "filename": "opulent_pandas-0.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "d1769667b2a6e03d9c1da6167e284eb7", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 5438, "upload_time": "2019-01-15T15:25:44", "url": "https://files.pythonhosted.org/packages/87/99/d5cd5e76acdb927424df63c9a629a6ed554ebd41aff853fb9d96b7d121d8/opulent_pandas-0.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1db305ed761ee2b3d2e553f5779b3a17", "sha256": "c024eadc70bc2d34b0ba4d3100562bd28d65a9c05c61af9d0f1ba79a8df72844" }, "downloads": -1, "filename": "opulent-pandas-0.0.2.tar.gz", "has_sig": false, "md5_digest": "1db305ed761ee2b3d2e553f5779b3a17", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3126, "upload_time": "2019-01-15T15:25:46", "url": "https://files.pythonhosted.org/packages/d3/4d/51cbd2f99a35d689037be8e6c9d6a831e191d4e99a59786f88a1e5ada843/opulent-pandas-0.0.2.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "505b9b8f78516914bad3047a0c9c25cd", "sha256": "ab37cc4f3c3f0504c78aeaea3b4c280333f81e60dae6338e7b5bb6dc72eb55c2" }, "downloads": -1, "filename": "opulent_pandas-0.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "505b9b8f78516914bad3047a0c9c25cd", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 5465, "upload_time": "2019-01-17T08:06:15", "url": "https://files.pythonhosted.org/packages/67/a5/379c29dbd7454d8d5ad92209d258b463673f242d0fedf4bc1712669156d7/opulent_pandas-0.0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "c769e20809244c394487f7cdcd16632c", "sha256": "05bfed4e6b8bb8a780ac17b60a1ac74732cae083571fcb5f6bb3b1d3b66aaf73" }, "downloads": -1, "filename": "opulent-pandas-0.0.3.tar.gz", "has_sig": false, "md5_digest": "c769e20809244c394487f7cdcd16632c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3132, "upload_time": "2019-01-17T08:06:17", "url": "https://files.pythonhosted.org/packages/5a/3c/7a4634c2767188a4af2f0c0cfdfd0e09262546e7ae9672eb05556b895faa/opulent-pandas-0.0.3.tar.gz" } ], "0.0.4": [ { "comment_text": "", "digests": { "md5": "6bfeebe59431c235d416b70f2d65263f", "sha256": "f6babbd3fa1e8b2d9a12927017f620f80fdf35fed55fe8cb67ca3f37b2882697" }, "downloads": -1, "filename": "opulent_pandas-0.0.4-py3-none-any.whl", "has_sig": false, "md5_digest": "6bfeebe59431c235d416b70f2d65263f", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 10040, "upload_time": "2019-05-19T12:01:00", "url": "https://files.pythonhosted.org/packages/b0/96/caf421e6ad9c8d964cf925d88a9a3995d098439f7eca3bbc35e2c0f26f3c/opulent_pandas-0.0.4-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3cb181e91a394775ccd6212036260bc8", "sha256": "c8615b0827907b02625d8c250ad03e5e38a59b337ae86e30e90f19bfa801219a" }, "downloads": -1, "filename": "opulent-pandas-0.0.4.tar.gz", "has_sig": false, "md5_digest": "3cb181e91a394775ccd6212036260bc8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4836, "upload_time": "2019-05-19T12:01:02", "url": "https://files.pythonhosted.org/packages/eb/6e/f037eb2645b9bcd53c228ffb51b92ba2fb5efc822852b96a31100d6a51c6/opulent-pandas-0.0.4.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "6bfeebe59431c235d416b70f2d65263f", "sha256": "f6babbd3fa1e8b2d9a12927017f620f80fdf35fed55fe8cb67ca3f37b2882697" }, "downloads": -1, "filename": "opulent_pandas-0.0.4-py3-none-any.whl", "has_sig": false, "md5_digest": "6bfeebe59431c235d416b70f2d65263f", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 10040, "upload_time": "2019-05-19T12:01:00", "url": "https://files.pythonhosted.org/packages/b0/96/caf421e6ad9c8d964cf925d88a9a3995d098439f7eca3bbc35e2c0f26f3c/opulent_pandas-0.0.4-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3cb181e91a394775ccd6212036260bc8", "sha256": "c8615b0827907b02625d8c250ad03e5e38a59b337ae86e30e90f19bfa801219a" }, "downloads": -1, "filename": "opulent-pandas-0.0.4.tar.gz", "has_sig": false, "md5_digest": "3cb181e91a394775ccd6212036260bc8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4836, "upload_time": "2019-05-19T12:01:02", "url": "https://files.pythonhosted.org/packages/eb/6e/f037eb2645b9bcd53c228ffb51b92ba2fb5efc822852b96a31100d6a51c6/opulent-pandas-0.0.4.tar.gz" } ] }