{ "info": { "author": "Feature Labs, Inc.", "author_email": "support@featurelabs.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Developers", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7" ], "description": "# AutoNormalize\n\n[![CircleCI](https://circleci.com/gh/FeatureLabs/autonormalize.svg?style=shield&circle-token=b890443ca669d7e88d62ad2fd712f92951550c4a)](https://circleci.com/gh/FeatureLabs/autonormalize)\n\nAutoNormalize is a Python library for automated datatable normalization. It allows you to build an `EntitySet` from a single denormalized table and generate features for machine learning using [Featuretools](https://github.com/Featuretools/featuretools).\n\n\n\n## Getting Started\n* [Install](#install) \n* [Demos](#demos) \n* [API Reference](#api-reference) \n\n\n## Install\n```shell\npip install featuretools[autonormalize]\n```\n#### Uninstall\n```shell\npip uninstall autonormalize\n```\n\n## Demos\n\n\n* [Blog Post](https://blog.featurelabs.com/automatic-dataset-normalization-for-feature-engineering-in-python/)\n* [Machine Learning Demo with Featuretools](https://github.com/FeatureLabs/autonormalize/blob/master/autonormalize/demos/AutoNormalize%20%2B%20FeatureTools%20Demo.ipynb)\n* [Kaggle Liquor Sales Dataset Demo](https://github.com/FeatureLabs/autonormalize/blob/master/autonormalize/demos/Kaggle%20Liquor%20Sales%20Dataset%20Demo.ipynb)\n* [Demo with Editing Dependencies](https://github.com/FeatureLabs/autonormalize/blob/master/autonormalize/demos/Editing%20Dependnecies%20Demo.ipynb)\n* [Kaggle Food Production Dataset Demo](https://github.com/FeatureLabs/autonormalize/blob/master/autonormalize/demos/Kaggle%20Food%20%20Dataset%20Demo.ipynb)\n\n\n## API Reference\n\n### `auto_entityset`\n```shell\nauto_entityset(df, accuracy=0.98, index=None, name=None, time_index=None)\n```\nCreates a normalized entityset from a dataframe.\n\n**Arguments:**\n\n* `df` (pd.Dataframe) : the dataframe containing data\n\n* `accuracy` (0 < float <= 1.00; default = 0.98) : the accuracy threshold required in order to conclude a dependency (i.e. with accuracy = 0.98, 0.98 of the rows must hold true the dependency LHS --> RHS)\n\n* `index` (str, optional) : name of column that is intended index of df\n\n* `name` (str, optional) : the name of created EntitySet\n\n* `time_index` (str, optional) : name of time column in the dataframe.\n\n**Returns:**\n\n* `entityset` (ft.EntitySet) : created entity set\n\n### `find_dependencies`\n\n```shell\nfind_dependencies(df, accuracy=0.98, index=None)\n```\nFinds dependencies within dataframe with the DFD search algorithm.\n\n**Returns:**\n\n* `dependencies` (Dependencies) : the dependencies found in the data within the contraints provided\n\n### `normalize_dataframe`\n\n```shell\nnormalize_dataframe(df, dependencies)\n```\nNormalizes dataframe based on the dependencies given. Keys for the newly created DataFrames can only be columns that are strings, ints, or categories. Keys are chosen according to the priority: \n1) shortest lenghts \n2) has \"id\" in some form in the name of an attribute \n3) has attribute furthest to left in the table\n\n**Returns:**\n\n* `new_dfs` (list[pd.DataFrame]) : list of new dataframes\n\n
\n\n### `make_entityset`\n\n```shell\nmake_entityset(df, dependencies, name=None, time_index=None)\n```\nCreates a normalized EntitySet from dataframe based on the dependencies given. Keys are chosen in the same fashion as for `normalize_dataframe`and a new index will be created if any key has more than a single attribute.\n\n**Returns:**\n\n* `entityset` (ft.EntitySet) : created EntitySet\n\n
\n\n### `normalize_entity`\n\n```shell\nnormalize_entity(es, accuracy=0.98)\n```\nReturns a new normalized `EntitySet` from an `EntitySet` with a single entity.\n\n**Arguments:**\n\n* `es` (ft.EntitySet) : EntitySet with a single entity to normalize\n\n**Returns:**\n\n* `new_es` (ft.EntitySet) : new normalized EntitySet\n\n
\n\n## Feature Labs\n\n \"Featuretools\"\n\n\nAutoNormalize is an open source project created by [Feature Labs](https://www.featurelabs.com/). To see the other open source projects we're working on visit Feature Labs [Open Source](https://www.featurelabs.com/open). If building impactful data science pipelines is important to you or your business, please [get in touch](https://www.featurelabs.com/contact/).\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/FeatureLabs/autonormalize", "keywords": "", "license": "BSD 3-clause", "maintainer": "", "maintainer_email": "", "name": "autonormalize", "package_url": "https://pypi.org/project/autonormalize/", "platform": "", "project_url": "https://pypi.org/project/autonormalize/", "project_urls": { "Homepage": "https://github.com/FeatureLabs/autonormalize" }, "release_url": "https://pypi.org/project/autonormalize/1.0.1/", "requires_dist": [ "numpy (>=1.13.3)", "pandas (>=0.23.0)", "tqdm (>=4.19.2)", "featuretools (>=0.0.0)" ], "requires_python": "", "summary": "a library for automated table normalization", "version": "1.0.1" }, "last_serial": 5689136, "releases": { "0.0.0": [ { "comment_text": "", "digests": { "md5": "1ea1690f517e1369b116f014050e7a91", "sha256": "6ebbbb6ddc2216a23f9d9c578e97db7eaf6a7d4394ed7cde016d5508ecdf8c0c" }, "downloads": -1, "filename": "autonormalize-0.0.0-py3-none-any.whl", "has_sig": false, "md5_digest": "1ea1690f517e1369b116f014050e7a91", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 611869, "upload_time": "2019-08-08T15:07:21", "url": "https://files.pythonhosted.org/packages/30/ce/67d91a6c7600f5040009650f43a19f2e27b0b82106bc2e35cbe2c89b5479/autonormalize-0.0.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "2a6de16038871068b65e555f81c80761", "sha256": "99bfeb0ea6b3a5fcc8cee4d905c468b07e1015132bf125e9e57d3aac7bb2db53" }, "downloads": -1, "filename": "autonormalize-0.0.0.tar.gz", "has_sig": false, "md5_digest": "2a6de16038871068b65e555f81c80761", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 585026, "upload_time": "2019-08-08T15:07:24", "url": "https://files.pythonhosted.org/packages/a2/11/1218f4998cc84a2a81467404577b4622276824aace2d6c1e67db1abf663e/autonormalize-0.0.0.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "499d24fd28236a3f044af3a763d438a6", "sha256": "4fa9aa522443c8cea6a92a279b9cd9a86edf40c9ae9dfc1bb062c0df871594c2" }, "downloads": -1, "filename": "autonormalize-0.1.2-py3-none-any.whl", "has_sig": false, "md5_digest": "499d24fd28236a3f044af3a763d438a6", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 611971, "upload_time": "2019-08-14T15:51:48", "url": "https://files.pythonhosted.org/packages/04/ce/50344ed2083ca8e68e2ff0e16c0317e8b1e0527687919c25799fab98773d/autonormalize-0.1.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "dab14be0f3b18f3e0d606c7c93f3a0bf", "sha256": "8c33e631139d93a092d657acf9d528f8b41294fdb4abbe4f1c8b380751790232" }, "downloads": -1, "filename": "autonormalize-0.1.2.tar.gz", "has_sig": false, "md5_digest": "dab14be0f3b18f3e0d606c7c93f3a0bf", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 585295, "upload_time": "2019-08-14T15:51:50", "url": "https://files.pythonhosted.org/packages/ae/f9/b5c6dbb3da526f0bd63a9190c7c97fb52a5855bf5c354ff02312f0a9728f/autonormalize-0.1.2.tar.gz" } ], "0.1.3": [ { "comment_text": "", "digests": { "md5": "8ba0a990dc9f8fe2ec5bb337f38f258d", "sha256": "b852eb06a760d90ad25aea20df62b5c121a89f5c39416caf2e441189c45568cd" }, "downloads": -1, "filename": "autonormalize-0.1.3-py3-none-any.whl", "has_sig": false, "md5_digest": "8ba0a990dc9f8fe2ec5bb337f38f258d", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 612230, "upload_time": "2019-08-15T16:06:53", "url": "https://files.pythonhosted.org/packages/44/79/bee6ca273c6f88d58331cdfffe15bc05e4cc09d5fbd8dd156ab6e819819a/autonormalize-0.1.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "9ff739c544e4f6b1826540e122ceea13", "sha256": "7691c280c56c3a155e5e90d8fc839a41044e658aca41802f985c2cd962a566cc" }, "downloads": -1, "filename": "autonormalize-0.1.3.tar.gz", "has_sig": false, "md5_digest": "9ff739c544e4f6b1826540e122ceea13", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 585611, "upload_time": "2019-08-15T16:06:55", "url": "https://files.pythonhosted.org/packages/54/ab/255420c792d98a6fe827326c618bec2a17455949c1ca18711cda0b5b3a3f/autonormalize-0.1.3.tar.gz" } ], "1.0.0": [ { "comment_text": "", "digests": { "md5": "e14e2bd7bdba0a99a795135543baa1dc", "sha256": "03ee8804ddf43d01461ca4cc9b48c622aff596ab2c933e28c6d08d0acce48ba4" }, "downloads": -1, "filename": "autonormalize-1.0.0-py3-none-any.whl", "has_sig": false, "md5_digest": "e14e2bd7bdba0a99a795135543baa1dc", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 612249, "upload_time": "2019-08-15T18:26:58", "url": "https://files.pythonhosted.org/packages/6b/63/0d8a9e44d029d6339a6d6200db6bb7f40e4ae438b749da127014a75269f9/autonormalize-1.0.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "2ff8a59bed40124b57f8915c5b5d9fb5", "sha256": "162ca1ce28622a1a1bc6ae32c217baf451744beaa5688b1d51beef0af8939608" }, "downloads": -1, "filename": "autonormalize-1.0.0.tar.gz", "has_sig": false, "md5_digest": "2ff8a59bed40124b57f8915c5b5d9fb5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 585714, "upload_time": "2019-08-15T18:27:00", "url": "https://files.pythonhosted.org/packages/6e/49/db5aa83d9590c2dfe7701679210ce2047df0ed007d512cd970261214b3e4/autonormalize-1.0.0.tar.gz" } ], "1.0.1": [ { "comment_text": "", "digests": { "md5": "7fee6a555ad9c88e1da1159267f44ebe", "sha256": "f8add04081f7ffa2e29681488319016a7c550f3f6f88f103174f5210108d7469" }, "downloads": -1, "filename": "autonormalize-1.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "7fee6a555ad9c88e1da1159267f44ebe", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 612495, "upload_time": "2019-08-16T19:15:11", "url": "https://files.pythonhosted.org/packages/39/39/2a5166a7b4b5508942dfb0ec6f4aee6338ceba5456f69b08de12a7892f8e/autonormalize-1.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "777cae4b243f5d5e2886eb0f0e857dc3", "sha256": "cb6fb9cea4e6017fefcddb25b719f93b5e75d68b588b28dc4da116cd50eada30" }, "downloads": -1, "filename": "autonormalize-1.0.1.tar.gz", "has_sig": false, "md5_digest": "777cae4b243f5d5e2886eb0f0e857dc3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 586301, "upload_time": "2019-08-16T19:15:13", "url": "https://files.pythonhosted.org/packages/99/95/e4e1aa3e0f7da1edf6f837bde9848d1e6fe756e30b817947427f5ce9db24/autonormalize-1.0.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "7fee6a555ad9c88e1da1159267f44ebe", "sha256": "f8add04081f7ffa2e29681488319016a7c550f3f6f88f103174f5210108d7469" }, "downloads": -1, "filename": "autonormalize-1.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "7fee6a555ad9c88e1da1159267f44ebe", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 612495, "upload_time": "2019-08-16T19:15:11", "url": "https://files.pythonhosted.org/packages/39/39/2a5166a7b4b5508942dfb0ec6f4aee6338ceba5456f69b08de12a7892f8e/autonormalize-1.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "777cae4b243f5d5e2886eb0f0e857dc3", "sha256": "cb6fb9cea4e6017fefcddb25b719f93b5e75d68b588b28dc4da116cd50eada30" }, "downloads": -1, "filename": "autonormalize-1.0.1.tar.gz", "has_sig": false, "md5_digest": "777cae4b243f5d5e2886eb0f0e857dc3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 586301, "upload_time": "2019-08-16T19:15:13", "url": "https://files.pythonhosted.org/packages/99/95/e4e1aa3e0f7da1edf6f837bde9848d1e6fe756e30b817947427f5ce9db24/autonormalize-1.0.1.tar.gz" } ] }