{ "info": { "author": "Ji Zhang", "author_email": "", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Programming Language :: Python :: 3" ], "description": "[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\n# datacleanbot\nAutomated Data Cleaning Tool.\nThe main goal is to develop a Python tool ``datacleanbot`` such that:\n Given a random parsed raw dataset representing a supervised learning problem, the Python tool is capable of automatically identifying the potential issues and reporting the results and recommendations to the end-user in an effective way.\n\n## Install\n\n```sh\n$ pip install datacleanbot\n```\n\n## QuickStart\n\nAcquire data from OpenML:\n\n >>> import openml as oml\n >>> data = oml.datasets.get_dataset(id) # id: openml dataset id\n >>> X, y, features = data.get_data(target=data.default_target_attribute, return_attribute_names=True)\n >>> Xy = data.get_data()\n\nAutoclean data with datacleanbot\n\n >>> import datacleanbot.dataclean as dc\n >>> Xy = dc.autoclean(Xy, data.name, features)\n\n\n## Description\n\n``datacleanbot`` is equipped with the following capabilities:\n* Present an overview report of the given dataset\n * The most important features\n * Statistical information (e.g., mean, max, min)\n * **Data types of features**\n* Clean common data problems in the raw dataset\n * Duplicated records\n * Inconsistent column names\n * **Missing values**\n * **Outliers**\n\nThe three aspects ``datacleanbot`` meaningfully automates are marked in bold.\n\n## User's Guide\n\nThe user's guide can be found at [datacleanbot](https://datacleanbot.readthedocs.io/en/latest/).\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/Ji-Zhang/datacleanbot", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "datacleanbot", "package_url": "https://pypi.org/project/datacleanbot/", "platform": "", "project_url": "https://pypi.org/project/datacleanbot/", "project_urls": { "Homepage": "https://github.com/Ji-Zhang/datacleanbot" }, "release_url": "https://pypi.org/project/datacleanbot/0.4/", "requires_dist": [ "numpy (>=1.14.2)", "pandas", "scikit-learn (>=0.20.0)", "scipy (>=1.0.0)", "seaborn (>=0.8)", "matplotlib (>=2.2.2)", "missingno (>=0.4.0)", "fancyimpute", "numba (>=0.27)", "pystruct (>=0.2.4)", "cvxopt (>=1.1.9)", "pymc3 (>=3.4)", "pyro-ppl (>=0.2)", "rpy2 (==2.9.4)" ], "requires_python": "", "summary": "automated data cleaning tool", "version": "0.4" }, "last_serial": 4984939, "releases": { "0.1": [ { "comment_text": "", "digests": { "md5": "f364906715886b42b90ae7e037891f4d", "sha256": "8c730ebcd37411bf75f74c9f675e57ff8cd194f3d93abd56a9720d44e796682b" }, "downloads": -1, "filename": "datacleanbot-0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "f364906715886b42b90ae7e037891f4d", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 197090, "upload_time": "2019-03-22T21:53:35", "url": "https://files.pythonhosted.org/packages/a7/41/92dba7c69091e26fa66de7892d3a0e184e84982b9494cd89113d65bb265f/datacleanbot-0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "caa350f39dddd7100df9a4b8e6dff314", "sha256": "c3519900d2b6aaba3bb13ce5d9d378d68397ee4c52b781c3580193d2d2af3294" }, "downloads": -1, "filename": "datacleanbot-0.1.tar.gz", "has_sig": false, "md5_digest": "caa350f39dddd7100df9a4b8e6dff314", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 150816, "upload_time": "2019-03-22T21:53:37", "url": "https://files.pythonhosted.org/packages/7b/e6/8e0e546360073ab1dfda8653398617a255916e23d9a54b37ac5360051925/datacleanbot-0.1.tar.gz" } ], "0.2": [ { "comment_text": "", "digests": { "md5": "e1199fb2ceeeaac2856ec5cd9c490732", "sha256": "635baf5a97752236675b5ade7c995afda92ad0cb752850deaa3c0177ee1c2536" }, "downloads": -1, "filename": "datacleanbot-0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "e1199fb2ceeeaac2856ec5cd9c490732", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 197068, "upload_time": "2019-03-22T23:36:23", "url": "https://files.pythonhosted.org/packages/ec/84/7586133fa0da821077ba62551ade88b86661ec522ceebd32634bdfc2ac9b/datacleanbot-0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3edec63f624a9444b3507f7e2740fee5", "sha256": "c0eb7897ca388bc393e94c5894849d99427082d54425119156db993132373fc9" }, "downloads": -1, "filename": "datacleanbot-0.2.tar.gz", "has_sig": false, "md5_digest": "3edec63f624a9444b3507f7e2740fee5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 151007, "upload_time": "2019-03-22T23:36:25", "url": "https://files.pythonhosted.org/packages/a5/b5/9ba669927f22655625fa144aac8f356ae2cf360044dcc1220412099cdc70/datacleanbot-0.2.tar.gz" } ], "0.3": [ { "comment_text": "", "digests": { "md5": "1b36a37c4dd7a26080aeedf0592e11d9", "sha256": "e109027d887ad2f87f0d20ea1007d974d321b4834c4b66be553a41ad140d5f94" }, "downloads": -1, "filename": "datacleanbot-0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "1b36a37c4dd7a26080aeedf0592e11d9", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 197896, "upload_time": "2019-03-24T13:43:46", "url": "https://files.pythonhosted.org/packages/93/c5/c74b44f675d21c4f75294ea70800c8c862babd9a5de2d81b32e2f07083b8/datacleanbot-0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "0b6fdd65757d1be08b92d2222f6db2e5", "sha256": "1c5d475120714c324b5c01c9ca5766b092698f14f71a888b4e6289496bb12675" }, "downloads": -1, "filename": "datacleanbot-0.3.tar.gz", "has_sig": false, "md5_digest": "0b6fdd65757d1be08b92d2222f6db2e5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 151315, "upload_time": "2019-03-24T13:43:47", "url": "https://files.pythonhosted.org/packages/b7/ee/cb516cb7b476a1c8452a5072a8d0dd078cab03c71ab4d0d463cec5e6151d/datacleanbot-0.3.tar.gz" } ], "0.4": [ { "comment_text": "", "digests": { "md5": "342e5ff05d893761a2c76b2bee8fad06", "sha256": "e2007c2591fe9c6eb0b0bcf1cf9533b79480974e92a76303096c2288c17984da" }, "downloads": -1, "filename": "datacleanbot-0.4-py3-none-any.whl", "has_sig": false, "md5_digest": "342e5ff05d893761a2c76b2bee8fad06", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 197861, "upload_time": "2019-03-25T22:36:14", "url": "https://files.pythonhosted.org/packages/3f/1e/0ce3bdd6b7889cd0a7884bab83c8e13709f0794999afab08754eb69b33ff/datacleanbot-0.4-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "194695949cc5d7b613cdea686acaea0e", "sha256": "c03636c24795d5b609059abb73d0be99650c93e504f569661614e41e92a1ef88" }, "downloads": -1, "filename": "datacleanbot-0.4.tar.gz", "has_sig": false, "md5_digest": "194695949cc5d7b613cdea686acaea0e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 151285, "upload_time": "2019-03-25T22:36:18", "url": "https://files.pythonhosted.org/packages/84/58/fd158c4bc83ab6435c507f8935fd10f4f83e59901a5424637765b0b6fa0d/datacleanbot-0.4.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "342e5ff05d893761a2c76b2bee8fad06", "sha256": "e2007c2591fe9c6eb0b0bcf1cf9533b79480974e92a76303096c2288c17984da" }, "downloads": -1, "filename": "datacleanbot-0.4-py3-none-any.whl", "has_sig": false, "md5_digest": "342e5ff05d893761a2c76b2bee8fad06", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 197861, "upload_time": "2019-03-25T22:36:14", "url": "https://files.pythonhosted.org/packages/3f/1e/0ce3bdd6b7889cd0a7884bab83c8e13709f0794999afab08754eb69b33ff/datacleanbot-0.4-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "194695949cc5d7b613cdea686acaea0e", "sha256": "c03636c24795d5b609059abb73d0be99650c93e504f569661614e41e92a1ef88" }, "downloads": -1, "filename": "datacleanbot-0.4.tar.gz", "has_sig": false, "md5_digest": "194695949cc5d7b613cdea686acaea0e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 151285, "upload_time": "2019-03-25T22:36:18", "url": "https://files.pythonhosted.org/packages/84/58/fd158c4bc83ab6435c507f8935fd10f4f83e59901a5424637765b0b6fa0d/datacleanbot-0.4.tar.gz" } ] }