{ "info": { "author": "Feature Labs, Inc.", "author_email": "support@featurelabs.com", "bugtrack_url": null, "classifiers": [], "description": "# categorical-encoding\n\n[![CircleCI](https://circleci.com/gh/FeatureLabs/categorical_encoding.svg?style=shield&circle-token=625a1d6124154059267ea8477324156b1d645fa9)](https://circleci.com/gh/FeatureLabs/categorical_encoding)\n\ncategorical-encoding is a Python library for encoding categorical data, intended for use with [Featuretools](https://github.com/Featuretools/featuretools). \ncategorical-encoding allows for easy encoding of data and integration into Featuretools pipeline for automated feature engineering within the machine learning pipeline.\n\n### Install\n```shell\npython -m pip install \"featuretools[categorical_encoding]\"\n```\n\n### Description\nFor more general questions regarding how to use categorical encoding in a machine learning pipeline, consult the guides located in the [categorical encoding github repository](https://github.com/FeatureLabs/categorical_encoding/tree/master/guides).\n\n```py\n>>> feature_matrix\n product_id purchased value countrycode\nid \n0 coke zero True 0.0 US\n1 coke zero True 5.0 US\n2 coke zero True 10.0 US\n3 car True 15.0 US\n4 car True 20.0 US\n5 toothpaste True 0.0 AL\n```\nIntegrates into standard procedure of train/test split within applied machine learning processes.\n```py\n>>> train_data = feature_matrix.iloc[[0, 1, 4, 5]]\n>>> train_data\n product_id purchased value countrycode\nid \n0 coke zero True 0.0 US\n1 coke zero True 5.0 US\n4 car True 20.0 US\n5 toothpaste True 0.0 AL\n>>> test_data = feature_matrix.iloc[[2, 3]]\n>>> test_data\n product_id purchased value countrycode\nid \n2 coke zero True 10.0 US\n3 car True 15.0 US\n```\n```py\n>>> import categorical_encoding as ce\n>>> encoder = ce.Encoder(method='leave_one_out')\n>>> train_enc = encoder.fit_transform(train_data, features, train_data['value'])\n>>> test_enc = encoder.transform(test_data)\n```\nEncoder fits and transforms to train data, and then transforms test data using its learned fitted encoding.\n```py\n>>> train_enc\n PRODUCT_ID_leave_one_out purchased value COUNTRYCODE_leave_one_out\nid \n0 5.00 True 0.0 12.50\n1 0.00 True 5.0 10.00\n4 6.25 True 20.0 2.50\n5 6.25 True 0.0 6.25\n>>> test_enc\n PRODUCT_ID_leave_one_out purchased value COUNTRYCODE_leave_one_out\nid \n2 2.50 True 10.0 8.333333\n3 6.25 True 15.0 8.333333\n```\nSupports easy integration into Featuretools through its support and use of features.\nFirst, learn features through fitting an encoder to data. Then, when new data comes in, easily prepare it for your trained machine learning model by using those features to seamlessly generate new tables of encoded data.\n```py\n>>> features = encoder.get_features()\n[,\n ,\n ,\n ]\n>>> features_encoded = enc.get_features()\n>>> fm2_encoded = ft.calculate_feature_matrix(features_encoded, es, instance_ids=[6,7])\n>>> fm2_encoded\n PRODUCT_ID_leave_one_out purchased value COUNTRYCODE_leave_one_out\nid \n6 6.25 True 1.0 6.25\n7 6.25 True 2.0 6.25\n```\n\n## Feature Labs\n\n \"Featuretools\"\n\n\ncategorical-encoding is an open source project created by [Feature Labs](https://www.featurelabs.com/). To see the other open source projects we're working on visit Feature Labs [Open Source](https://www.featurelabs.com/open). If building impactful data science pipelines is important to you or your business, please [get in touch](https://www.featurelabs.com/contact/).\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://www.featurelabs.com/", "keywords": "feature engineering data science machine learning categorical encoding", "license": "BSD 3-clause", "maintainer": "", "maintainer_email": "", "name": "categorical-encoding", "package_url": "https://pypi.org/project/categorical-encoding/", "platform": "", "project_url": "https://pypi.org/project/categorical-encoding/", "project_urls": { "Homepage": "http://www.featurelabs.com/" }, "release_url": "https://pypi.org/project/categorical-encoding/0.3.0/", "requires_dist": [ "featuretools (>=0.9.0)", "jupyter (==1.0.0)", "nbconvert (==5.5.0)", "nbsphinx (==0.4.2)", "category-encoders (>=2.0.0)" ], "requires_python": "", "summary": "categorical encoding for featuretools", "version": "0.3.0" }, "last_serial": 5699157, "releases": { "0.0.0": [ { "comment_text": "", "digests": { "md5": "52d3eae1e8261ebe6fc33c09877602ec", "sha256": "6f5ef9820e84f829aa186734f83dcdeef98dd7f297c41cb80f198d58e6723f4a" }, "downloads": -1, "filename": "categorical_encoding-0.0.0-py3-none-any.whl", "has_sig": false, "md5_digest": "52d3eae1e8261ebe6fc33c09877602ec", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 11238, "upload_time": "2019-08-07T16:00:13", "url": "https://files.pythonhosted.org/packages/dd/36/5299bce63ba9d1c9039f0fd413f92f2a780aa00ff14e91875c35ddfbca10/categorical_encoding-0.0.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "905b319aa3e97224e989c8facc733e3f", "sha256": "27ab145d44fa0ad97cc89194799879bd5c685a2f4033d734de9b4d294958d72a" }, "downloads": -1, "filename": "categorical_encoding-0.0.0.tar.gz", "has_sig": false, "md5_digest": "905b319aa3e97224e989c8facc733e3f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6014, "upload_time": "2019-08-07T16:00:16", "url": "https://files.pythonhosted.org/packages/b0/58/89315c49d4292c0a71c9507e1a033644fa5645d37fa1c7a571ae92ab50dc/categorical_encoding-0.0.0.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "601a8ec15f4c157fe3b3595b163593b5", "sha256": "a32d785b159dd38ca83fc158c41810af39c648a71fc4605cbcc3791493b25617" }, "downloads": -1, "filename": "categorical_encoding-0.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "601a8ec15f4c157fe3b3595b163593b5", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 12355, "upload_time": "2019-08-07T19:06:17", "url": "https://files.pythonhosted.org/packages/7c/ee/1be63369ef56b1fd584a92e6e92a65fdc5bb5f56f7bb3494856824ed028a/categorical_encoding-0.0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "ed95c0c3794594ccedf6fde7b8d93739", "sha256": "c8c64a9fe10b84cf2c1af6c3e150dbdc49a6babe70831a1dabdbd3b4023106c8" }, "downloads": -1, "filename": "categorical_encoding-0.0.3.tar.gz", "has_sig": false, "md5_digest": "ed95c0c3794594ccedf6fde7b8d93739", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8189, "upload_time": "2019-08-07T19:06:19", "url": "https://files.pythonhosted.org/packages/8f/5a/67b6a19036d7d05af266d469e1d72613e99eda67345472e943bb1ba362d2/categorical_encoding-0.0.3.tar.gz" } ], "0.1.0": [ { "comment_text": "", "digests": { "md5": "637a0e371d2ba1a8614d9bb16476fc9a", "sha256": "0c5e78009c9a4634fdaea1696d3e26b8b87536c5eb60992a9dbf34734bae88f4" }, "downloads": -1, "filename": "categorical_encoding-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "637a0e371d2ba1a8614d9bb16476fc9a", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 14702, "upload_time": "2019-08-12T14:47:57", "url": "https://files.pythonhosted.org/packages/e7/74/75dd2c9232ad35277b31c563aa70ea63885f77474470e99ac25c8b24ba56/categorical_encoding-0.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3126da0d1d186a52b7985baec3ccfdd3", "sha256": "c9538389c7d231bd1c53c5b2ded19660e4c63b384f277127fb1afddc8f874179" }, "downloads": -1, "filename": "categorical_encoding-0.1.0.tar.gz", "has_sig": false, "md5_digest": "3126da0d1d186a52b7985baec3ccfdd3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9332, "upload_time": "2019-08-12T14:47:59", "url": "https://files.pythonhosted.org/packages/31/ea/8dd1a6fa203a0a92434c2521fbc94ed44eaea31ac12614ce76b1cbcce2f8/categorical_encoding-0.1.0.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "289c3868ad24f22c7f526ff63cd59f1d", "sha256": "2ae85b458436d5e95b5022e38f8145840475e075e756c9e74c9f60a58633e0e4" }, "downloads": -1, "filename": "categorical_encoding-0.2.0-py3-none-any.whl", "has_sig": false, "md5_digest": "289c3868ad24f22c7f526ff63cd59f1d", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 19129, "upload_time": "2019-08-13T16:14:11", "url": "https://files.pythonhosted.org/packages/05/1a/8c069611137540925080203bb83ae9a78ad49ccf61a59bbd64d3bf4e8da1/categorical_encoding-0.2.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1ff6ec4a4526602745a17e5f6a47ea16", "sha256": "3d9e162f3571c23f91183c5e49c7961047637864b208d871683dc5b61eb40877" }, "downloads": -1, "filename": "categorical_encoding-0.2.0.tar.gz", "has_sig": false, "md5_digest": "1ff6ec4a4526602745a17e5f6a47ea16", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10447, "upload_time": "2019-08-13T16:14:13", "url": "https://files.pythonhosted.org/packages/b2/e2/a622c93eff44bd7d971e30b10b1816e0b583de1620be25621ce60122eb60/categorical_encoding-0.2.0.tar.gz" } ], "0.3.0": [ { "comment_text": "", "digests": { "md5": "21a7a82885364643aed1555f7d9f6e7a", "sha256": "66c77d5ca05724ccd23b8e3a43284494bd4fe23ef6828ead78ea0f1bbc72635d" }, "downloads": -1, "filename": "categorical_encoding-0.3.0-py3-none-any.whl", "has_sig": false, "md5_digest": "21a7a82885364643aed1555f7d9f6e7a", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 19428, "upload_time": "2019-08-19T15:51:12", "url": "https://files.pythonhosted.org/packages/82/8b/99c230b6c539467f32b86370fed7754aedb4f5e6b2a9632a3751330d5246/categorical_encoding-0.3.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1a1656af1658cfa1e2e25ff96d716a73", "sha256": "4626f550b944e5865151b2a85049a992d76b6f1f75c93058c354e39ae3176e10" }, "downloads": -1, "filename": "categorical_encoding-0.3.0.tar.gz", "has_sig": false, "md5_digest": "1a1656af1658cfa1e2e25ff96d716a73", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11006, "upload_time": "2019-08-19T15:51:14", "url": "https://files.pythonhosted.org/packages/a6/b3/a1929a491000d6039329eb940395b9c39eb49c4024ca50a1561439725c5e/categorical_encoding-0.3.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "21a7a82885364643aed1555f7d9f6e7a", "sha256": "66c77d5ca05724ccd23b8e3a43284494bd4fe23ef6828ead78ea0f1bbc72635d" }, "downloads": -1, "filename": "categorical_encoding-0.3.0-py3-none-any.whl", "has_sig": false, "md5_digest": "21a7a82885364643aed1555f7d9f6e7a", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 19428, "upload_time": "2019-08-19T15:51:12", "url": "https://files.pythonhosted.org/packages/82/8b/99c230b6c539467f32b86370fed7754aedb4f5e6b2a9632a3751330d5246/categorical_encoding-0.3.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1a1656af1658cfa1e2e25ff96d716a73", "sha256": "4626f550b944e5865151b2a85049a992d76b6f1f75c93058c354e39ae3176e10" }, "downloads": -1, "filename": "categorical_encoding-0.3.0.tar.gz", "has_sig": false, "md5_digest": "1a1656af1658cfa1e2e25ff96d716a73", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11006, "upload_time": "2019-08-19T15:51:14", "url": "https://files.pythonhosted.org/packages/a6/b3/a1929a491000d6039329eb940395b9c39eb49c4024ca50a1561439725c5e/categorical_encoding-0.3.0.tar.gz" } ] }