{ "info": { "author": "Kacper Kubara", "author_email": "kacper.kubara.ai@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# distython\nImplementation of state-of-the-art distance metrics from research papers which can handle mixed-type data and missing values.\nAt the moment, HEOM, HVDM and VDM are tested and working. VDM and HVDM has been released recently so please report bugs, if there are any.\nPlease feel free to help and contribute to the project as there is a lack of existing implementations of hetergeneous distance metrics.\n# Installation\nClone the repository with `git clone`.\nInstall the necessary packages with `pipenv install`\n\n# Example - HEOM\n```python\n# Example code of how the HEOM metric can be used together with Scikit-Learn\nimport numpy as np\nfrom sklearn.neighbors import NearestNeighbors\nfrom sklearn.datasets import load_boston\n# Importing a custom metric class\nfrom HEOM import HEOM\n\n# Load the dataset from sklearn\nboston = load_boston()\nboston_data = boston[\"data\"]\n# Categorical variables in the data\ncategorical_ix = [3, 8]\n# The problem here is that NearestNeighbors can't handle np.nan\n# So we have to set up the NaN equivalent\nnan_eqv = 12345\n\n# Introduce some missingness to the data for the purpose of the example\nrow_cnt, col_cnt = boston_data.shape\nfor i in range(row_cnt):\n for j in range(col_cnt):\n rand_val = np.random.randint(20, size=1)\n if rand_val == 10:\n boston_data[i, j] = nan_eqv\n\n# Declare the HEOM with a correct NaN equivalent value\nheom_metric = HEOM(boston_data, categorical_ix, nan_equivalents = [nan_eqv])\n\n# Declare NearestNeighbor and link the metric\nneighbor = NearestNeighbors(metric = heom_metric.heom)\n\n# Fit the model which uses the custom distance metric \nneighbor.fit(boston_data)\n\n# Return 5-Nearest Neighbors to the 1st instance (row 1)\nresult = neighbor.kneighbors(boston_data[0].reshape(1, -1), n_neighbors = 5)\nprint(result)\n```\n# Research Papers\nThe code have implemented based on the following literature:\nHEOM, VDM and HVDM: https://arxiv.org/pdf/cs/9701101.pdf\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/KacperKubara/distython", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "distython", "package_url": "https://pypi.org/project/distython/", "platform": "", "project_url": "https://pypi.org/project/distython/", "project_urls": { "Homepage": "https://github.com/KacperKubara/distython" }, "release_url": "https://pypi.org/project/distython/0.0.3/", "requires_dist": null, "requires_python": "", "summary": "Implementation of state-of-the-art distance metrics from research papers which can handle mixed-type data and missing values.", "version": "0.0.3" }, "last_serial": 5579707, "releases": { "0.0.3": [ { "comment_text": "", "digests": { "md5": "c06ae67838c4be25c7eb3685b4a4eb28", "sha256": "0f45014d21a821cbf495d4fedb96c4b21c2f82508f91cdd476e789609780500f" }, "downloads": -1, "filename": "distython-0.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "c06ae67838c4be25c7eb3685b4a4eb28", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 7974, "upload_time": "2019-07-24T20:53:44", "url": "https://files.pythonhosted.org/packages/56/fd/9f1569d7abdf7ad962884ff97c4215aa0e13f126d25b57b003d1a96faa99/distython-0.0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "0fd87fa8d89321bc028b7b1aab0374de", "sha256": "bc82a51554acd2f19f1bdbc25c5d9d349bbf2d085680a196358311cad6c71659" }, "downloads": -1, "filename": "distython-0.0.3.tar.gz", "has_sig": false, "md5_digest": "0fd87fa8d89321bc028b7b1aab0374de", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4296, "upload_time": "2019-07-24T20:53:46", "url": "https://files.pythonhosted.org/packages/26/28/14809dc5a22def53601c344e130d7ac9bccf01b2e078240258fbcb0befa5/distython-0.0.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "c06ae67838c4be25c7eb3685b4a4eb28", "sha256": "0f45014d21a821cbf495d4fedb96c4b21c2f82508f91cdd476e789609780500f" }, "downloads": -1, "filename": "distython-0.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "c06ae67838c4be25c7eb3685b4a4eb28", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 7974, "upload_time": "2019-07-24T20:53:44", "url": "https://files.pythonhosted.org/packages/56/fd/9f1569d7abdf7ad962884ff97c4215aa0e13f126d25b57b003d1a96faa99/distython-0.0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "0fd87fa8d89321bc028b7b1aab0374de", "sha256": "bc82a51554acd2f19f1bdbc25c5d9d349bbf2d085680a196358311cad6c71659" }, "downloads": -1, "filename": "distython-0.0.3.tar.gz", "has_sig": false, "md5_digest": "0fd87fa8d89321bc028b7b1aab0374de", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4296, "upload_time": "2019-07-24T20:53:46", "url": "https://files.pythonhosted.org/packages/26/28/14809dc5a22def53601c344e130d7ac9bccf01b2e078240258fbcb0befa5/distython-0.0.3.tar.gz" } ] }