{ "info": { "author": "David Atienza", "author_email": "datienza@fi.upm.es", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3" ], "description": "This repository implements Gaussian [Kernel Density Estimation](https://en.wikipedia.org/wiki/Kernel_density_estimation) using \nOpenCL to achieve important performance gains.\n\n\nThe Python interface is based on the [Scipy's `gaussian_kde`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde.html) class, \nso it should be pretty easy to replace the CPU implementation of `gaussian_kde` with the\nOpenCL implementation in this repository `gaussian_kde_ocl`.\n\n\n# Example Code\n\n\n```python\nimport numpy as np\nfrom kde_ocl import gaussian_kde_ocl\n\n# Generate dummy training data (10000 instances of 2D data)\ntrain = np.random.multivariate_normal([0,0], [[1,0],[0,1]], 10000)\n# Generate dummy test data (10000 instances of 2D data)\ntest = np.random.multivariate_normal([0,0], [[1,0],[0,1]], 100)\n\n# Train the KDE model\nkde = gaussian_kde_ocl(train)\n\n# Get the pdf of each test point. This is equivalent to kde.pdf(test)\npdf = kde(test)\n\n# Get the logpdf of each test point. This is equivalent to kde.pdf(test)\nlogpdf = kde.logpdf(test)\n```\n\n**The interface is mostly the same as Scipy's `gaussian_kde`, but the axis order is changed**. For example, training a \nScipy's `gaussian_kde` with a numpy array of shape (10000, 2) is interpreted as two instances of 10000 dimensions. In\n`gaussian_kde_ocl`, this data is interpreted as 10000 instances of 2 dimensions. This change makes easier to work with\n`pandas` dataframes:\n\n```python\nimport pandas as pd\nimport numpy as np\nfrom kde_ocl import gaussian_kde_ocl\n\n# Create pandas dataframe \na = np.random.normal(0, 1, 5000)\nb = np.random.normal(3.2, np.sqrt(1.8), 5000)\ndata = pd.DataFrame({'a': a, 'b': b})\n\n# Train KDE model\nkde = gaussian_kde_ocl(data.values)\n\n# Evaluate one point\nlogpdf = kde.logpdf([1.1, 2.3])\n```\n\n# Performance\n\nThis is a comparison of the `gaussian_kde_ocl` and Scipy's `gaussian_kde` with 2D data and the following configuration:\n\n- CPU: Intel i7-6700K.\n- GPU: AMD RX 460.\n- Python 3.7.3\n- Ubuntu 16.04\n\n\n## ``pdf()`` method\n\nTraining instances / Test instances | `gaussian_kde_ocl.pdf()` | `gaussian_kde.pdf()` | Speedup |\n------------------------------------|-----------------------------| --------------------------------|-----------------|\n100,000 / 1,000 | 218.6474 ± 1.5901 ms | 1,911.0764 ± 50.8762 ms | 8.74x |\n1,000 / 10,000,000 | 18.8643 ± 0.07322 s | 237.3429 ± 1.1765 s | 12.58x |\n100 / 10,000 | 4.4533 ± 0.7297 ms | 18.0684 ± 0.3302 ms | 4.46x |\n\n## ``logpdf()`` method\n\n\nTraining instances / Test instances | `gaussian_kde_ocl.logpdf()` | `gaussian_kde.logpdf()` | Speedup |\n------------------------------------|-----------------------------|---------------------------------|---------|\n100,000 / 1,000 | 261.1466 ± 6.3932 ms | 6,798.4730 ± 420.2878 ms | 26.03x |\n1,000 / 10,000,000 | 36.3143 ± 0.02916 s | MemoryError | NA |\n100 / 10,000 | 8.827 ± 0.7442 ms | 34.1114 ± 1.3060 ms | 3.86x |\n\n\n# Current Limitations\n\n- Only C order (the default) numpy arrays can be used as traning/test datasets.\n- Only Gaussian kernels are implemented.\n- OpenCL device is selected automatically.\n\n# Dependencies\n\nThe library is Python 2/3 compatible. Currently, is tested in Ubuntu 16.04, but should be compatible with other operating systems where\nthere are OpenCL GPU support.\n\n## Python Dependencies\n\nThe project has the following Python dependencies:\n\n``\ncffi\nnumpy\nsix\n``\n\nYou can install them with:\n\n``\npip install cffi numpy six\n``\n\n## Rust\n\nThe [Rust](https://www.rust-lang.org/) compiler must be installed in the system. Check out [https://www.rust-lang.org/tools/install](https://www.rust-lang.org/tools/install) for more information.\n\nThe default Rust toolchain is used to compile the library, so **make sure to install a Rust toolchain (32 vs 64 bits) compatible with the Python interpreter version (32 vs 64 bits).**\n\n## OpenCL\n\nThe GPU drivers that enable OpenCL should be installed.\n\n# Installation\n\nUse pip:\n\n``\npip install kde_ocl\n``\n\nAlternatively, clone the repository and use the setup script:\n\n``\npython setup.py install\n``\n\n# Testing\n\nTests are run using pytest and requires `scipy` to compare `gaussian_kde_ocl` with Scipy's `gaussian_kde`. Install them:\n\n``\npip pytest scipy\n``\n\nRun the tests with:\n\n``\npytest\n``\n\n## Benchmarks\n\nTo run the benchmarks, `pytest-benchmark` is needed:\n\n``\npip pytest-benchmark\n``\n\nThen, execute the tests with benchmarks enabled:\n\n``\npytest --times\n``\n\nTo run only the OpenCL benchmarks:\n\n``\npytest --times-ocl\n``\n\nTo run only the Scipy's `gaussian_kde` benchmarks:\n\n\n``\npytest --times-scipy\n``\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/davenza/kde_ocl", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "kde-ocl", "package_url": "https://pypi.org/project/kde-ocl/", "platform": "any", "project_url": "https://pypi.org/project/kde-ocl/", "project_urls": { "Homepage": "https://github.com/davenza/kde_ocl" }, "release_url": "https://pypi.org/project/kde-ocl/0.1.0/", "requires_dist": [ "milksnake", "numpy", "six" ], "requires_python": "", "summary": "An OpenCL implementation of Kernel Density Estimation", "version": "0.1.0" }, "last_serial": 5927518, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "1ec076a4d422e4a443ce5e45e163dc00", "sha256": "3cbf7ee5ff399fcd073caa5a15df75c209bbbba0c428dfe1e59d9e33c25f92b2" }, "downloads": -1, "filename": "kde_ocl-0.1.0-py2.py3-none-win_amd64.whl", "has_sig": false, "md5_digest": "1ec076a4d422e4a443ce5e45e163dc00", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 254522, "upload_time": "2019-10-04T09:33:58", "url": "https://files.pythonhosted.org/packages/ec/79/877f22e170e1667ade263e64294849524df5b5b58a31f3e292d80055c5e0/kde_ocl-0.1.0-py2.py3-none-win_amd64.whl" }, { "comment_text": "", "digests": { "md5": "f74c9c61cc360fa796a47f7a90858f78", "sha256": "54284e78b5a2cb1e09d2372bae19174c4a7542573e2883ef2fec90db8a81fb79" }, "downloads": -1, "filename": "kde_ocl-0.1.0.tar.gz", "has_sig": false, "md5_digest": "f74c9c61cc360fa796a47f7a90858f78", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 18755, "upload_time": "2019-10-04T09:34:01", "url": "https://files.pythonhosted.org/packages/bb/7b/8d8108063010ce5e8db7f4b61b403a4cd0bb16bb2fc2d75d1b13c129d3e6/kde_ocl-0.1.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "1ec076a4d422e4a443ce5e45e163dc00", "sha256": "3cbf7ee5ff399fcd073caa5a15df75c209bbbba0c428dfe1e59d9e33c25f92b2" }, "downloads": -1, "filename": "kde_ocl-0.1.0-py2.py3-none-win_amd64.whl", "has_sig": false, "md5_digest": "1ec076a4d422e4a443ce5e45e163dc00", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 254522, "upload_time": "2019-10-04T09:33:58", "url": "https://files.pythonhosted.org/packages/ec/79/877f22e170e1667ade263e64294849524df5b5b58a31f3e292d80055c5e0/kde_ocl-0.1.0-py2.py3-none-win_amd64.whl" }, { "comment_text": "", "digests": { "md5": "f74c9c61cc360fa796a47f7a90858f78", "sha256": "54284e78b5a2cb1e09d2372bae19174c4a7542573e2883ef2fec90db8a81fb79" }, "downloads": -1, "filename": "kde_ocl-0.1.0.tar.gz", "has_sig": false, "md5_digest": "f74c9c61cc360fa796a47f7a90858f78", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 18755, "upload_time": "2019-10-04T09:34:01", "url": "https://files.pythonhosted.org/packages/bb/7b/8d8108063010ce5e8db7f4b61b403a4cd0bb16bb2fc2d75d1b13c129d3e6/kde_ocl-0.1.0.tar.gz" } ] }