{ "info": { "author": "Koji Makiyama, Ameya Daigavane", "author_email": "hoxo.smile@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Topic :: Scientific/Engineering :: Artificial Intelligence" ], "description": "\n\n[![Build\nStatus](https://travis-ci.org/hoxo-m/densratio_py.svg?branch=master)](https://travis-ci.org/hoxo-m/densratio_py)\n[![PyPI](https://img.shields.io/pypi/v/densratio.svg)](https://pypi.python.org/pypi/densratio)\n[![PyPI](https://img.shields.io/pypi/dm/densratio.svg)](https://pypi.python.org/pypi/densratio)\n[![Coverage\nStatus](https://coveralls.io/repos/github/hoxo-m/densratio_py/badge.svg?branch=master)](https://coveralls.io/github/hoxo-m/densratio_py?branch=master)\n\n## 1\\. Overview\n\n**Density ratio estimation** is described as follows: for given two data\nsamples `x1` and `x2` from unknown distributions `p(x)` and `q(x)`\nrespectively, estimate `w(x) = p(x) / q(x)`, where `x1` and `x2` are\nd-dimensional real numbers.\n\nThe estimated density ratio function `w(x)` can be used in many\napplications such as the inlier-based outlier detection \\[1\\] and\ncovariate shift adaptation \\[2\\]. Other useful applications for density\nratio estimation were summarized by Sugiyama et al. (2012) in \\[3\\].\n\nThe package **densratio** provides a function `densratio()` that returns\nan object with a method to estimate density ratio as\n`compute_density_ratio()`.\n\nFurther, the alpha-relative density ratio `p(x)/(alpha * p(x) + (1 -\nalpha) * q(x))` (where alpha is in the range \\[0, 1\\]) can also be\nestimated. When alpha is 0, this reduces to the ordinary density ratio\n`w(x)`. The alpha-relative PE-divergence and KL-divergence between\n`p(x)` and `q(x)` are also computed.\n\n![](https://raw.githubusercontent.com/hoxo-m/densratio_py/master/README_files/figure-gfm/compare-true-estimate-1.png)\n\nFor example,\n\n``` python\nimport numpy as np\nfrom scipy.stats import norm\nfrom densratio import densratio\n\nnp.random.seed(1)\nx = norm.rvs(size=500, loc=0, scale=1./8)\ny = norm.rvs(size=500, loc=0, scale=1./2)\nalpha = 0.1\ndensratio_obj = densratio(x, y, alpha=alpha)\nprint(densratio_obj)\n```\n\ngives the following output:\n\n #> Method: RuLSIF\n #> \n #> Alpha: 0.1\n #> \n #> Kernel Information:\n #> Kernel type: Gaussian\n #> Number of kernels: 100\n #> Bandwidth(sigma): 0.1\n #> Centers: matrix([[-0.09591373],..\n #> \n #> Kernel Weights (theta):\n #> array([0.04990797, 0.0550548 , 0.04784736, 0.04951904, 0.04840418,..\n #> \n #> Regularization Parameter (lambda): 0.1\n #> \n #> Alpha-Relative PE-Divergence: 0.6187941335987046\n #> \n #> Alpha-Relative KL-Divergence: 0.7037648129307482\n #> \n #> Function to Estimate Density Ratio:\n #> compute_density_ratio(x)\n #> \n\nIn this case, the true density ratio `w(x)` is known, so we can compare\n`w(x)` with the estimated density ratio `w-hat(x)`. The code below gives\nthe plot shown above.\n\n``` python\nfrom matplotlib import pyplot as plt\nfrom numpy import linspace\n\ndef true_alpha_density_ratio(sample):\n return norm.pdf(sample, 0, 1./8) / (alpha * norm.pdf(sample, 0, 1./8) + (1 - alpha) * norm.pdf(sample, 0, 1./2))\n\ndef estimated_alpha_density_ratio(sample):\n return densratio_obj.compute_density_ratio(sample)\n\nsample_points = np.linspace(-1, 3, 400)\nplt.plot(sample_points, true_alpha_density_ratio(sample_points), 'b-', label='True Alpha-Relative Density Ratio')\nplt.plot(sample_points, estimated_alpha_density_ratio(sample_points), 'r-', label='Estimated Alpha-Relative Density Ratio')\nplt.title(\"Alpha-Relative Density Ratio - Normal Random Variables (alpha={:03.2f})\".format(alpha))\nplt.legend()\nplt.show()\n```\n\n## 2\\. Installation\n\nYou can install the package from\n[PyPI](https://pypi.org/project/densratio/).\n\n``` :sh\n$ pip install densratio\n```\n\nAlso, you can install the package from\n[GitHub](https://github.com/hoxo-m/densratio_py).\n\n``` :sh\n$ pip install git+https://github.com/hoxo-m/densratio_py.git\n```\n\nThe source code for **densratio** package is available on GitHub at\n.\n\n## 3\\. Details\n\n### 3.1. Basics\n\nThe package provides `densratio()`. The function returns an object that\nhas a function to compute estimated density ratio.\n\nFor data samples `x` and `y`,\n\n``` python\nfrom scipy.stats import norm\nfrom densratio import densratio\n\nx = norm.rvs(size = 200, loc = 1, scale = 1./8)\ny = norm.rvs(size = 200, loc = 1, scale = 1./2)\nresult = densratio(x, y)\n```\n\nIn this case, `result.compute_density_ratio()` can compute estimated\ndensity ratio.\n\n``` python\nfrom matplotlib import pyplot as plt\n\ndensity_ratio = result.compute_density_ratio(y)\n\nplt.plot(y, density_ratio, \"o\")\nplt.xlabel(\"x\")\nplt.ylabel(\"Density Ratio\")\nplt.show()\n```\n\n![](https://raw.githubusercontent.com/hoxo-m/densratio_py/master/README_files/figure-gfm/plot-estimated-density-ratio-1.png)\n\n### 3.2. The Method\n\nThe package estimates density ratio by the RuLSIF method.\n\n**RuLSIF** (Relative unconstrained Least-Squares Importance Fitting)\nestimates the alpha-relative density ratio by minimizing the squared\nloss between the true and estimated alpha-relative ratios. You can find\nmore information in Hido et al. (2011) \\[1\\] and Liu et al (2013) \\[4\\].\n\nThe method assumes that the alpha-relative density ratio is represented\nby a linear kernel model:\n\n`w(x) = theta1 * K(x, c1) + theta2 * K(x, c2) + ... + thetab * K(x, cb)`\nwhere `K(x, c) = exp(- ||x - c||^2 / (2 * sigma ^ 2))` is the Gaussian\nRBF kernel.\n\n`densratio()` performs the following: - Decides kernel parameter `sigma`\nby cross-validation. - Optimizes for kernel weights `theta`. - Computes\nthe alpha-relative PE-divergence and KL-divergence from the learned\nalpha-relative ratio.\n\nAs the result, you can obtain `compute_density_ratio()`, which will\ncompute the alpha-relative density ratio at the passed coordinates.\n\n### 3.3. Result and Parameter Settings\n\n`densratio()` outputs the result like as follows:\n\n #> Method: RuLSIF\n #> \n #> Alpha: 0\n #> \n #> Kernel Information:\n #> Kernel type: Gaussian\n #> Number of kernels: 100\n #> Bandwidth(sigma): 0.1\n #> Centers: matrix([[0.92113356],..\n #> \n #> Kernel Weights (theta):\n #> array([0.08848922, 0.03377533, 0.0753727 , 0.06141277, 0.02543963,..\n #> \n #> Regularization Parameter (lambda): 1.0\n #> \n #> Alpha-Relative PE-Divergence: 0.9635169300831035\n #> \n #> Alpha-Relative KL-Divergence: 0.8388266265473269\n #> \n #> Function to Estimate Density Ratio:\n #> compute_density_ratio(x)\n #> \n\n - **Method** is fixed as RuLSIF.\n - **Kernel type** is fixed as Gaussian RBF.\n - **Number of kernels** is the number of kernels in the linear model.\n You can change by setting `kernel_num` parameter. In default,\n `kernel_num = 100`.\n - **Bandwidth(sigma)** is the Gaussian kernel bandwidth. In default,\n `sigma = \"auto\"`, the algorithm automatically select an optimal\n value by cross validation. If you set `sigma` a number, that will be\n used. If you set `sigma` a numeric array, the algorithm select an\n optimal value in them by cross validation.\n - **Centers** are centers of Gaussian kernels in the linear model.\n These are selected at random from the data sample `x` underlying a\n numerator distribution `p(x)`. You can find the whole values in\n `result.kernel_info.centers`.\n - **Kernel weights(theta)** are theta parameters in the linear kernel\n model. You can find these values in `result.theta`.\n - **The function to estimate the alpha-relative density ratio** is\n named `compute_density_ratio()`.\n\n## 4\\. Multi Dimensional Data Samples\n\nSo far, we have deal with one-dimensional data samples `x` and `y`.\n`densratio()` allows to input multidimensional data samples as\n`numpy.ndarray` or `numpy.matrix`, as long as their dimensions are the\nsame.\n\nFor example,\n\n``` python\nfrom scipy.stats import multivariate_normal\nfrom densratio import densratio\n\nnp.random.seed(1)\nx = multivariate_normal.rvs(size=3000, mean=[1, 1], cov=[[1. / 8, 0], [0, 1. / 8]])\ny = multivariate_normal.rvs(size=3000, mean=[1, 1], cov=[[1. / 2, 0], [0, 1. / 2]])\nalpha = 0\ndensratio_obj = densratio(x, y, alpha=alpha, sigma_range=[0.1, 0.3, 0.5, 0.7, 1], lambda_range=[0.01, 0.02, 0.03, 0.04, 0.05])\nprint(densratio_obj)\n```\n\ngives the following output:\n\n #> Method: RuLSIF\n #> \n #> Alpha: 0\n #> \n #> Kernel Information:\n #> Kernel type: Gaussian\n #> Number of kernels: 100\n #> Bandwidth(sigma): 0.3\n #> Centers: matrix([[1.01477443, 1.38864061],..\n #> \n #> Kernel Weights (theta):\n #> array([0.06151164, 0.08012094, 0.10467369, 0.13868176, 0.14917063,..\n #> \n #> Regularization Parameter (lambda): 0.04\n #> \n #> Alpha-Relative PE-Divergence: 0.653615870855595\n #> \n #> Alpha-Relative KL-Divergence: 0.6214285743087549\n #> \n #> Function to Estimate Density Ratio:\n #> compute_density_ratio(x)\n #> \n\nIn this case, as well, we can compare the true density ratio with the\nestimated density ratio.\n\n``` python\nfrom matplotlib import pyplot as plt\nfrom numpy import linspace, dstack, meshgrid, concatenate\n\ndef true_alpha_density_ratio(x):\n return multivariate_normal.pdf(x, [1., 1.], [[1. / 8, 0], [0, 1. / 8]]) / \\\n (alpha * multivariate_normal.pdf(x, [1., 1.], [[1. / 8, 0], [0, 1. / 8]]) + (1 - alpha) * multivariate_normal.pdf(x, [1., 1.], [[1. / 2, 0], [0, 1. / 2]]))\n\ndef estimated_alpha_density_ratio(x):\n return densratio_obj.compute_density_ratio(x)\n\nrange_ = np.linspace(0, 2, 200)\ngrid = np.concatenate(np.dstack(np.meshgrid(range_, range_)))\nlevels = [0, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4.5]\n\nplt.figure(figsize=(10, 4))\nplt.subplot(1, 2, 1)\nplt.contourf(range_, range_, true_alpha_density_ratio(grid).reshape(200, 200), levels)\n#> \nplt.colorbar()\n#> \nplt.title(\"True Alpha-Relative Density Ratio\")\nplt.subplot(1, 2, 2)\nplt.contourf(range_, range_, estimated_alpha_density_ratio(grid).reshape(200, 200), levels)\n#> \nplt.colorbar()\n#> \nplt.title(\"Estimated Alpha-Relative Density Ratio\")\nplt.show()\n```\n\n![](https://raw.githubusercontent.com/hoxo-m/densratio_py/master/README_files/figure-gfm/compare-2d-1.png)\n\n## 5\\. References\n\n\\[1\\] Hido, S., Tsuboi, Y., Kashima, H., Sugiyama, M., & Kanamori, T.\n**Statistical outlier detection using direct density ratio estimation.**\nKnowledge and Information Systems 2011.\n\n\\[2\\] Sugiyama, M., Nakajima, S., Kashima, H., von B\u00fcnau, P. & Kawanabe,\nM. **Direct importance estimation with model selection and its\napplication to covariate shift adaptation.** NIPS 2007.\n\n\\[3\\] Sugiyama, M., Suzuki, T. & Kanamori, T. **Density Ratio Estimation\nin Machine Learning.** Cambridge University Press 2012.\n\n\\[4\\] Liu, S., Yamada, M., Collier, N., & Sugiyama, M. **Change-Point\nDetection in Time-Series Data by Relative Density-Ratio Estimation**\nNeural Networks, 2013.\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/hoxo-m/densratio_py", "keywords": "density ratio,anomaly detection,change point detection,covariate shift", "license": "MIT + file LICENSE", "maintainer": "", "maintainer_email": "", "name": "densratio", "package_url": "https://pypi.org/project/densratio/", "platform": "", "project_url": "https://pypi.org/project/densratio/", "project_urls": { "Bug Reports": "https://github.com/hoxo-m/densratio_py/issues", "Homepage": "https://github.com/hoxo-m/densratio_py", "Say Thanks!": "https://saythanks.io/to/hoxo-m", "Source": "https://github.com/hoxo-m/densratio_py" }, "release_url": "https://pypi.org/project/densratio/0.2.2/", "requires_dist": [ "numpy" ], "requires_python": "", "summary": "A Python Package for Density Ratio Estimation", "version": "0.2.2" }, "last_serial": 5250357, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "9e7197c7441eac74c63a0e5e120eaa19", "sha256": "7818773f729a6d49e12d67ea30bf2a8b6a308b9581c36a8dc5ef1e6ddbd447ef" }, "downloads": -1, "filename": "densratio-0.1.0.tar.gz", "has_sig": false, "md5_digest": "9e7197c7441eac74c63a0e5e120eaa19", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3496, "upload_time": "2016-12-09T13:29:17", "url": "https://files.pythonhosted.org/packages/67/c2/5cc3c876726b6e1d17f751d0b5e5767f912d020b7f36834bcacaad5cf3de/densratio-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "9634cd007f30fcc37a0dfc29cbbddc98", "sha256": "a70854967a06e9fe1d4195abcac96156dd2e855ecde829f9d94ecd204b93bc9d" }, "downloads": -1, "filename": "densratio-0.1.1.tar.gz", "has_sig": false, "md5_digest": "9634cd007f30fcc37a0dfc29cbbddc98", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3474, "upload_time": "2016-12-09T13:37:45", "url": "https://files.pythonhosted.org/packages/8b/80/635059fc967e1e7a7e65e072a8dcf181de13d35c13c698f2d33f04c5b1c9/densratio-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "5f9e7e2ae2afea40896d6d39a6ceb4b0", "sha256": "e0e26b76a60ba9cd0de20ffbf29d7b83220ce9e2dc3fbef14d7877d7674f7a82" }, "downloads": -1, "filename": "densratio-0.1.2.tar.gz", "has_sig": false, "md5_digest": "5f9e7e2ae2afea40896d6d39a6ceb4b0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3472, "upload_time": "2016-12-09T23:35:52", "url": "https://files.pythonhosted.org/packages/3a/8d/f9790c9c8411052da46658e7033ce468cc5555d02fdfd0cc3f583fce3e67/densratio-0.1.2.tar.gz" } ], "0.1.3": [ { "comment_text": "", "digests": { "md5": "7f4e8f21c8d281fe564e6ec463a849df", "sha256": "5214a26876f2f41b1e75acdf1b58fe6a6c941f2404ecc899d15e82af2a2a5d61" }, "downloads": -1, "filename": "densratio-0.1.3.tar.gz", "has_sig": false, "md5_digest": "7f4e8f21c8d281fe564e6ec463a849df", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3424, "upload_time": "2016-12-09T23:38:35", "url": "https://files.pythonhosted.org/packages/8f/35/2dfa80ee7fdcd2bff2c2073077c7cba4c31038638a81ad78ac073eee4ec6/densratio-0.1.3.tar.gz" } ], "0.1.4": [ { "comment_text": "", "digests": { "md5": "9ef5c5ef2183654c4c71a6367f80fa15", "sha256": "a6414c9a57f7de393008ffc7db89b742125429d740db6056addffb9d267823bf" }, "downloads": -1, "filename": "densratio-0.1.4.tar.gz", "has_sig": false, "md5_digest": "9ef5c5ef2183654c4c71a6367f80fa15", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3480, "upload_time": "2016-12-11T05:05:42", "url": "https://files.pythonhosted.org/packages/70/41/6b582e9e264a0fad258e89fc9061ed689ae766232e2ef719ec2afc0a3a4e/densratio-0.1.4.tar.gz" } ], "0.2.2": [ { "comment_text": "", "digests": { "md5": "c537cf5381f211c4861dc8d4d750bc41", "sha256": "64115e5b8b16edeb9bebc0238103d1a927f1c70f6798351d9033ae2d46ffd449" }, "downloads": -1, "filename": "densratio-0.2.2-py3-none-any.whl", "has_sig": false, "md5_digest": "c537cf5381f211c4861dc8d4d750bc41", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9831, "upload_time": "2019-05-10T03:10:33", "url": "https://files.pythonhosted.org/packages/5a/4d/834ef1457da2038eca9c148a141820d39a0dbd6a7566ddfcf0f472d32b21/densratio-0.2.2-py3-none-any.whl" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "c537cf5381f211c4861dc8d4d750bc41", "sha256": "64115e5b8b16edeb9bebc0238103d1a927f1c70f6798351d9033ae2d46ffd449" }, "downloads": -1, "filename": "densratio-0.2.2-py3-none-any.whl", "has_sig": false, "md5_digest": "c537cf5381f211c4861dc8d4d750bc41", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9831, "upload_time": "2019-05-10T03:10:33", "url": "https://files.pythonhosted.org/packages/5a/4d/834ef1457da2038eca9c148a141820d39a0dbd6a7566ddfcf0f472d32b21/densratio-0.2.2-py3-none-any.whl" } ] }