{ "info": { "author": "R\u00e9my Frenoy", "author_email": "rfrenoy@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "# Introduction\n\nThis is a python implementation of [this paper](https://www.cse.wustl.edu/~jain/papers/ftp/psqr.pdf), \nwhich proposes a heuristic algorithm for dynamic calculation of the median and other percentiles.\nIt has the advantage of running in O(1) at each iteration and is hence particularly useful when \ndealing with continuously (and fastly) incoming data.\n\n# Install\n\n1. From sources\n ```bash\n git clone git@gitlab.octo.com:bdalab/psquare.git\n cd psquare\n python setup.py install\n ```\n\n2. Using pip\n ```bash\n pip install psquare\n ```\n\n# Example of use\n\nIn the following example, we will estimate the value of the 95th percentile of a N(0,1) distribution\nusing p square algorithm. We will compare our estimates as we continuously draw from the distribution,\nby comparing with the truth value given by the ````numpy.percentile```` function. At the end, we will plot\nthe residuals, and the execution time using psquare and ```numpy.percentile```.\n\n```python\nimport numpy as np\nimport matplotlib\nimport matplotlib.pyplot as plt\nimport time\n\nfrom psquare.psquare import PSquare\n\nNB_ITERATIONS = 30000\n\n\ndef random_generator():\n return np.random.normal(0, 1, 1)\n\n\ndef exact_value_for_quantile(values, quantile):\n return np.percentile(values, quantile)\n\n\ndef main():\n values = [random_generator() for _ in range(5)]\n quantile_to_estimate = 95\n\n psquare = PSquare(quantile_to_estimate)\n exact_quantiles = []\n estimated_quantiles = []\n psquare_exec_time = []\n numpy_exec_time = []\n\n for val in values: # p square algorithm necessitates 5 values to start\n psquare.update(val)\n\n for _ in range(NB_ITERATIONS):\n new_val = random_generator()\n values.append(new_val)\n\n psquare_start = time.time()\n psquare.update(new_val)\n estimated_quantiles.append(psquare.p_estimate())\n psquare_end = time.time()\n\n numpy_start = time.time()\n exact_quantiles.append(exact_value_for_quantile(values, quantile_to_estimate))\n numpy_end = time.time()\n\n psquare_exec_time.append(psquare_end - psquare_start)\n numpy_exec_time.append(numpy_end - numpy_start)\n\n matplotlib.rc('figure', figsize=(10, 5))\n errors = np.abs(np.array(estimated_quantiles) - np.array(exact_quantiles))\n plt.plot(errors)\n plt.title('Absolute error between p-square predicted value and exact percentile value')\n plt.ylabel('Difference between exact percentile value and p-square estimation')\n plt.xlabel('Size of the dataset')\n plt.rcParams[\"figure.figsize\"] = (10, 5)\n plt.show()\n\n plt.plot(psquare_exec_time[1:], label='p-square')\n plt.plot(numpy_exec_time[1:], label='numpy percentile')\n plt.title('Execution time to compute percentile on a growing dataset')\n plt.ylabel('Execution time (in seconds)')\n plt.xlabel('Size of the dataset')\n plt.legend()\n plt.rcParams[\"figure.figsize\"] = (10, 5)\n plt.show()\n\n\nif __name__ == '__main__':\n main()\n```\n\n## Error between estimated and exact value\n\nUsing previous example we obtain the following figures:\n\n\n* Errors between exact and predicted percentile value\n ![](./img/errors.png)\n\n* Execution time between p-square estimations and numpy _percentile_ function:\n ![](./img/execution_times.png)\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://gitlab.octo.com/bdalab/psquare", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "psquare", "package_url": "https://pypi.org/project/psquare/", "platform": "", "project_url": "https://pypi.org/project/psquare/", "project_urls": { "Homepage": "https://gitlab.octo.com/bdalab/psquare" }, "release_url": "https://pypi.org/project/psquare/1.0/", "requires_dist": [ "numpy", "matplotlib ; extra == 'example'" ], "requires_python": "", "summary": "This is a Python implementation of the p-square algorithm", "version": "1.0" }, "last_serial": 4625734, "releases": { "1.0": [ { "comment_text": "", "digests": { "md5": "005add7ea83e73a342a2cb2f8201ed8b", "sha256": "50002ff25cece5896da63a6756fc2edb9302366c5dd31bcd1a40b0c023461cad" }, "downloads": -1, "filename": "psquare-1.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "005add7ea83e73a342a2cb2f8201ed8b", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 4889, "upload_time": "2018-12-21T16:41:27", "url": "https://files.pythonhosted.org/packages/2f/4e/f745d180d45234f50aa7c5e53bf90e0734ad2b901f523b43bc4d6d2e4c8b/psquare-1.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "4f7818f99d8d0acaee1273dbddcc5fd6", "sha256": "7c0a23f083d409b4605a69f9e64a1dd4d81442e1040f99f0d0103ab08cfbd9e7" }, "downloads": -1, "filename": "psquare-1.0.tar.gz", "has_sig": false, "md5_digest": "4f7818f99d8d0acaee1273dbddcc5fd6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3891, "upload_time": "2018-12-21T16:41:29", "url": "https://files.pythonhosted.org/packages/00/fe/2cf92bba70b336da67c461fd451aa7c64ab0d64c06bcfb23dc479cc06a72/psquare-1.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "005add7ea83e73a342a2cb2f8201ed8b", "sha256": "50002ff25cece5896da63a6756fc2edb9302366c5dd31bcd1a40b0c023461cad" }, "downloads": -1, "filename": "psquare-1.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "005add7ea83e73a342a2cb2f8201ed8b", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 4889, "upload_time": "2018-12-21T16:41:27", "url": "https://files.pythonhosted.org/packages/2f/4e/f745d180d45234f50aa7c5e53bf90e0734ad2b901f523b43bc4d6d2e4c8b/psquare-1.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "4f7818f99d8d0acaee1273dbddcc5fd6", "sha256": "7c0a23f083d409b4605a69f9e64a1dd4d81442e1040f99f0d0103ab08cfbd9e7" }, "downloads": -1, "filename": "psquare-1.0.tar.gz", "has_sig": false, "md5_digest": "4f7818f99d8d0acaee1273dbddcc5fd6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3891, "upload_time": "2018-12-21T16:41:29", "url": "https://files.pythonhosted.org/packages/00/fe/2cf92bba70b336da67c461fd451aa7c64ab0d64c06bcfb23dc479cc06a72/psquare-1.0.tar.gz" } ] }