{ "info": { "author": "Cam Davidson-pilon", "author_email": "cam.davidson.pilon@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "License :: OSI Approved :: MIT License", "Programming Language :: Python", "Programming Language :: Python :: 3", "Topic :: Scientific/Engineering" ], "description": "# tdigest\n### Efficient percentile estimation of streaming or distributed data\n[![PyPI version](https://badge.fury.io/py/tdigest.svg)](https://badge.fury.io/py/tdigest)\n[![Build Status](https://travis-ci.org/CamDavidsonPilon/tdigest.svg?branch=master)](https://travis-ci.org/CamDavidsonPilon/tdigest)\n\n\nThis is a Python implementation of Ted Dunning's [t-digest](https://github.com/tdunning/t-digest) data structure. The t-digest data structure is designed around computing accurate estimates from either streaming data, or distributed data. These estimates are percentiles, quantiles, trimmed means, etc. Two t-digests can be added, making the data structure ideal for map-reduce settings, and can be serialized into much less than 10kB (instead of storing the entire list of data).\n\nSee a blog post about it here: [Percentile and Quantile Estimation of Big Data: The t-Digest](http://dataorigami.net/blogs/napkin-folding/19055451-percentile-and-quantile-estimation-of-big-data-the-t-digest)\n\n\n### Installation\n*tdigest* is compatible with both Python 2 and Python 3. \n\n```\npip install tdigest\n```\n\n### Usage\n\n#### Update the digest sequentially\n\n```\nfrom tdigest import TDigest\nfrom numpy.random import random\n\ndigest = TDigest()\nfor x in range(5000):\n digest.update(random())\n\nprint(digest.percentile(15)) # about 0.15, as 0.15 is the 15th percentile of the Uniform(0,1) distribution\n```\n\n#### Update the digest in batches\n\n```\nanother_digest = TDigest()\nanother_digest.batch_update(random(5000))\nprint(another_digest.percentile(15))\n```\n\n#### Sum two digests to create a new digest\n\n```\nsum_digest = digest + another_digest \nsum_digest.percentile(30) # about 0.3\n```\n\n#### To dict or serializing a digest with JSON\n\nYou can use the to_dict() method to turn a TDigest object into a standard Python dictionary.\n```\ndigest = TDigest()\ndigest.update(1)\ndigest.update(2)\ndigest.update(3)\nprint(digest.to_dict())\n```\nOr you can get only a list of Centroids with `centroids_to_list()`.\n```\ndigest.centroids_to_list()\n```\n\nSimilarly, you can restore a Python dict of digest values with `update_from_dict()`. Centroids are merged with any existing ones in the digest.\nFor example, make a fresh digest and restore values from a python dictionary.\n```\ndigest = TDigest()\ndigest.update_from_dict({'K': 25, 'delta': 0.01, 'centroids': [{'c': 1.0, 'm': 1.0}, {'c': 1.0, 'm': 2.0}, {'c': 1.0, 'm': 3.0}]})\n```\n\nK and delta values are optional, or you can provide only a list of centroids with `update_centroids_from_list()`.\n```\ndigest = TDigest()\ndigest.update_centroids([{'c': 1.0, 'm': 1.0}, {'c': 1.0, 'm': 2.0}, {'c': 1.0, 'm': 3.0}])\n```\n\nIf you want to serialize with other tools like JSON, you can first convert to_dict().\n```\njson.dumps(digest.to_dict())\n```\n\nAlternatively, make a custom encoder function to provide as default to the standard json module.\n```\ndef encoder(digest_obj):\n return digest_obj.to_dict()\n```\nThen pass the encoder function as the default parameter.\n```\njson.dumps(digest, default=encoder)\n```\n\n\n### API \n\n`TDigest.`\n\n - `update(x, w=1)`: update the tdigest with value `x` and weight `w`.\n - `batch_update(x, w=1)`: update the tdigest with values in array `x` and weight `w`.\n - `compress()`: perform a compression on the underlying data structure that will shrink the memory footprint of it, without hurting accuracy. Good to perform after adding many values. \n - `percentile(p)`: return the `p`th percentile. Example: `p=50` is the median.\n - `cdf(x)`: return the CDF the value `x` is at. \n - `trimmed_mean(p1, p2)`: return the mean of data set without the values below and above the `p1` and `p2` percentile respectively. \n - `to_dict()`: return a Python dictionary of the TDigest and internal Centroid values.\n - `update_from_dict(dict_values)`: update from serialized dictionary values into the TDigest object.\n - `centroids_to_list()`: return a Python list of the TDigest object's internal Centroid values.\n - `update_centroids_from_list(list_values)`: update Centroids from a python list.\n\n\n\n\n\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/CamDavidsonPilon/tdigest", "keywords": "percentile,median,probabilistic data structure,quantile,distributed,qdigest,tdigest,streaming,pyspark", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "tdigest", "package_url": "https://pypi.org/project/tdigest/", "platform": "", "project_url": "https://pypi.org/project/tdigest/", "project_urls": { "Homepage": "https://github.com/CamDavidsonPilon/tdigest" }, "release_url": "https://pypi.org/project/tdigest/0.5.2.2/", "requires_dist": [ "accumulation-tree", "pyudorandom", "pytest ; extra == 'tests'", "pytest-timeout ; extra == 'tests'", "pytest-cov ; extra == 'tests'", "numpy ; extra == 'tests'" ], "requires_python": "", "summary": "T-Digest data structure", "version": "0.5.2.2" }, "last_serial": 5239473, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "9bf920190be94c471be8ca9fa9863295", "sha256": "4167cadb797b1287104b07072447f67b33cc0de4740b002959952c1f6171cd7c" }, "downloads": -1, "filename": "tdigest-0.1.0.tar.gz", "has_sig": false, "md5_digest": "9bf920190be94c471be8ca9fa9863295", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3233, "upload_time": "2015-05-10T14:17:54", "url": "https://files.pythonhosted.org/packages/53/66/1e8164b3fe37c46b445aa65dd546789cc51daf986146c4efc69bc0021153/tdigest-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "5ee487f0c02db4be8636aef6444fddad", "sha256": "87cf1636ea31b7d015a2aeeffc1d24b31ea92a7c7db9840559f13ec0c8a789b1" }, "downloads": -1, "filename": "tdigest-0.1.1.tar.gz", "has_sig": false, "md5_digest": "5ee487f0c02db4be8636aef6444fddad", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4192, "upload_time": "2015-05-10T17:38:49", "url": "https://files.pythonhosted.org/packages/6d/ff/99649b1cad439f80dbb6bfe0cfc9fa16df06d839efba26f58447d13c9404/tdigest-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "212b57e54aec8b4fe4098fdead518abc", "sha256": "b3a8e46c8756dadb53b80aaa07a301b8c080cbc055c71745022bf08c85037df6" }, "downloads": -1, "filename": "tdigest-0.1.2.tar.gz", "has_sig": false, "md5_digest": "212b57e54aec8b4fe4098fdead518abc", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4194, "upload_time": "2015-05-31T16:20:27", "url": "https://files.pythonhosted.org/packages/92/1e/9fa8a5d4a5c8cf3daff3292148731c80a49c5ead3a717de05697d58e015b/tdigest-0.1.2.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "e8adb1885eae8c1105019d474f69c008", "sha256": "4cca9def0356e32f0178faf996b1c899446c70c08b096e92c339af93942fbce6" }, "downloads": -1, "filename": "tdigest-0.2.0.tar.gz", "has_sig": false, "md5_digest": "e8adb1885eae8c1105019d474f69c008", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4828, "upload_time": "2015-06-09T03:39:52", "url": "https://files.pythonhosted.org/packages/34/06/a0fb53624218ce35ab6de3ceb8713a18c20828cdb8ea299660281dbc13e1/tdigest-0.2.0.tar.gz" } ], "0.3.0": [ { "comment_text": "", "digests": { "md5": "c2858dc043ad9bd367227b2b98a804cc", "sha256": "41cbb1c5b7cd5f56f75375784c5bf944b191fdcdaee65fcccb8ad3d67b328769" }, "downloads": -1, "filename": "tdigest-0.3.0.tar.gz", "has_sig": false, "md5_digest": "c2858dc043ad9bd367227b2b98a804cc", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4828, "upload_time": "2015-07-01T16:38:11", "url": "https://files.pythonhosted.org/packages/f5/49/63ad8cea535e3848b1eb701b131a35f498bea182afb5860c3a463fc97c38/tdigest-0.3.0.tar.gz" } ], "0.4.0": [ { "comment_text": "", "digests": { "md5": "d3308380dd96631540048a6d23f35bc5", "sha256": "873e013f60d0b56d4563890a88fcab7c767715c5ec1ed331628a4190610dab03" }, "downloads": -1, "filename": "tdigest-0.4.0.tar.gz", "has_sig": false, "md5_digest": "d3308380dd96631540048a6d23f35bc5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4849, "upload_time": "2015-07-20T02:01:28", "url": "https://files.pythonhosted.org/packages/db/17/cba7ac5a2ccdd437529d6274614b698722bccd7eac164df03ae32363572b/tdigest-0.4.0.tar.gz" } ], "0.4.0.1": [ { "comment_text": "", "digests": { "md5": "df54f358a007c9659d9291766da5ad7b", "sha256": "c8c29fb7c98f07f52b420a0bd92dadc582b9731b75b4e02aa53ef0900fd24699" }, "downloads": -1, "filename": "tdigest-0.4.0.1.tar.gz", "has_sig": false, "md5_digest": "df54f358a007c9659d9291766da5ad7b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4855, "upload_time": "2015-10-10T03:47:49", "url": "https://files.pythonhosted.org/packages/21/c9/76a30b19aecddadf5f1929c435e89fc5848307103a491bb3f459e8619fb7/tdigest-0.4.0.1.tar.gz" } ], "0.4.0.2": [ { "comment_text": "", "digests": { "md5": "3b07a1090348d28ffb98e88861cbc35b", "sha256": "8296a9caf1b3ddbe5e651b7b0d3afae015de7c4476c122d99e211d4df0560e7f" }, "downloads": -1, "filename": "tdigest-0.4.0.2.tar.gz", "has_sig": false, "md5_digest": "3b07a1090348d28ffb98e88861cbc35b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4948, "upload_time": "2016-07-21T17:59:56", "url": "https://files.pythonhosted.org/packages/d5/fe/19db841259000e1ab7d82a0c3f40f668cfb9eca7e0c52c27ee6ac31965cb/tdigest-0.4.0.2.tar.gz" } ], "0.4.1.0": [ { "comment_text": "", "digests": { "md5": "c19a15fc4d6cec2541858223231cd321", "sha256": "26551ce8870e44d15b0b4e827462de96fae54d8b69eaa37433118237179baf23" }, "downloads": -1, "filename": "tdigest-0.4.1.0.tar.gz", "has_sig": false, "md5_digest": "c19a15fc4d6cec2541858223231cd321", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5022, "upload_time": "2016-08-27T01:30:12", "url": "https://files.pythonhosted.org/packages/6c/5a/d2f6dd8e5dadf2b2685e0e8d76b3176b47c2f3374213fd215f100676c641/tdigest-0.4.1.0.tar.gz" } ], "0.5.0.0": [ { "comment_text": "", "digests": { "md5": "af9bb9ef6309e0fc30904776a4fedb0c", "sha256": "23b13868b3a0f8fdbd11425ca103929ac76870a5d470a48655cdd44071cecdf5" }, "downloads": -1, "filename": "tdigest-0.5.0.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "af9bb9ef6309e0fc30904776a4fedb0c", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 9749, "upload_time": "2017-12-27T04:49:56", "url": "https://files.pythonhosted.org/packages/5b/34/708ac0f9c65d080e2d9f32e6e24ee7cfebdf04627dbe1314bb3b713922da/tdigest-0.5.0.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "75b8973535f0dc66ba227e3ed7ffd658", "sha256": "7aa665116b28d628b25a37e6640d461660d34f2502cd143370c36713b66183f9" }, "downloads": -1, "filename": "tdigest-0.5.0.0-py3-none-any.whl", "has_sig": false, "md5_digest": "75b8973535f0dc66ba227e3ed7ffd658", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9745, "upload_time": "2017-12-27T04:46:36", "url": "https://files.pythonhosted.org/packages/c7/2a/b8d383790023f3184b0bc0978bdea2203f8044ad4b13e1e70bad2cb3537a/tdigest-0.5.0.0-py3-none-any.whl" } ], "0.5.1.0": [ { "comment_text": "", "digests": { "md5": "055991a8e0b277151827250e41d5658b", "sha256": "01fb1e02d9ecb8e9c4810405827c0dc84a9afc600d1cba232c9d77834e9a3691" }, "downloads": -1, "filename": "tdigest-0.5.1.0-py2-none-any.whl", "has_sig": false, "md5_digest": "055991a8e0b277151827250e41d5658b", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 9711, "upload_time": "2018-02-14T15:00:57", "url": "https://files.pythonhosted.org/packages/51/bf/b115637cdc037a31771c628d84cdacf79c40da593e6c81c2372efedf5632/tdigest-0.5.1.0-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "fd2ee13e12c27722a91350b025b12fe1", "sha256": "f42099001442d461df17f28f3c47bc11bcf6f0fa4716bec718e8c97acd2dcf95" }, "downloads": -1, "filename": "tdigest-0.5.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "fd2ee13e12c27722a91350b025b12fe1", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9670, "upload_time": "2018-02-09T16:40:01", "url": "https://files.pythonhosted.org/packages/c5/85/0cb268e3efa0532146d0c96a23f2574cdb0ba4123286f82821362e51d524/tdigest-0.5.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "5494e0c7e4f7c3df4450feb0558e630c", "sha256": "5546c32c0e7c18f6873a00637afa9b524290d47f0f0de6964c1247787e62bf8a" }, "downloads": -1, "filename": "tdigest-0.5.1.0.tar.gz", "has_sig": false, "md5_digest": "5494e0c7e4f7c3df4450feb0558e630c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5765, "upload_time": "2018-02-14T15:02:25", "url": "https://files.pythonhosted.org/packages/6f/05/678ce3837a02f4a9dbef8cb88ef2bbc38be2127ba6dda4ef0ed365f788eb/tdigest-0.5.1.0.tar.gz" } ], "0.5.2.0": [ { "comment_text": "", "digests": { "md5": "b8c5da1be4cfd2a395238b3aedd2ed3e", "sha256": "94d08df0a17035cdf7ab904c590f23c882811b50c415e3e866d154c16ce11a42" }, "downloads": -1, "filename": "tdigest-0.5.2.0-py2-none-any.whl", "has_sig": false, "md5_digest": "b8c5da1be4cfd2a395238b3aedd2ed3e", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 12107, "upload_time": "2018-03-12T15:01:53", "url": "https://files.pythonhosted.org/packages/6e/53/073e8639f0dbeb760e627864a085e83f0f8f244e1ad9253b1bb19732fa2f/tdigest-0.5.2.0-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "81fa191371ae09cdc4a98fc808c02526", "sha256": "8ff243f138e7702803d221aeb6526525ad384a8e0cf356133b87238c03d1788c" }, "downloads": -1, "filename": "tdigest-0.5.2.0-py3-none-any.whl", "has_sig": false, "md5_digest": "81fa191371ae09cdc4a98fc808c02526", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 12069, "upload_time": "2018-03-12T15:01:54", "url": "https://files.pythonhosted.org/packages/b3/9c/79805580305ffa4c8aaf99a13219fb6950067088976f97bf551a61d4de76/tdigest-0.5.2.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3f2257fe4eb61eca00826413e4a6c6ef", "sha256": "080825c2c6a8c0a494774e7e69fbd5ba850f9777b812558268614b12ead73d78" }, "downloads": -1, "filename": "tdigest-0.5.2.0.tar.gz", "has_sig": false, "md5_digest": "3f2257fe4eb61eca00826413e4a6c6ef", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7043, "upload_time": "2018-03-12T15:01:55", "url": "https://files.pythonhosted.org/packages/d9/e3/f509a40b3b3d31cf0318524c994ef07b1e3aeb2e7fc7da2fb89bb04d2842/tdigest-0.5.2.0.tar.gz" } ], "0.5.2.1": [ { "comment_text": "", "digests": { "md5": "8b988c655b3989e54696dbb4d53626a6", "sha256": "e0febed55737e1b5cf67e530a38beaaadad68c4e0c37229d3d8f662b99c914d7" }, "downloads": -1, "filename": "tdigest-0.5.2.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "8b988c655b3989e54696dbb4d53626a6", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 18615, "upload_time": "2018-05-05T13:25:38", "url": "https://files.pythonhosted.org/packages/27/41/b714941a6dba3760ddf2c2604daabbb578bcd6063f57ecdbe2c1d8ce4a79/tdigest-0.5.2.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "e6943c24fc8385aee7972a8e0f23c8be", "sha256": "8b1e554356dcc176a2e15265b2d2c39f67a388524b5dd121bd03e929b1c52e7b" }, "downloads": -1, "filename": "tdigest-0.5.2.1.tar.gz", "has_sig": false, "md5_digest": "e6943c24fc8385aee7972a8e0f23c8be", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7101, "upload_time": "2018-05-05T13:25:40", "url": "https://files.pythonhosted.org/packages/fb/61/30ce52974601199c1f5d38c03989aa6c913d27892b5e7b4f035166f561c5/tdigest-0.5.2.1.tar.gz" } ], "0.5.2.2": [ { "comment_text": "", "digests": { "md5": "0be092d4caf62c7e54c27380664de896", "sha256": "e32ff6ab62e4defdb93b816c831080d94dfa1efb68a9fa1e7976c237fa9375cb" }, "downloads": -1, "filename": "tdigest-0.5.2.2-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "0be092d4caf62c7e54c27380664de896", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 9445, "upload_time": "2019-05-07T18:57:37", "url": "https://files.pythonhosted.org/packages/32/72/f420480118cbdd18eb761b9936f0a927957130659a638449575b4a4f0aa7/tdigest-0.5.2.2-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "8655b11bc115465cf53acab1be3e0b11", "sha256": "dd25f8d6e6be002192bba9e4b8c16491d36c10b389f50637818603d1f67c6fb2" }, "downloads": -1, "filename": "tdigest-0.5.2.2-py3-none-any.whl", "has_sig": false, "md5_digest": "8655b11bc115465cf53acab1be3e0b11", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9440, "upload_time": "2019-05-07T18:57:38", "url": "https://files.pythonhosted.org/packages/b4/94/fd3853b98f39d10206b08f2737d2ec2dc6f46a42dc7b7e05f4f0162d13ee/tdigest-0.5.2.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "07637824cb88ef904bb5dade8e7408d1", "sha256": "8deffc8bac024761786f43d9444e3b6c91008cd690323e051f068820a7364d0e" }, "downloads": -1, "filename": "tdigest-0.5.2.2.tar.gz", "has_sig": false, "md5_digest": "07637824cb88ef904bb5dade8e7408d1", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6549, "upload_time": "2019-05-07T18:57:40", "url": "https://files.pythonhosted.org/packages/dd/34/7e2f78d1ed0af7d0039ab2cff45b6bf8512234b9f178bb21713084a1f2f0/tdigest-0.5.2.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "0be092d4caf62c7e54c27380664de896", "sha256": "e32ff6ab62e4defdb93b816c831080d94dfa1efb68a9fa1e7976c237fa9375cb" }, "downloads": -1, "filename": "tdigest-0.5.2.2-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "0be092d4caf62c7e54c27380664de896", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 9445, "upload_time": "2019-05-07T18:57:37", "url": "https://files.pythonhosted.org/packages/32/72/f420480118cbdd18eb761b9936f0a927957130659a638449575b4a4f0aa7/tdigest-0.5.2.2-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "8655b11bc115465cf53acab1be3e0b11", "sha256": "dd25f8d6e6be002192bba9e4b8c16491d36c10b389f50637818603d1f67c6fb2" }, "downloads": -1, "filename": "tdigest-0.5.2.2-py3-none-any.whl", "has_sig": false, "md5_digest": "8655b11bc115465cf53acab1be3e0b11", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9440, "upload_time": "2019-05-07T18:57:38", "url": "https://files.pythonhosted.org/packages/b4/94/fd3853b98f39d10206b08f2737d2ec2dc6f46a42dc7b7e05f4f0162d13ee/tdigest-0.5.2.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "07637824cb88ef904bb5dade8e7408d1", "sha256": "8deffc8bac024761786f43d9444e3b6c91008cd690323e051f068820a7364d0e" }, "downloads": -1, "filename": "tdigest-0.5.2.2.tar.gz", "has_sig": false, "md5_digest": "07637824cb88ef904bb5dade8e7408d1", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6549, "upload_time": "2019-05-07T18:57:40", "url": "https://files.pythonhosted.org/packages/dd/34/7e2f78d1ed0af7d0039ab2cff45b6bf8512234b9f178bb21713084a1f2f0/tdigest-0.5.2.2.tar.gz" } ] }