{
"info": {
"author": "",
"author_email": "",
"bugtrack_url": null,
"classifiers": [],
"description": "## Less Hash Bloom Filter\n[](https://travis-ci.com/garawalid/LH-BloomFilter)\n\nLess Hash Bloom Filter is fast bloom filter suitable for Big Data. \n\nThe computation of hash functions and checking the existence of an element is a major computation overhead. \nAlso, bloom filter requires multiple independent hash functions, and well-designed hash functions are computation-intensive like MD5, SHA-1 [1]. \n\nIn this implementation, we use a different technique to generate the k hash functions from only two. Therefore, the bloom filter is fast.\n\n### Install\n\nInstall Less Hash Bloom Filter with pip as follows:\n```\n$ pip install LessHash-BloomFilter\n```\n\n### Usage\n\nLHBF needs to know the size of bloom filter `m` and number of hash functions `k`.\n\n**Note:** You should use high `m` to avoid the collision of hash functions. The probability of two random strings colliding is ~ 1/m\n\n\n\n```python\nfrom lhbf import BloomFilter\n\n# Create a bloom filter \nbf = BloomFilter(m=200, k=2)\n\n# Add an element\nbf.add(\"a\")\n\n# Check if element exists\nbf.might_contain(\"a\")\n\n# Estimate flase positive probability \nbf.estimate_fpp()\n\n# Combine two bloom filters\nbf2 = BloomFilter(m=200, k=2)\nbf.combine(bf2)\n\n```\n\n### Details\n\n+ Hash functions used: \n + For integer, we use **Knuth multiplicative hash** [2]\n + For string, we use **polynomial rolling hash function** [3]\n\n+ k hash functions: \n\n Using two hash functions, we calculate the k hash functions as follows: \n \n *gi(x) = h1(x) + i x h2(x) mod m, where 0 ≤ i ≤ k-1* \n \n It has been proved that using this method does not increase the asymptotic false positive probability [4].\n\n\n### Contributing\nYou're welcome to submit pull requests with any changes for this repository at any time. I'll be very glad to see any contributions.\n\n\n### References\n\n+ [1] Luo, Lailong, et al. Optimizing bloom filter: challenges, solutions, and comparisons. IEEE Communications Surveys & Tutorials (2018). \n+ [2] Knuth, Donald Ervin. The art of computer programming: sorting and searching. Vol. 3. Pearson Education, 1997.\n+ [3] Karp, Richard M., and Michael O. Rabin. Efficient randomized pattern-matching algorithms. IBM journal of research and development 31.2 (1987): 249-260. \n+ [4] Kirsch, Adam, and Michael Mitzenmacher. Less hashing, same performance: building a better bloom filter. European Symposium on Algorithms. Springer, Berlin, Heidelberg, 2006.",
"description_content_type": "text/markdown",
"docs_url": null,
"download_url": "https://github.com/garawalid/LH-BloomFilter/archive/v0.0.4.tar.gz",
"downloads": {
"last_day": -1,
"last_month": -1,
"last_week": -1
},
"home_page": "https://github.com/garawalid/LH-BloomFilter",
"keywords": "",
"license": "Apache License v2",
"maintainer": "Walid Gara",
"maintainer_email": "",
"name": "LessHash-BloomFilter",
"package_url": "https://pypi.org/project/LessHash-BloomFilter/",
"platform": "",
"project_url": "https://pypi.org/project/LessHash-BloomFilter/",
"project_urls": {
"Download": "https://github.com/garawalid/LH-BloomFilter/archive/v0.0.4.tar.gz",
"Homepage": "https://github.com/garawalid/LH-BloomFilter"
},
"release_url": "https://pypi.org/project/LessHash-BloomFilter/0.0.5/",
"requires_dist": null,
"requires_python": "",
"summary": "Fast Bloom Filter",
"version": "0.0.5"
},
"last_serial": 5437315,
"releases": {
"0.0.2": [
{
"comment_text": "",
"digests": {
"md5": "352569f2bfa7d6ecceef99cd0a995a95",
"sha256": "c765ffc67104ecec4fcceebaeb82c72baae0e17c3263d5a281c6a4e2136f5d9a"
},
"downloads": -1,
"filename": "LessHash-BloomFilter-0.0.2.tar.gz",
"has_sig": false,
"md5_digest": "352569f2bfa7d6ecceef99cd0a995a95",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 3712,
"upload_time": "2019-06-07T19:27:54",
"url": "https://files.pythonhosted.org/packages/27/a4/707de323d0b5ccadaa5ae7e68f4e67763f3dc161b684477a9f96d25f3dc5/LessHash-BloomFilter-0.0.2.tar.gz"
}
],
"0.0.3": [
{
"comment_text": "",
"digests": {
"md5": "07d251c783bbe36a4a7991ee1d6ff142",
"sha256": "d032e8f198a3598843ec80d0f1940c213fdcefaaca2160e4ec7235e8c5c06858"
},
"downloads": -1,
"filename": "LessHash-BloomFilter-0.0.3.tar.gz",
"has_sig": false,
"md5_digest": "07d251c783bbe36a4a7991ee1d6ff142",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 3885,
"upload_time": "2019-06-07T19:36:35",
"url": "https://files.pythonhosted.org/packages/92/03/32e6762159c5b71ff39e082e9555c337fb2fbf62dad4d593b113ab46b3ff/LessHash-BloomFilter-0.0.3.tar.gz"
}
],
"0.0.4": [
{
"comment_text": "",
"digests": {
"md5": "c13e15dc3cb068ec2bead18d4b225f2b",
"sha256": "8f9199cd6e360bf2bd300d32b2c08267ef4b347e1f820b09fb491413f1c1a150"
},
"downloads": -1,
"filename": "LessHash-BloomFilter-0.0.4.tar.gz",
"has_sig": false,
"md5_digest": "c13e15dc3cb068ec2bead18d4b225f2b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 3908,
"upload_time": "2019-06-07T19:49:05",
"url": "https://files.pythonhosted.org/packages/c9/4f/6af543aca46e838c2dd5a775025aaf2cc4ca319fd1969148744782c1ca0f/LessHash-BloomFilter-0.0.4.tar.gz"
}
],
"0.0.5": [
{
"comment_text": "",
"digests": {
"md5": "0416ad2e322e36317a6c800967ed6813",
"sha256": "a2ff6906ec78dbe3106e803ba2f2a724cb80e578cf3aaaf94ee8362ab59cefeb"
},
"downloads": -1,
"filename": "LessHash-BloomFilter-0.0.5.tar.gz",
"has_sig": false,
"md5_digest": "0416ad2e322e36317a6c800967ed6813",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 4218,
"upload_time": "2019-06-23T13:14:53",
"url": "https://files.pythonhosted.org/packages/07/8c/c47a53b548b4ea3099ccc951faead48f00f54224c4350630ae208f80afea/LessHash-BloomFilter-0.0.5.tar.gz"
}
]
},
"urls": [
{
"comment_text": "",
"digests": {
"md5": "0416ad2e322e36317a6c800967ed6813",
"sha256": "a2ff6906ec78dbe3106e803ba2f2a724cb80e578cf3aaaf94ee8362ab59cefeb"
},
"downloads": -1,
"filename": "LessHash-BloomFilter-0.0.5.tar.gz",
"has_sig": false,
"md5_digest": "0416ad2e322e36317a6c800967ed6813",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 4218,
"upload_time": "2019-06-23T13:14:53",
"url": "https://files.pythonhosted.org/packages/07/8c/c47a53b548b4ea3099ccc951faead48f00f54224c4350630ae208f80afea/LessHash-BloomFilter-0.0.5.tar.gz"
}
]
}