{ "info": { "author": "Sam Crochet", "author_email": "samuel.d.crochet@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "\n#### Bloom Filter\n\\\\bloom **fil**-ter\\\\\n_**Noun**_\n1. Space-efficient probabilistic data structure that is used to test whether an element is a member of a set. \n2. A fun side project!\n\n### About bloomy\nBloomy is a bloom filter module designed to be both lightweight and scalable. Bloomy does not use any external libraries, and scales internally to match scenarios. Bloomy also designed to reduce the rate of false positives. This is done my applying md5 hashing and bit rotation to generate k unique 8 byte hash values, and then using golden ratio compression for even distribution. \n\n### Advantages of bloom filters \n- Uses bit arrays to store the presence of items. This significantly reduces the in memory storage. \n- Implements binary and bit masking operations. This reduces the runtime of add and contains operations to practically O(1). \n- Raw types can be used. \n\n---\n### Operations \t\n\n#### `___init__(m,[array of additional hash functions])`\n\t\t\t params: (optional) size of bit array, (optional) additional hash functions \n\t\t\t return: bloomFilter object \nCreate a bit array `bits` with m bits cleared to 0, and saves k number of hash functions. M has a default of 1000 and the filter comes with five unique default hash functions. \n\n#### `add(value)`\n\t\t\t params: object to add to the bloomFilter set\n\t\t\t return: None\nTakes in a value and hashes it k times to obtain a list of indexes. For each index obtained by a given hash function, set the value in the bit array `bits` to 1. \n\n#### `__contains__(value)` \n\t\t\t params: object to search for in the set \n\t\t\t return: True/False indicating presence in the set\nTakes in a value and hashes it k times to obtain a list of unique indexes. Create a bit mask with the initial value 0. For each index obtained by a hash function, set that index in the mask 1. Perform the following binary operation `bits AND mask`, and if the value performed by that operation equals the mask, we can conclude that the value _**probably**_ exists in the bloom filter. \n\n\n#### `addHashFunction(fn)`\n\t\t\t params: additional hash function to add\n\t\t\t return: None\nTakes in a function as a parameter, and adds this function to the existing list of hash functions used by the bloom filter. The function `fn` has to take the form `fn(value)` where `value` is the item to hash, and the function must return an integer between 0 and the upper index bound. All other following parameters must have default values. \n\nThe function also has to be able to provide consistent values for the given input. There is no randomization involved, rather this is a pure key transformation function. Any compression function may be used, as long as it limits the key space within the upper bound of the bit array. \n\n#### `fingerprint(value)`\n\t\t\t params: value to inspect fingerprint\n\t\t\t return: array (len = k) of hashed values \nTakes in a value and returns an array of the hashed values created by the list of hash functions. Used to debug and inspect the behavior of the hashing mechanism. \n\n#### `mask([ints])`\n\t\t\t params: integer values to use to create bit mask\n\t\t\t return: integer bit mask\nCreates a bit masked integer with 1 in the indexes specified by the parameter integer array. \n\n---\n### Example (k = 3, m = 12)\n```\n0. constructor \n\nbit array: \n\t--- --- --- --- --- --- --- --- --- --- --- --- --- \n | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ...\n\t--- --- --- --- --- --- --- --- --- --- --- --- --- \n\n1. add(\"Hello world\") => 00, 03, 04\n\nbit array: \n\t--- --- --- --- --- --- --- --- --- --- --- --- --- \n | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ...\n\t--- --- --- --- --- --- --- --- --- --- --- --- --- \n\n2. contains(\"Hello World\") => 00,03,04 \n\nbit array: \n\t--- --- --- --- --- --- --- --- --- --- --- --- --- \n | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ...\n\t--- --- --- --- --- --- --- --- --- --- --- --- --- \n\t || || ||\n\t ||==========||==||=================> 1,1,1 = True \n\n3. contains(\"java is the best\") => 02,03,07 \n\nbit array: \n\t--- --- --- --- --- --- --- --- --- --- --- --- --- \n | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ...\n\t--- --- --- --- --- --- --- --- --- --- --- --- --- \n\t\t\t\t\t || || ||\n\t\t\t\t\t ||==||==============||=====> 0,1,0 = False\n```\n\n### Thank you!! \nThank you for checking this out! Please feel free to reach out via email [sdcroche@ncsu.edu](mailto:sdcroche@ncsu.edu), or twitter [@shmam_](https://twitter.com/shmam_)\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/shmam/bloomy", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "bloomy", "package_url": "https://pypi.org/project/bloomy/", "platform": "", "project_url": "https://pypi.org/project/bloomy/", "project_urls": { "Homepage": "https://github.com/shmam/bloomy" }, "release_url": "https://pypi.org/project/bloomy/0.0.2/", "requires_dist": null, "requires_python": "", "summary": "An efficient and scalable bloom filter module built in pure python.", "version": "0.0.2" }, "last_serial": 4643164, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "6642635422e73e84c5734f74da5396e9", "sha256": "7cc9aa7baf342520271aff592d68000f977136667aabc3020eb286350c3db559" }, "downloads": -1, "filename": "bloomy-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "6642635422e73e84c5734f74da5396e9", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 5718, "upload_time": "2018-12-28T23:14:59", "url": "https://files.pythonhosted.org/packages/2f/8f/51cb47e05ef634f355c3a39084a4093482eff60918a7bd3f23cf2455e403/bloomy-0.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "d5061d23b48f3cabc205a6f260f234a5", "sha256": "6d32e8025f724afa87a8a8eb008c9db776757e4bdad858db93ab4b6022be8c12" }, "downloads": -1, "filename": "bloomy-0.0.1.tar.gz", "has_sig": false, "md5_digest": "d5061d23b48f3cabc205a6f260f234a5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4729, "upload_time": "2018-12-28T23:15:01", "url": "https://files.pythonhosted.org/packages/cd/34/3af6aea36643733d143f426256b16cc41c592e037df85199e7fb1ec20ce7/bloomy-0.0.1.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "a55f0aeb08e28ccf8cf49d668236c1a6", "sha256": "2276d470d5d81e7d69f6a1df3a995b7b1d846ee2c4a55de1ebdbe1cc5159869c" }, "downloads": -1, "filename": "bloomy-0.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "a55f0aeb08e28ccf8cf49d668236c1a6", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 5449, "upload_time": "2018-12-29T03:45:56", "url": "https://files.pythonhosted.org/packages/e0/ae/70edd7a50953f420757bd3554885c4374a7ba824dd4cc684a9dc6ffc65c4/bloomy-0.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "37aeb1733741cdf520a92ad476ea3991", "sha256": "61528e4ec428f81c05408dea6a3ebdc79a51d55f58da671f0ed31322eb064786" }, "downloads": -1, "filename": "bloomy-0.0.2.tar.gz", "has_sig": false, "md5_digest": "37aeb1733741cdf520a92ad476ea3991", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4772, "upload_time": "2018-12-29T03:45:57", "url": "https://files.pythonhosted.org/packages/8a/5f/cb0f3ee594d5b0e8e504d08416407e44a4fae2d289d75b1224f627d5c2aa/bloomy-0.0.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "a55f0aeb08e28ccf8cf49d668236c1a6", "sha256": "2276d470d5d81e7d69f6a1df3a995b7b1d846ee2c4a55de1ebdbe1cc5159869c" }, "downloads": -1, "filename": "bloomy-0.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "a55f0aeb08e28ccf8cf49d668236c1a6", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 5449, "upload_time": "2018-12-29T03:45:56", "url": "https://files.pythonhosted.org/packages/e0/ae/70edd7a50953f420757bd3554885c4374a7ba824dd4cc684a9dc6ffc65c4/bloomy-0.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "37aeb1733741cdf520a92ad476ea3991", "sha256": "61528e4ec428f81c05408dea6a3ebdc79a51d55f58da671f0ed31322eb064786" }, "downloads": -1, "filename": "bloomy-0.0.2.tar.gz", "has_sig": false, "md5_digest": "37aeb1733741cdf520a92ad476ea3991", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4772, "upload_time": "2018-12-29T03:45:57", "url": "https://files.pythonhosted.org/packages/8a/5f/cb0f3ee594d5b0e8e504d08416407e44a4fae2d289d75b1224f627d5c2aa/bloomy-0.0.2.tar.gz" } ] }