{ "info": { "author": "Brian Kopp", "author_email": "briankopp.usa@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "# fewerbytes\nA numpy-based compression library to make your data require fewerbytes\n\n## Quick Start\n```pip install fewerbytes```\n\n```python\nimport fewerbytes as fb\nimport numpy as np\narr = np.array([50000, 55000, 60000, 65000, 70000], dtype=np.uint32)\narr.nbytes # 20 Bytes\nnew_arr, new_arr_type, details = fb.integer_minimize_compression(arr)\nnew_arr # [0, 5000, 10000, 15000, 20000] dtype=uint16\nnew_arr_type # NumpyType with kind UNSIGNED and size SHORT (16bit)\ndetails # IntegerMinimizeTransformation with reference_value 50000\ndecomp_arr = fb.integer_decompression_from_transform(new_arr, details) # decompressed array\ndecomp_arr # [50000, 55000, 60000, 65000, 70000]\n```\n\n## Integer Compression\n\nfewerbytes offers four types of integer compression\n\n### Simple Downcast\n\nThis technique simply calculates the minimum and maximum values of the array,\nand then attempts to down-cast the integers to a smaller storage size, e.g. from \n64-bit integers to 16-bit integers\n\n```python\nimport fewerbytes as fb\nimport numpy as np\narr = np.full(10, fill_value=1, dtype=np.int64)\nnew_arr, new_arr_type = fb.downcast_integers(arr) # values are stored in 8-bit unsigned integers\n```\n\n### Minimize\n\nThis compression technique calculates the minimum array value and subtracts\nthe value from each array element. This allows switching from signed integers to\nunsigned integers, which opens an extra bit, doubling the range of absolute\nvalues that can be stored in each class of integers (8-bit, 16-bit, etc.).\n\nThis technique is effective when values in an array are large, but when the\ndifference between the min and max are not as large.\n\n```python\nimport fewerbytes as fb\nimport numpy as np\narr = np.full(10, fill_value=1000000)\nfb.integer_minimize_compression(arr) # (array, NumpyType) - array is all 0's\n```\n\n### Derivative or Element-Wise\n\nThis compression technique calculates the element-wise difference of the array.\nThe returned array has 1 fewer element than the original array.\n\nThis technique is effective when differences in values in an array are similar.\nFor example, unix timestamps from some regularly occurring measurement.\n\n```python\nimport fewerbytes as fb\nimport numpy as np\narr = np.array([1, 2, 3, 4, 5])\nfb.integer_derivative_compression(arr) # (array, numpy compressed array is [1, 1, 1, 1]\n```\n\n### Hash Set\n\nThis compression technique calculates a unique set of values in the array. Then,\nfor each element in the array, it stores the index of its value in the unique set.\nThe transformation details contain the unique values.\n\nThis technique is effective when the original array has a relatively small set\nof unique values. Due to the compression overhead of doing key-lookups, this\ntechnique is abandoned if the byte-storage efficiency is improved by less than 20%.\n\n```python\nimport fewerbytes as fb\nimport numpy as np\narr = np.array([1000000, 1000001, 1000000, 1000000, 1000000], dtype=np.uint32)\narr.nbytes # 20 bytes\nkeys, keys_type, transform = fb.integer_hash_compression(arr)\nkeys # array [0, 1, 0, 0, 0]\nkeys_type # UNSIGNED BYTE\ntransform.key_values # array [1000000, 1000001]\ntransform.key_values_type # UNSIGNED SINGLE (32-bit)\n```\n\n## Integer Decompression\n\nInteger decompression can be achieved using any of the following functions?\n\n```python\nimport fewerbytes as fb\n# arr and transform obtained from compression\nfb.integer_decompression_from_transform(arr, transform)\nfb.integer_minimize_decompression(arr, transform)\nfb.integer_derivative_decompression(arr, transform)\nfb.integer_hash_decompression(arr, transform)\nfb.integer_decompression_from_transforms(arr, list_of_transforms)\n```\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/briankopp/fewerbytes", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "fewerbytes", "package_url": "https://pypi.org/project/fewerbytes/", "platform": "", "project_url": "https://pypi.org/project/fewerbytes/", "project_urls": { "Homepage": "https://github.com/briankopp/fewerbytes" }, "release_url": "https://pypi.org/project/fewerbytes/0.0.1/", "requires_dist": [ "numpy" ], "requires_python": "", "summary": "Compression techniques for numpy arrays", "version": "0.0.1" }, "last_serial": 4166306, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "ffb7231f37f0544ba6c651cef82ebbd4", "sha256": "82a8bd17f854b6ffb84f3385f466c74691da71575a918867317a057a128f1488" }, "downloads": -1, "filename": "fewerbytes-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "ffb7231f37f0544ba6c651cef82ebbd4", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9411, "upload_time": "2018-08-13T17:11:07", "url": "https://files.pythonhosted.org/packages/a4/01/69d7e542a47c4af06ae3caaa1e3c467d1934eecfff13043a3d7c37975e35/fewerbytes-0.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "d0c811abf2c462e26eb272a9d3a55ba8", "sha256": "789bd421fd1787aba8e13e5649f986cd4fe0accb9444180f781653a34c78038d" }, "downloads": -1, "filename": "fewerbytes-0.0.1.tar.gz", "has_sig": false, "md5_digest": "d0c811abf2c462e26eb272a9d3a55ba8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8950, "upload_time": "2018-08-13T17:11:09", "url": "https://files.pythonhosted.org/packages/be/d1/4da7283f967b93268bf854ea599ff34c7dfa3e6681cc39e59d7b9212241d/fewerbytes-0.0.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "ffb7231f37f0544ba6c651cef82ebbd4", "sha256": "82a8bd17f854b6ffb84f3385f466c74691da71575a918867317a057a128f1488" }, "downloads": -1, "filename": "fewerbytes-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "ffb7231f37f0544ba6c651cef82ebbd4", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9411, "upload_time": "2018-08-13T17:11:07", "url": "https://files.pythonhosted.org/packages/a4/01/69d7e542a47c4af06ae3caaa1e3c467d1934eecfff13043a3d7c37975e35/fewerbytes-0.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "d0c811abf2c462e26eb272a9d3a55ba8", "sha256": "789bd421fd1787aba8e13e5649f986cd4fe0accb9444180f781653a34c78038d" }, "downloads": -1, "filename": "fewerbytes-0.0.1.tar.gz", "has_sig": false, "md5_digest": "d0c811abf2c462e26eb272a9d3a55ba8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8950, "upload_time": "2018-08-13T17:11:09", "url": "https://files.pythonhosted.org/packages/be/d1/4da7283f967b93268bf854ea599ff34c7dfa3e6681cc39e59d7b9212241d/fewerbytes-0.0.1.tar.gz" } ] }