{ "info": { "author": "Yusuke Matsui", "author_email": "matsui528@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "\n\n\n[![Build Status](https://travis-ci.org/matsui528/rii.svg?branch=master)](https://travis-ci.org/matsui528/rii)\n[![Documentation Status](https://readthedocs.org/projects/rii/badge/?version=latest)](https://rii.readthedocs.io/en/latest/?badge=latest)\n[![PyPI version](https://badge.fury.io/py/rii.svg)](https://badge.fury.io/py/rii)\n[![Downloads](https://pepy.tech/badge/rii)](https://pepy.tech/project/rii)\n\n\n\n\n\nReconfigurable Inverted Index (Rii): IVFPQ-based fast and memory efficient approximate nearest neighbor search method\nwith a subset-search functionality.\n\nReference:\n- [Y. Matsui](http://yusukematsui.me/), [R. Hinami](http://www.satoh-lab.nii.ac.jp/member/hinami/), and [S. Satoh](http://research.nii.ac.jp/~satoh/index.html), \"**Reconfigurable Inverted Index**\", ACM Multimedia 2018 (oral). [**[paper](https://dl.acm.org/ft_gateway.cfm?id=3240630)**] [**[project](http://yusukematsui.me/project/rii/rii.html)**]\n\n## Summary of features\n![](http://yusukematsui.me/project/rii/img/teaser1.png) | ![](http://yusukematsui.me/project/rii/img/teaser2.png)\n:---:|:---:\nThe search can be operated for a subset of a database. | Rii remains fast even after many new items are added.\n- Fast and memory efficient ANN. Rii enables you to run billion-scale search in less than 10 ms.\n- You can run the search over a **subset** of the whole database\n- Rii Remains fast even after many vectors are newly added (i.e., the data structure can be **reconfigured**)\n\n\n## Installing\nYou can install the package via pip. This library works with Python 3.5+ on linux.\n```\npip install rii\n```\n\n## [Documentation](https://rii.readthedocs.io/en/latest/index.html)\n- [Tutorial](https://rii.readthedocs.io/en/latest/source/tutorial.html)\n- [Tips](https://rii.readthedocs.io/en/latest/source/tips.html)\n- [API](https://rii.readthedocs.io/en/latest/source/api.html)\n\n## Usage\n\n### Basic ANN\n\n```python\nimport rii\nimport nanopq\nimport numpy as np\n\nN, Nt, D = 10000, 1000, 128\nX = np.random.random((N, D)).astype(np.float32) # 10,000 128-dim vectors to be searched\nXt = np.random.random((Nt, D)).astype(np.float32) # 1,000 128-dim vectors for training\nq = np.random.random((D,)).astype(np.float32) # a 128-dim vector\n\n# Prepare a PQ/OPQ codec with M=32 sub spaces\ncodec = nanopq.PQ(M=32).fit(vecs=Xt) # Trained using Xt\n\n# Instantiate a Rii class with the codec\ne = rii.Rii(fine_quantizer=codec)\n\n# Add vectors\ne.add_configure(vecs=X)\n\n# Search\nids, dists = e.query(q=q, topk=3)\nprint(ids, dists) # e.g., [7484 8173 1556] [15.06257439 15.38533878 16.16935158]\n```\nNote that you can construct a PQ codec and instantiate the Rii class at the same time if you want.\n```python\ne = rii.Rii(fine_quantizer=nanopq.PQ(M=32).fit(vecs=Xt))\ne.add_configure(vecs=X)\n```\nFurthermore, you can even write them in one line by chaining a function.\n```python\ne = rii.Rii(fine_quantizer=nanopq.PQ(M=32).fit(vecs=Xt)).add_configure(vecs=X)\n```\n\n### Subset search\n\n```python\n# The search can be conducted over a subset of the database\ntarget_ids = np.array([85, 132, 236, 551, 694, 728, 992, 1234]) # Specified by IDs\nids, dists = e.query(q=q, topk=3, target_ids=target_ids)\nprint(ids, dists) # e.g., [728 85 132] [14.80522156 15.92787838 16.28690338]\n```\n\n### Data addition and reconfiguration\n\n```python\n# Add new vectors\nX2 = np.random.random((1000, D)).astype(np.float32)\ne.add(vecs=X2) # Now N is 11000\ne.query(q=q) # Ok. (0.12 msec / query)\n\n# However, if you add quite a lot of vectors, the search might become slower\n# because the data structure has been optimized for the initial item size (N=10000)\nX3 = np.random.random((1000000, D)).astype(np.float32) \ne.add(vecs=X3) # A lot. Now N is 1011000\ne.query(q=q) # Slower (0.96 msec/query)\n\n# In such case, run the reconfigure function. That updates the data structure\ne.reconfigure()\ne.query(q=q) # Ok. (0.21 msec / query)\n```\n\n### I/O by pickling\n```python\nimport pickle\nwith open('rii.pkl', 'wb') as f:\n pickle.dump(e, f)\nwith open('rii.pkl', 'rb') as f:\n e_dumped = pickle.load(f) # e_dumped is identical to e\n```\n\n### Util functions\n```python\n# Print the current parameters\ne.print_params()\n\n# Delete all PQ-codes and posting lists. fine_quantizer is kept.\ne.clear()\n\n# You can switch the verbose flag\ne.verbose = False\n\n# You can merge two Rii instances if they have the same fine_quantizer\ne1 = rii.Rii(fine_quantizer=codec)\ne2 = rii.Rii(fine_quantizer=codec)\ne1.add_reconfigure(vecs=X1)\ne2.add_reconfigure(vecs=X2)\ne1.merge(e2) # Now e1 contains both X1 and X2\n\n```\n\n## [Examples](./examples)\n- [Simple tag search](./examples/tag_search/simple_tag_search.ipynb)\n- [Benchmark](./examples/benchmark/)\n\n## Author\n- [Yusuke Matsui](http://yusukematsui.me)\n\n## Credits\n- The logo is designed by [@richardbmx](https://github.com/richardbmx) ([#4](https://github.com/matsui528/rii/issues/4))", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/matsui528/rii", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "rii", "package_url": "https://pypi.org/project/rii/", "platform": "", "project_url": "https://pypi.org/project/rii/", "project_urls": { "Homepage": "https://github.com/matsui528/rii" }, "release_url": "https://pypi.org/project/rii/0.2.6/", "requires_dist": null, "requires_python": "", "summary": "Fast and memory-efficient ANN with a subset-search functionality", "version": "0.2.6" }, "last_serial": 5614352, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "ace8f4db26ed63c4cd958a599ffb8476", "sha256": "caaffbe8f80fd14c1e42f43635b863f8932d3dd5ad3647a37f9496f93cd735fc" }, "downloads": -1, "filename": "rii-0.1.0.tar.gz", "has_sig": false, "md5_digest": "ace8f4db26ed63c4cd958a599ffb8476", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 25748, "upload_time": "2018-08-12T02:52:17", "url": "https://files.pythonhosted.org/packages/a6/88/9bd8e3e9b3a0f9c545d01002e24b6348a467db6e60e2735cf55a8c678241/rii-0.1.0.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "4d47d116c54ec8cc35a51186d3cfbc99", "sha256": "48d04ce445536f413cd7c5a6525ee8f94e5f746ef87b339fe8ce47c423afd665" }, "downloads": -1, "filename": "rii-0.2.0.tar.gz", "has_sig": false, "md5_digest": "4d47d116c54ec8cc35a51186d3cfbc99", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 26210, "upload_time": "2018-08-21T08:05:16", "url": "https://files.pythonhosted.org/packages/1a/ea/e6658a7e3c98fed0434bf8929e46185d5dd77e82f2201a9f8baed539c385/rii-0.2.0.tar.gz" } ], "0.2.1": [ { "comment_text": "", "digests": { "md5": "52649171eeeb4d1f1a14ac00cf20f489", "sha256": "65dbd868c4cfaecaa1df835beced5dfbdbde9bf8cc6737e37d62968c5527bc5a" }, "downloads": -1, "filename": "rii-0.2.1.tar.gz", "has_sig": false, "md5_digest": "52649171eeeb4d1f1a14ac00cf20f489", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 26563, "upload_time": "2018-08-23T16:35:41", "url": "https://files.pythonhosted.org/packages/45/0e/fff085fbf8a01fbf88de3d1817b2a57695adb04b63d7a87db56cb02011d1/rii-0.2.1.tar.gz" } ], "0.2.2": [ { "comment_text": "", "digests": { "md5": "54c9fcaa6529d6c2a1ad9f10c7d89550", "sha256": "2fcaaba4431c68bf176c54bbba3aaeedb378c885b3e5b93e65ac42f00c40d8b0" }, "downloads": -1, "filename": "rii-0.2.2.tar.gz", "has_sig": false, "md5_digest": "54c9fcaa6529d6c2a1ad9f10c7d89550", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 29287, "upload_time": "2018-08-31T02:24:56", "url": "https://files.pythonhosted.org/packages/2a/1a/3a935181f88c62bcff453df4a8916936d05c2f349b74a6368ac91afce5c4/rii-0.2.2.tar.gz" } ], "0.2.3": [ { "comment_text": "", "digests": { "md5": "4ada4a505cd4efd2c729566493f55a9a", "sha256": "a535342603b237cadd1ca8342f27111c36131917b2f5cc1ffeb9b14588ffdc01" }, "downloads": -1, "filename": "rii-0.2.3.tar.gz", "has_sig": false, "md5_digest": "4ada4a505cd4efd2c729566493f55a9a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 31138, "upload_time": "2018-09-13T15:28:26", "url": "https://files.pythonhosted.org/packages/37/79/379308df392bc07d1c51b288884c2e85ef193c67f7a6908181392778a4e0/rii-0.2.3.tar.gz" } ], "0.2.4": [ { "comment_text": "", "digests": { "md5": "20c359f3a3d54886ec5d3b8db3c36d81", "sha256": "0a2f92973881187197f590010547a55aab15d7f6b8807b712d703a7e0b305509" }, "downloads": -1, "filename": "rii-0.2.4.tar.gz", "has_sig": false, "md5_digest": "20c359f3a3d54886ec5d3b8db3c36d81", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 31233, "upload_time": "2018-11-24T16:29:05", "url": "https://files.pythonhosted.org/packages/3d/b3/3b34703a1546c45a09ea455d9546e025b92917b32290553e02af94d83394/rii-0.2.4.tar.gz" } ], "0.2.5": [ { "comment_text": "", "digests": { "md5": "22be6f308c3ca56a2ff278ee10d0479a", "sha256": "411962dc59d1d33d3d5ce67bafc63cd4ff0c9e32045bb8f31d43656a376e3fb8" }, "downloads": -1, "filename": "rii-0.2.5.tar.gz", "has_sig": false, "md5_digest": "22be6f308c3ca56a2ff278ee10d0479a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 24292, "upload_time": "2019-07-31T15:18:20", "url": "https://files.pythonhosted.org/packages/51/8f/943fec0fe6d821310d92569702d84b7c7b230eb8412b6373456a2a6dbaf7/rii-0.2.5.tar.gz" } ], "0.2.6": [ { "comment_text": "", "digests": { "md5": "fcb1a4ad7edc6e9148dbe0a18bb33a04", "sha256": "a6b0e3ec191a6eda57f03eea6bac30d11af9fbcb16e7ef6da70c0b96a01b019f" }, "downloads": -1, "filename": "rii-0.2.6.tar.gz", "has_sig": false, "md5_digest": "fcb1a4ad7edc6e9148dbe0a18bb33a04", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 24377, "upload_time": "2019-07-31T16:10:26", "url": "https://files.pythonhosted.org/packages/0a/d5/9e7a32e612b7414c272c0277add9494065e022353d1500963ec2ff0cc7f5/rii-0.2.6.tar.gz" } ], "0.2.6.dev1": [ { "comment_text": "", "digests": { "md5": "e3c02dc42886fe38bac12ac563743323", "sha256": "ab8482447ce5bc2edc85d620d37f85c286714bd8c753fcbfbec459a46882d007" }, "downloads": -1, "filename": "rii-0.2.6.dev1.tar.gz", "has_sig": false, "md5_digest": "e3c02dc42886fe38bac12ac563743323", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 24309, "upload_time": "2019-07-31T15:57:00", "url": "https://files.pythonhosted.org/packages/a5/e0/dba2a54f341cdfd050b6be47a00fd580809e1e87f37b0432a4d6b90bce59/rii-0.2.6.dev1.tar.gz" } ], "0.2.6.dev2": [ { "comment_text": "", "digests": { "md5": "299f8425f83640d9590d5352d348e219", "sha256": "b2e2074e81d362bd00fac68bc6d0e2d60b7e960e4fed5be8d710150d4822ccf3" }, "downloads": -1, "filename": "rii-0.2.6.dev2.tar.gz", "has_sig": false, "md5_digest": "299f8425f83640d9590d5352d348e219", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 24397, "upload_time": "2019-07-31T16:03:44", "url": "https://files.pythonhosted.org/packages/cc/98/8274afe594f690448e158cb4a760960c5b846777d16af149f7d5bec27dc6/rii-0.2.6.dev2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "fcb1a4ad7edc6e9148dbe0a18bb33a04", "sha256": "a6b0e3ec191a6eda57f03eea6bac30d11af9fbcb16e7ef6da70c0b96a01b019f" }, "downloads": -1, "filename": "rii-0.2.6.tar.gz", "has_sig": false, "md5_digest": "fcb1a4ad7edc6e9148dbe0a18bb33a04", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 24377, "upload_time": "2019-07-31T16:10:26", "url": "https://files.pythonhosted.org/packages/0a/d5/9e7a32e612b7414c272c0277add9494065e022353d1500963ec2ff0cc7f5/rii-0.2.6.tar.gz" } ] }