{ "info": { "author": "Yukino Ikegami", "author_email": "yknikgm@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Developers", "Intended Audience :: Science/Research", "License :: OSI Approved :: BSD License", "Programming Language :: C++", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3.3", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Topic :: Scientific/Engineering :: Information Analysis", "Topic :: Software Development :: Libraries :: Python Modules", "Topic :: Text Processing :: Indexing", "Topic :: Text Processing :: Linguistic" ], "description": "shellinford\n===========\n|travis| |coveralls| |pyversion| |version| |license|\n\nShellinford is an implementation of a Wavelet Matrix/Tree succinct data structure for document retrieval.\n\nIt is based on `shellinford`_ C++ library.\n\n.. _shellinford: https://github.com/echizentm/shellinford\n\nNOTE: This module requires C++11 compiler\n\nInstallation\n============\n\n::\n\n $ pip install shellinford\n\n\nUsage\n=====\n\nCreate a new FM-index instance\n-------------------------------\n\n.. code:: python\n\n >>> import shellinford\n >>> fm = shellinford.FMIndex()\n\n\n- shellinford.Shellinford([use_wavelet_tree=True, filename=None])\n\n - When given a filename, Shellinford loads FM-index data from the file\n\n\nBuild FM-index\n-----------------------------\n\n.. code:: python\n\n >>> fm.build(['Milky Holmes', 'Sherlock \"Sheryl\" Shellingford', 'Milky'], 'milky.fm')\n\n- build([docs, filename])\n\n - When given a filename, Shellinford stores FM-index data to the file\n\n\nSearch word from FM-index\n---------------------------------\n\n.. code:: python\n\n >>> for doc in fm.search('Milky'):\n >>> print('doc_id:', doc.doc_id)\n >>> print('count:', doc.count)\n >>> print('text:', doc.text)\n doc_id: 0\n count: [1]\n text: Milky Holmes\n doc_id: 2\n count: [1]\n text: Milky\n\n >>> for doc in fm.search(['Milky', 'Holmes']):\n >>> print('doc_id:', doc.doc_id)\n >>> print('count:', doc.count)\n >>> print('text:', doc.text)\n doc_id: 1\n count: [1]\n text: Milky Holmes\n\n- search(query, [_or=False, ignores=[]])\n\n - If `_or` = True, then \"OR\" search is executed, else \"AND\" search\n - Given `ignores`, \"NOT\" search is also executed\n - NOTE: The search function is available after FM-index is built or loaded\n\n\nCount word from FM-index\n---------------------------------\n\n.. code:: python\n\n >>> fm.count('Milky'):\n 2\n\n >>> fm.count(['Milky', 'Holmes']):\n 1\n\n- count(query, [_or=False])\n\n - If `_or` = True, then \"OR\" search is executed, else \"AND\" search\n - NOTE: The count function is available after FM-index is built or loaded\n - This function is slightly faster than the search function\n\n\n\nAdd a document\n---------------------------------\n\n.. code:: python\n\n >>> fm.push_back('Baritsu')\n\n- push_back(doc)\n\n - NOTE: A document added by this method is not available to search until build\n\n\nRead FM-index from a binary file\n---------------------------------\n\n.. code:: python\n\n >>> fm.read('milky_holmes.fm')\n\n- read(path)\n\n\nWrite FM-index binary to a file\n---------------------------------\n\n.. code:: python\n\n >>> fm.write('milky_holmes.fm')\n\n- write(path)\n\n\nCheck Whether FM-Index contains string\n---------------------------------------\n\n.. code:: python\n\n >>> 'baritsu' in fm\n\n\nLicense\n=========\n- Wrapper code is licensed under the New BSD License.\n- Bundled `shellinford`_ C++ library (c) 2012 echizen_tm is licensed under the New BSD License.\n\n\n.. |travis| image:: https://travis-ci.org/ikegami-yukino/shellinford-python.svg?branch=master\n :target: https://travis-ci.org/ikegami-yukino/shellinford-python\n :alt: travis-ci.org\n\n.. |coveralls| image:: https://coveralls.io/repos/ikegami-yukino/shellinford-python/badge.svg?branch=master&service=github\n :target: https://coveralls.io/github/ikegami-yukino/shellinford-python?branch=master\n :alt: coveralls.io\n\n.. |pyversion| image:: https://img.shields.io/pypi/pyversions/shellinford.svg\n\n.. |version| image:: https://img.shields.io/pypi/v/shellinford.svg\n :target: http://pypi.python.org/pypi/shellinford/\n :alt: latest version\n\n.. |license| image:: https://img.shields.io/pypi/l/shellinford.svg\n :target: http://pypi.python.org/pypi/shellinford/\n :alt: license\n\n\nCHANGES\n=======\n\n0.4.1 (2010-02-08)\n------------------\n\n- Make \"in\" operator faster\n\n0.4.0 (2018-09-30)\n------------------\n\n- `FMIndex.count()` is added\n- No longer support Python 2.6\n- bug fix\n\n0.3.5 (2018-09-05)\n------------------\n\n- `FMIndex.build()` and `FMIndex.pushback()` ignore empty string\n- `FMIndex` supports \"in\" operator. (e.g., 'a' in fm)\n- Support Python 3.5, 3.6 and 3.7\n\n0.3.4 (2016-10-28)\n------------------\n\n- `FMIndex.search()` returns list\n\n0.3 (2014-11-24)\n----------------\n\n- \"OR\" search and \"NOT\" search are available in `FMIndex.search()`.\n- `FMIndex.size` and `FMIndex.docsize` are available as property\n\n0.2 (2014-03-28)\n----------------\n\n\"AND\" search is available by giving Sequence (list, tuple, etc.) `FMIndex.search()`\n\n0.1 (2014-03-11)\n----------------\n\nFirst release.", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/ikegami-yukino/shellinford-python", "keywords": "full text search,FM-index,Wavelet Matrix", "license": "", "maintainer": "", "maintainer_email": "", "name": "shellinford", "package_url": "https://pypi.org/project/shellinford/", "platform": "", "project_url": "https://pypi.org/project/shellinford/", "project_urls": { "Homepage": "https://github.com/ikegami-yukino/shellinford-python" }, "release_url": "https://pypi.org/project/shellinford/0.4.1/", "requires_dist": null, "requires_python": "", "summary": "Wavelet Matrix/Tree succinct data structure for full text search (using shellinford C++ library)", "version": "0.4.1" }, "last_serial": 4795603, "releases": { "0.3.1": [ { "comment_text": "", "digests": { "md5": "3baad78e59f96193c413c9647b1e7e67", "sha256": "05c55b8bab63ad7b9dc60ef5f60918a5d6d26e0ecadae78504d81395f7178a49" }, "downloads": -1, "filename": "shellinford-0.3.1.tar.gz", "has_sig": false, "md5_digest": "3baad78e59f96193c413c9647b1e7e67", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 65295, "upload_time": "2014-11-23T15:22:48", "url": "https://files.pythonhosted.org/packages/7b/f3/3e8abdf18c12e8d5a3e5cc8b67663515f6acf0f0b5e447cb43561ecc4bf3/shellinford-0.3.1.tar.gz" } ], "0.3.4": [ { "comment_text": "", "digests": { "md5": "19173d751819b27dd8b6174dcba61d0b", "sha256": "97f1b3e256012870457a66697f68a3d0ba56f48402ac537734b80837fb80ce4a" }, "downloads": -1, "filename": "shellinford-0.3.4.tar.gz", "has_sig": false, "md5_digest": "19173d751819b27dd8b6174dcba61d0b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 61133, "upload_time": "2016-10-29T06:50:25", "url": "https://files.pythonhosted.org/packages/56/ad/c3914b0ba1bf8da4187ec64a114178f5d593aefcb73dcbca3f2f7c5a50e7/shellinford-0.3.4.tar.gz" } ], "0.3.5": [ { "comment_text": "", "digests": { "md5": "fcfe0fa5456519f360cd1eaf222d203c", "sha256": "e9a45c30db15cfa0e9fede38ec23ec1c771656a6e3f73cb7668c02bd8fa7cdc5" }, "downloads": -1, "filename": "shellinford-0.3.5.tar.gz", "has_sig": false, "md5_digest": "fcfe0fa5456519f360cd1eaf222d203c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 62875, "upload_time": "2018-09-04T22:24:53", "url": "https://files.pythonhosted.org/packages/20/2e/2217b7afede772c6f9ea40c8ece3c5657f3e86da2c49c22685cb8461ad20/shellinford-0.3.5.tar.gz" } ], "0.4.0": [ { "comment_text": "", "digests": { "md5": "466639acb95ada2a58a720908597263a", "sha256": "7311a203b8f6b2b6f96e616859ae00bec6edc2f7ed54d385766a0603ac20d5c4" }, "downloads": -1, "filename": "shellinford-0.4.0.tar.gz", "has_sig": false, "md5_digest": "466639acb95ada2a58a720908597263a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 64329, "upload_time": "2018-09-30T00:56:46", "url": "https://files.pythonhosted.org/packages/f4/db/aeab3393e085917eddfd859a072a90f60676e53d688dba4248797c71d4fb/shellinford-0.4.0.tar.gz" } ], "0.4.1": [ { "comment_text": "", "digests": { "md5": "d485d6483ace46aca6b6662bea346877", "sha256": "c19f125a9d22d9676dbec64c0490ddd2d95d2449363052ddc2f4a588a52b04b3" }, "downloads": -1, "filename": "shellinford-0.4.1.tar.gz", "has_sig": false, "md5_digest": "d485d6483ace46aca6b6662bea346877", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 64999, "upload_time": "2019-02-08T13:56:24", "url": "https://files.pythonhosted.org/packages/fd/d7/717cc007043e951cccc6f384b25df4161cb54391b69f93c5b1b29cf9b924/shellinford-0.4.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "d485d6483ace46aca6b6662bea346877", "sha256": "c19f125a9d22d9676dbec64c0490ddd2d95d2449363052ddc2f4a588a52b04b3" }, "downloads": -1, "filename": "shellinford-0.4.1.tar.gz", "has_sig": false, "md5_digest": "d485d6483ace46aca6b6662bea346877", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 64999, "upload_time": "2019-02-08T13:56:24", "url": "https://files.pythonhosted.org/packages/fd/d7/717cc007043e951cccc6f384b25df4161cb54391b69f93c5b1b29cf9b924/shellinford-0.4.1.tar.gz" } ] }