{
"info": {
"author": "KodexLab, Pierre Magistry, Korantin Auguste, Emmanuel Navarro",
"author_email": "contact@kodexlab.com",
"bugtrack_url": null,
"classifiers": [
"Development Status :: 4 - Beta",
"License :: OSI Approved :: GNU Library or Lesser General Public License (LGPL)",
"Natural Language :: French",
"Operating System :: OS Independent",
"Programming Language :: Python",
"Programming Language :: Python :: 3.2",
"Programming Language :: Python :: 3.3",
"Programming Language :: Python :: 3.4",
"Topic :: Scientific/Engineering"
],
"description": "What is ELeVE ?\n===============\n\nELeVE is a library intended for computing an \"autonomy estimate\" score for substrings (all n-grams) in a corpus of text.\n\nThe autonomy score is based on normalised variation of branching entropies (nVBE) of strings, See [MagistrySagot2012]_ for a definiton of these terms \n\nIt was developed mainly for unsupervised segmentation of mandarin Chinese, but is language independant and was successfully used in research on other tasks like keyphrase extraction.\n\nFull documentation is available on `http://pythonhosted.org/eleve/ `_.\n\nIn a nutshell\n==============\n\nHere is a simple \"getting started\". First you have to train a model::\n\n >>> from eleve import MemoryStorage\n >>>\n >>> storage = MemoryStorage()\n >>>\n >>> # Then the training itself:\n >>> storage.add_sentence([\"I\", \"like\", \"New\", \"York\", \"city\"])\n >>> storage.add_sentence([\"I\", \"like\", \"potatoes\"])\n >>> storage.add_sentence([\"potatoes\", \"are\", \"fine\"])\n >>> storage.add_sentence([\"New\", \"York\", \"is\", \"a\", \"fine\", \"city\"])\n\nAnd then you cat query it::\n\n >>> storage.query_autonomy([\"New\", \"York\"])\n 2.0369977951049805\n >>> storage.query_autonomy([\"like\", \"potatoes\"])\n -0.3227022886276245\n\nEleve also store n-gram's occurence count::\n\n >>> storage.query_count([\"New\", \"York\"])\n 2\n >>> storage.query_count([\"New\", \"potatoes\"])\n 0\n >>> storage.query_count([\"I\", \"like\", \"potatoes\"])\n 1\n >>> storage.query_count([\"potatoes\"])\n 2\n\nThen, you can use it for segmentation, using an algorigthm that look for the solution which maximize nVBE of resulting words::\n\n >>> from eleve import Segmenter\n >>> s = Segmenter(storage)\n >>> # segment up to 4-grams, if we used the same storage as before.\n >>>\n >>> s.segment([\"What\", \"do\", \"you\", \"know\", \"about\", \"New\", \"York\"])\n [['What'], ['do'], ['you'], ['know'], ['about'], ['New', 'York']]\n\n\n\nInstallation\n============\n\nYou will need some dependencies. On Ubuntu::\n\n $ sudo apt-get install python3-dev libboost-python-dev libboost-filesystem-dev libleveldb-dev\n\nThen to install eleve::\n\n $ pip install eleve\n\nor if you have a local clone of source folder::\n\n $ python setup.py install\n\n\nGet the source\n==============\n\nSource are stored on `github `_::\n\n $ git clone https://github.com/kodexlab/eleve\n\n\n\nContribute\n==========\n\nInstall the development environment::\n\n $ git clone https://github.com/kodexlab/eleve\n $ cd eleve\n $ virtualenv ENV -p /usr/bin/python3\n $ source ENV/bin/activate\n $ pip install -r requirements.txt\n $ pip install -r requirements.dev.txt\n\nPull requests are welcome!\n\nTo run tests::\n\n $ make testall\n\nTo build the doc::\n\n $ make doc\n\nthen open: ``docs/_build/html/index.html``\n\n\n**Warning**: You need to have ``eleve`` accesible in the python path to run tests (and to build doc).\nFor that you can install ``eleve`` as a link in local virtualenv::\n\n $ pip install -e .\n\n(Note: this is indicated in pytest `good practice `_ )\n\n\nReferences\n===========\n\nIf you use ``eleve`` for an academic publication, please cite this paper:\n\n.. [MagistrySagot2012] Magistry, P., & Sagot, B. (2012, July). Unsupervized word segmentation: the case for mandarin chinese. In Proceedings of the 50th Annual Meeting of the ACL: Short Papers-Volume 2 (pp. 383-387). http://www.aclweb.org/anthology/P12-2075\n\n\n\nCopyright, license and authors\n==============================\n\nCopyright (C) 2014-2015 Kodex\u22c5Lab.\n\n``eleve`` is available under the `LGPL Version 3 `_ license.\n\n``eleve`` was originaly designed and prototyped by `Pierre Magistry `_ during its PhD. It then has been completly rewriten by `Korantin Auguste `_ and `Emmanuel Navarro `_ (with the help of Pierre).",
"description_content_type": "",
"docs_url": "https://pythonhosted.org/eleve/",
"download_url": "",
"downloads": {
"last_day": -1,
"last_month": -1,
"last_week": -1
},
"home_page": "https://github.com/kodexlab/eleve",
"keywords": "",
"license": "",
"maintainer": "",
"maintainer_email": "",
"name": "eleve",
"package_url": "https://pypi.org/project/eleve/",
"platform": "",
"project_url": "https://pypi.org/project/eleve/",
"project_urls": {
"Homepage": "https://github.com/kodexlab/eleve"
},
"release_url": "https://pypi.org/project/eleve/19.2/",
"requires_dist": null,
"requires_python": "",
"summary": "Extraction de LExique par Variation d'Entropie - Lexicon extraction based on the variation of entropy",
"version": "19.2"
},
"last_serial": 4841269,
"releases": {
"15.10.r1": [
{
"comment_text": "",
"digests": {
"md5": "3db82163ac78dda47c1a31a64b759ae5",
"sha256": "22abd8e15fba97f98b007e43dd421df3015d782c1ed79a6a1cfe6e06655f207a"
},
"downloads": -1,
"filename": "eleve-15.10.r1.tar.gz",
"has_sig": false,
"md5_digest": "3db82163ac78dda47c1a31a64b759ae5",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 23420,
"upload_time": "2015-10-30T21:17:45",
"url": "https://files.pythonhosted.org/packages/81/6f/393f33285d19ac487ba957533bc555c3349531f46c199e91d74243a49a29/eleve-15.10.r1.tar.gz"
}
],
"15.10.r2": [
{
"comment_text": "",
"digests": {
"md5": "e77dd636235d8fc09ae9396f39cf5e27",
"sha256": "750fe0dc6eaef19da3523a426fde7ab1ea3bebdadc924f9069ffb5d249bfe00c"
},
"downloads": -1,
"filename": "eleve-15.10.r2.tar.gz",
"has_sig": false,
"md5_digest": "e77dd636235d8fc09ae9396f39cf5e27",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 24488,
"upload_time": "2015-10-31T09:50:06",
"url": "https://files.pythonhosted.org/packages/c3/5d/2e5e6de43e052f90f86861ca308975f77e97db658dbfc6f5a73e64ceb8e5/eleve-15.10.r2.tar.gz"
}
],
"19.2": [
{
"comment_text": "",
"digests": {
"md5": "5aa489ab8c4af0f0253862de49759a7d",
"sha256": "736ff072f43859b53e76fb10d7a74d0e75fb28ffc7fd5d17da310f02e6a92dd3"
},
"downloads": -1,
"filename": "eleve-19.2.tar.gz",
"has_sig": false,
"md5_digest": "5aa489ab8c4af0f0253862de49759a7d",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 23035,
"upload_time": "2019-02-19T19:02:33",
"url": "https://files.pythonhosted.org/packages/56/0f/1e2f1fd45c26770ee8a42cde51936ab233e02d4fe9a79509dc67574880df/eleve-19.2.tar.gz"
}
]
},
"urls": [
{
"comment_text": "",
"digests": {
"md5": "5aa489ab8c4af0f0253862de49759a7d",
"sha256": "736ff072f43859b53e76fb10d7a74d0e75fb28ffc7fd5d17da310f02e6a92dd3"
},
"downloads": -1,
"filename": "eleve-19.2.tar.gz",
"has_sig": false,
"md5_digest": "5aa489ab8c4af0f0253862de49759a7d",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 23035,
"upload_time": "2019-02-19T19:02:33",
"url": "https://files.pythonhosted.org/packages/56/0f/1e2f1fd45c26770ee8a42cde51936ab233e02d4fe9a79509dc67574880df/eleve-19.2.tar.gz"
}
]
}