{ "info": { "author": "Maarten van Gompel, Iris Hendrickx", "author_email": "proycon@anaproy.nl", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Intended Audience :: Science/Research", "License :: OSI Approved :: GNU General Public License v3 (GPLv3)", "Operating System :: POSIX", "Programming Language :: Python :: 3", "Topic :: Text Processing :: Linguistic" ], "description": "BabelEnte: Entity extractioN, Translation and implicit Evaluation using BabelFy\n===================================================================================\n\nThis is an entity extractor, translator and evaluator that uses `BabelFy `_ . Initially developed\nfor the TraMOOC project. It is written in Python 3.\n\n.. image:: https://github.com/proycon/babelente/blob/master/logo.jpg?raw=true\n :align: center\n\nInstallation\n---------------\n\n::\n\n pip3 install babelente\n\nor clone this github repository and run ``python3 setup.py install``, optionally prepend the commands with ``sudo`` for\nglobal installation.\n\nUsage\n-------\n\nYou will need a BabelFy API key, get it from `BabelNet.org `_ .\n\nSee ``babelente -h`` for extensive usage instructions, explaining all the options.\n\nFor simple entity recognition/linking on plain text documents, invoke BabelEnte as follows. This will produce JSON output with all entities found:\n\n``$ babelente -k \"YOUR-API-KEY\" -s en -S sentences.en.txt > output.json``\n\nBabelEnte comes with `FoLiA `_ support. Allowing you to read FoLiA documents and\nproducing enriched FoLiA documents that include the detected/linked entities. To this end, simply specify the language\nof your FoLiA document(s) and pass them to babelente as follows, multiple documents are allowed:\n\n``$ babelente -k \"YOUR-API-KEY\" -s en yourdocument.folia.xml``\n\nEach FoLiA document will be outputted to a new file, which includes all the entities. Entities will be explicitly linked to BabelNet\nand DBpedia where possible. At the same time, the ``stdout`` output again consists of a JSON object containing all found\nentities.\n\nNote that this method does currently not do any translation of entities yet (I'm open to feature request\nif you want this).\n\nIf you start from plain text but want to produce FoLiA output, then first use for instance `ucto\n`_ to tokenise your document and convert it to FoLiA, prior to passing it to\nBabelEnte.\n\n\nUsage for TraMOOC\n--------------------\n\nThis sofware can be used for implicit evaluation of translations, as it was designed in the scope of the TraMOOC\nproject.\n\nTo evaluate a translation (english to portuguese in this example), output wil be JSON to stdout:\n\n``$ babelente -k \"YOUR-API-KEY\" -s en -t pt -S sentences.en.txt -T sentences.pt.txt > output.json``\n\nTo re-evaluate:\n\n``$ babelente --evalfile output.json -S sentences.en.txt -T sentences.pt.txt > newoutput.json``\n\n\n\nEvaluation\n~~~~~~~~~~~~~\n\nThe evaluation produces several metrics.\n\n* source coverage number of characters covered by found source entities divided by the total number of characters in the source text\n* target coverage number of characters covered by found target entities divided by the total number of characters in the target text\n\nPrecision and Recall\n~~~~~~~~~~~~~~~~~~~~~~\n\nIn the standard scoring method we count each entity and compute scores\nWe also implemented the option to compute the scores\n\n\n* **micro precision** sum of found equivalent entities in target and source texts divided by the total sum of found entities in target language\n* **macro precision** sum of found equivalent entities in target and source texts divided by the number of target sentences\n* **micro recall** sum of found equivalent entities in target and source divided by the total sum of found entities in source language for which a equivalent link existed in the target language. In other words, how many of the hypothetical possible matches that were found?\nNote that this is intensive computation and needs to be specified as command line parameter \u2014recall.\n* **macro recall** sum of found equivalent entities in target and source texts divided by the number of source sentences.\n\n**Computing recall and precision over entity sets**\n\nInstead of counting every occurring entity (\u201ctokens\u201d), we can also count each entity once (\u201ctypes\u201d or \u201csets\u201d). This can be a more useful indicator of the performance measure when the input texts contains many repetitions or slight variations of the same sentences.\nThis option is activated with the parameter \u2014nodup (no duplicates) .\n\n\n\nLicense\n-----------\n\nGNU - GPL 3.0\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/proycon/babelente", "keywords": "nlp computational_linguistics entities wikipedia dbpedia linguistics tramooc", "license": "GPL", "maintainer": "", "maintainer_email": "", "name": "BabelEnte", "package_url": "https://pypi.org/project/BabelEnte/", "platform": "", "project_url": "https://pypi.org/project/BabelEnte/", "project_urls": { "Homepage": "https://github.com/proycon/babelente" }, "release_url": "https://pypi.org/project/BabelEnte/0.4.3/", "requires_dist": null, "requires_python": "", "summary": "Entity extractioN, Translation and Evaluation using BabelFy", "version": "0.4.3" }, "last_serial": 3827945, "releases": { "0.3.0": [ { "comment_text": "", "digests": { "md5": "ec0a2c436af5a57ad0a4f748bf601f93", "sha256": "e3663de34b1f7c4881d8ae557d3bbc69dcdcb3163af9d245b64a3549e61eb6bd" }, "downloads": -1, "filename": "BabelEnte-0.3.0.tar.gz", "has_sig": false, "md5_digest": "ec0a2c436af5a57ad0a4f748bf601f93", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8927, "upload_time": "2018-01-03T11:20:18", "url": "https://files.pythonhosted.org/packages/e7/71/157e4db622bfc4eedf3977fad60c1485091477b80ee10aff2c2780f472b3/BabelEnte-0.3.0.tar.gz" } ], "0.4.0": [ { "comment_text": "", "digests": { "md5": "5464162eaea0bb679eef70f780562108", "sha256": "f522e2c73910b4af5d0ad0657782a0e37cac4f2316f0f6e2eeab71eeaa8e97af" }, "downloads": -1, "filename": "BabelEnte-0.4.0.tar.gz", "has_sig": false, "md5_digest": "5464162eaea0bb679eef70f780562108", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12369, "upload_time": "2018-01-09T13:24:21", "url": "https://files.pythonhosted.org/packages/e1/ff/a087b6f33499edca86f53de32d725e88dd6c99e6a02df19a3575b9f92cb7/BabelEnte-0.4.0.tar.gz" } ], "0.4.1": [ { "comment_text": "", "digests": { "md5": "629ece94e3d4034b3823ebe95136bb93", "sha256": "ed92b92cbfb1308e435cd9e422b2f2b1d97f0ae7e8c6671c4b3b41406495c5c4" }, "downloads": -1, "filename": "BabelEnte-0.4.1.tar.gz", "has_sig": false, "md5_digest": "629ece94e3d4034b3823ebe95136bb93", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12385, "upload_time": "2018-01-27T15:06:59", "url": "https://files.pythonhosted.org/packages/aa/6f/ad6ce6f5ca2e583ff0c2d3ba5ab16b83c23935bd1b24e6b5ac722b21d6fb/BabelEnte-0.4.1.tar.gz" } ], "0.4.2": [ { "comment_text": "", "digests": { "md5": "16867414075055e8cbc5d3937240e6d3", "sha256": "6eabfac641e9e4020fd8d68ee65b5d027a4cc3c5efc958ea6a99053131de6950" }, "downloads": -1, "filename": "BabelEnte-0.4.2.tar.gz", "has_sig": false, "md5_digest": "16867414075055e8cbc5d3937240e6d3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 19537, "upload_time": "2018-02-21T14:05:55", "url": "https://files.pythonhosted.org/packages/8a/28/27becd4ce023f4bf950a8c0be38c45e909abe6257230e3bc6cf64ecc8e69/BabelEnte-0.4.2.tar.gz" } ], "0.4.3": [ { "comment_text": "", "digests": { "md5": "3a9952067fb6c8893285ea2898ef9fa9", "sha256": "803d0cdb8951840307e4a1ec5599f7f8f8f455e0094939893d21b4fd80cd0e44" }, "downloads": -1, "filename": "BabelEnte-0.4.3.tar.gz", "has_sig": false, "md5_digest": "3a9952067fb6c8893285ea2898ef9fa9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 18199, "upload_time": "2018-05-02T17:52:52", "url": "https://files.pythonhosted.org/packages/bc/23/22ac8216e5c31371aaf51c216a6a3a07029f9634e70ab8ce81a37c1a0e34/BabelEnte-0.4.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "3a9952067fb6c8893285ea2898ef9fa9", "sha256": "803d0cdb8951840307e4a1ec5599f7f8f8f455e0094939893d21b4fd80cd0e44" }, "downloads": -1, "filename": "BabelEnte-0.4.3.tar.gz", "has_sig": false, "md5_digest": "3a9952067fb6c8893285ea2898ef9fa9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 18199, "upload_time": "2018-05-02T17:52:52", "url": "https://files.pythonhosted.org/packages/bc/23/22ac8216e5c31371aaf51c216a6a3a07029f9634e70ab8ce81a37c1a0e34/BabelEnte-0.4.3.tar.gz" } ] }