{ "info": { "author": "Aleksandar Savkov", "author_email": "aleksandar@savkov.eu", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "License :: OSI Approved :: GNU General Public License v3 (GPLv3)" ], "description": "CRFSuiteTagger\n==============\n\n_CRFSuiteTagger_ is a sequence tagger based on the [pycrfsuite](https://github.com/tpeng/python-crfsuite \"pycrfsuite\") python wrapper for [CRFSuite](http://www.chokkan.org/software/crfsuite/ \"CRFSuite\"). It is built for chunking, NER, and other BIO (also referred to as IOB) based text annotation tasks.\n\n### Why would you need this?\n\n_CRFSuiteTagger_ has a wide selection of common features, and the capability to easily integrate additional ones. The features are controlled using a simple string-based feature template. Additional features can be easily added through new _feature generating functions_ (see `crfsuitetagger.ftex`) passed on the `CRFSuiteTagger` constructor.\n\n### Installation\n\nYou should be able to install _CRFSuiteTagger_ as any other Python package:\n\n python setup.py install\n\n### Dependencies\n\nYou will need the following Python packages and one of my other libraries:\n\n* [pycrfsuite](https://github.com/tpeng/python-crfsuite \"pycrfsuite\") - python wrapper for CRFSuite\n* [numpy](http://www.numpy.org/ \"NumPy\") - you should it\n* [bioeval](https://github.com/savkov/bioeval \"bioeval\") - my library for evaluating BIO style annotation, which replaces the perl script from [CoNLL-2000](http://ilk.uvt.nl/team/sabine/chunklink/chunklink_2-2-2000_for_conll.pl)\n\n### TODO\n\n* command line interface\n* migrate data structure to [pandas](http://pandas.pydata.org/ \"pandas\")\n* more examples\n\n### See Also\n\nIf you are interested in other sequence taggers, you might want to look at:\n\n* [Stanford NLP](http://nlp.stanford.edu/software/lex-parser.shtml) -- POS tagger\n* [ARK](http://www.ark.cs.cmu.edu/TweetNLP/) -- POS tagger for tweets\n* [YamCha](http://chasen.org/~taku/software/yamcha/) -- BIO tagger/chunker\n* [CRF++](http://taku910.github.io/crfpp/) -- BIO tagger/chunker\n* [Wapiti](https://wapiti.limsi.fr/) -- POS & BIO tagger/chunker", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/savkov/CRFSuiteTagger", "keywords": "CRF CRFSuite sequence tagging POS chunking NER", "license": "GPLv3", "maintainer": null, "maintainer_email": null, "name": "crfst", "package_url": "https://pypi.org/project/crfst/", "platform": "Unix,MacOS", "project_url": "https://pypi.org/project/crfst/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://github.com/savkov/CRFSuiteTagger" }, "release_url": "https://pypi.org/project/crfst/0.2/", "requires_dist": null, "requires_python": null, "summary": "A multi-purpose sequential tagger wrapped around CRFSuite", "version": "0.2" }, "last_serial": 3858761, "releases": { "0.2": [ { "comment_text": "", "digests": { "md5": "f5a95b2308ad77abd706877343b50521", "sha256": "57b6063b588458c990d74e61d6bf41ba385395bb099b0fcde7cda6f22baf7259" }, "downloads": -1, "filename": "crfst-0.2.tar.gz", "has_sig": false, "md5_digest": "f5a95b2308ad77abd706877343b50521", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 22369, "upload_time": "2016-09-10T16:20:46", "url": "https://files.pythonhosted.org/packages/29/6d/d6cb76f25a892a3ff4d2fc051a54e841c2f6f4f578536a0ac437d62f4920/crfst-0.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "f5a95b2308ad77abd706877343b50521", "sha256": "57b6063b588458c990d74e61d6bf41ba385395bb099b0fcde7cda6f22baf7259" }, "downloads": -1, "filename": "crfst-0.2.tar.gz", "has_sig": false, "md5_digest": "f5a95b2308ad77abd706877343b50521", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 22369, "upload_time": "2016-09-10T16:20:46", "url": "https://files.pythonhosted.org/packages/29/6d/d6cb76f25a892a3ff4d2fc051a54e841c2f6f4f578536a0ac437d62f4920/crfst-0.2.tar.gz" } ] }