{ "info": { "author": "David Lukes", "author_email": "dafydd.lukes@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7" ], "description": "=====\nCorPy\n=====\n\n.. image:: https://readthedocs.org/projects/corpy/badge/?version=stable\n :target: https://corpy.readthedocs.io/en/stable/?badge=stable\n :alt: Documentation status\n\n.. image:: https://badge.fury.io/py/corpy.svg\n :target: https://badge.fury.io/py/corpy\n :alt: PyPI package\n\n.. image:: https://img.shields.io/badge/code%20style-black-000000.svg\n :target: https://github.com/python/black\n :alt: Code style\n\nInstallation\n============\n\n.. code:: bash\n\n $ pip3 install corpy\n\nOnly recent versions of Python 3 (3.6+) are supported by design.\n\nWhat is CorPy?\n==============\n\nA fancy plural for *corpus* ;) Also, a collection of handy but not especially\nmutually integrated tools for dealing with linguistic data. It abstracts away\nfunctionality which is often needed in practice for teaching and/or day to day\nwork at the `Czech National Corpus `__, without aspiring to\nbe a fully featured or consistent NLP framework.\n\nThe short URL to the docs is: https://corpy.rtfd.io/\n\nHere's an idea of what you can do with CorPy:\n\n- add linguistic annotation to raw textual data using either `UDPipe\n `__ or `MorphoDiTa\n `__\n\n.. note::\n\n **Should I pick UDPipe or MorphoDiTa?**\n\n UDPipe_ is the successor to MorphoDiTa_, extending and improving upon the\n original codebase. It has more features at the cost of being somewhat more\n complex: it does both `morphological tagging (including lemmatization) and\n syntactic parsing `__,\n and it handles a number of different input and output formats. You can also\n download `pre-trained models `__ for\n many different languages.\n\n By contrast, MorphoDiTa_ only has `pre-trained models for Czech and English\n `__, and only performs\n `morphological tagging (including lemmatization)\n `__. However, its\n output is more straightforward -- it just splits your text into tokens and\n annotates them, whereas UDPipe can (depending on the model) introduce\n additional tokens necessary for a more explicit analysis, add multi-word\n tokens etc. This is because UDPipe is tailored to the type of linguistic\n analysis conducted within the UniversalDependencies_ project, using the\n CoNLL-U_ data format.\n\n MorphoDiTa can also help you if you just want to tokenize text and don't have\n a language model available.\n\n.. _UDPipe: http://ufal.mff.cuni.cz/udpipe\n.. _MorphoDiTa: http://ufal.mff.cuni.cz/morphodita\n.. _UniversalDependencies: https://universaldependencies.org\n.. _CoNLL-U: https://universaldependencies.org/format.html\n\n- `easily generate word clouds\n `__\n- `generate phonetic transcripts of Czech texts\n `__\n- `wrangle corpora in the vertical format\n `__ devised originally\n for `CWB `__, used also by `(No)SketchEngine\n `__\n- plus some `command line utilities\n `__\n\n.. development-marker\n\nDevelopment\n===========\n\nDependencies and building the docs\n----------------------------------\n\nThe canonical dependency requirements are listed in ``pyproject.toml`` and\nfrozen in ``poetry.lock``. However, in order to use ``autodoc`` to build the API\ndocs, the package has to be installed, and ``corpy`` has dependencies that are\ntoo resource-intensive to build on ReadTheDocs.\n\nThe solution is to use a dummy ``setup.py`` which lists *only* the dependencies\nneeded to build the docs properly, and mock all other dependencies by listing\nthem in ``autodoc_mock_imports`` in ``docs/conf.py``. This dummy ``setup.py`` is\nused to install ``corpy`` *only* on ReadTheDocs (via the appropriate config\noption in ``.readthedocs.yml``). The same goes for the ``MANIFEST.in`` file,\nwhich duplicates the ``tool.poetry.include`` entries in ``pyproject.toml`` for\nthe sole benefit of ReadTheDocs.\n\n.. license-marker\n\nLicense\n=======\n\nCopyright \u00a9 2016--present `\u00da\u010cNK `__/David Luke\u0161\n\nDistributed under the `GNU General Public License v3\n`__.\n", "description_content_type": "text/x-rst", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/dlukes/corpy", "keywords": "corpus,linguistics,NLP", "license": "GPL-3.0+", "maintainer": "", "maintainer_email": "", "name": "corpy", "package_url": "https://pypi.org/project/corpy/", "platform": "", "project_url": "https://pypi.org/project/corpy/", "project_urls": { "Homepage": "https://github.com/dlukes/corpy", "Repository": "https://github.com/dlukes/corpy" }, "release_url": "https://pypi.org/project/corpy/0.2.3/", "requires_dist": [ "regex (>=2019.4,<2020.0)", "lazy (>=1.4,<2.0)", "lxml (>=4.3,<5.0)", "matplotlib (>=3.1,<4.0)", "wordcloud (>=1.5,<2.0)", "ufal.morphodita (>=1.9,<2.0)", "ufal.udpipe (>=1.2,<2.0)", "numpy (>=1.16,<2.0)", "click (>=7.0,<8.0)" ], "requires_python": ">=3.6,<4.0", "summary": "Tools for processing language data.", "version": "0.2.3" }, "last_serial": 5703915, "releases": { "0.1.1": [ { "comment_text": "", "digests": { "md5": "c17fef9f0533faadade6dc1ba4c23724", "sha256": "28d0305ca650506239fdd0578b88f88043a526250e67c5694432a89bb673458c" }, "downloads": -1, "filename": "corpy-0.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "c17fef9f0533faadade6dc1ba4c23724", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6,<4.0", "size": 74819, "upload_time": "2019-05-23T13:23:19", "url": "https://files.pythonhosted.org/packages/84/4b/72ff96ad185435aba36481527520860815530dc09342cf45cc670e9569c2/corpy-0.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "58e42abffc4e369f782b1254892c3350", "sha256": "d2f593bac9414f215e1821050270b8e37ca26815360919632f45ea35d93c2891" }, "downloads": -1, "filename": "corpy-0.1.1.tar.gz", "has_sig": false, "md5_digest": "58e42abffc4e369f782b1254892c3350", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6,<4.0", "size": 26346, "upload_time": "2019-05-23T13:23:21", "url": "https://files.pythonhosted.org/packages/c0/87/4910ab01f786520176a6f74fe96c9ed1aa973f51ef9fdb670c308b9ccf58/corpy-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "5993fefa3cae6e9b33befa61b117e418", "sha256": "1438a42e281a7aeb05e0903155f394db36dd35169f1a4a4988de4131cd437818" }, "downloads": -1, "filename": "corpy-0.1.2-py3-none-any.whl", "has_sig": false, "md5_digest": "5993fefa3cae6e9b33befa61b117e418", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6,<4.0", "size": 75513, "upload_time": "2019-05-23T21:17:20", "url": "https://files.pythonhosted.org/packages/b9/35/a183ca7f214cd2293deb1ee5711dac38de548faa8f317c3cfeb9a0b99da4/corpy-0.1.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "e3919e5a8d7281be19db65b15fce430f", "sha256": "96ec8943fff81fc277ad99f4929a2df41a69ca027693154b6b648f0ebc9611f2" }, "downloads": -1, "filename": "corpy-0.1.2.tar.gz", "has_sig": false, "md5_digest": "e3919e5a8d7281be19db65b15fce430f", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6,<4.0", "size": 27560, "upload_time": "2019-05-23T21:17:22", "url": "https://files.pythonhosted.org/packages/a2/0e/ecf253af6e3fadb16c5d1af36e545a66110653d642c388d0a364ffff0468/corpy-0.1.2.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "41adab9c4987cef1c5e82887151e6db6", "sha256": "9453901130f6577329bf2898091ffecf1740ab180dbe89b3690a816a38fe62da" }, "downloads": -1, "filename": "corpy-0.2.0-py3-none-any.whl", "has_sig": false, "md5_digest": "41adab9c4987cef1c5e82887151e6db6", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6,<4.0", "size": 63375, "upload_time": "2019-05-27T16:02:18", "url": "https://files.pythonhosted.org/packages/08/36/3a1cac8a54419333da9a23c94da0650b293dd0057ddb7e5f4755cb10a3b7/corpy-0.2.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "5fa28e74e90539528e93d64ff8d8c832", "sha256": "dc02c903ce2b662c9ec34da75c17ea0437428001e587506511df173ba017336e" }, "downloads": -1, "filename": "corpy-0.2.0.tar.gz", "has_sig": false, "md5_digest": "5fa28e74e90539528e93d64ff8d8c832", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6,<4.0", "size": 24425, "upload_time": "2019-05-27T16:02:20", "url": "https://files.pythonhosted.org/packages/d4/ac/2ebb3954fc08ca5afc8d28e508ac3ef6598985d3f42e9020aa46421361ff/corpy-0.2.0.tar.gz" } ], "0.2.1": [ { "comment_text": "", "digests": { "md5": "5499507e74cec627f72f0f0fece6e05d", "sha256": "0bbe7a0482c94e5338afbf643782cffb2c7b781b67d877e1c5ed5be136d817dc" }, "downloads": -1, "filename": "corpy-0.2.1-py3-none-any.whl", "has_sig": false, "md5_digest": "5499507e74cec627f72f0f0fece6e05d", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6,<4.0", "size": 31261, "upload_time": "2019-06-14T19:38:33", "url": "https://files.pythonhosted.org/packages/2b/50/a88adbf865a06bb7dc859fafbc02b81e664a4dc440a115cb7dad6c671839/corpy-0.2.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "cdb5b62b34492616ae0d94a2f50b2914", "sha256": "33e217f86aa25d9f34b3ea4101ec4ceaecf5aff82166450c833c2a8094bafc12" }, "downloads": -1, "filename": "corpy-0.2.1.tar.gz", "has_sig": false, "md5_digest": "cdb5b62b34492616ae0d94a2f50b2914", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6,<4.0", "size": 29375, "upload_time": "2019-06-14T19:38:35", "url": "https://files.pythonhosted.org/packages/2c/6f/4a3cf1961b9a71ab1022b56587996fb3609e3a97508234ca26da7285ef41/corpy-0.2.1.tar.gz" } ], "0.2.2": [ { "comment_text": "", "digests": { "md5": "9a2f1793649b8918c9b8dfd0134a7d13", "sha256": "c61c7d5b5b90de3ab9fd02faec15663d25830aa33321e89040b93e19ad15a4f5" }, "downloads": -1, "filename": "corpy-0.2.2-py3-none-any.whl", "has_sig": false, "md5_digest": "9a2f1793649b8918c9b8dfd0134a7d13", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6,<4.0", "size": 31949, "upload_time": "2019-06-19T17:27:08", "url": "https://files.pythonhosted.org/packages/ad/27/e8dca2bc7c5776743c0d8233f1119c69d19ead6d7efdf3058d7950188965/corpy-0.2.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1f1b97c991826d28ecbbc9ad96ffc9dc", "sha256": "c6820e2f87fc056e351922de6f67823ed3c2006e942c0912a8bb2c78b225c41a" }, "downloads": -1, "filename": "corpy-0.2.2.tar.gz", "has_sig": false, "md5_digest": "1f1b97c991826d28ecbbc9ad96ffc9dc", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6,<4.0", "size": 29960, "upload_time": "2019-06-19T17:27:11", "url": "https://files.pythonhosted.org/packages/29/ee/2b037abb12301947fb891364520e0b37bbf93e3ac78cb3bc0f141497c723/corpy-0.2.2.tar.gz" } ], "0.2.3": [ { "comment_text": "", "digests": { "md5": "71e1c1db46592e1fac1b28250d40ebb9", "sha256": "40548db9ed0f6857222a6a55d21cecb002a5944832e4c687469d76d4ae5f1e21" }, "downloads": -1, "filename": "corpy-0.2.3-py3-none-any.whl", "has_sig": false, "md5_digest": "71e1c1db46592e1fac1b28250d40ebb9", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6,<4.0", "size": 31951, "upload_time": "2019-08-20T15:17:36", "url": "https://files.pythonhosted.org/packages/a5/8b/eca75efcd32157f1ba49f6205996b62ca308c2ebfd1634d6862549fb4bc6/corpy-0.2.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "c0c06faafa51fe7a9c103a3838ee8474", "sha256": "c36f526670365fb1ee3c3d9753de1bb1bf5e4bd31aa5abbe09e8bd33d5b2b283" }, "downloads": -1, "filename": "corpy-0.2.3.tar.gz", "has_sig": false, "md5_digest": "c0c06faafa51fe7a9c103a3838ee8474", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6,<4.0", "size": 30029, "upload_time": "2019-08-20T15:17:37", "url": "https://files.pythonhosted.org/packages/38/31/7938ca6942714d8cc3179ded344448f861603e35d63a13c50b7e1b766ee0/corpy-0.2.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "71e1c1db46592e1fac1b28250d40ebb9", "sha256": "40548db9ed0f6857222a6a55d21cecb002a5944832e4c687469d76d4ae5f1e21" }, "downloads": -1, "filename": "corpy-0.2.3-py3-none-any.whl", "has_sig": false, "md5_digest": "71e1c1db46592e1fac1b28250d40ebb9", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6,<4.0", "size": 31951, "upload_time": "2019-08-20T15:17:36", "url": "https://files.pythonhosted.org/packages/a5/8b/eca75efcd32157f1ba49f6205996b62ca308c2ebfd1634d6862549fb4bc6/corpy-0.2.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "c0c06faafa51fe7a9c103a3838ee8474", "sha256": "c36f526670365fb1ee3c3d9753de1bb1bf5e4bd31aa5abbe09e8bd33d5b2b283" }, "downloads": -1, "filename": "corpy-0.2.3.tar.gz", "has_sig": false, "md5_digest": "c0c06faafa51fe7a9c103a3838ee8474", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6,<4.0", "size": 30029, "upload_time": "2019-08-20T15:17:37", "url": "https://files.pythonhosted.org/packages/38/31/7938ca6942714d8cc3179ded344448f861603e35d63a13c50b7e1b766ee0/corpy-0.2.3.tar.gz" } ] }