{ "info": { "author": "Soren Lind Kristiansen", "author_email": "sorenlind@mac.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Developers", "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License", "Natural Language :: Danish", "Operating System :: OS Independent", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3.6", "Topic :: Text Processing :: Linguistic" ], "description": "# \ud83e\udd18 Lemmy\n\nLemmy is a lemmatizer for Danish \ud83c\udde9\ud83c\uddf0 and Swedish \ud83c\uddf8\ud83c\uddea. It comes ready for use. The\nDanish model is trained on Dansk Sprogn\u00e6vn's (DSN) word list (\u2018fuldformliste\u2019) and the\n[Danish Universal Dependencies](https://github.com/UniversalDependencies/UD_Danish-DDT).\nThe Swedish model is trained on the [SALDO's\nmorphology](https://spraakbanken.gu.se/eng/resource/saldom) dataset and the Swedish\n[Universal Dependencies\n(Talbanken)](https://github.com/UniversalDependencies/UD_Swedish-Talbanken). Lemmy also\nsupports training on your own dataset.\n\nThe models included in Lemmy were evaluated on the respective Universal Dependencies dev\ndatasets. The Danish model scored > 99% accuracy, while the Swedish model scored > 97%.\nAll reported scores were obtained when supplying Lemmy with POS tags.\n\nYou can use Lemmy as a spaCy extension, more specifcally a spaCy pipeline component.\nThis is highly recommended and makes the lemmas easily accessible from the spaCy tokens.\nLemmy makes use of POS tags to predict the lemmas. When wired up to the spaCy pipeline,\nLemmy has the benefit of using spaCy\u2019s builtin POS tagger.\n\nLemmy can also by used without spaCy, as a standalone lemmatizer. In that case, you will\nhave to provide the POS tags. Alternatively, you can use Lemmy without POS tags, though\nmost likely the accuracy will suffer. Currrently, only the Danish Lemmy model comes with\na model trained for use without POS tags. That is, if you want to use Lemmy on Swedish\ntext without POS tags, you must train your own Lemmy model.\n\nLemmy is heavily inspired by the [CST Lemmatizer for\nDanish](https://cst.dk/online/lemmatiser/).\n\n## Install\n\n```bash\npip install lemmy\n```\n\n## Basic Usage Without POS tags\n\n```python\nimport lemmy\n\n# Create an instance of the standalone lemmatizer.\nlemmatizer = lemmy.load(\"da\")\n\n# Find lemma for the word 'akvariernes'. First argument is an empty POS tag.\nlemmatizer.lemmatize(\"\", \"akvariernes\")\n```\n\n## Basic Usage With POS tags\n\n```python\nimport lemmy\n\n# Create an instance of the standalone lemmatizer.\n# Replace 'da' with 'sv' for the Swedish lemmatizer.\nlemmatizer = lemmy.load(\"da\")\n\n# Find lemma for the word 'akvariernes'. First argument is the user-provided POS tag.\nlemmatizer.lemmatize(\"NOUN\", \"akvariernes\")\n```\n\n## Usage with spaCy Model\n\n```python\nimport da_custom_model as da # replace da_custom_model with name of your spaCy model\nimport lemmy.pipe\nnlp = da.load()\n\n# Create an instance of Lemmy's pipeline component for spaCy.\n# Replace 'da' with 'sv' for the Swedish lemmatizer.\npipe = lemmy.pipe.load('da')\n\n# Add the component to the spaCy pipeline.\nnlp.add_pipe(pipe, after='tagger')\n\n# Lemmas can now be accessed using the `._.lemmas` attribute on the tokens.\nnlp(\"akvariernes\")[0]._.lemmas\n```\n\n## Training\n\nThe ``notebooks`` folder contains examples showing how to train your own model using\nLemmy.\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/sorenlind/lemmy/", "keywords": "nlp lemma lemmatizer lemmatiser danish spacy", "license": "", "maintainer": "", "maintainer_email": "", "name": "lemmy", "package_url": "https://pypi.org/project/lemmy/", "platform": "", "project_url": "https://pypi.org/project/lemmy/", "project_urls": { "Homepage": "https://github.com/sorenlind/lemmy/" }, "release_url": "https://pypi.org/project/lemmy/2.1.0/", "requires_dist": [ "pandas ; extra == 'dev'", "jupyter ; extra == 'dev'", "unicodecsv ; extra == 'dev'", "bs4 ; extra == 'dev'", "tqdm ; extra == 'dev'", "regex ; extra == 'dev'", "spacy ; extra == 'dev'", "pylint ; extra == 'dev'", "pycodestyle ; extra == 'dev'", "pydocstyle ; extra == 'dev'", "yapf ; extra == 'dev'", "pytest ; extra == 'dev'", "tox ; extra == 'dev'", "pandas ; extra == 'notebooks'", "jupyter ; extra == 'notebooks'", "unicodecsv ; extra == 'notebooks'", "bs4 ; extra == 'notebooks'", "tqdm ; extra == 'notebooks'", "regex ; extra == 'notebooks'", "spacy ; extra == 'notebooks'", "pytest ; extra == 'test'", "tox ; extra == 'test'" ], "requires_python": "", "summary": "Lemmatizer for Danish", "version": "2.1.0" }, "last_serial": 5184622, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "ecdf00b151ba4a2840b5958db2b6b5dc", "sha256": "0ddf1bb04c42411866b2726e8d8172979a8b8d48d509be0d0b64468245418544" }, "downloads": -1, "filename": "lemmy-0.1.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "ecdf00b151ba4a2840b5958db2b6b5dc", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 244064, "upload_time": "2018-02-11T12:25:27", "url": "https://files.pythonhosted.org/packages/f0/bb/c2c485fa2459cbb824344479af3037d1bddf5a278599f5f5c12f4da59577/lemmy-0.1.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "e46384f27c6b8358b739bc8bfbf5c71b", "sha256": "eccc2e847ea17f01390dc41049a2a4af432c6aa854de13801732d2800734540e" }, "downloads": -1, "filename": "lemmy-0.1.0.tar.gz", "has_sig": false, "md5_digest": "e46384f27c6b8358b739bc8bfbf5c71b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 234245, "upload_time": "2018-02-11T12:25:29", "url": "https://files.pythonhosted.org/packages/11/9d/2dadde6f3dee137529a86d325ed777c6e9a8b29ec2187ae9fde4391206b7/lemmy-0.1.0.tar.gz" } ], "1.0.0": [ { "comment_text": "", "digests": { "md5": "67d042dd22a7b568524d32d4099e528f", "sha256": "de062737bc9c0a1d0d493238afa2f5752e1bec80b71c7762f15c5292262fa27d" }, "downloads": -1, "filename": "lemmy-1.0.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "67d042dd22a7b568524d32d4099e528f", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 242952, "upload_time": "2019-02-15T17:34:50", "url": "https://files.pythonhosted.org/packages/7b/fb/2740242edd7d2a8b6d402c213389c1e1f19f18715021bd968cafd5ad7a44/lemmy-1.0.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "d862ac7f9af9d6c851b2ad777d5dfcf7", "sha256": "cbb189eb2f5fe921d1dc19a91015d72987a140e123f6ebd9296eee053502b783" }, "downloads": -1, "filename": "lemmy-1.0.0.tar.gz", "has_sig": false, "md5_digest": "d862ac7f9af9d6c851b2ad777d5dfcf7", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 234262, "upload_time": "2019-02-15T17:34:52", "url": "https://files.pythonhosted.org/packages/15/6e/f2b86660d2509eaee8e4ca60c7ab16301bff975128553bdb22b5569c9f79/lemmy-1.0.0.tar.gz" } ], "2.0.0": [ { "comment_text": "", "digests": { "md5": "ce47133f982753a1622cdfddd4c042c4", "sha256": "0ec45e8b09bfece6e4677303d46b939f4c9b6b8278975abb6c4ab0fa517923e1" }, "downloads": -1, "filename": "lemmy-2.0.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "ce47133f982753a1622cdfddd4c042c4", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 654942, "upload_time": "2019-03-19T21:47:24", "url": "https://files.pythonhosted.org/packages/94/e5/c00c86421c79d8fd64ed0270fd3752a34fea5ebad3a845c119b01a327959/lemmy-2.0.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "ffd70fa8981f994d2a3084f542b57c8c", "sha256": "63f3500434c5323077353cbf4cc79402c9fcf3e41fc5232f917f1ae46c09a433" }, "downloads": -1, "filename": "lemmy-2.0.0.tar.gz", "has_sig": false, "md5_digest": "ffd70fa8981f994d2a3084f542b57c8c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 628803, "upload_time": "2019-03-19T21:47:26", "url": "https://files.pythonhosted.org/packages/aa/80/a45dac1a79fdd0f833b8725a5dce749bf1388505e946c05ff351775a75cd/lemmy-2.0.0.tar.gz" } ], "2.1.0": [ { "comment_text": "", "digests": { "md5": "eda4fa2db951d202d445c4c6a83d5af9", "sha256": "0ed5fc3030f3858ebdc5c41b0c909963b5166121c855343912003b1b1c6b1603" }, "downloads": -1, "filename": "lemmy-2.1.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "eda4fa2db951d202d445c4c6a83d5af9", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 1087514, "upload_time": "2019-04-24T21:28:18", "url": "https://files.pythonhosted.org/packages/79/e3/66f7cf52608ee094fb411a3af4958cb3ba31af167dc03eb269568ccd0248/lemmy-2.1.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "7f23b86de4e2a22990b5633e1ae23316", "sha256": "06f970c4e54b614cf740a7228778fbc62f01c366a544f7c86fecc7f1d2324b63" }, "downloads": -1, "filename": "lemmy-2.1.0.tar.gz", "has_sig": false, "md5_digest": "7f23b86de4e2a22990b5633e1ae23316", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1044628, "upload_time": "2019-04-24T21:28:21", "url": "https://files.pythonhosted.org/packages/18/98/01f75fe58c4c67114c99502788cc9d16d2d7d871c6ab70f1d0d360eb16f0/lemmy-2.1.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "eda4fa2db951d202d445c4c6a83d5af9", "sha256": "0ed5fc3030f3858ebdc5c41b0c909963b5166121c855343912003b1b1c6b1603" }, "downloads": -1, "filename": "lemmy-2.1.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "eda4fa2db951d202d445c4c6a83d5af9", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 1087514, "upload_time": "2019-04-24T21:28:18", "url": "https://files.pythonhosted.org/packages/79/e3/66f7cf52608ee094fb411a3af4958cb3ba31af167dc03eb269568ccd0248/lemmy-2.1.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "7f23b86de4e2a22990b5633e1ae23316", "sha256": "06f970c4e54b614cf740a7228778fbc62f01c366a544f7c86fecc7f1d2324b63" }, "downloads": -1, "filename": "lemmy-2.1.0.tar.gz", "has_sig": false, "md5_digest": "7f23b86de4e2a22990b5633e1ae23316", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1044628, "upload_time": "2019-04-24T21:28:21", "url": "https://files.pythonhosted.org/packages/18/98/01f75fe58c4c67114c99502788cc9d16d2d7d871c6ab70f1d0d360eb16f0/lemmy-2.1.0.tar.gz" } ] }