{ "info": { "author": "Taishi Ikeda", "author_email": "taishi.ikeda.0323@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Natural Language :: Japanese", "Operating System :: Microsoft :: Windows", "Operating System :: Unix", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Topic :: Software Development :: Libraries :: Python Modules", "Topic :: Text Processing :: Linguistic" ], "description": "|Codacy Badge| |Build Status| |Build status| |Coverage Status|\n|Documentation Status| |PyPI|\n\nNagisa is a python module for Japanese word segmentation/POS-tagging. It\nis designed to be a simple and easy-to-use tool.\n\nThis tool has the following features. - Based on recurrent neural\nnetworks. - The word segmentation model uses character- and word-level\nfeatures\n`[\u6c60\u7530+] `__.\n- The POS-tagging model uses tag dictionary information\n`[Inoue+] `__.\n\nFor more details refer to the following links. - The article in Japanese\nis available\n`here `__. - The\ndocumentation is available\n`here `__.\n\nInstallation\n============\n\nPython 2.7.x or 3.5+ is required. This tool uses\n`DyNet `__ (the Dynamic Neural Network\nToolkit) to calcucate neural networks. You can install nagisa by using\nthe following command.\n\n.. code:: bash\n\n pip install nagisa\n\nFor Windows users, please run it with python 3.6+ (64bit).\n\nBasic usage\n===========\n\nSample of word segmentation and POS-tagging for Japanese.\n\n.. code:: python\n\n import nagisa\n\n text = 'Python\u3067\u7c21\u5358\u306b\u4f7f\u3048\u308b\u30c4\u30fc\u30eb\u3067\u3059'\n words = nagisa.tagging(text)\n print(words)\n #=> Python/\u540d\u8a5e \u3067/\u52a9\u8a5e \u7c21\u5358/\u5f62\u72b6\u8a5e \u306b/\u52a9\u52d5\u8a5e \u4f7f\u3048\u308b/\u52d5\u8a5e \u30c4\u30fc\u30eb/\u540d\u8a5e \u3067\u3059/\u52a9\u52d5\u8a5e\n\n # Get a list of words\n print(words.words)\n #=> ['Python', '\u3067', '\u7c21\u5358', '\u306b', '\u4f7f\u3048\u308b', '\u30c4\u30fc\u30eb', '\u3067\u3059']\n\n # Get a list of POS-tags\n print(words.postags)\n #=> ['\u540d\u8a5e', '\u52a9\u8a5e', '\u5f62\u72b6\u8a5e', '\u52a9\u52d5\u8a5e', '\u52d5\u8a5e', '\u540d\u8a5e', '\u52a9\u52d5\u8a5e']\n\nPost-processing functions\n=========================\n\nFilter and extarct words by the specific POS tags.\n\n.. code:: python\n\n # Filter the words of the specific POS tags.\n words = nagisa.filter(text, filter_postags=['\u52a9\u8a5e', '\u52a9\u52d5\u8a5e'])\n print(words)\n #=> Python/\u540d\u8a5e \u7c21\u5358/\u5f62\u72b6\u8a5e \u4f7f\u3048\u308b/\u52d5\u8a5e \u30c4\u30fc\u30eb/\u540d\u8a5e\n\n # Extarct only nouns.\n words = nagisa.extract(text, extract_postags=['\u540d\u8a5e'])\n print(words)\n #=> Python/\u540d\u8a5e \u30c4\u30fc\u30eb/\u540d\u8a5e\n\n # This is a list of available POS-tags in nagisa.\n print(nagisa.tagger.postags)\n #=> ['\u88dc\u52a9\u8a18\u53f7', '\u540d\u8a5e', ... , 'URL']\n\nAdd the user dictionary in easy way.\n\n.. code:: python\n\n # default\n text = \"3\u6708\u306b\u898b\u305f\u300c3\u6708\u306e\u30e9\u30a4\u30aa\u30f3\u300d\"\n print(nagisa.tagging(text))\n #=> 3/\u540d\u8a5e \u6708/\u540d\u8a5e \u306b/\u52a9\u8a5e \u898b/\u52d5\u8a5e \u305f/\u52a9\u52d5\u8a5e \u300c/\u88dc\u52a9\u8a18\u53f7 3/\u540d\u8a5e \u6708/\u540d\u8a5e \u306e/\u52a9\u8a5e \u30e9\u30a4\u30aa\u30f3/\u540d\u8a5e \u300d/\u88dc\u52a9\u8a18\u53f7\n\n # If a word (\"3\u6708\u306e\u30e9\u30a4\u30aa\u30f3\") is included in the single_word_list, it is recognized as a single word.\n new_tagger = nagisa.Tagger(single_word_list=['3\u6708\u306e\u30e9\u30a4\u30aa\u30f3'])\n print(new_tagger.tagging(text))\n #=> 3/\u540d\u8a5e \u6708/\u540d\u8a5e \u306b/\u52a9\u8a5e \u898b/\u52d5\u8a5e \u305f/\u52a9\u52d5\u8a5e \u300c/\u88dc\u52a9\u8a18\u53f7 3\u6708\u306e\u30e9\u30a4\u30aa\u30f3/\u540d\u8a5e \u300d/\u88dc\u52a9\u8a18\u53f7\n\nTrain a model\n=============\n\nNagisa (v0.2.0+) provides a simple train method for a joint word\nsegmentation and sequence labeling (e.g, POS-tagging, NER) model.\n\nThe format of the train/dev/test files is tsv. Each line is ``word`` and\n``tag`` and one line is represented by ``word`` tab ``tag``. Note that\nyou put EOS between sentences. Refer to `sample\ndatasets `__ and `tutorial (Train a model\nfor Universal\nDependencies) `__.\n\n::\n\n $ cat sample.train\n \u552f\u4e00 NOUN\n \u306e ADP\n \u8da3\u5473 NOU\n \u306f ADP\n \u6599\u7406 NOUN\n EOS\n \u3068\u3066\u3082 ADV\n \u304a\u3044\u3057\u304b\u3063 ADJ\n \u305f AUX\n \u3067\u3059 AUX\n \u3002 PUNCT\n EOS\n \u30c9\u30eb NOUN\n \u306f ADP\n \u4e3b\u8981 ADJ\n \u901a\u8ca8 NOUN\n EOS\n\n.. code:: python\n\n # After finish training, save the three model files (*.vocabs, *.params, *.hp).\n nagisa.fit(train_file=\"sample.train\", dev_file=\"sample.dev\", test_file=\"sample.test\", model_name=\"sample\")\n\n # Build the tagger by loading the trained model files.\n sample_tagger = nagisa.Tagger(vocabs='sample.vocabs', params='sample.params', hp='sample.hp')\n\n text = \"\u798f\u5ca1\u30fb\u535a\u591a\u306e\u89b3\u5149\u60c5\u5831\"\n words = sample_tagger.tagging(text)\n print(words)\n #> \u798f\u5ca1/PROPN \u30fb/SYM \u535a\u591a/PROPN \u306e/ADP \u89b3\u5149/NOUN \u60c5\u5831/NOUN\n\n.. |Codacy Badge| image:: https://api.codacy.com/project/badge/Grade/769dd003c7184d4d81dad74fd8a322a1\n :target: https://app.codacy.com/app/taishi-i/nagisa?utm_source=github.com&utm_medium=referral&utm_content=taishi-i/nagisa&utm_campaign=Badge_Grade_Dashboard\n.. |Build Status| image:: https://travis-ci.org/taishi-i/nagisa.svg?branch=master\n :target: https://travis-ci.org/taishi-i/nagisa\n.. |Build status| image:: https://ci.appveyor.com/api/projects/status/6k35hmxl1juf1hqf?svg=true\n :target: https://ci.appveyor.com/project/taishi-i/nagisa\n.. |Coverage Status| image:: https://coveralls.io/repos/github/taishi-i/nagisa/badge.svg?branch=master\n :target: https://coveralls.io/github/taishi-i/nagisa?branch=master\n.. |Documentation Status| image:: https://readthedocs.org/projects/nagisa/badge/?version=latest\n :target: https://nagisa.readthedocs.io/en/latest/?badge=latest\n.. |PyPI| image:: https://img.shields.io/pypi/v/nagisa.svg\n :target: https://pypi.python.org/pypi/nagisa", "description_content_type": "", "docs_url": null, "download_url": "https://github.com/taishi-i/nagisa/archive/0.2.4.tar.gz", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/taishi-i/nagisa", "keywords": "", "license": "MIT License", "maintainer": "", "maintainer_email": "", "name": "nagisa", "package_url": "https://pypi.org/project/nagisa/", "platform": "Unix", "project_url": "https://pypi.org/project/nagisa/", "project_urls": { "Download": "https://github.com/taishi-i/nagisa/archive/0.2.4.tar.gz", "Homepage": "https://github.com/taishi-i/nagisa" }, "release_url": "https://pypi.org/project/nagisa/0.2.4/", "requires_dist": null, "requires_python": "", "summary": "A Japanese tokenizer based on recurrent neural networks", "version": "0.2.4" }, "last_serial": 5634996, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "06ab1caf6fa4b547c475010e5484d4f7", "sha256": "5e93424f3ab95cebdf61f5ebfcd54319a3948abd12b66c11aad813df575dbcc3" }, "downloads": -1, "filename": "nagisa-0.0.1.tar.gz", "has_sig": false, "md5_digest": "06ab1caf6fa4b547c475010e5484d4f7", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20641752, "upload_time": "2018-02-15T13:42:41", "url": "https://files.pythonhosted.org/packages/b3/f3/ac074f8db6e0c3da01ab2d7dfa4ab1328fc9930966d4566131a60718bbb6/nagisa-0.0.1.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "be0f56f36a361bfff7c2b7f291c730c4", "sha256": "b84379d91c5b5f7cd6c30d0e413186c565f23533523713d077aaf9f67d74488c" }, "downloads": -1, "filename": "nagisa-0.0.2.tar.gz", "has_sig": false, "md5_digest": "be0f56f36a361bfff7c2b7f291c730c4", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20647851, "upload_time": "2018-02-22T09:01:20", "url": "https://files.pythonhosted.org/packages/69/3e/c4b065b9816c5e1af313d607d908a4dda8db0fc17a4846f12781eddef258/nagisa-0.0.2.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "83308e6995d819cecc541aa60b12c312", "sha256": "14f50b4fae7f7e365da34f4880cb13fcf47b32dc04593a3edea89ff7f1c5cf3b" }, "downloads": -1, "filename": "nagisa-0.0.3.tar.gz", "has_sig": false, "md5_digest": "83308e6995d819cecc541aa60b12c312", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20647141, "upload_time": "2018-02-25T06:27:34", "url": "https://files.pythonhosted.org/packages/0c/c2/c63711d3c11a0a427b4ac70db48a754bc4807692e810f7e3831c3cc1e3b8/nagisa-0.0.3.tar.gz" } ], "0.0.4": [ { "comment_text": "", "digests": { "md5": "7d1c2af665bf1539ec1b217270830598", "sha256": "a98724d2899d254cc72745e94014376e69d2088b6d00a28a736a9a5f2588abed" }, "downloads": -1, "filename": "nagisa-0.0.4.tar.gz", "has_sig": false, "md5_digest": "7d1c2af665bf1539ec1b217270830598", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20648814, "upload_time": "2018-02-25T06:47:13", "url": "https://files.pythonhosted.org/packages/7f/e3/c6a0b1c4d939f3e2e485da0a2d6f2914ea6e48155fae9b6c117c2155f3a1/nagisa-0.0.4.tar.gz" } ], "0.0.5": [ { "comment_text": "", "digests": { "md5": "93e0c3cba6f6056749a2ca45b64b3a8e", "sha256": "e61fd0981f5aa5a2995df82cea1836e6c8b919acff2c5ddfe4ac1c51b52bed28" }, "downloads": -1, "filename": "nagisa-0.0.5.tar.gz", "has_sig": false, "md5_digest": "93e0c3cba6f6056749a2ca45b64b3a8e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20648815, "upload_time": "2018-02-25T07:26:13", "url": "https://files.pythonhosted.org/packages/8d/a0/45d8d196b80f0a585f56e148e7af4488c559ee3ab87e8bac810f853c571f/nagisa-0.0.5.tar.gz" } ], "0.0.6": [ { "comment_text": "", "digests": { "md5": "b038023970b3eec2f0aba2d60070ea4d", "sha256": "bf87ab957ff620cc0732f7af22c2f39f866fee0649dab3de6c33f092c92be70d" }, "downloads": -1, "filename": "nagisa-0.0.6.tar.gz", "has_sig": false, "md5_digest": "b038023970b3eec2f0aba2d60070ea4d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20838122, "upload_time": "2018-03-19T15:31:07", "url": "https://files.pythonhosted.org/packages/ce/32/a6b953171a922d848de6a996afedc18dc88e3ab69db5f36ea1d71b72e431/nagisa-0.0.6.tar.gz" } ], "0.0.7": [ { "comment_text": "", "digests": { "md5": "e6d564ea6363e71325fb7df3bdf80926", "sha256": "e7108a8c55ebb6653639012d63ee2856283b73cc89dfbfb7238b7ee087bd4ba2" }, "downloads": -1, "filename": "nagisa-0.0.7.tar.gz", "has_sig": false, "md5_digest": "e6d564ea6363e71325fb7df3bdf80926", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20839885, "upload_time": "2018-05-17T03:12:14", "url": "https://files.pythonhosted.org/packages/2d/f0/a6e2ab5d161768c3460404204d44035660b68f25ad20c5946d44ea6db677/nagisa-0.0.7.tar.gz" } ], "0.0.8": [ { "comment_text": "", "digests": { "md5": "0973c21b99ac9fdd8a1858e736af13fb", "sha256": "0f00cb006c8130e9df6af3703c37d77d73f8049c34c9ac1c6c9062225a41cc2f" }, "downloads": -1, "filename": "nagisa-0.0.8.tar.gz", "has_sig": false, "md5_digest": "0973c21b99ac9fdd8a1858e736af13fb", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20837992, "upload_time": "2018-05-22T15:00:56", "url": "https://files.pythonhosted.org/packages/f4/7b/fe6aceb3d236b6551daa97b01ea084fbbb88c7bb0b7ad8654e9a4af4ce9d/nagisa-0.0.8.tar.gz" } ], "0.0.9": [ { "comment_text": "", "digests": { "md5": "461c8f34c40d57a8f44b842b871391cd", "sha256": "9caff3c399d69c00961c39ea96cd22c702e68552498e8ec329384e499b07dae1" }, "downloads": -1, "filename": "nagisa-0.0.9.tar.gz", "has_sig": false, "md5_digest": "461c8f34c40d57a8f44b842b871391cd", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20841233, "upload_time": "2018-06-27T16:32:59", "url": "https://files.pythonhosted.org/packages/90/8d/a3c91b4762f7b65ffcbef8f1fbad62332fcf901d32b7bd697b79329a94a0/nagisa-0.0.9.tar.gz" } ], "0.1.0": [ { "comment_text": "", "digests": { "md5": "dcdf7e19baf66ad61b560e298a16bdd8", "sha256": "a0ad541106afd96218b2e9dcc68caefd2ab2a400590d9f26aa56e775e5c2be15" }, "downloads": -1, "filename": "nagisa-0.1.0.tar.gz", "has_sig": false, "md5_digest": "dcdf7e19baf66ad61b560e298a16bdd8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20841289, "upload_time": "2018-09-02T07:02:07", "url": "https://files.pythonhosted.org/packages/a4/a3/7b6c2208f3c1db5aabafeee052d0cdbb5df57e60881730f78544ad262835/nagisa-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "6416af11cc4c7fee72d36ff1bd288c4a", "sha256": "442ea19ffae679f4ab93e09780d5cd283fb4ff1dbd961a30c67bd9d74c17bb18" }, "downloads": -1, "filename": "nagisa-0.1.1.tar.gz", "has_sig": false, "md5_digest": "6416af11cc4c7fee72d36ff1bd288c4a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20841464, "upload_time": "2018-09-21T12:29:29", "url": "https://files.pythonhosted.org/packages/a1/40/a94f7944ee5d6a4d44eadcc966fe0d46b5155fb139d7b4d708e439617df1/nagisa-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "482b0f2d8e77689eb3552573917cd6ca", "sha256": "2d8454acd379905b88f13dc020aa9c769b91706eb1b0afcdd8913a6da815175d" }, "downloads": -1, "filename": "nagisa-0.1.2.tar.gz", "has_sig": false, "md5_digest": "482b0f2d8e77689eb3552573917cd6ca", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20842822, "upload_time": "2018-12-25T15:08:41", "url": "https://files.pythonhosted.org/packages/a6/19/b3bb95389b45f719528f5c6d32d46886be276fcac439932746ed758c5839/nagisa-0.1.2.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "e9897d8c9edb0937b2fe2d1b409cadb8", "sha256": "a3a4e601b57db193b51db5febc7e1f76d2ff8a3fb24944f4021244aadb022e0f" }, "downloads": -1, "filename": "nagisa-0.2.0.tar.gz", "has_sig": false, "md5_digest": "e9897d8c9edb0937b2fe2d1b409cadb8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20888885, "upload_time": "2019-01-09T17:34:35", "url": "https://files.pythonhosted.org/packages/d9/66/0f6fdea8e341ce39fb4a4effb0a9c7707449ac5e8bc9e22d120d18f58202/nagisa-0.2.0.tar.gz" } ], "0.2.1": [ { "comment_text": "", "digests": { "md5": "9c4aaa04bfd402232baed1006b99af6b", "sha256": "f570acc620d76bf6908d23eef96645e52c6d79b6187c52d1269cfa3b429e8a76" }, "downloads": -1, "filename": "nagisa-0.2.1.tar.gz", "has_sig": false, "md5_digest": "9c4aaa04bfd402232baed1006b99af6b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20890640, "upload_time": "2019-03-03T16:50:19", "url": "https://files.pythonhosted.org/packages/39/d6/1a5cf5cf1abaa97101bcaff6e72d42a09de66bfbb4a37eae47adab634b80/nagisa-0.2.1.tar.gz" } ], "0.2.2": [ { "comment_text": "", "digests": { "md5": "c2864512c259c640f5413c1699e4f0a0", "sha256": "9590080c97e24985fe83315ecfd3cbd59214ecbcdf94b15e4b62863863daeb4c" }, "downloads": -1, "filename": "nagisa-0.2.2.tar.gz", "has_sig": false, "md5_digest": "c2864512c259c640f5413c1699e4f0a0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20891040, "upload_time": "2019-05-03T16:54:58", "url": "https://files.pythonhosted.org/packages/11/17/faa25ceb7964d758e593f753ec4dc7c2be13ec94d948868d9761c81f732f/nagisa-0.2.2.tar.gz" } ], "0.2.3": [ { "comment_text": "", "digests": { "md5": "73e5dc31dad5c9e2082b78923a6bf243", "sha256": "5023215f1a63b42bfcab535a15e3a7b9c23652ae237103b9206d113ca84377cc" }, "downloads": -1, "filename": "nagisa-0.2.3.tar.gz", "has_sig": false, "md5_digest": "73e5dc31dad5c9e2082b78923a6bf243", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20891227, "upload_time": "2019-05-19T06:51:29", "url": "https://files.pythonhosted.org/packages/d4/c5/dd56d380f037334b75bcdd6a3e41f1990c07fa0712b754fbc9fa878199f4/nagisa-0.2.3.tar.gz" } ], "0.2.4": [ { "comment_text": "", "digests": { "md5": "4bbb17c01acd4a6a4f63f63559686947", "sha256": "a5b7a86bed38767848ab22340bc7eb1709a92fb64a53f1d19458113b08c1b8ea" }, "downloads": -1, "filename": "nagisa-0.2.4.tar.gz", "has_sig": false, "md5_digest": "4bbb17c01acd4a6a4f63f63559686947", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20895333, "upload_time": "2019-08-05T15:34:00", "url": "https://files.pythonhosted.org/packages/29/25/f8a7916c541c79eb59c3a30f80ab2055ff26330518bdffa3e38ee4d76edf/nagisa-0.2.4.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "4bbb17c01acd4a6a4f63f63559686947", "sha256": "a5b7a86bed38767848ab22340bc7eb1709a92fb64a53f1d19458113b08c1b8ea" }, "downloads": -1, "filename": "nagisa-0.2.4.tar.gz", "has_sig": false, "md5_digest": "4bbb17c01acd4a6a4f63f63559686947", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20895333, "upload_time": "2019-08-05T15:34:00", "url": "https://files.pythonhosted.org/packages/29/25/f8a7916c541c79eb59c3a30f80ab2055ff26330518bdffa3e38ee4d76edf/nagisa-0.2.4.tar.gz" } ] }