{ "info": { "author": "THUNLP", "author_email": "i@dozbear.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# OpenHowNet API\n\nThis project contains core data of HowNet and OpenHowNet API developed by THUNLP, for providing a convenient way to search information in HowNet, display sememe trees, calculate word similarity via sememes, etc. If you would like to learn more about OpenHowNet, please visit our [website](https://openhownet.thunlp.org).\n\nIf you are using any data or API provided by OpenHowNet, please cite the following papers:\n\n\t@article{qi2019openhownet,\n\t title={OpenHowNet: An Open Sememe-based Lexical Knowledge Base},\n\t author={Qi, Fanchao and Yang, Chenghao and Liu, Zhiyuan and Dong, Qiang and Sun, Maosong and Dong, Zhendong},\n\t journal={arXiv preprint arXiv:1901.09957},\n\t year={2019},\n\t}\n\n\t@inproceedings{dong2003hownet,\n\t title={HowNet-a hybrid language and knowledge resource},\n\t author={Dong, Zhendong and Dong, Qiang},\n\t booktitle={Proceedings of NLP-KE},\n\t year={2003},\n\t}\n\n## Requirements\n\n* Python==3.6\n* anytree==2.4.3\n* tqdm==4.31.1\n* requests==2.22.0\n\n## Installation\n\nFirst, run `pip install OpenHowNet`\n\n```python\nimport OpenHowNet\nhownet_dict = OpenHowNet.HowNetDict()\n```\n\nAn error will occur if you haven't downloaded the HowNet data, you need to run `OpenHowNet.download()` to make the package functional.\n\n## Interfaces\n\n|interfaces|description|params|\n|---|-------|-------|\nget(self, word, language=None)|Search all information annotated with a word. | \"word\" is the target word. \"lang\" is 'en'(English) or 'zh'(Chinese), searching in both languages by default.\nget\\_sememes\\_by\\_word(self, word, structured=False, lang='zh', merge=False, expanded_layer=-1) | Search sememes of the target word. You can choose whether multiple senses in the result will be merged, whether the result itself will be structured, and the expand layers of the tree | \"word\" is the target word. \"lang\" is 'en'(English) or 'zh'(Chinese). \"structured\" is whether the result is structured. \"merge\" is whether the result is merged. \"expanded_layer\" is number of expand layers, -1 means expand all layers.\ninitialize\\_sememe\\_similarity\\_calculation(self)| Initialize the implementation of advanced feature #1 to calculate sememe similarity. May take some time to read necessary files.|\ncalculate\\_word\\_similarity(self, word0, word1)|Calculate similarity of two words. You need to run initialize\\_sememe\\_similarity\\_calculation before calling this function.|\"word0\" and \"word1\" represents the words you are to query.\nget\\_nearest\\_words\\_via\\_sememes(self, word, K=10)|Get the nearest K words to the target word with similarity calculated via sememes.|\"word\" is the target word, \"K\" is \"top K\" in k nearest neighbors.\nget\\_sememe\\_relation(self, x, y)|Get relationship between sememe x and sememe y.|\"x\" and \"y\" represents the sememes you are to query.\nget\\_sememe\\_via\\_relation(self, x, relation, lang='zh')|Get all sememes that have specified relation with sememe x.|\"x\" is the target sememe, \"relation\" is the target relation, \"lang\" is 'en'(English) or 'zh'(Chinese)\n\n## Usage\n\n### Get word annotations in HowNet\n\nBy default, the api will search the target word in both English and Chinese annotations in HowNet, which will cause significant search overhead. Note that if the target word does not exist in HowNet annotation, this api will simply return an empty list.\n\n```python\n>>> result_list = hownet_dict.get(\"\u82f9\u679c\")\n>>> print(\"\u68c0\u7d22\u6570\u91cf\uff1a\",len(result_list))\n>>> print(\"\u68c0\u7d22\u7ed3\u679c\u8303\u4f8b:\",result_list[0])\n\u68c0\u7d22\u6570\u91cf\uff1a 6\n\u68c0\u7d22\u7ed3\u679c\u8303\u4f8b: {'Def': '{computer|\u7535\u8111:modifier={PatternValue|\u6837\u5f0f\u503c:CoEvent={able|\u80fd:scope={bring|\u643a\u5e26:patient={$}}}}{SpeBrand|\u7279\u5b9a\u724c\u5b50}}', 'en_grammar': 'noun', 'zh_grammar': 'noun', 'No': '127151', 'syn': [{'id': '004024', 'text': 'IBM'}, {'id': '041684', 'text': '\u6234\u5c14'}, {'id': '049006', 'text': '\u4e1c\u829d'}, {'id': '106795', 'text': '\u8054\u60f3'}, {'id': '156029', 'text': '\u7d22\u5c3c'}, {'id': '004203', 'text': 'iPad'}, {'id': '019457', 'text': '\u7b14\u8bb0\u672c'}, {'id': '019458', 'text': '\u7b14\u8bb0\u672c\u7535\u8111'}, {'id': '019459', 'text': '\u7b14\u8bb0\u672c\u7535\u8111'}, {'id': '019460', 'text': '\u7b14\u8bb0\u672c\u7535\u8111'}, {'id': '019461', 'text': '\u7b14\u8bb0\u672c\u7535\u8111'}, {'id': '019463', 'text': '\u7b14\u8bb0\u7c3f\u7535\u8111'}, {'id': '019464', 'text': '\u7b14\u8bb0\u7c3f\u7535\u8111'}, {'id': '020567', 'text': '\u4fbf\u643a\u5f0f\u7535\u8111'}, {'id': '020568', 'text': '\u4fbf\u643a\u5f0f\u8ba1\u7b97\u673a'}, {'id': '020569', 'text': '\u4fbf\u643a\u5f0f\u8ba1\u7b97\u673a'}, {'id': '127224', 'text': '\u5e73\u677f\u7535\u8111'}, {'id': '127225', 'text': '\u5e73\u677f\u7535\u8111'}, {'id': '172264', 'text': '\u819d\u4e0a\u578b\u7535\u8111'}, {'id': '172265', 'text': '\u819d\u4e0a\u578b\u7535\u8111'}], 'zh_word': '\u82f9\u679c', 'en_word': 'apple'}\n\n>>> hownet_dict.get(\"test_for_non_exist_word\")\n[]\n```\n\nYou can visualize the retrieved HowNet structured annotations (\"sememe tree\") of the target word as follow :\n(K=2 means only displaying 2 sememe trees)\n\n```python\n>>> hownet_dict.visualize_sememe_trees(\"\u82f9\u679c\", K=2)\nFind 6 result(s)\nDisplay #0 sememe tree\n[sense]\u82f9\u679c\n\u2514\u2500\u2500 [None]computer|\u7535\u8111\n \u251c\u2500\u2500 [modifier]PatternValue|\u6837\u5f0f\u503c\n \u2502 \u2514\u2500\u2500 [CoEvent]able|\u80fd\n \u2502 \u2514\u2500\u2500 [scope]bring|\u643a\u5e26\n \u2502 \u2514\u2500\u2500 [patient]$\n \u2514\u2500\u2500 [patient]SpeBrand|\u7279\u5b9a\u724c\u5b50\nDisplay #1 sememe tree\n[sense]\u82f9\u679c\n\u2514\u2500\u2500 [None]fruit|\u6c34\u679c\n```\n\nTo boost the efficiency of the search process, you can specify the language of the target word as the following.\n\n```python\n>>> result_list = hownet_dict.get(\"\u82f9\u679c\", language=\"zh\")\n>>> print(\"\u5355\u8bed\u68c0\u7d22\u6570\u91cf\uff1a\",len(result_list))\n>>> print(\"\u5355\u8bed\u68c0\u7d22\u7ed3\u679c\u8303\u4f8b:\",result_list[0])\n>>> print(\"-------\u53cc\u8bed\u6df7\u5408\u68c0\u7d22\u6d4b\u8bd5---------\")\n>>> print(\"\u6df7\u5408\u68c0\u7d22\u7ed3\u679c\u6570\u91cf:\",len(hownet_dict.get(\"X\")))\n>>> print(\"\u4e2d\u6587\u68c0\u7d22\u7ed3\u679c\u6570\u91cf:\",len(hownet_dict.get(\"X\",language=\"zh\")))\n>>> print(\"\u82f1\u8bed\u68c0\u7d22\u7ed3\u679c\u6570\u91cf:\",len(hownet_dict.get(\"X\",language=\"en\")))\n\u5355\u8bed\u68c0\u7d22\u6570\u91cf\uff1a 6\n\u5355\u8bed\u68c0\u7d22\u7ed3\u679c\u8303\u4f8b: {'Def': '{computer|\u7535\u8111:modifier={PatternValue|\u6837\u5f0f\u503c:CoEvent={able|\u80fd:scope={bring|\u643a\u5e26:patient={$}}}}{SpeBrand|\u7279\u5b9a\u724c\u5b50}}', 'en_grammar': 'noun', 'zh_grammar': 'noun', 'No': '127151', 'syn': [{'id': '004024', 'text': 'IBM'}, {'id': '041684', 'text': '\u6234\u5c14'}, {'id': '049006', 'text': '\u4e1c\u829d'}, {'id': '106795', 'text': '\u8054\u60f3'}, {'id': '156029', 'text': '\u7d22\u5c3c'}, {'id': '004203', 'text': 'iPad'}, {'id': '019457', 'text': '\u7b14\u8bb0\u672c'}, {'id': '019458', 'text': '\u7b14\u8bb0\u672c\u7535\u8111'}, {'id': '019459', 'text': '\u7b14\u8bb0\u672c\u7535\u8111'}, {'id': '019460', 'text': '\u7b14\u8bb0\u672c\u7535\u8111'}, {'id': '019461', 'text': '\u7b14\u8bb0\u672c\u7535\u8111'}, {'id': '019463', 'text': '\u7b14\u8bb0\u7c3f\u7535\u8111'}, {'id': '019464', 'text': '\u7b14\u8bb0\u7c3f\u7535\u8111'}, {'id': '020567', 'text': '\u4fbf\u643a\u5f0f\u7535\u8111'}, {'id': '020568', 'text': '\u4fbf\u643a\u5f0f\u8ba1\u7b97\u673a'}, {'id': '020569', 'text': '\u4fbf\u643a\u5f0f\u8ba1\u7b97\u673a'}, {'id': '127224', 'text': '\u5e73\u677f\u7535\u8111'}, {'id': '127225', 'text': '\u5e73\u677f\u7535\u8111'}, {'id': '172264', 'text': '\u819d\u4e0a\u578b\u7535\u8111'}, {'id': '172265', 'text': '\u819d\u4e0a\u578b\u7535\u8111'}], 'zh_word': '\u82f9\u679c', 'en_word': 'apple'}\n-------\u53cc\u8bed\u6df7\u5408\u68c0\u7d22\u6d4b\u8bd5---------\n\u6df7\u5408\u68c0\u7d22\u7ed3\u679c\u6570\u91cf: 5\n\u4e2d\u6587\u68c0\u7d22\u7ed3\u679c\u6570\u91cf: 3\n\u82f1\u8bed\u68c0\u7d22\u7ed3\u679c\u6570\u91cf: 2\n\n>>> hownet_dict.get(\"\u82f9\u679c\", language=\"en\")\n[]\n```\n\n### Get All Words annotated in HowNet\u00b6\n\n```python\n>>> zh_word_list = hownet_dict.get_zh_words()\n>>> en_word_list = hownet_dict.get_en_words()\n>>> print(zh_word_list[:30])\n>>> print(en_word_list[:30])\n['', '\"', '#', '#\u53f7\u6807\u7b7e', '$', '%', \"'\", '(', ')', '*', '+', '-', '--', '...', '...\u51fa\u4ec0\u4e48\u95ee\u9898', '...\u5e95', '...\u5e95\u4e0b', '...\u53d1\u751f\u6545\u969c', '...\u53d1\u751f\u4e86\u4ec0\u4e48', '...\u4f55\u5982', '...\u5bb6\u91cc\u6709\u51e0\u53e3\u4eba', '...\u68c0\u6d4b\u5448\u9633\u6027', '...\u68c0\u6d4b\u5448\u9634\u6027', '...\u6765', '...\u5185', '...\u4e3a\u6b62', '...\u4e5f\u540c\u6837\u4f7f\u7136', '...\u4ee5\u6765', '...\u4ee5\u5185', '...\u4ee5\u4e0a']\n['A', 'An', 'Frenchmen', 'Frenchwomen', 'Ottomans', 'a', 'aardwolves', 'abaci', 'abandoned', 'abbreviated', 'abode', 'aboideaux', 'aboiteaux', 'abscissae', 'absorbed', 'acanthi', 'acari', 'accepted', 'acciaccature', 'acclaimed', 'accommodating', 'accompanied', 'accounting', 'accused', 'acetabula', 'acetified', 'aching', 'acicula', 'acini', 'acquired']\n```\n\n### Get Flattened Sememe Trees for certain word or all words in HowNet\n\nCautions: the parameters \"lang\", \"merge\" and \"expanded_layer\" only works when \"structured = False\". The main consideration is that there are multiple ways to interpret these params when deal with structured data. We leave the freedom to our end user. In next section, you will be able to see how to utilize the structured data.\n\nDetailed explanation of params will be displayed in our documentation.\n\n#### Get the full merged sememe list from multi-sense words\n\n```python\n>>> hownet_dict.get_sememes_by_word(\"\u82f9\u679c\",structured=False,lang=\"zh\",merge=True)\n{'\u7535\u8111', '\u4ea4\u6d41', '\u7528\u5177', '\u6c34\u679c', '\u7279\u5b9a\u724c\u5b50', '\u6837\u5f0f\u503c', '\u80fd', '\u6811', '\u751f\u6b96', '\u643a\u5e26'}\n\n>>> hownet_dict.get_sememes_by_word(\"apple\",structured=False,lang=\"en\",merge=True)\n{'communicate', 'able', 'reproduce', 'SpeBrand', 'computer', 'bring', 'tool', 'PatternValue', 'tree', '$', 'fruit'}\n```\n\nEven if the language is not corresponding to the target word, the api still works. It will keep all the returned word entries to be in the same language you specified.\n\n```python\n>>> hownet_dict.get_sememes_by_word(\"\u82f9\u679c\",structured=False,lang=\"en\",merge=True)\n{'apple': {'communicate', 'able', 'reproduce', 'SpeBrand', 'computer', 'bring', 'tool', 'PatternValue', 'tree', '$', 'fruit'}, 'malus pumila': {'reproduce', 'fruit', 'tree'}, 'orchard apple tree': {'reproduce', 'fruit', 'tree'}}\n```\n\nNote that, in the latest version, if the number of the word entries equals to one, for convenience, the api will simply return the set of sememes.\n\nYou could specify the number of the expanded layers like the following:\n\n```python\n>>> hownet_dict.get_sememes_by_word(\"\u82f9\u679c\",structured=False,merge=True,expanded_layer=2)\n{'\u7535\u8111', '\u6811', '\u7528\u5177', '\u6c34\u679c'}\n```\n\nYou could get all flattened sememe trees for all words as well as specify the number of the expanded layers:\n\n```python\n>>> hownet_dict.get_sememes_by_word(\"*\",structured=False,merge=True)\n# the result is too large, just try it yourself.\n```\n\nIf you would like to see the sememe lists for different senses of particular word in HowNet, just need to set the param \"merged\" to False.\n\n```python\n>>> hownet_dict.get_sememes_by_word(\"\u82f9\u679c\",structured=False,lang=\"zh\",merge=False)\n[{'word': '\u82f9\u679c', 'sememes': {'\u7279\u5b9a\u724c\u5b50', '\u6837\u5f0f\u503c', '\u7535\u8111', '\u80fd', '\u643a\u5e26'}},\n{'word': '\u82f9\u679c', 'sememes': {'\u6c34\u679c'}},\n{'word': '\u82f9\u679c', 'sememes': {'\u7279\u5b9a\u724c\u5b50', '\u6837\u5f0f\u503c', '\u80fd', '\u4ea4\u6d41', '\u7528\u5177', '\u643a\u5e26'}},\n{'word': '\u82f9\u679c', 'sememes': {'\u6811', '\u751f\u6b96', '\u6c34\u679c'}},\n{'word': '\u82f9\u679c', 'sememes': {'\u6811', '\u751f\u6b96', '\u6c34\u679c'}},\n{'word': '\u82f9\u679c', 'sememes': {'\u6811', '\u751f\u6b96', '\u6c34\u679c'}}]\n\n>>> hownet_dict.get_sememes_by_word(\"apple\",structured=False,lang=\"en\",merge=False)\n[{'word': 'apple', 'sememes': {'able', 'computer', 'bring', 'SpeBrand', 'PatternValue', '$'}},\n{'word': 'apple', 'sememes': {'fruit'}},\n{'word': 'apple', 'sememes': {'communicate', 'able', 'bring', 'tool', 'SpeBrand', 'PatternValue', '$'}},\n{'word': 'apple', 'sememes': {'reproduce', 'fruit', 'tree'}},\n{'word': 'apple', 'sememes': {'communicate', 'able', 'bring', 'tool', 'SpeBrand', 'PatternValue', '$'}},\n{'word': 'apple', 'sememes': {'reproduce', 'fruit', 'tree'}},\n{'word': 'apple', 'sememes': {'fruit'}},\n{'word': 'apple', 'sememes': {'fruit'}}]\n```\n\n### Get Structured Sememe Trees for certain words in HowNet\n\n```python\n>>> hownet_dict.get_sememes_by_word(\"\u82f9\u679c\",structured=True)[0][\"tree\"]\n{'role': 'sense', 'name': '\u82f9\u679c','children': [\n {'role': 'None', 'name': 'computer|\u7535\u8111', 'children': [\n {'role': 'modifier', 'name': 'PatternValue|\u6837\u5f0f\u503c', 'children': [\n {'role': 'CoEvent', 'name': 'able|\u80fd', 'children': [\n {'role': 'scope', 'name': 'bring|\u643a\u5e26', 'children': [\n {'role': 'patient', 'name': '$'}\n ]}\n ]}\n ]},\n {'role': 'patient', 'name': 'SpeBrand|\u7279\u5b9a\u724c\u5b50'}\n ]}\n]}\n```\n\nTwo ways to see the corresponding annotation data\n\n```python\n>>> hownet_dict.get_sememes_by_word(\"\u82f9\u679c\",structured=True)[0][\"tree\"] # or\n>>> hownet_dict.get_sememes_by_word(\"\u82f9\u679c\",structured=True)[0][\"word\"]\n>>> # two results are the same, only displaying one\n{'Def': '{computer|\u7535\u8111:modifier={PatternValue|\u6837\u5f0f\u503c:CoEvent={able|\u80fd:scope={bring|\u643a\u5e26:patient={$}}}}{SpeBrand|\u7279\u5b9a\u724c\u5b50}}',\n'en_grammar': 'noun',\n'zh_grammar': 'noun',\n'No': '127151',\n'syn': [\n {'id': '004024', 'text': 'IBM'},\n {'id': '041684', 'text': '\u6234\u5c14'},\n {'id': '049006', 'text': '\u4e1c\u829d'},\n {'id': '106795', 'text': '\u8054\u60f3'},\n {'id': '156029', 'text': '\u7d22\u5c3c'},\n {'id': '004203', 'text': 'iPad'},\n {'id': '019457', 'text': '\u7b14\u8bb0\u672c'},\n {'id': '019458', 'text': '\u7b14\u8bb0\u672c\u7535\u8111'},\n {'id': '019459', 'text': '\u7b14\u8bb0\u672c\u7535\u8111'},\n {'id': '019460', 'text': '\u7b14\u8bb0\u672c\u7535\u8111'},\n {'id': '019461', 'text': '\u7b14\u8bb0\u672c\u7535\u8111'},\n {'id': '019463', 'text': '\u7b14\u8bb0\u7c3f\u7535\u8111'},\n {'id': '019464', 'text': '\u7b14\u8bb0\u7c3f\u7535\u8111'},\n {'id': '020567', 'text': '\u4fbf\u643a\u5f0f\u7535\u8111'},\n {'id': '020568', 'text': '\u4fbf\u643a\u5f0f\u8ba1\u7b97\u673a'},\n {'id': '020569', 'text': '\u4fbf\u643a\u5f0f\u8ba1\u7b97\u673a'},\n {'id': '127224', 'text': '\u5e73\u677f\u7535\u8111'},\n {'id': '127225', 'text': '\u5e73\u677f\u7535\u8111'},\n {'id': '172264', 'text': '\u819d\u4e0a\u578b\u7535\u8111'},\n {'id': '172265', 'text': '\u819d\u4e0a\u578b\u7535\u8111'}\n],\n'zh_word': '\u82f9\u679c',\n'en_word': 'apple'}\n```\n\n### Get the static synonyms of the certain word\n\nThe similarity metrics are based on HowNet.\n\n```python\n>>> hownet_dict[\"\u82f9\u679c\"][0][\"syn\"]\n[{'id': '004024', 'text': 'IBM'},\n {'id': '041684', 'text': '\u6234\u5c14'},\n {'id': '049006', 'text': '\u4e1c\u829d'},\n {'id': '106795', 'text': '\u8054\u60f3'},\n {'id': '156029', 'text': '\u7d22\u5c3c'},\n {'id': '004203', 'text': 'iPad'},\n {'id': '019457', 'text': '\u7b14\u8bb0\u672c'},\n {'id': '019458', 'text': '\u7b14\u8bb0\u672c\u7535\u8111'},\n {'id': '019459', 'text': '\u7b14\u8bb0\u672c\u7535\u8111'},\n {'id': '019460', 'text': '\u7b14\u8bb0\u672c\u7535\u8111'},\n {'id': '019461', 'text': '\u7b14\u8bb0\u672c\u7535\u8111'},\n {'id': '019463', 'text': '\u7b14\u8bb0\u7c3f\u7535\u8111'},\n {'id': '019464', 'text': '\u7b14\u8bb0\u7c3f\u7535\u8111'},\n {'id': '020567', 'text': '\u4fbf\u643a\u5f0f\u7535\u8111'},\n {'id': '020568', 'text': '\u4fbf\u643a\u5f0f\u8ba1\u7b97\u673a'},\n {'id': '020569', 'text': '\u4fbf\u643a\u5f0f\u8ba1\u7b97\u673a'},\n {'id': '127224', 'text': '\u5e73\u677f\u7535\u8111'},\n {'id': '127225', 'text': '\u5e73\u677f\u7535\u8111'},\n {'id': '172264', 'text': '\u819d\u4e0a\u578b\u7535\u8111'},\n {'id': '172265', 'text': '\u819d\u4e0a\u578b\u7535\u8111'}]\n```\n\n### Get access of the word by ID\n\n```python\n>>> hownet_dict[\"004024\"]\n['Def', 'en_grammar', 'zh_grammar', 'No', 'syn', 'zh_word', 'en_word']\n```\n\n### Get all sememes\n\n```python\n>>> len(hownet_dict.get_all_sememes())\n2187\n```\n\n### Get relationship between two sememes\n\nThe sememes you input can be in any language.\n\n```python\n>>> hownet_dict.get_sememe_relation(\"\u97f3\u91cf\u503c\", \"\u5c16\u58f0\")\n>>> hownet_dict.get_sememe_relation(\"\u97f3\u91cf\u503c\", \"shrill\")\n>>> hownet_dict.get_sememe_relation(\"\u5c16\u58f0\", \"SoundVolumeValue\")\n>>> hownet_dict.get_sememe_relation(\"shrill\", \"SoundVolumeValue\")\n'hyponym'\n'hyponym'\n'hypernym'\n'hypernym'\n```\n\nThe output could be hypernym, hyponym, antonym or converse.\n\n### Get a sememe by another and the relation between them\n\nThe sememe you input can be in any language, but the relation must be in lowercase. You can specify the language of result, by default it will be Chinese.\n\n```python\n>>> hownet_dict.get_sememe_via_relation(\"\u97f3\u91cf\u503c\", \"hyponym\")\n>>> hownet_dict.get_sememe_via_relation(\"\u97f3\u91cf\u503c\", \"hyponym\", lang=\"en\")\n>>> hownet_dict.get_sememe_via_relation(\"SoundVolumeValue\", \"hyponym\", lang=\"en\")\n['\u9ad8\u58f0', '\u4f4e\u58f0', '\u5c16\u58f0', '\u6c99\u54d1', '\u65e0\u58f0', '\u6709\u58f0']\n['loud', 'LowVoice', 'shrill', 'hoarse', 'silent', 'talking']\n['loud', 'LowVoice', 'shrill', 'hoarse', 'silent', 'talking']\n```\n\n## Advanced Feature #1: Word Similarity Calculation via Sememes\n\nThe following parts are mainly implemented by Jun Yan and integrated by Chenghao Yang. Our implementation is based on the paper:\n\n> Jiangming Liu, Jinan Xu, Yujie Zhang. An Approach of Hybrid Hierarchical Structure for Word Similarity Computing by HowNet. In Proceedings of IJCNLP\n\n### Extra Initialization\n\nBecause there are some files required to be loaded for similarity calculation, the initialization overhead will be larger than before. To begin with, you can initialize the hownet_dict object as the following code:\n\n```python\n>>> hownet_dict_advanced = OpenHowNet.HowNetDict(use_sim=True)\n```\n\nYou can also postpone the initialization work of similarity calculation until use. The following code serves as an example and the return value will indicate whether the extra initialization process succeed.\n\n```python\n>>> hownet_dict.initialize_sememe_similarity_calculation()\nTrue\n```\n\n### Get Top-K Nearest Words for the Given Word\n\nIf the given word does not exist in HowNet annotations, this function will return an empty list.\n\n```python\n>>> query_result = hownet_dict_advanced.get_nearest_words_via_sememes(\"\u82f9\u679c\",20)\n>>> example = query_result[0]\n>>> print(\"word_name:\",example[\"word\"])\n>>> print(\"id:\",example[\"id\"])\n>>> print(\"synset and corresonding word&id&score:\")\n>>> print(example[\"synset\"])\nword_name: \u82f9\u679c\nid: 127151\nsynset and corresonding word&id&score:\n[{'id': 4024, 'word': 'IBM', 'score': 1.0},\n {'id': 41684, 'word': '\u6234\u5c14', 'score': 1.0},\n {'id': 49006, 'word': '\u4e1c\u829d', 'score': 1.0},\n {'id': 106795, 'word': '\u8054\u60f3', 'score': 1.0},\n {'id': 156029, 'word': '\u7d22\u5c3c', 'score': 1.0},\n {'id': 4203, 'word': 'iPad', 'score': 0.865},\n {'id': 19457, 'word': '\u7b14\u8bb0\u672c', 'score': 0.865},\n {'id': 19458, 'word': '\u7b14\u8bb0\u672c\u7535\u8111', 'score': 0.865},\n {'id': 19459, 'word': '\u7b14\u8bb0\u672c\u7535\u8111', 'score': 0.865},\n {'id': 19460, 'word': '\u7b14\u8bb0\u672c\u7535\u8111', 'score': 0.865},\n {'id': 19461, 'word': '\u7b14\u8bb0\u672c\u7535\u8111', 'score': 0.865},\n {'id': 19463, 'word': '\u7b14\u8bb0\u7c3f\u7535\u8111', 'score': 0.865},\n {'id': 19464, 'word': '\u7b14\u8bb0\u7c3f\u7535\u8111', 'score': 0.865},\n {'id': 20567, 'word': '\u4fbf\u643a\u5f0f\u7535\u8111', 'score': 0.865},\n {'id': 20568, 'word': '\u4fbf\u643a\u5f0f\u8ba1\u7b97\u673a', 'score': 0.865},\n {'id': 20569, 'word': '\u4fbf\u643a\u5f0f\u8ba1\u7b97\u673a', 'score': 0.865},\n {'id': 127224, 'word': '\u5e73\u677f\u7535\u8111', 'score': 0.865},\n {'id': 127225, 'word': '\u5e73\u677f\u7535\u8111', 'score': 0.865},\n {'id': 172264, 'word': '\u819d\u4e0a\u578b\u7535\u8111', 'score': 0.865},\n {'id': 172265, 'word': '\u819d\u4e0a\u578b\u7535\u8111', 'score': 0.865}]\n```\n\n### Calculate the Similarity for the Given Two Words\n\nIf any of the given words does not exist in HowNet annotations, this function will return 0.\n\n```python\n>>> hownet_dict_advanced.calculate_word_similarity(\"\u82f9\u679c\", \"\u68a8\")\n1.0\n```\n\n## License\n\nMIT\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/thunlp/OpenHowNet-API/", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "OpenHowNet", "package_url": "https://pypi.org/project/OpenHowNet/", "platform": "", "project_url": "https://pypi.org/project/OpenHowNet/", "project_urls": { "Homepage": "https://github.com/thunlp/OpenHowNet-API/" }, "release_url": "https://pypi.org/project/OpenHowNet/0.0.1a3/", "requires_dist": [ "anytree", "tqdm", "requests" ], "requires_python": ">=3.6", "summary": "OpenHowNet-API", "version": "0.0.1a3" }, "last_serial": 5993582, "releases": { "0.0.1a0": [ { "comment_text": "", "digests": { "md5": "2c3988e281af0550c12ac82e93516ae9", "sha256": "43b9b12e6b05d692ac432bbd58ad176ff5015b6e4fc898227d46ddd087d4c681" }, "downloads": -1, "filename": "OpenHowNet-0.0.1a0-py3-none-any.whl", "has_sig": false, "md5_digest": "2c3988e281af0550c12ac82e93516ae9", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 18454, "upload_time": "2019-10-17T14:27:12", "url": "https://files.pythonhosted.org/packages/88/6b/143556d5d9e403b08d90fb451cad33332b132d39308f59170058867d347e/OpenHowNet-0.0.1a0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "8c025fb6e6a0708e09cb28bfe928a8c1", "sha256": "afc53ad1d01c0628affff17a57401ee8a5ba29abeae5108c7819c92474610a1b" }, "downloads": -1, "filename": "OpenHowNet-0.0.1a0.tar.gz", "has_sig": false, "md5_digest": "8c025fb6e6a0708e09cb28bfe928a8c1", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 48629, "upload_time": "2019-10-17T14:27:17", "url": "https://files.pythonhosted.org/packages/5c/d4/79cc450359b390f0c633336618410cf4621fa6bc35d736749b8d3c9eed5a/OpenHowNet-0.0.1a0.tar.gz" } ], "0.0.1a1": [ { "comment_text": "", "digests": { "md5": "ba576a6a23a3d2a10b699d628502e62c", "sha256": "48c129454f3986a404343ff8b23e8b8a00a5163b018b98258f924ce20202887a" }, "downloads": -1, "filename": "OpenHowNet-0.0.1a1-py3-none-any.whl", "has_sig": false, "md5_digest": "ba576a6a23a3d2a10b699d628502e62c", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 18456, "upload_time": "2019-10-18T02:58:07", "url": "https://files.pythonhosted.org/packages/a3/32/545c05562cd2860edf3afd0719eafd97881618e0fc0d1a3d083767b80632/OpenHowNet-0.0.1a1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "27097f9173a0cc22fcd5619e4e9cbacf", "sha256": "38e8d887cff80bb82c03ed3349af6e8beb6d05e750b33144e601b8ca43060422" }, "downloads": -1, "filename": "OpenHowNet-0.0.1a1.tar.gz", "has_sig": false, "md5_digest": "27097f9173a0cc22fcd5619e4e9cbacf", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 48630, "upload_time": "2019-10-18T02:58:12", "url": "https://files.pythonhosted.org/packages/c9/49/f534455503fd59e9d0b82265fc2ca04777bb1564585a715288a6344fb5d5/OpenHowNet-0.0.1a1.tar.gz" } ], "0.0.1a2": [ { "comment_text": "", "digests": { "md5": "2faaae3738d067426979f548ac611d8b", "sha256": "b898b3262b1aa14d8560cbdf85faf2a2a78bcf032e9808973c310afa4e9ce732" }, "downloads": -1, "filename": "OpenHowNet-0.0.1a2-py3-none-any.whl", "has_sig": false, "md5_digest": "2faaae3738d067426979f548ac611d8b", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 18454, "upload_time": "2019-10-18T04:13:54", "url": "https://files.pythonhosted.org/packages/06/94/f1d9026a73c598ee6ee4b05bef198d46dbeeb88dc98339b0460fd6f74cd3/OpenHowNet-0.0.1a2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "bda1f5aa1614237f8d8f7134313527d9", "sha256": "4fbcb290a60a4ecd223835803bbbc9ddb6b3e65178077e837fe2171f14bb7287" }, "downloads": -1, "filename": "OpenHowNet-0.0.1a2.tar.gz", "has_sig": false, "md5_digest": "bda1f5aa1614237f8d8f7134313527d9", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 48629, "upload_time": "2019-10-18T04:13:58", "url": "https://files.pythonhosted.org/packages/9a/b8/996de2a4b0d1e688923fbc96bcd59f5302faf5aea158f563eda445197f6b/OpenHowNet-0.0.1a2.tar.gz" } ], "0.0.1a3": [ { "comment_text": "", "digests": { "md5": "7706cfa94c8cbf8268a822535be88f83", "sha256": "eb22d81cd138ac8ffe12d0fb0c95b0898152ee2432e4171e6b91a95f05acd421" }, "downloads": -1, "filename": "OpenHowNet-0.0.1a3-py3-none-any.whl", "has_sig": false, "md5_digest": "7706cfa94c8cbf8268a822535be88f83", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 18455, "upload_time": "2019-10-18T04:17:12", "url": "https://files.pythonhosted.org/packages/29/e7/65bc92341929e46b2239429de81ef5e0c5c1efca059213da7916df3c7b85/OpenHowNet-0.0.1a3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1e6c204945dd4d66f1376422fe910230", "sha256": "50aaf338daabd66b20bc6452c864b64af0589ef1c7245d255a477c45c0a15ac6" }, "downloads": -1, "filename": "OpenHowNet-0.0.1a3.tar.gz", "has_sig": false, "md5_digest": "1e6c204945dd4d66f1376422fe910230", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 48632, "upload_time": "2019-10-18T04:17:15", "url": "https://files.pythonhosted.org/packages/dc/c9/1f6ba66b16034e35efd8368fa00a93b8ed4e463bcea3007e4696a22c8964/OpenHowNet-0.0.1a3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "7706cfa94c8cbf8268a822535be88f83", "sha256": "eb22d81cd138ac8ffe12d0fb0c95b0898152ee2432e4171e6b91a95f05acd421" }, "downloads": -1, "filename": "OpenHowNet-0.0.1a3-py3-none-any.whl", "has_sig": false, "md5_digest": "7706cfa94c8cbf8268a822535be88f83", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 18455, "upload_time": "2019-10-18T04:17:12", "url": "https://files.pythonhosted.org/packages/29/e7/65bc92341929e46b2239429de81ef5e0c5c1efca059213da7916df3c7b85/OpenHowNet-0.0.1a3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1e6c204945dd4d66f1376422fe910230", "sha256": "50aaf338daabd66b20bc6452c864b64af0589ef1c7245d255a477c45c0a15ac6" }, "downloads": -1, "filename": "OpenHowNet-0.0.1a3.tar.gz", "has_sig": false, "md5_digest": "1e6c204945dd4d66f1376422fe910230", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 48632, "upload_time": "2019-10-18T04:17:15", "url": "https://files.pythonhosted.org/packages/dc/c9/1f6ba66b16034e35efd8368fa00a93b8ed4e463bcea3007e4696a22c8964/OpenHowNet-0.0.1a3.tar.gz" } ] }