{ "info": { "author": "fennuDetudou", "author_email": "upczyxl@163.com", "bugtrack_url": null, "classifiers": [], "description": "# \u8bf4\u660e\n\n1. \u57fa\u4e8ebert\u7684\u4e2d\u6587\u81ea\u7136\u8bed\u8a00\u5904\u7406\u5de5\u5177\n2. \u5305\u62ec\u60c5\u611f\u5206\u6790\u3001\u4e2d\u6587\u5206\u8bcd\u3001\u8bcd\u6027\u6807\u6ce8\u3001\u4ee5\u53ca\u547d\u540d\u5b9e\u4f53\u8bc6\u522b\u529f\u80fd\n3. \u63d0\u4f9b\u4e86\u8bad\u7ec3\u63a5\u53e3\uff0c\u901a\u8fc7\u6307\u5b9a\u8f93\u5165\u8f93\u51fa\u4ee5\u53ca\u8c37\u6b4c\u63d0\u4f9b\u7684\u4e0b\u8f7d\u597d\u7684\u9884\u8bad\u7ec3\u6a21\u578b\u5373\u53ef\u8fdb\u884c\u81ea\u5df1\u7684\u6a21\u578b\u7684\u8bad\u7ec3\uff0c\u8bad\u7ec3\u4efb\u52a1\u6709task_name\u53c2\u6570\u51b3\u5b9a\uff0c\u76ee\u524d\u63d0\u4f9b\u7684\u4efb\u52a1\u4e3b\u8981\u5305\u62ec\u53e5\u5b50\u5339\u914d\u3001\u6587\u672c\u5206\u7c7b\u3001\u547d\u540d\u5b9e\u4f53\u8bc6\u522b\u3001\u5e8f\u5217\u6807\u6ce8\u4efb\u52a1\n4. \u4f7f\u7528`pip install tudou`\u5b89\u88c5\u4f7f\u7528\n5. \u9700\u8981\u4e0b\u8f7d\u9884\u5148\u8bad\u7ec3\u597d\u7684\u6a21\u578b\uff0c\u6a21\u578b\u5730\u5740\u5728\u5e95\u90e8\n\n# \u4f7f\u7528\u793a\u4f8b\n\u5728predict_test.ipynb\u4e2d\u6709\u9884\u6d4b\u4ee3\u7801\u7684\u6f14\u793a\uff0c\u53ef\u4ee5\u770b\u5230\u6548\u679c\u6bd4\u7edd\u5927\u591a\u6570\u5f00\u6e90\u4e2d\u6587nlp\u5e93\u8981\u597d\uff0c\u4f46\u662f\u901f\u5ea6\u8f83\u6162\uff08\u65f6\u95f4\u4e3b\u8981\u6d6a\u8d39\u5728\u52a0\u8f7d\u6a21\u578b\u53c2\u6570\u4e0a\u4e86\uff09\uff0c**\u6240\u4ee5\u63a8\u8350\u4e00\u6b21\u6027\u8f93\u5165\u591a\u4e2a\u8bed\u53e5\u7684\u6587\u672c\u5217\u8868\u8fdb\u884c\u4f7f\u7528**\n\n# \u4f9d\u8d56\u9879\n## \u4f9d\u8d56\u9879\n\n`python >=3.6\ntensorflow >= 1.12.0`\n## \u786c\u4ef6\n1. \u9884\u6d4b\u4e0e\u4f7f\u7528\u5728\u666e\u901acpu\u673a\u5668\u4e0a\u65e2\u53ef\u4ee5\u8fd0\u884c\n2. \u8bad\u7ec3\u63a5\u53e3\u9700\u8981\u5728GPU\u673a\u5668\u4e0a\u8fdb\u884c\uff0c\u5f53\u5185\u5b58\u4e0d\u591f\u7528\u65f6\uff0c\u63a8\u8350\u51cf\u5c11batch_size\u800c\u4e0d\u662fmax_sequence_len,\u5bf9\u7cbe\u5ea6\u5f71\u54cd\u8f83\u5c0f\n\n# \u529f\u80fd\u7b80\u4ecb\n\n\u63d0\u4f9b\u4e86\u4e09\u4e2a\u63a5\u53e3\uff0c\u5305\u62ec\u9884\u6d4b\uff0c\u5e38\u7528\u5de5\u5177\u4ee5\u53ca\u5229\u7528bert\u8bad\u7ec3\u6a21\u578b\u7684\u63a5\u53e3\n\n# \u4f7f\u7528\u8bf4\u660e\n\n1. \u5728\u4f7f\u7528\u524d\u8981\u5148\u4e0b\u8f7d\u6a21\u578b\uff0c\u6a21\u578b\u4e0b\u8f7d\u5730\u5740\u9644\u5728\u6700\u540e\uff0c\u6839\u636e\u4e0d\u540c\u7684\u4efb\u52a1\u4e0b\u8f7d\u4e0d\u540c\u7684\u6a21\u578b\n1. \u540c\u65f6\u4e5f\u53ef\u4ee5\u4f7f\u7528\u81ea\u5df1\u8bad\u7ec3\u597d\u7684\u6a21\u578b\n1. \u8bad\u7ec3\u6a21\u578b\u9700\u8981\u9884\u5148\u4e0b\u8f7d\u8c37\u6b4c\u63d0\u4f9b\u7684bert\u9884\u8bad\u7ec3\u6a21\u578b\uff08\u8be5\u9879\u76ee\u4e5f\u63d0\u4f9b\uff1a\u4f4d\u4e8epre_trained_model\u4e0b\n\n## train\u51fd\u6570\n\n\u65b0\u5efa\u4e00\u4e2a\u5b9e\u4f8b\n\n`trainer=tudouNLP.models.train.train(*params)`\n\n### help\n\ntrain\u51fd\u6570\u7684\u8bf4\u660e\u51fd\u6570\uff0c\u5305\u62ec\u4e00\u4e9b\u53c2\u6570\u53ca\u6587\u4ef6\u683c\u5f0f\u7684\u8bf4\u660e\n\n`trainer.help()`\n\n### \u8bad\u7ec3\n\n`trainer()`\n\n### \u9884\u6d4b\n\n`results=trainer.predict()`\n\n#### \u8bf4\u660e\n\n1. \u8bad\u7ec3\u7684\u65f6\u5019\u6ca1\u6709\u8fd4\u56de\u503c\uff0c\u6839\u636e\u53c2\u6570\u4e2dtask_name\u5f00\u59cb\u4e0d\u540c\u7684\u8bad\u7ec3\u4efb\u52a1\n\n1. \u5305\u62ec\u6587\u672c\u5206\u7c7b\uff0c\u5e8f\u5217\u6807\u6ce8\u4ee5\u53ca\u53e5\u5b50\u5339\u914d\u4efb\u52a1\u7684\u8bad\u7ec3\n\n1. \u9884\u6d4b\u65f6\u8981\u6ce8\u610f\u4e0e\u8bad\u7ec3\u65f6\u53c2\u6570\u8981\u76f8\u540c\uff08\u4e3b\u8981\u662f`label_list\u3001label_dict`\uff09\uff0c\u540c\u65f6\u8f93\u51fa\u76ee\u5f55\u4e5f\u8981\u76f8\u540c\n\n1. \u53c2\u6570\u7b80\u4ecb\n\n ```\n :param task_name:\u4efb\u52a1\u540d\uff1a\u76ee\u524d\u5305\u62ec\u5b9e\u4f53\u8bc6\u522bner\uff0c\u5e8f\u5217\u6807\u6ce8tag\uff0c\u53e5\u5b50\u5206\u7c7bclassify\uff0c\u53e5\u5b50\u914d\u5bf9pair\n :param label_list: \u4efb\u52a1\u7684\u6807\u7b7e\u5217\u8868\uff0c\u5728\u5e8f\u5217\u6807\u6ce8\u4efb\u52a1\u4e2d\u8981\u52a0\u5165\u3010CLS\u3011,[SEP]\n :param label_dict: \u5e8f\u5217\u6807\u6ce8\u4efb\u52a1\u4e2d\u6807\u7b7e\u4e0eID\u5bf9\u5e94\u7684\u5b57\u5178\u540d\n :param data_dir: \u6570\u636e\u6587\u4ef6\n :param model_dir: \u6a21\u578b\u6587\u4ef6\n :param output_dir: \u8f93\u51fa\u6587\u4ef6\n :param eval: \u662f\u5426\u8fdb\u884c\u9a8c\u8bc1\n :param max_seq_length:\n :param learning_rate:\n :param batch_size:\n ```\n\n1. \u63d0\u4f9b\u7684\u6587\u4ef6\u683c\u5f0f\u8bf4\u660e\n\n ```\n 1. \u5e8f\u5217\u6807\u6ce8\u4efb\u52a1\u6587\u4ef6\u683c\u5f0f\u4e3a word tag\n 2. \u6587\u672c\u5206\u7c7b\u4efb\u52a1\u6587\u4ef6\u683c\u5f0f\u4e3a sentence label\n 3. \u53e5\u5b50\u914d\u5bf9\u4efb\u52a1\u6587\u4ef6\u683c\u5f0f\u4e3a index text1 text2 label \uff0c\u5176\u4e2dindex\u4e3a\u4e0d\u5fc5\u8981\u7684\u5217\uff0c\u4e2d\u95f4\u5206\u9694\u7b26\u4e3a\\t\n 4. \u6587\u4ef6\u5728data_dir\u4e2d\uff0c\u8bad\u7ec3\u6587\u4ef6\u547d\u540d\u4e3atrain.txt\uff0c\u9a8c\u8bc1\u96c6\u6587\u4ef6\u547d\u540d\u4e3adev.txt\n ```\n\n## predict\n\n**\u4f7f\u7528\u65f6\u8981\u521b\u5efa\u4e00\u4e2a\u5b9e\u4f8b**\n\n\n### sentence\u51fd\u6570\n\n`predictor=tudouNLP.models.predict.sentence(model_dir\uff09# \u53c2\u6570\u4e3a\u6a21\u578b\u6240\u5728\u6587\u4ef6\u5939`\n\n#### \u60c5\u611f\u5206\u6790\n\n`result=predictor.sentiment(document,full_msg)`\n\n1. \u8fd4\u56de\u60c5\u611f\u5206\u6790\u7ed3\u679c\uff0c\u5f53full_msg\u53c2\u6570\u4e3aTrue\u65f6\uff0c\u8fd4\u56de\u5168\u90e8\u7684\u5206\u6790\u7ed3\u679c\n1. document\u4e3a\u8981\u5206\u6790\u7684\u53e5\u5b50\u5217\u8868\n1. **\u6ce8\u610f\uff1a\u5373\u4f7f\u662f\u5355\u4e2a\u53e5\u5b50\uff0c\u4e5f\u8981\u4ee5\u5217\u8868\u6216\u5143\u7ec4\u7684\u5f62\u5f0f\u8f93\u5165**\n\n#### \u53e5\u5b50\u5339\u914d\n\n`result=predictor.pair(document,full_msg,model_name)`\n\n\u540c\u60c5\u611f\u5206\u6790\n\n### tagger\u51fd\u6570\n\n`predictor=tudouNLP.models.predict.tagger(model_dir)`\n\n#### \u5206\u8bcd\n\n`result=predictor.cut(document,mode='cut')`\n\n1. \u8fd4\u56de\u5206\u8bcd\u7ed3\u679c\u5217\u8868\n1. document\u4e3a\u8981\u5206\u6790\u7684\u53e5\u5b50\u5217\u8868\n1. \u6ce8\u610f\uff1a\u5373\u4f7f\u662f\u5355\u4e2a\u53e5\u5b50\uff0c\u4e5f\u8981\u4ee5\u5217\u8868\u6216\u5143\u7ec4\u7684\u5f62\u5f0f\u8f93\u5165\n\n#### \u8bcd\u6027\u6807\u6ce8\n\n`result=predictor.cut(document,mode='posseg')`\n\n\u540c\u5206\u8bcd\uff0c\u4e0d\u8fc7\u8fd4\u56de\u5206\u8bcd\u7ed3\u679c\u5217\u8868\n\n## utils\u51fd\u6570\n\n**\u4f7f\u7528\u65f6\u8981\u521b\u5efa\u4e00\u4e2a\u5b9e\u4f8b**\n\n`tool=tudouNLP.tools.utils.tools()`\n\n#### \u5e8f\u5217\u6807\u6ce8\u6570\u636e\u96c6\u8f6c\u6362\n\n`tool.posseg_data(input_dir,output_file)`\n\n1. \u5c06\u5e8f\u5217\u6807\u6ce8\u7684\u6570\u636e\u96c6\u8f6c\u6362\u4e3abert\u6a21\u578b\u53ef\u4ee5\u8bc6\u522b\u7684\u6a21\u5f0f\n\n#### \u6a21\u578b\u538b\u7f29\n\n`tool.compress_model(input_file,output_file)`\n\n\u5c06\u8bad\u7ec3\u540e\u7684\u6a21\u578b\u53c2\u6570\u8fdb\u884c\u538b\u7f29\n\n# \u6a21\u578b\u4e0b\u8f7d\n\n\u94fe\u63a5: https://pan.baidu.com/s/1_dBX3-mjY3-Dedm96XNY2g \u63d0\u53d6\u7801: tjqe", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "", "keywords": "bert nlp ner NER named entity recognition tensorflow machine learning sentence encoding embedding pos tag sentiment judge", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "tudou", "package_url": "https://pypi.org/project/tudou/", "platform": "", "project_url": "https://pypi.org/project/tudou/", "project_urls": null, "release_url": "https://pypi.org/project/tudou/2.0.1/", "requires_dist": null, "requires_python": "", "summary": "A Chinese NLP tools based Google-bert", "version": "2.0.1" }, "last_serial": 4934595, "releases": { "2.0.1": [ { "comment_text": "", "digests": { "md5": "e863cf88ffcf7de93f7f38065b91b05e", "sha256": "7ea77a79c906d679008392ab513cf762f451abe8766a3488f6416dbdcdb69b21" }, "downloads": -1, "filename": "tudou-2.0.1.tar.gz", "has_sig": false, "md5_digest": "e863cf88ffcf7de93f7f38065b91b05e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 116702, "upload_time": "2019-03-13T13:23:34", "url": "https://files.pythonhosted.org/packages/e6/ff/d4e9270c60d4b1c5814eb1c279986a4486667dadac3ca6dbce4e628b06ca/tudou-2.0.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "e863cf88ffcf7de93f7f38065b91b05e", "sha256": "7ea77a79c906d679008392ab513cf762f451abe8766a3488f6416dbdcdb69b21" }, "downloads": -1, "filename": "tudou-2.0.1.tar.gz", "has_sig": false, "md5_digest": "e863cf88ffcf7de93f7f38065b91b05e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 116702, "upload_time": "2019-03-13T13:23:34", "url": "https://files.pythonhosted.org/packages/e6/ff/d4e9270c60d4b1c5814eb1c279986a4486667dadac3ca6dbce4e628b06ca/tudou-2.0.1.tar.gz" } ] }