{ "info": { "author": "Sean Lee", "author_email": "xmlee@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Environment :: Console", "License :: OSI Approved :: Apache Software License", "Operating System :: POSIX :: Linux", "Programming Language :: Python :: 3.6", "Topic :: Scientific/Engineering :: Artificial Intelligence" ], "description": "# MatchZoo-Lite\n\n\u57fa\u4e8e [MatchZoo 2.0.0](https://github.com/NTMC-Community/MatchZoo/tree/v2.0) \u5f00\u53d1\uff0c\u5e76\u505a\u4e86\u7b80\u5316\n\n## \u4e3b\u8981\u4fee\u6539\n\n### \u589e\n\u589e\u52a0\u6570\u636e\u52a0\u8f7d\u5668 dataloader \u53ef\u65b9\u4fbf\u8fdb\u884c\u6570\u636e\u7684\u52a0\u8f7d\uff0c\u8bad\u7ec3\u6570\u636e\u548c\u6d4b\u8bd5\u6570\u636e\u7684\u6587\u4ef6\u683c\u5f0f\u7edf\u4e00\u4e3a json \u6587\u4ef6\uff0c\u683c\u5f0f\u4e3a\uff1a\n\n```json\n{\"text_left\": \"xxx xxx xx\", \"text_right\": \"xxx xxx xxx\", \"label\": 1}\n```\n\n\u5176\u4e2d `text_left` \u548c `text_right` \u4e3a\u7a7a\u683c\u5206\u5272\u7684\u5206\u8bcd\u6587\u672c\n\n### \u5220\u6539\n- \u53bb\u9664 nltk \u76f8\u5173\u8bed\u6599\u5e93\u7684\u8c03\u7528\uff08\u5982\u505c\u7528\u8bcd\uff09\n- \u53bb\u9664\u9884\u63d0\u4f9b\u7684 datasets\n- \u66f4\u6362\u4e86\u6d4b\u8bd5 tests\n- \u53bb\u9664\u90e8\u5206\u6a21\u578b\uff0c\u53ea\u4fdd\u7559\u4ee5\u4e0b\u6a21\u578b\uff1a\n - arci\n - arcii\n - dssm\n - cdssm\n - conv_highway\n - duet\n - match_pyramid\n - mvlstm\n\n## Install\n\nMatchZoo is dependent on [Keras](https://github.com/keras-team/keras), please install one of its backend engines: TensorFlow, Theano, or CNTK. We recommend the TensorFlow backend. Two ways to install MatchZoo:\n\n### Install matchzoo-lite from the Github source\n\n```\ngit clone http://gitlab.alipay-inc.com/niming.lxm/matchzoo-lite.git\ncd matchzoo-lite\npython setup.py install\n```\n### Docker\n\n```\ndocker pull seanlee97/matchzoo-lite:latest\n```\n\n\n## Train your model\n- [train_duet.py](http://gitlab.alipay-inc.com/niming.lxm/matchzoo-lite/blob/master/examples/train_duet.py)\n- [train_arcii.py](http://gitlab.alipay-inc.com/niming.lxm/matchzoo-lite/blob/master/examples/train_arcii.py)\n- [train_dssm.py](http://gitlab.alipay-inc.com/niming.lxm/matchzoo-lite/blob/master/examples/train_dssm.py)\n- [train_mvlstm.py](http://gitlab.alipay-inc.com/niming.lxm/matchzoo-lite/blob/master/examples/train_mvlstm.py)\n\n## Get Started in 60 Seconds\n\nTo train a [Deep Semantic Structured Model](https://www.microsoft.com/en-us/research/project/dssm/), import matchzoo and prepare input data.\n\n```python\nimport matchzoo as mz\n\ntrain_pack = mz.datasets.wiki_qa.load_data('train', task='ranking')\nvalid_pack = mz.datasets.wiki_qa.load_data('dev', task='ranking')\npredict_pack = mz.datasets.wiki_qa.load_data('test', task='ranking')\n```\n\nPreprocess your input data in three lines of code, keep track parameters to be passed into the model.\n\n```python\npreprocessor = mz.preprocessors.DSSMPreprocessor()\ntrain_pack_processed = preprocessor.fit_transform(train_pack)\nvalid_pack_processed = preprocessor.transform(valid_pack)\npredict_pack_processed = preprocessor.transform(predict_pack)\n```\n\nMake use of MatchZoo customized loss functions and evaluation metrics:\n\n```python\nranking_task = mz.tasks.Ranking(loss=mz.losses.RankCrossEntropyLoss(num_neg=4))\nranking_task.metrics = [\n mz.metrics.NormalizedDiscountedCumulativeGain(k=3),\n mz.metrics.NormalizedDiscountedCumulativeGain(k=5),\n mz.metrics.MeanAveragePrecision()\n]\n```\n\nInitialize the model, fine-tune the hyper-parameters.\n\n```python\nmodel = mz.models.DSSM()\nmodel.params['input_shapes'] = preprocessor.context['input_shapes']\nmodel.params['task'] = ranking_task\nmodel.params['mlp_num_layers'] = 3\nmodel.params['mlp_num_units'] = 300\nmodel.params['mlp_num_fan_out'] = 128\nmodel.params['mlp_activation_func'] = 'relu'\nmodel.guess_and_fill_missing_params()\nmodel.build()\nmodel.compile()\n```\n\nGenerate pair-wise training data on-the-fly, evaluate model performance using customized callbacks on prediction data.\n\n```python\ntrain_generator = mz.PairDataGenerator(train_pack_processed, num_dup=1, num_neg=4, batch_size=64, shuffle=True)\n\npred_x, pred_y = predict_pack_processed.unpack()\nevaluate = mz.callbacks.EvaluateAllMetrics(model, x=pred_x, y=pred_y, batch_size=len(pred_x))\n\nhistory = model.fit_generator(train_generator, epochs=20, callbacks=[evaluate], workers=5, use_multiprocessing=False)\n```\n\n## References\n[MatchZoo](https://github.com/NTMC-Community/MatchZoo)\n\n[Tutorials](https://github.com/NTMC-Community/MatchZoo/tree/master/tutorials)\n\n[English Documentation](https://matchzoo.readthedocs.io/en/master/)\n\n[\u4e2d\u6587\u6587\u6863](https://matchzoo.readthedocs.io/zh/latest/)\n\nIf you're interested in the cutting-edge research progress, please take a look at [awaresome neural models for semantic match](https://github.com/NTMC-Community/awaresome-neural-models-for-semantic-match).\n\n\n## License\n\n[Apache-2.0](https://opensource.org/licenses/Apache-2.0)\n\n## MatchZoo License\n\n[Apache-2.0](https://opensource.org/licenses/Apache-2.0)\nCopyright (c) 2015-present, Yixing Fan (faneshion)", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/4AI/matchzoo-lite", "keywords": "text matching models based on https://github.com/NTMC-Community/MatchZoo", "license": "Apache 2.0", "maintainer": "", "maintainer_email": "", "name": "matchzoo-lite", "package_url": "https://pypi.org/project/matchzoo-lite/", "platform": "", "project_url": "https://pypi.org/project/matchzoo-lite/", "project_urls": { "Homepage": "https://github.com/4AI/matchzoo-lite" }, "release_url": "https://pypi.org/project/matchzoo-lite/0.1.4/", "requires_dist": null, "requires_python": "", "summary": "Facilitating the design, comparison and sharing of deep text matching models. Based on MatchZoo", "version": "0.1.4" }, "last_serial": 4997075, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "096fd950c7e3a51093b160403b461b95", "sha256": "6b6e96195b4ff156834a25c22d5810ebbdf4780790e56bae9efcbe43f6dbea53" }, "downloads": -1, "filename": "matchzoo-lite-0.0.1.tar.gz", "has_sig": false, "md5_digest": "096fd950c7e3a51093b160403b461b95", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 52123, "upload_time": "2019-03-20T03:31:39", "url": "https://files.pythonhosted.org/packages/a6/7c/1558e4852dd9602d48c260bcbf40b0254bec2314ab250f47c123e304aec7/matchzoo-lite-0.0.1.tar.gz" } ], "0.1.0": [ { "comment_text": "", "digests": { "md5": "56f80ef94dfc6bf38267180c9a9f29a0", "sha256": "1762276cc6624446616e599eecc544647d4dd809abc175bda164306957187809" }, "downloads": -1, "filename": "matchzoo-lite-0.1.0.tar.gz", "has_sig": false, "md5_digest": "56f80ef94dfc6bf38267180c9a9f29a0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 52140, "upload_time": "2019-03-20T08:50:53", "url": "https://files.pythonhosted.org/packages/f0/02/ba4042e621fa9e248bf25ede72620bb6ec77bddd228f89c36f6bcb27266e/matchzoo-lite-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "1d3867d5fb19ac7681a48c3e37132964", "sha256": "22e90cd5a03cee5d99c1c56f396a36746bb0a7def0a5c95c11d14271f66aa83d" }, "downloads": -1, "filename": "matchzoo-lite-0.1.1.tar.gz", "has_sig": false, "md5_digest": "1d3867d5fb19ac7681a48c3e37132964", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 52171, "upload_time": "2019-03-25T03:23:26", "url": "https://files.pythonhosted.org/packages/87/a5/48f04dfd2c6db165285ef0c039afca6561f6c594035374d401bd7bd2fa12/matchzoo-lite-0.1.1.tar.gz" } ], "0.1.4": [ { "comment_text": "", "digests": { "md5": "37416ab0bb9df876ad1c5547e3bbc552", "sha256": "a6f9dcbd4de2d29e8af1a92855aab14b6a7745593bfe6eeb94e177d5c84e0c15" }, "downloads": -1, "filename": "matchzoo-lite-0.1.4.tar.gz", "has_sig": false, "md5_digest": "37416ab0bb9df876ad1c5547e3bbc552", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 52170, "upload_time": "2019-03-28T10:34:09", "url": "https://files.pythonhosted.org/packages/a6/3f/0e9f14e734d739c4f1c3ec936b5ca07da30d5608c23d4c78563a68952ea4/matchzoo-lite-0.1.4.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "37416ab0bb9df876ad1c5547e3bbc552", "sha256": "a6f9dcbd4de2d29e8af1a92855aab14b6a7745593bfe6eeb94e177d5c84e0c15" }, "downloads": -1, "filename": "matchzoo-lite-0.1.4.tar.gz", "has_sig": false, "md5_digest": "37416ab0bb9df876ad1c5547e3bbc552", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 52170, "upload_time": "2019-03-28T10:34:09", "url": "https://files.pythonhosted.org/packages/a6/3f/0e9f14e734d739c4f1c3ec936b5ca07da30d5608c23d4c78563a68952ea4/matchzoo-lite-0.1.4.tar.gz" } ] }