{ "info": { "author": "Yixing Fan, Bo Wang, Zeyi Wang, Liang Pang, Liu Yang, Qinghua Wang, etc.", "author_email": "fanyixing@ict.ac.cn", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Environment :: Console", "License :: OSI Approved :: Apache Software License", "Operating System :: POSIX :: Linux", "Programming Language :: Python :: 3.6", "Topic :: Scientific/Engineering :: Artificial Intelligence" ], "description": "
\n\"logo\"\n
\n\n# MatchZoo [![Tweet](https://img.shields.io/twitter/url/http/shields.io.svg?style=social)](https://twitter.com/intent/tweet?text=MatchZoo:%20deep%20learning%20for%20semantic%20matching&url=https://github.com/NTMC-Community/MatchZoo)\n\n> Facilitating the design, comparison and sharing of deep text matching models.
\n> MatchZoo \u662f\u4e00\u4e2a\u901a\u7528\u7684\u6587\u672c\u5339\u914d\u5de5\u5177\u5305\uff0c\u5b83\u65e8\u5728\u65b9\u4fbf\u5927\u5bb6\u5feb\u901f\u7684\u5b9e\u73b0\u3001\u6bd4\u8f83\u3001\u4ee5\u53ca\u5206\u4eab\u6700\u65b0\u7684\u6df1\u5ea6\u6587\u672c\u5339\u914d\u6a21\u578b\u3002\n\n[![Python 3.6](https://img.shields.io/badge/python-3.6%20%7C%203.7-blue.svg)](https://www.python.org/downloads/release/python-360/)\n[![Pypi Downloads](https://img.shields.io/pypi/dm/matchzoo.svg?label=pypi)](https://pypi.org/project/MatchZoo/)\n[![Documentation Status](https://readthedocs.org/projects/matchzoo/badge/?version=master)](https://matchzoo.readthedocs.io/en/master/?badge=master)\n[![Build Status](https://travis-ci.org/NTMC-Community/MatchZoo.svg?branch=master)](https://travis-ci.org/NTMC-Community/MatchZoo/)\n[![codecov](https://codecov.io/gh/NTMC-Community/MatchZoo/branch/master/graph/badge.svg)](https://codecov.io/gh/NTMC-Community/MatchZoo)\n[![License](https://img.shields.io/badge/License-Apache%202.0-yellowgreen.svg)](https://opensource.org/licenses/Apache-2.0)\n[![Requirements Status](https://requires.io/github/NTMC-Community/MatchZoo/requirements.svg?branch=master)](https://requires.io/github/NTMC-Community/MatchZoo/requirements/?branch=master)\n---\n\ud83d\udd25**News: [MatchZoo-py](https://github.com/NTMC-Community/MatchZoo-py) (PyTorch version of MatchZoo) is ready now.**\n\nThe goal of MatchZoo is to provide a high-quality codebase for deep text matching research, such as document retrieval, question answering, conversational response ranking, and paraphrase identification. With the unified data processing pipeline, simplified model configuration and automatic hyper-parameters tunning features equipped, MatchZoo is flexible and easy to use.\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
TasksText 1Text 2Objective
Paraphrase Indentification string 1 string 2 classification
Textual Entailment text hypothesis classification
Question Answer question answer classification/ranking
Conversation dialog response classification/ranking
Information Retrieval query document ranking
\n\n## Get Started in 60 Seconds\n\nTo train a [Deep Semantic Structured Model](https://www.microsoft.com/en-us/research/project/dssm/), import matchzoo and prepare input data.\n\n```python\nimport matchzoo as mz\n\ntrain_pack = mz.datasets.wiki_qa.load_data('train', task='ranking')\nvalid_pack = mz.datasets.wiki_qa.load_data('dev', task='ranking')\n```\n\nPreprocess your input data in three lines of code, keep track parameters to be passed into the model.\n\n```python\npreprocessor = mz.preprocessors.DSSMPreprocessor()\ntrain_processed = preprocessor.fit_transform(train_pack)\nvalid_processed = preprocessor.transform(valid_pack)\n```\n\nMake use of MatchZoo customized loss functions and evaluation metrics:\n\n```python\nranking_task = mz.tasks.Ranking(loss=mz.losses.RankCrossEntropyLoss(num_neg=4))\nranking_task.metrics = [\n mz.metrics.NormalizedDiscountedCumulativeGain(k=3),\n mz.metrics.MeanAveragePrecision()\n]\n```\n\nInitialize the model, fine-tune the hyper-parameters.\n\n```python\nmodel = mz.models.DSSM()\nmodel.params['input_shapes'] = preprocessor.context['input_shapes']\nmodel.params['task'] = ranking_task\nmodel.guess_and_fill_missing_params()\nmodel.build()\nmodel.compile()\n```\n\nGenerate pair-wise training data on-the-fly, evaluate model performance using customized callbacks on validation data.\n\n```python\ntrain_generator = mz.PairDataGenerator(train_processed, num_dup=1, num_neg=4, batch_size=64, shuffle=True)\nvalid_x, valid_y = valid_processed.unpack()\nevaluate = mz.callbacks.EvaluateAllMetrics(model, x=valid_x, y=valid_y, batch_size=len(valid_x))\nhistory = model.fit_generator(train_generator, epochs=20, callbacks=[evaluate], workers=5, use_multiprocessing=False)\n```\n\n## References\n[Tutorials](https://github.com/NTMC-Community/MatchZoo/tree/master/tutorials)\n\n[English Documentation](https://matchzoo.readthedocs.io/en/master/)\n\n[\u4e2d\u6587\u6587\u6863](https://matchzoo.readthedocs.io/zh/latest/)\n\nIf you're interested in the cutting-edge research progress, please take a look at [awaresome neural models for semantic match](https://github.com/NTMC-Community/awaresome-neural-models-for-semantic-match).\n\n## Install\n\nMatchZoo is dependent on [Keras](https://github.com/keras-team/keras) and [Tensorflow](https://github.com/tensorflow/tensorflow). Two ways to install MatchZoo:\n\n**Install MatchZoo from Pypi:**\n\n```python\npip install matchzoo\n```\n\n**Install MatchZoo from the Github source:**\n\n```\ngit clone https://github.com/NTMC-Community/MatchZoo.git\ncd MatchZoo\npython setup.py install\n```\n\n\n## Models\n\n1. [DRMM](https://github.com/NTMC-Community/MatchZoo/tree/master/matchzoo/models/drmm.py): this model is an implementation of A Deep Relevance Matching Model for Ad-hoc Retrieval.\n\n2. [MatchPyramid](https://github.com/NTMC-Community/MatchZoo/tree/master/matchzoo/models/match_pyramid.py): this model is an implementation of Text Matching as Image Recognition\n\n3. [ARC-I](https://github.com/NTMC-Community/MatchZoo/tree/master/matchzoo/models/arci.py): this model is an implementation of Convolutional Neural Network Architectures for Matching Natural Language Sentences\n\n4. [DSSM](https://github.com/NTMC-Community/MatchZoo/tree/master/matchzoo/models/dssm.py): this model is an implementation of Learning Deep Structured Semantic Models for Web Search using Clickthrough Data\n\n5. [CDSSM](https://github.com/NTMC-Community/MatchZoo/tree/master/matchzoo/models/cdssm.py): this model is an implementation of Learning Semantic Representations Using Convolutional Neural Networks for Web Search\n\n6. [ARC-II](https://github.com/NTMC-Community/MatchZoo/tree/master/matchzoo/models/arcii.py): this model is an implementation of Convolutional Neural Network Architectures for Matching Natural Language Sentences\n\n7. [MV-LSTM](https://github.com/NTMC-Community/MatchZoo/tree/master/matchzoo/models/mvlstm.py):this model is an implementation of A Deep Architecture for Semantic Matching with Multiple Positional Sentence Representations\n\n8. [aNMM](https://github.com/NTMC-Community/MatchZoo/tree/master/matchzoo/models/anmm.py): this model is an implementation of aNMM: Ranking Short Answer Texts with Attention-Based Neural Matching Model\n\n9. [DUET](https://github.com/NTMC-Community/MatchZoo/tree/master/matchzoo/models/duet.py): this model is an implementation of Learning to Match Using Local and Distributed Representations of Text for Web Search\n\n10. [K-NRM](https://github.com/NTMC-Community/MatchZoo/tree/master/matchzoo/models/knrm.py): this model is an implementation of End-to-End Neural Ad-hoc Ranking with Kernel Pooling\n\n11. [CONV-KNRM](https://github.com/NTMC-Community/MatchZoo/tree/master/matchzoo/models/conv_knrm.py): this model is an implementation of Convolutional neural networks for soft-matching n-grams in ad-hoc search\n\n12. models under development: Match-SRNN, DeepRank, BiMPM .... \n\n\n## Citation\n\nIf you use MatchZoo in your research, please use the following BibTex entry.\n\n```\n@inproceedings{Guo:2019:MLP:3331184.3331403,\n author = {Guo, Jiafeng and Fan, Yixing and Ji, Xiang and Cheng, Xueqi},\n title = {MatchZoo: A Learning, Practicing, and Developing System for Neural Text Matching},\n booktitle = {Proceedings of the 42Nd International ACM SIGIR Conference on Research and Development in Information Retrieval},\n series = {SIGIR'19},\n year = {2019},\n isbn = {978-1-4503-6172-9},\n location = {Paris, France},\n pages = {1297--1300},\n numpages = {4},\n url = {http://doi.acm.org/10.1145/3331184.3331403},\n doi = {10.1145/3331184.3331403},\n acmid = {3331403},\n publisher = {ACM},\n address = {New York, NY, USA},\n keywords = {matchzoo, neural network, text matching},\n} \n```\n\n\n## Development Team\n\n \u200b \u200b \u200b \u200b\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
\n \u200b \"faneshion\"
\n \u200b Fan Yixing \u200b\n

Core Dev
\n ASST PROF, ICT

\u200b\n
\n \"bwanglzu\"
\n Wang Bo \u200b\n

Core Dev
M.S. TU Delft

\u200b\n
\n \u200b \"uduse\"
\n Wang Zeyi\n

Core Dev
B.S. UC Davis

\u200b\n
\n \u200b \"pl8787\"
\n \u200b Pang Liang\n

Core Dev
\n ASST PROF, ICT

\u200b\n
\n \u200b \"yangliuy\"
\n \u200b Yang Liu\n

Core Dev
\n PhD. UMASS

\u200b\n
\n \u200b \"wqh17101\"
\n \u200b Wang Qinghua \u200b\n

Documentation
\n B.S. Shandong Univ.

\u200b\n
\n \u200b \"ZizhenWang\"
\n \u200b Wang Zizhen \u200b\n

Dev
\n M.S. UCAS

\u200b\n
\n \u200b \"lixinsu\"
\n \u200b Su Lixin\n

Dev
\n PhD. UCAS

\u200b\n
\n \u200b \"zhouzhouyang520\"
\n \u200b Yang Zhou \u200b\n

Dev
\n M.S. CQUT

\u200b\n
\n \u200b \"rgtjf\"
\n \u200b Tian Junfeng \u200b\n

Dev
\n M.S. ECNU

\u200b\n
\n\n\n\n## Contribution\n\nPlease make sure to read the [Contributing Guide](./CONTRIBUTING.md) before creating a pull request. If you have a MatchZoo-related paper/project/compnent/tool, send a pull request to [this awesome list](https://github.com/NTMC-Community/awaresome-neural-models-for-semantic-match)!\n\nThank you to all the people who already contributed to MatchZoo!\n\n[Jianpeng Hou](https://github.com/HouJP), [Lijuan Chen](https://github.com/githubclj), [Yukun Zheng](https://github.com/zhengyk11), [Niuguo Cheng](https://github.com/niuox), [Dai Zhuyun](https://github.com/AdeDZY), [Aneesh Joshi](https://github.com/aneesh-joshi), [Zeno Gantner](https://github.com/zenogantner), [Kai Huang](https://github.com/hkvision), [stanpcf](https://github.com/stanpcf), [ChangQF](https://github.com/ChangQF), [Mike Kellogg\n](https://github.com/wordreference)\n\n\n\n\n## Project Organizers\n\n- Jiafeng Guo\n * Institute of Computing Technology, Chinese Academy of Sciences\n * [Homepage](http://www.bigdatalab.ac.cn/~gjf/)\n- Yanyan Lan\n * Institute of Computing Technology, Chinese Academy of Sciences\n * [Homepage](http://www.bigdatalab.ac.cn/~lanyanyan/)\n- Xueqi Cheng\n * Institute of Computing Technology, Chinese Academy of Sciences\n * [Homepage](http://www.bigdatalab.ac.cn/~cxq/)\n\n\n## License\n\n[Apache-2.0](https://opensource.org/licenses/Apache-2.0)\n\nCopyright (c) 2015-present, Yixing Fan (faneshion)", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/NTMC-Community/MatchZoo", "keywords": "text matching models", "license": "Apache 2.0", "maintainer": "", "maintainer_email": "", "name": "MatchZoo", "package_url": "https://pypi.org/project/MatchZoo/", "platform": "", "project_url": "https://pypi.org/project/MatchZoo/", "project_urls": { "Homepage": "https://github.com/NTMC-Community/MatchZoo" }, "release_url": "https://pypi.org/project/MatchZoo/2.2.0/", "requires_dist": null, "requires_python": "", "summary": "Facilitating the design, comparison and sharing of deep text matching models.", "version": "2.2.0", "yanked": false, "yanked_reason": null }, "last_serial": 6023725, "releases": { "1.0.0": [ { "comment_text": "", "digests": { "md5": "3d1bcce1334b64852ec7e0286b05a483", "sha256": "4d8038459fa2f7cff3b49671f396f87e34683b3e56b242d7502231953ad4161d" }, "downloads": -1, "filename": "MatchZoo-1.0.0-py2-none-any.whl", "has_sig": false, "md5_digest": "3d1bcce1334b64852ec7e0286b05a483", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 144939, "upload_time": "2019-01-12T08:21:22", "upload_time_iso_8601": "2019-01-12T08:21:22.736701Z", "url": "https://files.pythonhosted.org/packages/52/fd/981affb46f4e71905ffc7e90d175798b2f400d08018983edb4c24aae87a4/MatchZoo-1.0.0-py2-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "21a467f620f19e6684acda770721da35", "sha256": "a2be1249889760586f5a17b7e89a3f8af0cad4db7a076a0bf5afa79d5d6796d9" }, "downloads": -1, "filename": "MatchZoo-1.0.0.tar.gz", "has_sig": false, "md5_digest": "21a467f620f19e6684acda770721da35", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 40954, "upload_time": "2019-01-12T08:21:25", "upload_time_iso_8601": "2019-01-12T08:21:25.842001Z", "url": "https://files.pythonhosted.org/packages/f2/2b/5b366766c818b55ae77f2ddf25a5e6eb5ba84ab150bea6484e30320561d9/MatchZoo-1.0.0.tar.gz", "yanked": false, "yanked_reason": null } ], "2.0.0": [ { "comment_text": "", "digests": { "md5": "2afb839fe6e5b64b96b1ccf9862376ec", "sha256": "c10f2ad383e3a65ee047d74486ad01fac2c9b7123ba7edc586ad73cd00dc7239" }, "downloads": -1, "filename": "MatchZoo-2.0.0.tar.gz", "has_sig": false, "md5_digest": "2afb839fe6e5b64b96b1ccf9862376ec", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 62506, "upload_time": "2019-01-12T08:16:00", "upload_time_iso_8601": "2019-01-12T08:16:00.075161Z", "url": "https://files.pythonhosted.org/packages/ed/20/a6997f62ab892769448f4b11b22053703a2840d2a7fbf356202ce747d5e8/MatchZoo-2.0.0.tar.gz", "yanked": false, "yanked_reason": null } ], "2.1.0": [ { "comment_text": "", "digests": { "md5": "d6104816231672c76a035e3398b5b9d8", "sha256": "edd491334f0c07d75291836d7da863a551c32acca2f9e18c35a78771994c4a40" }, "downloads": -1, "filename": "MatchZoo-2.1.0.tar.gz", "has_sig": false, "md5_digest": "d6104816231672c76a035e3398b5b9d8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 76451, "upload_time": "2019-04-13T12:14:59", "upload_time_iso_8601": "2019-04-13T12:14:59.775272Z", "url": "https://files.pythonhosted.org/packages/7d/f7/f9bbe0ea45c2e77d00df976b597b976d38e73233c2b0b8f2c506e3bc62da/MatchZoo-2.1.0.tar.gz", "yanked": false, "yanked_reason": null } ], "2.2.0": [ { "comment_text": "", "digests": { "md5": "5b144301ce5bbdeff0c11acbc66b2269", "sha256": "547a94b10eed69afe7af896e3a70f357eee48032840a7b441893040233311f10" }, "downloads": -1, "filename": "MatchZoo-2.2.0.tar.gz", "has_sig": false, "md5_digest": "5b144301ce5bbdeff0c11acbc66b2269", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 91459, "upload_time": "2019-10-24T13:09:11", "upload_time_iso_8601": "2019-10-24T13:09:11.740252Z", "url": "https://files.pythonhosted.org/packages/0f/7a/6d4547d19dcc75dfd9602ebb5a893ade85920a3264fda976b2ea61e7cf27/MatchZoo-2.2.0.tar.gz", "yanked": false, "yanked_reason": null } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "5b144301ce5bbdeff0c11acbc66b2269", "sha256": "547a94b10eed69afe7af896e3a70f357eee48032840a7b441893040233311f10" }, "downloads": -1, "filename": "MatchZoo-2.2.0.tar.gz", "has_sig": false, "md5_digest": "5b144301ce5bbdeff0c11acbc66b2269", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 91459, "upload_time": "2019-10-24T13:09:11", "upload_time_iso_8601": "2019-10-24T13:09:11.740252Z", "url": "https://files.pythonhosted.org/packages/0f/7a/6d4547d19dcc75dfd9602ebb5a893ade85920a3264fda976b2ea61e7cf27/MatchZoo-2.2.0.tar.gz", "yanked": false, "yanked_reason": null } ], "vulnerabilities": [] }