{ "info": { "author": "ZhouYang Luo", "author_email": "zhouyang.luo@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3.6" ], "description": "# datasets\nA dataset utils repository. **For tensorflow>=2.0.0b only!**\n\n## Requirements\n\n* python 3.6\n* tensorflow>=2.0.0b\n\n## Installation\n\n```bash\npip install nlp-datasets\n```\n\n## Contents\n\n* Build dataset for seq2seq models. [seq2seq_dataset.py](nlp_datasets/seq2seq/seq2seq_dataset.py)\n* Build dataset for NMT. [nmt_dataset.py](nlp_datasets/nmt/nmt_dataset.py)\n* Build dataset for DSSM. [dssm_dataset.py](nlp_datasets/dssm/dssm_dataset.py)\n* Build dataset for MatchPyramid. [matchpyramid_dataset.py](nlp_datasets/matchpyramid/match_pyramid_dataset.py)\n\n## Usage\n\n### For NMT task\n\n```python\nfrom nlp_datasets import NMTSameFileDataset\n\no = NMTSameFileDataset(config=None, logger_name=None)\ntrain_files = [] # your files\n# train_dataset is an instance of tf.data.Dataset\ntrain_dataset = o.build_train_dataset(train_files)\n\n```\n\n```python\nfrom nlp_datasets import NMTSeparateFileDataset\n\no = NMTSeparateFileDataset(config=None, logger_name=None)\nfeature_files = [] # your files\nlabel_files = []\ntrain_dataset = o.build_train_dataset(feature_files,label_files)\n```\n\n### For DSSM task\n\n```python\nfrom nlp_datasets import DSSMSameFileDataset\n\no = DSSMSameFileDataset(config=None, logger_name=None)\ntrain_dataset = o.build_train_dataset(train_files=[])\n\n```\n\n```python\nfrom nlp_datasets import DSSMSeparateFileDataset\n\no = DSSMSeparateFileDataset(config=None, logger_name=None)\nquery_files = []\ndoc_files = []\nlabel_files = []\ntrain_dataset = o.build_train_dataset(query_files, doc_files, label_files)\n\n```\n\n### For MatchPyramid task\n\n```python\nfrom nlp_datasets import MatchPyramidSameFileDataset\n\no = MatchPyramidSameFileDataset(config=None, logger_name=None)\ntrain_dataset = o.build_train_dataset(train_files=[])\n\n```\n\n```python\nfrom nlp_datasets import MatchPyramidSeparateFilesDataset\n\no = MatchPyramidSeparateFilesDataset(config=None, logger_name=None)\nquery_files = []\ndoc_files = []\nlabel_files = []\ntrain_dataset = o.build_train_dataset(query_files, doc_files, label_files)\n\n```\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/naivenmt/datasets", "keywords": "", "license": "MIT License", "maintainer": "", "maintainer_email": "", "name": "naivenmt-datasets", "package_url": "https://pypi.org/project/naivenmt-datasets/", "platform": "", "project_url": "https://pypi.org/project/naivenmt-datasets/", "project_urls": { "Homepage": "https://github.com/naivenmt/datasets" }, "release_url": "https://pypi.org/project/naivenmt-datasets/0.0.7/", "requires_dist": [ "deprecated (>=1.2.5)" ], "requires_python": "", "summary": "A dataset utils repository. For tensorflow 2.x only!", "version": "0.0.7" }, "last_serial": 5854682, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "b9c31e15e20c019ce179f02b499daf07", "sha256": "94b5730b22996b021f7816e8f95abf5dd7ef72aed725ada77a8e6f22e1369937" }, "downloads": -1, "filename": "naivenmt_datasets-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "b9c31e15e20c019ce179f02b499daf07", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 14388, "upload_time": "2019-04-16T14:41:08", "url": "https://files.pythonhosted.org/packages/c8/fe/38ebfa65c56d38ef360a54707763187178e297f5348110c33c6ac12f1e7e/naivenmt_datasets-0.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1f4e151406f994c86668dbbf234ccfae", "sha256": "851464f8fa1f1529bfd4db96674aca50f95407d7a017e7665188c8924b59e064" }, "downloads": -1, "filename": "naivenmt_datasets-0.0.1.tar.gz", "has_sig": false, "md5_digest": "1f4e151406f994c86668dbbf234ccfae", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6788, "upload_time": "2019-04-16T14:41:10", "url": "https://files.pythonhosted.org/packages/01/3f/9ee940902e7a9eef9d1c59e5238213de808bef6fe445c09163803686dd5e/naivenmt_datasets-0.0.1.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "d0d88239323cd54d5610b4beb05b71e5", "sha256": "5770c6a0ee9c956c43d3bd39b5041ce7c4a45bed47c27876f28497c3d8d8c845" }, "downloads": -1, "filename": "naivenmt_datasets-0.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "d0d88239323cd54d5610b4beb05b71e5", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 18292, "upload_time": "2019-04-19T07:58:37", "url": "https://files.pythonhosted.org/packages/e5/a3/15fb3b6a7e185d91cc4b100bd923011d9653c4296d924d8e4ff21d64aada/naivenmt_datasets-0.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "05af3fe8a9af59a8901f5e805b56c9cf", "sha256": "49c92e3e56b4228e3790de02a5e6d0e461113d19d6f3c5328583bcf3562f22b2" }, "downloads": -1, "filename": "naivenmt_datasets-0.0.2.tar.gz", "has_sig": false, "md5_digest": "05af3fe8a9af59a8901f5e805b56c9cf", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8976, "upload_time": "2019-04-19T07:58:39", "url": "https://files.pythonhosted.org/packages/9b/67/3fc31c80408aba3a27e166513ba275e5d13d9580b31c134e0a73c515fa8f/naivenmt_datasets-0.0.2.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "66b1f48c2afcbbafbd05aa9e4c342956", "sha256": "1351a27709e66a44abc9b868a2f3f75f6b9e1826970115946bb5bf2ea6bd802b" }, "downloads": -1, "filename": "naivenmt_datasets-0.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "66b1f48c2afcbbafbd05aa9e4c342956", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 18308, "upload_time": "2019-04-19T08:03:46", "url": "https://files.pythonhosted.org/packages/c4/28/6f39e84f5431ab48aeec1b8ed5232bf44775b07d149ef949606d668f4e83/naivenmt_datasets-0.0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "8394323a20d0c2767ff2581210c44996", "sha256": "dfb26d7e70b91eb08b44bc378a6b5e5ac676043868248797ac87e1d8ced03f1e" }, "downloads": -1, "filename": "naivenmt_datasets-0.0.3.tar.gz", "has_sig": false, "md5_digest": "8394323a20d0c2767ff2581210c44996", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8990, "upload_time": "2019-04-19T08:03:47", "url": "https://files.pythonhosted.org/packages/da/6b/43d16b73be53ad660c51f0696c6cd044ce81fe4201c2e0231908fe5af11e/naivenmt_datasets-0.0.3.tar.gz" } ], "0.0.4": [ { "comment_text": "", "digests": { "md5": "13b53d54babbf82c033c0a5b1ee46b16", "sha256": "ad1980ca8981be7b3bbd7224f6a62bcc808eef74b1eaeedb11da1aacaa9b7193" }, "downloads": -1, "filename": "naivenmt_datasets-0.0.4-py3-none-any.whl", "has_sig": false, "md5_digest": "13b53d54babbf82c033c0a5b1ee46b16", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 18309, "upload_time": "2019-04-19T09:38:48", "url": "https://files.pythonhosted.org/packages/2f/ba/603a1b1787db437b8d8f062182c0f6cf2fba7f522738ffeb3df2472dba83/naivenmt_datasets-0.0.4-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3231447d749bda128be87c463dbb4bc6", "sha256": "a0bb8db5cf0f792928ed31af4216759c7105278cc9d91703a8ef3df2057331ff" }, "downloads": -1, "filename": "naivenmt_datasets-0.0.4.tar.gz", "has_sig": false, "md5_digest": "3231447d749bda128be87c463dbb4bc6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8992, "upload_time": "2019-04-19T09:38:50", "url": "https://files.pythonhosted.org/packages/ff/5e/49aa8f5d8dcf81e5e49057c79420c1edb2d2c651f0821c237c17ad4d4a6a/naivenmt_datasets-0.0.4.tar.gz" } ], "0.0.5": [ { "comment_text": "", "digests": { "md5": "d54dea6131bf7dbcda74a2f0a9049c08", "sha256": "84cbdb1f4294684f1b5c33ea67a896ca6f45eb12ed2e9c88c529dbfb792028ef" }, "downloads": -1, "filename": "naivenmt_datasets-0.0.5-py3-none-any.whl", "has_sig": false, "md5_digest": "d54dea6131bf7dbcda74a2f0a9049c08", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 18707, "upload_time": "2019-04-22T08:09:40", "url": "https://files.pythonhosted.org/packages/06/bd/716fbca92d24af66fbec25b2e90c45b79c378b26e26df999d5d9da998e14/naivenmt_datasets-0.0.5-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3e960d083fde5d0de974887e7c24d486", "sha256": "5b0b23d064f34af397466136efeb5900bf55f32bd1ee201203f661106657103e" }, "downloads": -1, "filename": "naivenmt_datasets-0.0.5.tar.gz", "has_sig": false, "md5_digest": "3e960d083fde5d0de974887e7c24d486", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9059, "upload_time": "2019-04-22T08:09:42", "url": "https://files.pythonhosted.org/packages/fb/45/8fe2e6d92611700b1d5a3afe06aa6cd1a3c565ef87e71e561943ead58221/naivenmt_datasets-0.0.5.tar.gz" } ], "0.0.6": [ { "comment_text": "", "digests": { "md5": "b544064de664312306f89519f98dea3e", "sha256": "749a984bab60b9e0fe99ed157a14e447ce7855cce7eba94fa7e1e4ee69641292" }, "downloads": -1, "filename": "naivenmt_datasets-0.0.6-py3-none-any.whl", "has_sig": false, "md5_digest": "b544064de664312306f89519f98dea3e", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 27007, "upload_time": "2019-06-27T07:56:57", "url": "https://files.pythonhosted.org/packages/3c/e3/4408f54e66db3469b08f60951d5a91d8d22e45a6b83c1229bd0f0495775c/naivenmt_datasets-0.0.6-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "8b4bc1b7873caa6622570f9a4a8f66f9", "sha256": "7b79d529dbe3a5ece15c07d058118d5ca6ccafc6455d44708e0e959f76edf23d" }, "downloads": -1, "filename": "naivenmt_datasets-0.0.6.tar.gz", "has_sig": false, "md5_digest": "8b4bc1b7873caa6622570f9a4a8f66f9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12669, "upload_time": "2019-06-27T07:56:59", "url": "https://files.pythonhosted.org/packages/1a/64/dd109c3ceb7f0634d693c4bb1313db5e815e7751ef65575b080a7ff2f6b5/naivenmt_datasets-0.0.6.tar.gz" } ], "0.0.7": [ { "comment_text": "", "digests": { "md5": "59bfc9a4fc1cca8e060986111d4bd8a2", "sha256": "9ae339aab797486eb8285469e5c2c5d34f9599777783e1f6f075a53f4f1d8947" }, "downloads": -1, "filename": "naivenmt_datasets-0.0.7-py3-none-any.whl", "has_sig": false, "md5_digest": "59bfc9a4fc1cca8e060986111d4bd8a2", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 27302, "upload_time": "2019-09-19T08:07:54", "url": "https://files.pythonhosted.org/packages/e8/da/5a6fd3efdc41e7c3ab9e62b79819a03b21d82998ac6e3b899e83b5d5bcd1/naivenmt_datasets-0.0.7-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "8c406e280363e660078dc1d569542253", "sha256": "704debe74263a3e42ea7d71dabcc5134002638e933d299eca4ce3e45c7547266" }, "downloads": -1, "filename": "naivenmt_datasets-0.0.7.tar.gz", "has_sig": false, "md5_digest": "8c406e280363e660078dc1d569542253", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11993, "upload_time": "2019-09-19T08:07:56", "url": "https://files.pythonhosted.org/packages/81/ab/aec3f7623a547a27ac1af7ad1da0fa1fb059e8f171e613ef8a1332873a70/naivenmt_datasets-0.0.7.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "59bfc9a4fc1cca8e060986111d4bd8a2", "sha256": "9ae339aab797486eb8285469e5c2c5d34f9599777783e1f6f075a53f4f1d8947" }, "downloads": -1, "filename": "naivenmt_datasets-0.0.7-py3-none-any.whl", "has_sig": false, "md5_digest": "59bfc9a4fc1cca8e060986111d4bd8a2", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 27302, "upload_time": "2019-09-19T08:07:54", "url": "https://files.pythonhosted.org/packages/e8/da/5a6fd3efdc41e7c3ab9e62b79819a03b21d82998ac6e3b899e83b5d5bcd1/naivenmt_datasets-0.0.7-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "8c406e280363e660078dc1d569542253", "sha256": "704debe74263a3e42ea7d71dabcc5134002638e933d299eca4ce3e45c7547266" }, "downloads": -1, "filename": "naivenmt_datasets-0.0.7.tar.gz", "has_sig": false, "md5_digest": "8c406e280363e660078dc1d569542253", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11993, "upload_time": "2019-09-19T08:07:56", "url": "https://files.pythonhosted.org/packages/81/ab/aec3f7623a547a27ac1af7ad1da0fa1fb059e8f171e613ef8a1332873a70/naivenmt_datasets-0.0.7.tar.gz" } ] }