{ "info": { "author": "PyTorch core devs and James Bradbury", "author_email": "jekbradbury@gmail.com", "bugtrack_url": null, "classifiers": [], "description": ".. image:: https://travis-ci.org/pytorch/text.svg?branch=master\n :target: https://travis-ci.org/pytorch/text\n\n.. image:: https://codecov.io/gh/pytorch/text/branch/master/graph/badge.svg\n :target: https://codecov.io/gh/pytorch/text\n\n.. image:: http://readthedocs.org/projects/torchtext/badge/?version=latest\n :target: http://torchtext.readthedocs.io/en/latest/?badge=latest\n\ntorchtext\n+++++++++\n\nThis repository consists of:\n\n* `torchtext.data <#data>`_: Generic data loaders, abstractions, and iterators for text (including vocabulary and word vectors)\n* `torchtext.datasets <#datasets>`_: Pre-built loaders for common NLP datasets\n\nInstallation\n============\n\n\nMake sure you have Python 2.7 or 3.5+ and PyTorch 0.4.0 or newer. You can then install torchtext using pip::\n\n pip install torchtext\n\nFor PyTorch versions before 0.4.0, please use `pip install torchtext==0.2.3`.\n\nOptional requirements\n---------------------\n\nIf you want to use English tokenizer from `SpaCy `_, you need to install SpaCy and download its English model::\n\n pip install spacy\n python -m spacy download en\n\nAlternatively, you might want to use the `Moses `_ tokenizer port in `SacreMoses `_ (split from `NLTK `_). You have to install SacreMoses::\n\n pip install sacremoses\n\nDocumentation\n=============\n\nFind the documentation `here `_.\n\nData\n====\n\nThe data module provides the following:\n\n* Ability to describe declaratively how to load a custom NLP dataset that's in a \"normal\" format:\n\n .. code-block:: python\n\n >>> pos = data.TabularDataset(\n ... path='data/pos/pos_wsj_train.tsv', format='tsv',\n ... fields=[('text', data.Field()),\n ... ('labels', data.Field())])\n ...\n >>> sentiment = data.TabularDataset(\n ... path='data/sentiment/train.json', format='json',\n ... fields={'sentence_tokenized': ('text', data.Field(sequential=True)),\n ... 'sentiment_gold': ('labels', data.Field(sequential=False))})\n\n* Ability to define a preprocessing pipeline:\n\n .. code-block:: python\n\n >>> src = data.Field(tokenize=my_custom_tokenizer)\n >>> trg = data.Field(tokenize=my_custom_tokenizer)\n >>> mt_train = datasets.TranslationDataset(\n ... path='data/mt/wmt16-ende.train', exts=('.en', '.de'),\n ... fields=(src, trg))\n\n* Batching, padding, and numericalizing (including building a vocabulary object):\n\n .. code-block:: python\n\n >>> # continuing from above\n >>> mt_dev = datasets.TranslationDataset(\n ... path='data/mt/newstest2014', exts=('.en', '.de'),\n ... fields=(src, trg))\n >>> src.build_vocab(mt_train, max_size=80000)\n >>> trg.build_vocab(mt_train, max_size=40000)\n >>> # mt_dev shares the fields, so it shares their vocab objects\n >>>\n >>> train_iter = data.BucketIterator(\n ... dataset=mt_train, batch_size=32,\n ... sort_key=lambda x: data.interleave_keys(len(x.src), len(x.trg)))\n >>> # usage\n >>> next(iter(train_iter))\n \n\n* Wrapper for dataset splits (train, validation, test):\n\n .. code-block:: python\n\n >>> TEXT = data.Field()\n >>> LABELS = data.Field()\n >>>\n >>> train, val, test = data.TabularDataset.splits(\n ... path='/data/pos_wsj/pos_wsj', train='_train.tsv',\n ... validation='_dev.tsv', test='_test.tsv', format='tsv',\n ... fields=[('text', TEXT), ('labels', LABELS)])\n >>>\n >>> train_iter, val_iter, test_iter = data.BucketIterator.splits(\n ... (train, val, test), batch_sizes=(16, 256, 256),\n >>> sort_key=lambda x: len(x.text), device=0)\n >>>\n >>> TEXT.build_vocab(train)\n >>> LABELS.build_vocab(train)\n\nDatasets\n========\n\nThe datasets module currently contains:\n\n* Sentiment analysis: SST and IMDb\n* Question classification: TREC\n* Entailment: SNLI, MultiNLI\n* Language modeling: abstract class + WikiText-2, WikiText103, PennTreebank\n* Machine translation: abstract class + Multi30k, IWSLT, WMT14\n* Sequence tagging (e.g. POS/NER): abstract class + UDPOS, CoNLL2000Chunking\n* Question answering: 20 QA bAbI tasks\n* Text classification: AG_NEWS, SogouNews, DBpedia, YelpReviewPolarity, YelpReviewFull, YahooAnswers, AmazonReviewPolarity, AmazonReviewFull\n\nOthers are planned or a work in progress:\n\n* Question answering: SQuAD\n\nSee the ``test`` directory for examples of dataset usage.\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/pytorch/text", "keywords": "", "license": "BSD", "maintainer": "", "maintainer_email": "", "name": "torchtext", "package_url": "https://pypi.org/project/torchtext/", "platform": "", "project_url": "https://pypi.org/project/torchtext/", "project_urls": { "Homepage": "https://github.com/pytorch/text" }, "release_url": "https://pypi.org/project/torchtext/0.4.0/", "requires_dist": [ "tqdm", "requests", "torch", "numpy", "six" ], "requires_python": "", "summary": "Text utilities and datasets for PyTorch", "version": "0.4.0" }, "last_serial": 5648006, "releases": { "0.1.1": [ { "comment_text": "", "digests": { "md5": "db69ae12dce43f8c326e60fb379fc52d", "sha256": "4dce00ae4876c998b9032f0aa380228c9b3b3fd02de99f14ed50c8f7bcced786" }, "downloads": -1, "filename": "torchtext-0.1.1-py3.6.egg", "has_sig": false, "md5_digest": "db69ae12dce43f8c326e60fb379fc52d", "packagetype": "bdist_egg", "python_version": "3.6", "requires_python": null, "size": 55617, "upload_time": "2017-08-14T20:35:27", "url": "https://files.pythonhosted.org/packages/2f/3b/3bb23f70bb87f6c9e56450859ca6730a4f613c69172a544723b335f0119c/torchtext-0.1.1-py3.6.egg" }, { "comment_text": "", "digests": { "md5": "65392d7d4f82ea4c748962ff140b7baf", "sha256": "4136b36cfcaee203496bb581e170241c3580ead36d82958774ddb50dae6ffc61" }, "downloads": -1, "filename": "torchtext-0.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "65392d7d4f82ea4c748962ff140b7baf", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 24010, "upload_time": "2017-08-14T20:35:24", "url": "https://files.pythonhosted.org/packages/b2/e7/4f01aa7348feff083bf7475a6a0aca2d63760a8f79bd4d67cee9f3d5cdcb/torchtext-0.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "4d4fdfb5e47f41b45d5067d9760c1210", "sha256": "e3086ecfb2bf6377843aa7629c92caa96ab04e49f49b405c8c9947501f2a521d" }, "downloads": -1, "filename": "torchtext-0.1.1.tar.gz", "has_sig": false, "md5_digest": "4d4fdfb5e47f41b45d5067d9760c1210", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 15169, "upload_time": "2017-08-14T20:35:29", "url": "https://files.pythonhosted.org/packages/cd/93/82d7e195c060c364c1cf81958453589085dddefb30b08f3f30f3f9e25235/torchtext-0.1.1.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "2a192c7e2973d1568b8f8c13cf5dacaf", "sha256": "0ee2d5f5c7f773ed171291f7b23c4cce3dc653de03fa4522d7d275dd73ef566f" }, "downloads": -1, "filename": "torchtext-0.2.0-py3-none-any.whl", "has_sig": false, "md5_digest": "2a192c7e2973d1568b8f8c13cf5dacaf", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 40692, "upload_time": "2017-10-20T04:17:15", "url": "https://files.pythonhosted.org/packages/3c/73/ac7461744aad1685595e112958555e1c8bc460e01d11047467b23521eb43/torchtext-0.2.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "cf722bbf6a597bfa171462b7ac4be112", "sha256": "c6844b4b0fb95004bb8d299941bc71361b0b73ecf6f2cd300729678816e601f2" }, "downloads": -1, "filename": "torchtext-0.2.0.tar.gz", "has_sig": false, "md5_digest": "cf722bbf6a597bfa171462b7ac4be112", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 30678, "upload_time": "2017-10-20T04:17:16", "url": "https://files.pythonhosted.org/packages/12/3b/9fcd440832e5ef6d16d04e91515ebfacd02a6037f018a7da33c7b7d1b602/torchtext-0.2.0.tar.gz" } ], "0.2.1": [ { "comment_text": "", "digests": { "md5": "33388d43114768b675cce55ab32ffe32", "sha256": "815471add270dce34c25899083e1b0ae57056fab93145a54fb872e932857a76e" }, "downloads": -1, "filename": "torchtext-0.2.1-py3-none-any.whl", "has_sig": false, "md5_digest": "33388d43114768b675cce55ab32ffe32", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 41993, "upload_time": "2017-12-28T23:25:37", "url": "https://files.pythonhosted.org/packages/d8/65/0e9370754790ed97f76ac4d357ee4fad6b5e093bcfd08e331d7b1b6828c3/torchtext-0.2.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3d0d890e4ebe40fca1dafe59eae7d24e", "sha256": "9deaa110f7f9383131cf4a1eee1784f2ebe7ebd382c9dd0a6450a4a61b186b0c" }, "downloads": -1, "filename": "torchtext-0.2.1.tar.gz", "has_sig": false, "md5_digest": "3d0d890e4ebe40fca1dafe59eae7d24e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 30732, "upload_time": "2017-12-28T23:25:39", "url": "https://files.pythonhosted.org/packages/ef/d3/c55a49e18e18b6f752ce05e90f1264cfdcb17fa930c00d0d89628e27fff2/torchtext-0.2.1.tar.gz" } ], "0.2.3": [ { "comment_text": "", "digests": { "md5": "b7bf31efcdc2bb2e70d2c3063756e4c6", "sha256": "268157efa287daa7fa78cc94e41d6e624dc1362dd85791df49ab86b888836de6" }, "downloads": -1, "filename": "torchtext-0.2.3.tar.gz", "has_sig": false, "md5_digest": "b7bf31efcdc2bb2e70d2c3063756e4c6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 42698, "upload_time": "2018-04-09T17:15:26", "url": "https://files.pythonhosted.org/packages/78/90/474d5944d43001a6e72b9aaed5c3e4f77516fbef2317002da2096fd8b5ea/torchtext-0.2.3.tar.gz" } ], "0.3.1": [ { "comment_text": "", "digests": { "md5": "a383dc6aab13276f3559b4a6badec30f", "sha256": "963160f97cf449edad1183e95d2dd0b4694225b7060a1a8b23e71bccb08022e0" }, "downloads": -1, "filename": "torchtext-0.3.1-py2-none-any.whl", "has_sig": false, "md5_digest": "a383dc6aab13276f3559b4a6badec30f", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 62399, "upload_time": "2018-10-11T14:15:29", "url": "https://files.pythonhosted.org/packages/26/b5/2022b596796eceba0143df5a18be2c17c9ecda95bdeab133225e0d46fae8/torchtext-0.3.1-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "042343d90f8c1319f18b37c3a6d45f42", "sha256": "7b5bc7af67d9c3892bdf6f4895734768f2836c13156a783c96597168176ce2d5" }, "downloads": -1, "filename": "torchtext-0.3.1-py3-none-any.whl", "has_sig": false, "md5_digest": "042343d90f8c1319f18b37c3a6d45f42", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 62398, "upload_time": "2018-10-11T14:15:30", "url": "https://files.pythonhosted.org/packages/c6/bc/b28b9efb4653c03e597ed207264eea45862b5260f48e9f010b5068d64db1/torchtext-0.3.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "471324b9c8ebf92d5cb8005e9ad59e7f", "sha256": "869e0860917b5a8660ebaa468f3cd3104a7acf3941a1f86e8e9a8ea61e78113d" }, "downloads": -1, "filename": "torchtext-0.3.1.tar.gz", "has_sig": false, "md5_digest": "471324b9c8ebf92d5cb8005e9ad59e7f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 50077, "upload_time": "2018-10-11T14:15:32", "url": "https://files.pythonhosted.org/packages/ff/67/b1b27c3318772cf75f3bf204bdb3a1b2008ae35564852d18a43a1605ae6e/torchtext-0.3.1.tar.gz" } ], "0.4.0": [ { "comment_text": "", "digests": { "md5": "79b2a7a89ea41fb562b7b41e8767798c", "sha256": "094520d9cd0af6a05368d9023fdc91dc038232bd9d128c7b548ec2200dba53ec" }, "downloads": -1, "filename": "torchtext-0.4.0-py3-none-any.whl", "has_sig": false, "md5_digest": "79b2a7a89ea41fb562b7b41e8767798c", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 53123, "upload_time": "2019-08-08T03:46:35", "url": "https://files.pythonhosted.org/packages/43/94/929d6bd236a4fb5c435982a7eb9730b78dcd8659acf328fd2ef9de85f483/torchtext-0.4.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "470597c5588c26ad7430486695c0343c", "sha256": "e04ca965fb1d74161fd1f4b5222ee4fa1ad6c02f1e7df213495883384f2fa408" }, "downloads": -1, "filename": "torchtext-0.4.0.tar.gz", "has_sig": false, "md5_digest": "470597c5588c26ad7430486695c0343c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 45844, "upload_time": "2019-08-08T03:46:37", "url": "https://files.pythonhosted.org/packages/31/80/1cde2a940fe42d5572487e47533f4b08302a0dd2c64bbd04116731cd7109/torchtext-0.4.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "79b2a7a89ea41fb562b7b41e8767798c", "sha256": "094520d9cd0af6a05368d9023fdc91dc038232bd9d128c7b548ec2200dba53ec" }, "downloads": -1, "filename": "torchtext-0.4.0-py3-none-any.whl", "has_sig": false, "md5_digest": "79b2a7a89ea41fb562b7b41e8767798c", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 53123, "upload_time": "2019-08-08T03:46:35", "url": "https://files.pythonhosted.org/packages/43/94/929d6bd236a4fb5c435982a7eb9730b78dcd8659acf328fd2ef9de85f483/torchtext-0.4.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "470597c5588c26ad7430486695c0343c", "sha256": "e04ca965fb1d74161fd1f4b5222ee4fa1ad6c02f1e7df213495883384f2fa408" }, "downloads": -1, "filename": "torchtext-0.4.0.tar.gz", "has_sig": false, "md5_digest": "470597c5588c26ad7430486695c0343c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 45844, "upload_time": "2019-08-08T03:46:37", "url": "https://files.pythonhosted.org/packages/31/80/1cde2a940fe42d5572487e47533f4b08302a0dd2c64bbd04116731cd7109/torchtext-0.4.0.tar.gz" } ] }