{ "info": { "author": "Szymon Maszke", "author_email": "szymon.maszke@protonmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 2 - Pre-Alpha", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.7", "Topic :: Scientific/Engineering", "Topic :: Scientific/Engineering :: Artificial Intelligence", "Topic :: Software Development :: Libraries", "Topic :: Software Development :: Libraries :: Python Modules" ], "description": "![torchdata Logo](https://github.com/szymonmaszke/torchdata/blob/master/assets/banner.png)\n\n--------------------------------------------------------------------------------\n\n| Version | Docs | Tests | Coverage | Style | PyPI | Python | PyTorch | Docker | Roadmap |\n|---------|------|-------|----------|-------|------|--------|---------|--------|---------|\n| [![Version](https://img.shields.io/static/v1?label=&message=0.1.1&color=377EF0&style=for-the-badge)](https://github.com/szymonmaszke/torchdata/releases) | [![Documentation](https://img.shields.io/static/v1?label=&message=docs&color=EE4C2C&style=for-the-badge)](https://szymonmaszke.github.io/torchdata/) | ![Tests](https://github.com/szymonmaszke/torchdata/workflows/test/badge.svg) | ![Coverage](https://img.shields.io/codecov/c/github/szymonmaszke/torchdata?label=%20&logo=codecov&style=for-the-badge) | [![codebeat](https://img.shields.io/static/v1?label=&message=CB&color=27A8E0&style=for-the-badge)](https://codebeat.co/projects/github-com-szymonmaszke-torchdata-master) | [![PyPI](https://img.shields.io/static/v1?label=&message=PyPI&color=377EF0&style=for-the-badge)](https://pypi.org/project/torchdata/) | [![Python](https://img.shields.io/static/v1?label=&message=3.7&color=377EF0&style=for-the-badge&logo=python&logoColor=F8C63D)](https://www.python.org/) | [![PyTorch](https://img.shields.io/static/v1?label=&message=1.2.0&color=EE4C2C&style=for-the-badge)](https://pytorch.org/) | [![Docker](https://img.shields.io/static/v1?label=&message=docker&color=309cef&style=for-the-badge)](https://cloud.docker.com/u/szymonmaszke/repository/docker/szymonmaszke/torchdata) | [![Roadmap](https://img.shields.io/static/v1?label=&message=roadmap&color=009688&style=for-the-badge)](https://github.com/szymonmaszke/torchdata/blob/master/ROADMAP.md) |\n\n[__torchdata__](https://szymonmaszke.github.io/torchdata/) is [PyTorch](https://pytorch.org/) oriented library focused on data processing and input pipelines in general.\n\nIt extends `torch.utils.data.Dataset` and equips it with\nfunctionalities known from [tensorflow.data](https://www.tensorflow.org/api_docs/python/tf/data/Dataset)\nlike `map` or `cache` (with some additions unavailable in aforementioned).\n\nAll of that with minimal interference (single call to `super().__init__()`) in original\nPyTorch's datasets.\n\n### Functionalities overview:\n\n* Use `map`, `apply`, `reduce` or `filter`\n* `cache` data in RAM or on disk (even partially, say first `20%`)\n* Full PyTorch's [`Dataset`](https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset) and [`IterableDataset`](https://pytorch.org/docs/stable/data.html#torch.utils.data.IterableDataset>) support (including [`torchvision`](https://pytorch.org/docs/stable/torchvision/index.html))\n* General `torchdata.maps` like `Flatten` or `Select`\n* Concrete `torchdata.datasets` designed for file reading and other general tasks\n\n\n# Quick examples\n\n- Create image dataset, convert it to Tensors, cache and concatenate with smoothed labels:\n\n```python\nimport torchdata\nimport torchvision\n\nclass Images(torchdata.Dataset): # Different inheritance\n def __init__(self, path: str):\n super().__init__() # This is the only change\n self.files = [file for file in pathlib.Path(path).glob(\"*\")]\n\n def __getitem__(self, index):\n return Image.open(self.files[index])\n\n def __len__(self):\n return len(self.files)\n\n\nimages = Images(\"./data\").map(torchvision.transforms.ToTensor()).cache()\n```\n\nYou can concatenate above dataset with another (say `labels`) and iterate over them as per usual:\n\n```python\nfor data, label in images | labels:\n # Do whatever you want with your data\n```\n\n- Cache first `1000` samples in memory, save the rest on disk in folder `./cache`:\n\n```python\nimages = (\n ImageDataset.from_folder(\"./data\").map(torchvision.transforms.ToTensor())\n # First 1000 samples in memory\n .cache(torchdata.modifiers.UpToIndex(1000, torchdata.cachers.Memory()))\n # Sample from 1000 to the end saved with Pickle on disk\n .cache(torchdata.modifiers.FromIndex(1000, torchdata.cachers.Pickle(\"./cache\")))\n # You can define your own cachers, modifiers, see docs\n)\n```\nTo see what else you can do please check [**torchdata documentation**](https://szymonmaszke.github.io/torchdata/)\n\n# Installation\n\n## [pip]()\n\n### Latest release:\n\n```shell\npip install --user torchdata\n```\n\n### Nightly:\n\n```shell\npip install --user torchdata-nightly\n```\n\n## [Docker](https://cloud.docker.com/repository/docker/szymonmaszke/torchdata)\n\n__CPU standalone__ and various versions of __GPU enabled__ images are available\nat [dockerhub](https://cloud.docker.com/repository/docker/szymonmaszke/torchdata).\n\nFor CPU quickstart, issue:\n\n```shell \ndocker pull szymonmaszke/torchdata:18.04\n```\n\nNightly builds are also available, just prefix tag with `nightly_`. If you are going for `GPU` image make sure you have\n[nvidia/docker](https://github.com/NVIDIA/nvidia-docker) installed and it's runtime set.\n\n# Contributing\n\nIf you find any issue or you think some functionality may be useful to others and fits this library, please [open new Issue](https://help.github.com/en/articles/creating-an-issue) or [create Pull Request](https://help.github.com/en/articles/creating-a-pull-request-from-a-fork).\n\nTo get an overview of thins one can do to help this project, see [Roadmap](https://github.com/szymonmaszke/torchdata/blob/master/ROADMAP.md)\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/pypa/torchdata", "keywords": "pytorch torch data datasets map cache memory disk apply database", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "torchdata", "package_url": "https://pypi.org/project/torchdata/", "platform": "", "project_url": "https://pypi.org/project/torchdata/", "project_urls": { "Documentation": "https://szymonmaszke.github.io/torchdata/#torchdata", "Homepage": "https://github.com/pypa/torchdata", "Issues": "https://github.com/szymonmaszke/torchdata/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc", "Website": "https://szymonmaszke.github.io/torchdata" }, "release_url": "https://pypi.org/project/torchdata/0.1.2/", "requires_dist": [ "torch (>=1.2.0)" ], "requires_python": ">=3.7", "summary": "PyTorch based library focused on data processing and input pipelines in general.", "version": "0.1.2" }, "last_serial": 5863258, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "45b97defb329d1aaafd38dc47a2116dd", "sha256": "6d9ff338075848e7f6d02997dcec2f7415c5b946be7ca9a7966028f075ee73a3" }, "downloads": -1, "filename": "torchdata-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "45b97defb329d1aaafd38dc47a2116dd", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.7", "size": 19352, "upload_time": "2019-09-16T01:07:39", "url": "https://files.pythonhosted.org/packages/9b/d6/d41838a6c8bc198827b7231d444a34fc41918ce97ab0e1ab2cb068d884d4/torchdata-0.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "8a896ec52664d3f1073bce5a548a4561", "sha256": "31fe1c54fb5946fb359be5b8a38224375a3e09f7fae62e87b5a1f16241cee946" }, "downloads": -1, "filename": "torchdata-0.1.0.tar.gz", "has_sig": false, "md5_digest": "8a896ec52664d3f1073bce5a548a4561", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.7", "size": 17411, "upload_time": "2019-09-16T01:07:41", "url": "https://files.pythonhosted.org/packages/b0/b4/2a2a54ec2389138df3333be937e68127515a0bc990fff46c1398aa72390f/torchdata-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "b38530fc17456e172c46e0ef53451b67", "sha256": "128c4b222c5502d5f8352aeeb7dbff8291b2d4cb6eb59bf70001fbb98592e296" }, "downloads": -1, "filename": "torchdata-0.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "b38530fc17456e172c46e0ef53451b67", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.7", "size": 23328, "upload_time": "2019-09-19T16:27:04", "url": "https://files.pythonhosted.org/packages/a1/e6/de3c44b7eb029714a418e284c9d46e21d236da3c5988c9258ca9c47c8867/torchdata-0.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "9576736d0baf54b50a156a2f19f445d9", "sha256": "1e5d871b9cb9a95b2966ede7e6d8ede2a19430ac6663cf97aa94780b8717bfdd" }, "downloads": -1, "filename": "torchdata-0.1.1.tar.gz", "has_sig": false, "md5_digest": "9576736d0baf54b50a156a2f19f445d9", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.7", "size": 19878, "upload_time": "2019-09-19T16:27:06", "url": "https://files.pythonhosted.org/packages/4d/e6/544d41902e59f07126e6033035811987977548606e59e51882a4bff991a6/torchdata-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "b93a9bb5fcd2c9e97eaf8d85cc1827de", "sha256": "eecff607cf962f2bad0817653470f6f6b334f5fe367c6745220fc06c3dacf233" }, "downloads": -1, "filename": "torchdata-0.1.2-py3-none-any.whl", "has_sig": false, "md5_digest": "b93a9bb5fcd2c9e97eaf8d85cc1827de", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.7", "size": 26451, "upload_time": "2019-09-20T16:58:37", "url": "https://files.pythonhosted.org/packages/d6/04/3244c60336750099af3aeaaa18efad4f1978f1e184d489a9ba1121dc80ab/torchdata-0.1.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "ba0b3bd91e1bd46df71dc886aea3a383", "sha256": "d947ee220ebfc863cb0363e4867bbbdc4c5a56b9eaa02f9f87f57c060576f548" }, "downloads": -1, "filename": "torchdata-0.1.2.tar.gz", "has_sig": false, "md5_digest": "ba0b3bd91e1bd46df71dc886aea3a383", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.7", "size": 21664, "upload_time": "2019-09-20T16:58:38", "url": "https://files.pythonhosted.org/packages/17/c1/1434bfbfb46752bda6cedbb6c054103cf5e37933f825c7c76d08be68f252/torchdata-0.1.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "b93a9bb5fcd2c9e97eaf8d85cc1827de", "sha256": "eecff607cf962f2bad0817653470f6f6b334f5fe367c6745220fc06c3dacf233" }, "downloads": -1, "filename": "torchdata-0.1.2-py3-none-any.whl", "has_sig": false, "md5_digest": "b93a9bb5fcd2c9e97eaf8d85cc1827de", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.7", "size": 26451, "upload_time": "2019-09-20T16:58:37", "url": "https://files.pythonhosted.org/packages/d6/04/3244c60336750099af3aeaaa18efad4f1978f1e184d489a9ba1121dc80ab/torchdata-0.1.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "ba0b3bd91e1bd46df71dc886aea3a383", "sha256": "d947ee220ebfc863cb0363e4867bbbdc4c5a56b9eaa02f9f87f57c060576f548" }, "downloads": -1, "filename": "torchdata-0.1.2.tar.gz", "has_sig": false, "md5_digest": "ba0b3bd91e1bd46df71dc886aea3a383", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.7", "size": 21664, "upload_time": "2019-09-20T16:58:38", "url": "https://files.pythonhosted.org/packages/17/c1/1434bfbfb46752bda6cedbb6c054103cf5e37933f825c7c76d08be68f252/torchdata-0.1.2.tar.gz" } ] }