{ "info": { "author": "jimmybot", "author_email": "jimmybot@jimmybot.com", "bugtrack_url": null, "classifiers": [ "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7" ], "description": "# typedtsv\nTyped TSV: A simple format for typing TSVs with an implementation in Python 3.\n\nAvailable on pypi: https://pypi.org/project/typedtsv/\n\nInstall with: `pip install typedtsv`\n\nSee code and leave feedback here: https://github.com/jimmybot/typedtsv\n\n## Why?\nJSON, YAML, TOML and other simple formats aren't built for list/table like sets of data.\n\nYAML is particularly slow due to its expansive featureset and JSON, being that is for single objects and not collections, is not chunkable. I once stored all PyPI package info in a YAML file and reading it back out was going to take half a day. Using a dead-simple newline-delimited JSON format made parsing take seconds.\n\nNewline-delimited JSON is convenient with little chance of making mistakes in parsing and good performance. The downsides are the types supported are a bit too limited (no int vs float), and it is also not easily human readable or editable.\n\nTOML is particularly targeted towards configuration files and similarly parses results in a single dictionary object rather than a collection.\n\nCSV/TSV formats have too much ambiguity resulting in repetitive custom parsing logic contained outside the file itself. CSV quote escaping can also lead to poor parsing performance.\n\n## Goals\n- Be simple\n- Be fast\n- Be easily parallelized\n- Be a better alternative to CSV/TSV/JSON and simple uses of YAML\n- Support open data and data sharing/archival. Push information about a dataset into the data file itself for future reproducibility\n\n### Use Cases in Mind\n- Database-agnostic, program-agnostic simple file format for open data\n- A quick go-to serialization format for sharing reproducible data science datasets\n- Easily-created, easily-editable, easily-understood database fixtures for tests\n\n## Non-Goals\n- Unlimited extensibility a la YAML\n- Config files. Focus is on lists of objects/tabular data\n\n## Format\nFormat is a normal TSV except the header rows uses a colon format to annotate the type:\n\n`:\\t:...`\n\nFor example:\n\n```\n# I'm a comment and will be ignored\nurl:str n_times:int score:float\nhttps://www.example.com 5 1.6\nhttps://archive.org 99 9.9\n```\n\nInitial pass centered around Python's basic types plus JSON. Current valid types are:\n\n| Type | Notes |\n|----------|------------------------------------------------------\n| int | |\n| float | |\n| bool | Valid values: true, false, t, f, yes, no, y, n, 1, 0|\n| str | Newlines, tabs, \\\\, and # must be escaped |\n| datetime | '2011-01-01 00:00:00' Without timezone assumes UTC |\n| json | |\n| | |\n| null | All types are nullable with value 'null'. To get literal string 'null', use '\\\\null'|\n\nComments are supported, just prefix with #. Escape actual # in a string with a single backslash '\\\\#'.\n\nRow separators use `'\\n'` only. Windows line breaks, `'\\r\\n'` are not valid.\n\nWe'll never allow quoted `'\\n'` because this would make the file difficult to chunk and thus make it difficult to parallelize reading.\n\n**Gotchas**:\n- In Python, you need to be careful about opening files that may contain Windows newlines:\n```py\ninfile = open('data.ttsv', 'r', newline='\\n') # must set newline='\\n' because default for newline is '\\n' or '\\r' or '\\r\\n'\n```\n- typedtsv.dumps can infer column types from the first row of your data but not if there are any ```null```'s. In that case, use the regular OrderedDict method to define column names and types\n\n## TODO:\n- ~~Add a boolean type~~\n- ~~Add nulls~~\n- ~~Add a datetime/date/time type: need to avoid ambiguity yet support common uses~~\n- ~~Ergonomics: optionally read and dump single lists of data rather than dealing with a list of lists~~\n- Support units annotations such as degrees F, meters/second using similar using same syntax as F#: https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/units-of-measure\n- Maybe: extend format to support column comments / other common metadata\n- Maybe: support array and map types for compatibility with Postgres\n- Maybe: Support date, time, and/or timeinterval types\n\n## Developing\n\nMake sure you have Poetry installed: https://github.com/sdispater/poetry\n\n```bash\ngit clone git@github.com:jimmybot/typedtsv.git\ncd typedtsv\npoetry install\npoetry shell\npytest\n```\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "", "keywords": "", "license": "", "maintainer": "jimmybot", "maintainer_email": "jimmybot@jimmybot.com", "name": "typedtsv", "package_url": "https://pypi.org/project/typedtsv/", "platform": "", "project_url": "https://pypi.org/project/typedtsv/", "project_urls": null, "release_url": "https://pypi.org/project/typedtsv/0.9.1/", "requires_dist": null, "requires_python": ">=3.6", "summary": "A simple format for typing TSVs with an implementation in Python 3", "version": "0.9.1" }, "last_serial": 4518333, "releases": { "0.6.0": [ { "comment_text": "", "digests": { "md5": "ae9b3e3c34757303f8a14de3adc2190a", "sha256": "b1434e25127a216463235c0d798d79f2c6e34bbb0dbae64cee2a1a25e3f62364" }, "downloads": -1, "filename": "typedtsv-0.6.0-py3-none-any.whl", "has_sig": false, "md5_digest": "ae9b3e3c34757303f8a14de3adc2190a", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 7627, "upload_time": "2018-07-17T07:36:11", "url": "https://files.pythonhosted.org/packages/a4/f2/ca3f8adb9e131b89b18fab6cb3895d3536093bbcc8ae4a0afd70867563e0/typedtsv-0.6.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3f5973e591eaff1d2f6d772d90b20960", "sha256": "8f630b77bbf669a0fd800393604b93e1e731dcfebc67a0d876df1cfd81030b49" }, "downloads": -1, "filename": "typedtsv-0.6.0.tar.gz", "has_sig": false, "md5_digest": "3f5973e591eaff1d2f6d772d90b20960", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7253, "upload_time": "2018-07-17T07:36:13", "url": "https://files.pythonhosted.org/packages/86/d9/206108ad91b2d404b9b72e0f16853c2fd243293cc9bf862f30644eda3dc1/typedtsv-0.6.0.tar.gz" } ], "0.9.0": [ { "comment_text": "", "digests": { "md5": "f06cd3454350432c59d377a4c835cf32", "sha256": "59b66d2ce884cd397ff05bba9b4b56188962643a03b35f064e36c25c7a5ab53c" }, "downloads": -1, "filename": "typedtsv-0.9.0-py3-none-any.whl", "has_sig": false, "md5_digest": "f06cd3454350432c59d377a4c835cf32", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 22369, "upload_time": "2018-11-21T22:58:57", "url": "https://files.pythonhosted.org/packages/1f/e5/4e222038029927373b1c089c70d2cee490a3d348b925ed917fafabd91a45/typedtsv-0.9.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "4ed0221f8c49e64849ff3fd10a1e785b", "sha256": "bc906669911552596f5b49fa5779e7a93cfb925aadd51abd581f1fa37ed2fc5c" }, "downloads": -1, "filename": "typedtsv-0.9.0.tar.gz", "has_sig": false, "md5_digest": "4ed0221f8c49e64849ff3fd10a1e785b", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 9389, "upload_time": "2018-11-21T22:58:59", "url": "https://files.pythonhosted.org/packages/5d/7a/09f103ecaeea57968cad707b6dd8074454dd45cae05191dd9432a30ff016/typedtsv-0.9.0.tar.gz" } ], "0.9.1": [ { "comment_text": "", "digests": { "md5": "9d6aae4a315c82e30fd7a62d19b297b7", "sha256": "434b90fb0b851d060e2c5dab3cab01f3f04d6f648eae55ee9f9452ed80a7f957" }, "downloads": -1, "filename": "typedtsv-0.9.1-py3-none-any.whl", "has_sig": false, "md5_digest": "9d6aae4a315c82e30fd7a62d19b297b7", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 22996, "upload_time": "2018-11-22T21:13:54", "url": "https://files.pythonhosted.org/packages/5d/da/02808a0d791b17aee12850b108614c87e0f279cfa2bbf99ffab33c6ba614/typedtsv-0.9.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "91bca01a7b5ff9f62ae085c39de88743", "sha256": "10a1ecabe12c42d33c8fb57c757d4e14c16b7b377fd717ca2370a5f73f1a6a1d" }, "downloads": -1, "filename": "typedtsv-0.9.1.tar.gz", "has_sig": false, "md5_digest": "91bca01a7b5ff9f62ae085c39de88743", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 9634, "upload_time": "2018-11-22T21:13:55", "url": "https://files.pythonhosted.org/packages/19/d9/df32e7855997d37ce7680682b046d33d8be9a3d7026bae72155888d0fe78/typedtsv-0.9.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "9d6aae4a315c82e30fd7a62d19b297b7", "sha256": "434b90fb0b851d060e2c5dab3cab01f3f04d6f648eae55ee9f9452ed80a7f957" }, "downloads": -1, "filename": "typedtsv-0.9.1-py3-none-any.whl", "has_sig": false, "md5_digest": "9d6aae4a315c82e30fd7a62d19b297b7", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 22996, "upload_time": "2018-11-22T21:13:54", "url": "https://files.pythonhosted.org/packages/5d/da/02808a0d791b17aee12850b108614c87e0f279cfa2bbf99ffab33c6ba614/typedtsv-0.9.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "91bca01a7b5ff9f62ae085c39de88743", "sha256": "10a1ecabe12c42d33c8fb57c757d4e14c16b7b377fd717ca2370a5f73f1a6a1d" }, "downloads": -1, "filename": "typedtsv-0.9.1.tar.gz", "has_sig": false, "md5_digest": "91bca01a7b5ff9f62ae085c39de88743", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 9634, "upload_time": "2018-11-22T21:13:55", "url": "https://files.pythonhosted.org/packages/19/d9/df32e7855997d37ce7680682b046d33d8be9a3d7026bae72155888d0fe78/typedtsv-0.9.1.tar.gz" } ] }