{ "info": { "author": "Florian Diesch", "author_email": "devel@florian-diesch.de", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Developers", "License :: OSI Approved :: GNU General Public License (GPL)", "Natural Language :: English", "Natural Language :: German", "Operating System :: OS Independent", "Programming Language :: Python", "Topic :: Other/Nonlisted Topic", "Topic :: Software Development :: Libraries" ], "description": "=================\nWhat is shipyard?\n=================\n\nShipyard is a module to process data in a format inspired by email \nheaders (RFC 2822).\n\nThe goal of shipyard is to have a simple, human readable and human writable\nreplacement for CSV that works better for long data and many rows and\ndoesn't need difficult escaping rules for special characters.\n\nIt's called ``shipyard`` because that word contains ``py`` and doesn't\nseem to be taken yet.\n\n\n\n===========\nFile format\n===========\n\n\nCharacter encoding\n==================\n\nA character encoding can be specified similar to :pep:`0263` using::\n\n # -*- coding: -*-\n\nin the first line. ``#`` is replaced with the actual `comment`_ mark.\n\nMore precisely, the first line must match the regular\nexpression::\n\n ^#.*coding[:=]\\s*([-\\w.]+)\n\nAgain ``#`` is replaced by the actual `comment`_ mark. The first group\nof this expression is then interpreted as encoding name.\n\n\nData set\n========\n\nA *data set* consists of zero or more `records <#record>`__ separated \nby one or more empty lines.\n\nComment\n=======\nLines starting with the *comment mark* (default: ``#``) are \nignored. Comments can be used in or between `records <#record>`__.\n\n\nRecord\n======\nA *record* consists of one or more `fields <#field>`__\n\nField\n=====\n\nA *field* is a line that has the form::\n\n key: value\n\n*key* is a string that\n - doesn't contain a colon\n - doesn't start with the `comment`_ mark\n - doesn't start with the `continuation`_ mark\n \n*value* is an arbitrary string. It can span multiple line using\n`continuation`_ marks.\n\n\nContinuation\n============\nIf a line starts with the *continuation mark* (default: \" \" [one blank])\nit gets appended to the preceding line, with the \ncontinuation mark removed.\n\n\n\n\n=====\nUsage\n=====\n\nObviuosly we need to import shipyard:\n >>> import shipyard\n\nFirst we open the file:\n >>> input = open('nobel.sy')\n\nThen we create a parser object:\n >>> reader = shipyard.Parser(keep_linebreaks=False,\n ... keys=['id', 'discipline', 'year',\n ... 'name', 'country', 'rationale'])\n \nFor every record the given keys are initialized with None.\n\nNow we can iterater through the records:\n\n >>> for record in reader.parse(input): # doctest:+ELLIPSIS\n ... print record['country']\n United States\n Japan\n United States\n ...\n \nInstead of iterating we may want to get a list of dicts:\n >>> input.seek(0)\n >>> lod = reader.get_list(input)\n >>> print lod # doctest:+ELLIPSIS\n [{u'discipline': u'Chemistry', u'name': u'Martin Chalfie', ...}, {u'discipline': u'Chemistry', u'name': u'Osamu Shimomura', ...}, ...]\n\nSometimes we need a dict of dicts (using the 'id' field as key):\n >>> input.seek(0)\n >>> dod = reader.get_dict(input, key='id')\n >>> print dod.keys()\n [u'11', u'10', u'1', u'0', u'3', u'2', u'5', u'4', u'7', u'6', u'9', u'8']\n >>> print dod[u'5'][u'rationale']\n for the discovery of the mechanism of spontaneous brokensymmetry in subatomic physics\n\n\nIf we don't want dicts we can use the 'factory' parameter:\n >>> input.seek(0)\n >>> los = reader.get_list(input, factory = lambda **keys: ', '.join(keys.values()))\n >>> print los[0]\n Chemistry, Martin Chalfie, United States, for the discovery and development of the green fluorescentprotein, GFP, 2008, 0\n\nOf course a class works as a factory, too:\n >>> input.seek(0)\n >>> class Laureate(object):\n ... def __init__(self, id, discipline, year, name, country, rationale):\n ... self.name = name\n >>> doo = reader.get_dict(input, key='id', factory = Laureate)\n >>> print doo[u'2'] # doctest:+ELLIPSIS\n \n >>> print doo[u'2'].name\n Roger Y. Tsien\n\nNow let's write a Shipyard file.\n\nFirst we create a StringIO (any other file-like object will do, too):\n >>> import StringIO\n >>> output = StringIO.StringIO()\n\nNext we need a Writer object:\n >>> writer = shipyard.Writer(keys=('foo', 'bar'), coding='utf-8')\n\nNow we can use write() to write a single record:\n >>> writer.write(output, {'foo': 1, 'bar': 2})\n >>> print output.getvalue()\n foo: 1\n bar: 2\n \n \n\n\nUsing write_many() we can write a list of records:\n >>> output = StringIO.StringIO()\n >>> d = [dict((('foo', i), ('bar', 2*i))) for i in range(3)]\n >>> writer.write_many(output, d)\n >>> print output.getvalue()\n foo: 0\n bar: 0\n \n foo: 1\n bar: 2\n \n foo: 2\n bar: 4\n \n \n\n\nTo get a encoding line we use write_coding():\n >>> output = StringIO.StringIO()\n >>> writer.write_coding(output)\n >>> print output.getvalue()\n #-*- coding: utf-8 -*-\n \n \n\nNow let's do everything at once using write_full():\n >>> output = StringIO.StringIO()\n >>> writer.write_full(output, d)\n >>> print output.getvalue()\n #-*- coding: utf-8 -*-\n \n foo: 0\n bar: 0\n \n foo: 1\n bar: 2\n \n foo: 2\n bar: 4\n \n ", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://www.florian-diesch.de/software/shipyard/", "keywords": "data format storage human readable RFC-2822 CSV", "license": "GPL", "maintainer": null, "maintainer_email": null, "name": "shipyard", "package_url": "https://pypi.org/project/shipyard/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/shipyard/", "project_urls": { "Download": "UNKNOWN", "Homepage": "http://www.florian-diesch.de/software/shipyard/" }, "release_url": "https://pypi.org/project/shipyard/0.02/", "requires_dist": null, "requires_python": null, "summary": "process data in a format inspired by email headers", "version": "0.02" }, "last_serial": 72982, "releases": { "0.01": [], "0.02": [] }, "urls": [] }