{ "info": { "author": "Nexedi", "author_email": "erp5-dev@erp5.org", "bugtrack_url": null, "classifiers": [ "Intended Audience :: Developers", "License :: OSI Approved :: GNU General Public License (GPL)", "Operating System :: OS Independent", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: Implementation :: CPython", "Programming Language :: Python :: Implementation :: PyPy" ], "description": ".. contents::\n\nxfw is an eXtensible Fixed-Width file handling module.\n\nFeatures\n========\n\n- field types (integers, strings, dates) are declared independently of\n file structure, and can be extended through subclassing. (BaseField\n subclasses)\n\n- multi-field structure declaration (FieldList class)\n\n- non-homogeneous file file structure declaration (FieldListFile)\n\n- checksum/hash computation helpers (ChecksumedFile subclasses)\n\n- does not depend on line notion (file may not contain CR/LF chars at all\n between successive field sets)\n\nMissing features / bugs\n=======================\n\n- string trucating is multi-byte (UTF-8, ...) agnostic, and will mindlessly cut\n in the middle of any entity\n If your fields are defined in number of characters of some encoding, just use\n provide xfw with unicode objects, and do the transcoding outside it. See\n `codecs` standard module.\n\n- proper interface declaration\n\n- fields (IntegerField, DateTimeField) should cast by default when parsing\n\n- FieldList total length should be made optional, and only used to\n auto-generate annonymous padding at record end when longer that the sum of\n individual fields lengths.\n\nExample\n=======\n\nDislaimer: give file format is purely hypothetical, does not some from any spec\nI know of, should not be taken as a guideline but just as a showcase of xfw\ncapabilities.\n\nLet's assume a file composed of a general header, containing some\nconstant-value 5-char identifier, a 3-char integer giving the number of records\ncontained, and an optional 20-char comment. It is followed by records, composed\nof a header itself composed of a date (YYYYMMDD), a row type (2-char integer)\nand number of rows (2-char integer), and followed by rows. Row types all start\nwith a time (HHMMSS), followed by fields which depend on row type:\n\n- type 1: a 10-char string\n\n- type 2: a 2-char integer, 8 chars of padding, a 1-char integer\n\nTo run the following code as a doctest, run::\n\n python -m doctest README.rst\n\nDeclare all file structures::\n\n >>> import xfw\n >>> ROOT_HEADER = xfw.FieldList([\n ... (xfw.StringField(5), True, 'header_id'),\n ... (xfw.IntegerField(3, cast=True), True, 'block_count'),\n ... (xfw.StringField(15), False, 'comment'),\n ... ], 23, fixed_value_dict={\n ... 'header_id': 'HEAD1',\n ... })\n >>> BLOCK_HEADER = xfw.FieldList([\n ... (xfw.DateTimeField('%Y%m%d', cast=True), True, 'date'),\n ... (xfw.IntegerField(2, cast=True), True, 'row_type'),\n ... (xfw.IntegerField(2, cast=True), True, 'row_count'),\n ... ], 12)\n >>> ROW_BASE = xfw.FieldList([\n ... (xfw.DateTimeField('%H%M%S', cast=True), True, 'time'),\n ... ], 6)\n >>> ROW_TYPE_DICT = {\n ... 1: xfw.FieldList([\n ... ROW_BASE,\n ... (xfw.StringField(10), True, 'description'),\n ... ], 16),\n ... 2: xfw.FieldList([\n ... ROW_BASE,\n ... (xfw.IntegerField(2, cast=True), True, 'some_value'),\n ... (xfw.StringField(8), False, None), # annonymous padding\n ... (xfw.IntegerField(1, cast=True), True, 'another_value'),\n ... ], 17),\n ... }\n >>> def blockCallback(head, item_list=None):\n ... if item_list is None:\n ... row_count = head['row_count']\n ... else:\n ... row_count = len(item_list)\n ... return row_count, ROW_TYPE_DICT[head['row_type']]\n >>> FILE_STRUCTURE = xfw.ConstItemTypeFile(\n ... ROOT_HEADER,\n ... 'block_count',\n ... xfw.FieldListFile(\n ... BLOCK_HEADER,\n ... blockCallback,\n ... separator='\\n',\n ... ),\n ... separator='\\n',\n ... )\n\nParse sample file through a hash helper wrapper (SHA1)::\n\n >>> from cStringIO import StringIO\n >>> sample_file = StringIO(\n ... 'HEAD1002blah \\n'\n ... '201112260101\\n'\n ... '115500other str \\n'\n ... '201112260201\\n'\n ... '11550099 8'\n ... )\n >>> from datetime import datetime\n >>> checksumed_wrapper = xfw.SHA1ChecksumedFile(sample_file)\n >>> parsed_file = FILE_STRUCTURE.parseStream(checksumed_wrapper)\n >>> parsed_file == \\\n ... (\n ... {\n ... 'header_id': 'HEAD1',\n ... 'block_count': 2,\n ... 'comment': 'blah',\n ... },\n ... [\n ... (\n ... {\n ... 'date': datetime(2011, 12, 26, 0, 0),\n ... 'row_type': 1,\n ... 'row_count': 1,\n ... },\n ... [\n ... {\n ... 'time': datetime(1900, 1, 1, 11, 55),\n ... 'description': 'other str',\n ... },\n ... ]\n ... ),\n ... (\n ... {\n ... 'date': datetime(2011, 12, 26, 0, 0),\n ... 'row_type': 2,\n ... 'row_count': 1,\n ... },\n ... [\n ... {\n ... 'time': datetime(1900, 1, 1, 11, 55),\n ... 'some_value': 99,\n ... 'another_value': 8,\n ... },\n ... ]\n ... ),\n ... ],\n ... )\n True\n\nVerify SHA1 was properly accumulated::\n\n >>> import hashlib\n >>> hashlib.sha1(sample_file.getvalue()).hexdigest() == checksumed_wrapper.getHexDigest()\n True\n\nGenerate a file from parsed data (as it was verified correct above)::\n\n >>> generated_stream = StringIO()\n >>> FILE_STRUCTURE.generateStream(generated_stream, parsed_file)\n >>> generated_stream.getvalue() == sample_file.getvalue()\n True\n\nLikewise, using unicode objects and producing streams of different binary\nlength, although containing the same number of entities. Note that\nfixed-values defined in format declaration are optional (ex: `header_id`),\nand dependent values are automaticaly computed (ex: `block_count`).\n\nGenerate with unicode chars fitting in single UTF-8-encoded bytes::\n\n >>> import codecs\n >>> encoded_writer = codecs.getwriter('UTF-8')\n >>> input_data = (\n ... {\n ... 'comment': u'Just ASCII',\n ... },\n ... [],\n ... )\n >>> sample_file = StringIO()\n >>> FILE_STRUCTURE.generateStream(encoded_writer(sample_file), input_data)\n >>> sample_file.getvalue()\n 'HEAD1000Just ASCII '\n >>> len(sample_file.getvalue())\n 23\n\nGenerate again, with chars needing more bytes when encoded, and demonstrating\nchecksum generation::\n\n >>> wide_input_data = (\n ... {\n ... 'comment': u'\\u3042\\u3044\\u3046\\u3048\\u304a\\u304b\\u304d\\u304f\\u3051\\u3053\\u3055\\u3057\\u3059\\u305b\\u305d',\n ... },\n ... [],\n ... )\n >>> wide_sample_file = StringIO()\n >>> checksumed_wrapper = xfw.SHA1ChecksumedFile(wide_sample_file)\n >>> FILE_STRUCTURE.generateStream(encoded_writer(checksumed_wrapper), wide_input_data)\n >>> wide_sample_file.getvalue()\n 'HEAD1000\\xe3\\x81\\x82\\xe3\\x81\\x84\\xe3\\x81\\x86\\xe3\\x81\\x88\\xe3\\x81\\x8a\\xe3\\x81\\x8b\\xe3\\x81\\x8d\\xe3\\x81\\x8f\\xe3\\x81\\x91\\xe3\\x81\\x93\\xe3\\x81\\x95\\xe3\\x81\\x97\\xe3\\x81\\x99\\xe3\\x81\\x9b\\xe3\\x81\\x9d'\n >>> len(wide_sample_file.getvalue())\n 53\n >>> hashlib.sha1(wide_sample_file.getvalue()).hexdigest() == checksumed_wrapper.getHexDigest()\n True\n\nStill, both parse to their respective original data::\n\n >>> encoded_reader = codecs.getreader('UTF-8')\n >>> FILE_STRUCTURE.parseStream(encoded_reader(StringIO(sample_file.getvalue())))[0]['comment']\n u'Just ASCII'\n >>> FILE_STRUCTURE.parseStream(encoded_reader(StringIO(wide_sample_file.getvalue())))[0]['comment']\n u'\\u3042\\u3044\\u3046\\u3048\\u304a\\u304b\\u304d\\u304f\\u3051\\u3053\\u3055\\u3057\\u3059\\u305b\\u305d'", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://git.erp5.org/gitweb/xfw.git", "keywords": "fixed width field file", "license": "GPL 2+", "maintainer": null, "maintainer_email": null, "name": "xfw", "package_url": "https://pypi.org/project/xfw/", "platform": "any", "project_url": "https://pypi.org/project/xfw/", "project_urls": { "Download": "UNKNOWN", "Homepage": "http://git.erp5.org/gitweb/xfw.git" }, "release_url": "https://pypi.org/project/xfw/0.10/", "requires_dist": null, "requires_python": null, "summary": "eXtensible Fixed-Width file handling module", "version": "0.10" }, "last_serial": 4265603, "releases": { "0.10": [ { "comment_text": "", "digests": { "md5": "2f82e71bb2a55537ecbac6ddd262966b", "sha256": "35563498e51701a3a600f916f9db7b4b0773ff5b6b6ef1809b2c95c2d5692a62" }, "downloads": -1, "filename": "xfw-0.10.tar.gz", "has_sig": false, "md5_digest": "2f82e71bb2a55537ecbac6ddd262966b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9839, "upload_time": "2013-11-26T13:09:56", "url": "https://files.pythonhosted.org/packages/6a/89/4e1d45fbce5c8bafcbdc6245e3c0a4c94d67b87c92c20b7b94cd6897ed09/xfw-0.10.tar.gz" } ], "0.9": [ { "comment_text": "", "digests": { "md5": "066e8340463e40b81660024de94fe50d", "sha256": "5e015537c8d1901ad558b6516c53a34cb251afbb39d0f2c84e2ccc6748d7b408" }, "downloads": -1, "filename": "xfw-0.9.tar.gz", "has_sig": false, "md5_digest": "066e8340463e40b81660024de94fe50d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7967, "upload_time": "2011-12-26T14:46:06", "url": "https://files.pythonhosted.org/packages/42/f2/f93f3b44b1017c21f6a1396acf317cf18310ee0779a6e992f1b87f6c9624/xfw-0.9.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "2f82e71bb2a55537ecbac6ddd262966b", "sha256": "35563498e51701a3a600f916f9db7b4b0773ff5b6b6ef1809b2c95c2d5692a62" }, "downloads": -1, "filename": "xfw-0.10.tar.gz", "has_sig": false, "md5_digest": "2f82e71bb2a55537ecbac6ddd262966b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9839, "upload_time": "2013-11-26T13:09:56", "url": "https://files.pythonhosted.org/packages/6a/89/4e1d45fbce5c8bafcbdc6245e3c0a4c94d67b87c92c20b7b94cd6897ed09/xfw-0.10.tar.gz" } ] }