{ "info": { "author": "Ali Ghaffaari", "author_email": "ali.ghaffaari@mpi-inf.mpg.de", "bugtrack_url": null, "classifiers": [ "Development Status :: 2 - Pre-Alpha", "Environment :: Console", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Topic :: Text Processing :: General" ], "description": "parse2csv\n=========\nparse2csv is a command-line tool to parse multiple files for named patterns in\norder to extract structured data from each file and then write the values in CSV\nformat. For each input file, there would be a row in the generated CSV file\ncontaining the field values extracted from that file. The CSV header would be\nthe names specified for each pattern.\n\n\nThe main motivation for writing this script is parsing the output of other tools\nand extract required information and put them in a CSV file for further analysis.\n\nInstallation\n------------\nUsing `pip`:\n\n pip install parse2csv\n\nUsing setup script:\n\n python setup.py install\n\nUsage\n-----\nThe first step is preparing a configuration file. The config file should be in\nYAML format specifying the patterns for which the input files should be searched.\n\nThe patterns can be specified in the config file under `patterns` entry as a\nlist. parse2csv uses [parse](https://github.com/r1chardj0n3s/parse) package to\nextract data. Therefore, all patterns should comply with its format syntax (see\n[format syntax](https://github.com/r1chardj0n3s/parse#format-syntax)). Since the\noutput file is in CSV format, all fields in the patterns should be **named**;\notherwise it cannot be determined which parsed value belongs to which column in\nthe CSV file.\n\nApart from patterns, the order of the fields by which they should appear in the\nCSV file should also be specified in the config file under `fields` entry. All\nfield names must be the same as the name used in the patterns. The `fields`\nentry does not require to include all named fields in the patterns list.\n\nThe entry `missing_value` in the config file indicates the value to be used in\nthe output CSV in case a field cannot be found in the context. The default value\nis 'NA' in case it is not provided in the config file.\n\nHere, is a sample config file:\n\n```yaml\n---\nmissing_value: '-'\nfields:\n - 'date'\n - 'first'\n - 'last'\n - 'address'\n - 'age'\npatterns:\n - 'Date: {date:tg}'\n - 'Age: {age:d}'\n - 'Name: {first:w}{:s}{last:w}'\n - 'Name: {last:w},{:s}{first:w}'\n - 'Address: \"{address}\"'\n...\n```\n\nAssume, there are two files:\n\n $ cat file1\n Date: 1/2/2011 11:00 PM\n Name: Sherlock Holmes\n Age: 38\n Address: \"221B Baker Street\"\n\n $ cat file2\n Date: 6/1/2018 12:00 AM\n Age: 42\n Name: Watson, John\n\nThe output CSV file would be:\n\n date,first,last,age\n 2011-02-01 23:00:00,Sherlock,Holmes,221B Baker Street,38\n 2018-01-06 12:00:00,John,Watson,-,42\n\nIn some cases, a field can be occurred multiple times in the context. These\nvalues can be reduced to one by specifying the reduce function in the config\nfile under `reduce` entry as a mapping between field name and reduce function:\n\n reduce:\n income: 'avg'\n children: 'count'\n\nThe above example maps the `avg` and `count` functions to 'income' and 'children'\nfields, respectively. In case, the income occurs more than once in the context,\nthe average of them will be reported and for 'children' the number of the\noccurrences will be put in the generated CSV file.\n\nThe reduce functions can be one of these:\n- `'first'`: use the first value.\n- `'last'`: use the last value.\n- `'avg'`: use the average of values (the values should be numerical).\n- `'avg_tp'`: the same as `'avg'` but preserves the original type (the values\n should be numerical).\n- `'count'`: use the count of occurrences.\n- `'min'`: use the minimum value.\n- `'max'`: use the maximum value.\n- `'sum'`: use the sum of values.\n- `'concat'`: use the concatenation of the values (field should be `str`).\n\nOnce the configuration file is ready, using parse2csv is quite straightforward\nby providing the input and configuration files:\n\n parse2csv -c config.yaml -o output.csv file1 file2...\n\nThe flag `--help` reveals more details about program usage:\n\n $ python parse2csv.py --help\n Usage: parse2csv.py [OPTIONS] [INPUTS]...\n\n Parse the input files for named patterns and dump their values to a file\n in CSV format.\n\n Options:\n -o, --output FILENAME Write to this file instead of stdout.\n -c, --configfile FILENAME Use this configuration file. [required]\n -d, --dialect [unix|excel|excel-tab]\n Use this CSV dialect. [default: unix]\n --help Show this message and exit.\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "https://github.com/cartoonist/parse2csv/tarball/0.1.4", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/cartoonist/parse2csv", "keywords": "parse text convert csv comma separated values", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "parse2csv", "package_url": "https://pypi.org/project/parse2csv/", "platform": "", "project_url": "https://pypi.org/project/parse2csv/", "project_urls": { "Download": "https://github.com/cartoonist/parse2csv/tarball/0.1.4", "Homepage": "https://github.com/cartoonist/parse2csv" }, "release_url": "https://pypi.org/project/parse2csv/0.1.4/", "requires_dist": [ "click (>=6.7)", "parse (>=1.8.2)", "PyYAML (>=3.12)", "nose (>=1.0); extra == 'test'", "coverage; extra == 'test'" ], "requires_python": "", "summary": "A command-line tool to parse multiple files for named", "version": "0.1.4" }, "last_serial": 3933007, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "b07e18a96252411e2145c69aeb97f7d0", "sha256": "fa3e4f1fafbdf5f97e8246d29573fc4fa0507ddb97689bb8f3fd8e81a3af5afa" }, "downloads": -1, "filename": "parse2csv-0.1.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "b07e18a96252411e2145c69aeb97f7d0", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 7733, "upload_time": "2018-05-18T12:27:49", "url": "https://files.pythonhosted.org/packages/16/68/1dd2d460d294c51607bcb975a7ba87923c0325f38a14d3e4db6145b3d5b5/parse2csv-0.1.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "42997645f76d15731bb0dee17e807aaa", "sha256": "09820e652561250213187fa788463857c5fcee0ded9606755bfd37fd253a7e67" }, "downloads": -1, "filename": "parse2csv-0.1.0.tar.gz", "has_sig": false, "md5_digest": "42997645f76d15731bb0dee17e807aaa", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6695, "upload_time": "2018-05-18T12:27:51", "url": "https://files.pythonhosted.org/packages/bf/f6/81a3d0de045343cacefa8137cd7505387444282cce80bf40368fb98ecedb/parse2csv-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "197fc99bd96616a23083c8fd3ad7aba2", "sha256": "fa75a6372a3c637209b1ee374a968429f9545a1b76c079da4f9c3250d5ba83b1" }, "downloads": -1, "filename": "parse2csv-0.1.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "197fc99bd96616a23083c8fd3ad7aba2", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 7758, "upload_time": "2018-05-18T13:55:28", "url": "https://files.pythonhosted.org/packages/b4/9d/f660dcc74958fd9e03fe41d654a6bb3f57aa260a3d92d5331c826ac4b768/parse2csv-0.1.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3264956bb12004618f04a9945376c683", "sha256": "9207bdd32fdfb4b1079f526f12ccc4f8171a4933565f58efdfe7bdd1ebff565d" }, "downloads": -1, "filename": "parse2csv-0.1.1.tar.gz", "has_sig": false, "md5_digest": "3264956bb12004618f04a9945376c683", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6780, "upload_time": "2018-05-18T13:55:29", "url": "https://files.pythonhosted.org/packages/c7/55/7529e931d2499348a8b8a0e8286d0e8c87b9003652341800ede38ae6fae0/parse2csv-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "7fd636e8d8e2e0eab9a05d01a9bbd083", "sha256": "dab461ba5ba14793eb9a6cdd63d386aad8023f8b28e63f91a19a411937bf1b26" }, "downloads": -1, "filename": "parse2csv-0.1.2-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "7fd636e8d8e2e0eab9a05d01a9bbd083", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 7808, "upload_time": "2018-05-21T07:26:55", "url": "https://files.pythonhosted.org/packages/2d/70/db2e814f8efe02e5e6ae4dd3a049ffdad21997a2df234833022e64ae04f6/parse2csv-0.1.2-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "b0e939e02825ab7f68e92c69f523bc34", "sha256": "3fbc02df5cc586895f2f229f27dfabc465c4bf218faa59a90c2607f1d3a659bb" }, "downloads": -1, "filename": "parse2csv-0.1.2.tar.gz", "has_sig": false, "md5_digest": "b0e939e02825ab7f68e92c69f523bc34", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6877, "upload_time": "2018-05-21T07:26:56", "url": "https://files.pythonhosted.org/packages/e4/2e/90cba6b18672b91a3a4cac917fd4e7a9d79df369cdd95dbca37ce1d22ce6/parse2csv-0.1.2.tar.gz" } ], "0.1.3": [ { "comment_text": "", "digests": { "md5": "c9b873da463f63ebc7e0b38cfa356667", "sha256": "8e10085a2a07385e13cc22c8df82e21d6d1488dd778c0a289f47979ba392dd80" }, "downloads": -1, "filename": "parse2csv-0.1.3-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "c9b873da463f63ebc7e0b38cfa356667", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 8737, "upload_time": "2018-05-28T12:23:32", "url": "https://files.pythonhosted.org/packages/3e/eb/57ccab487ac1f5b6b4d3b0ec13e08de297adc9d63d263610439a8d70795c/parse2csv-0.1.3-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "31954f8a6de70225bc2663b987703608", "sha256": "8a674fbb2ad79172597e9c5fabb9f9ee332ed05e41d70ee24b5152d59be716fd" }, "downloads": -1, "filename": "parse2csv-0.1.3.tar.gz", "has_sig": false, "md5_digest": "31954f8a6de70225bc2663b987703608", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7339, "upload_time": "2018-05-28T12:23:33", "url": "https://files.pythonhosted.org/packages/dc/be/8370d8864d4700003c10dc43fefdc9039bf9508bc269677fce96f4e5f95f/parse2csv-0.1.3.tar.gz" } ], "0.1.4": [ { "comment_text": "", "digests": { "md5": "4fa31407d1426bd1b8136afc7e9727cc", "sha256": "19aa5620d82434538a16ef15a62b9a4b2e55e8103cee775f5b34b88811f0a459" }, "downloads": -1, "filename": "parse2csv-0.1.4-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "4fa31407d1426bd1b8136afc7e9727cc", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 8864, "upload_time": "2018-06-05T17:07:06", "url": "https://files.pythonhosted.org/packages/be/a1/944384ac25fc9c15519d1e43a47684b1bd5e8f9cdcdd735f87d5eb3d2078/parse2csv-0.1.4-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3ed0940259fe99b6bf4860d7d55c4ade", "sha256": "9c2e69963e90b05d9e9c4bbc71139e20dc4f73cd33a9460362be06b8ab95cad1" }, "downloads": -1, "filename": "parse2csv-0.1.4.tar.gz", "has_sig": false, "md5_digest": "3ed0940259fe99b6bf4860d7d55c4ade", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7460, "upload_time": "2018-06-05T17:07:07", "url": "https://files.pythonhosted.org/packages/78/97/458f5661f55e09b16e9c43eef9fb9578c1202839f5fcf2d9b409e9a4a379/parse2csv-0.1.4.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "4fa31407d1426bd1b8136afc7e9727cc", "sha256": "19aa5620d82434538a16ef15a62b9a4b2e55e8103cee775f5b34b88811f0a459" }, "downloads": -1, "filename": "parse2csv-0.1.4-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "4fa31407d1426bd1b8136afc7e9727cc", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 8864, "upload_time": "2018-06-05T17:07:06", "url": "https://files.pythonhosted.org/packages/be/a1/944384ac25fc9c15519d1e43a47684b1bd5e8f9cdcdd735f87d5eb3d2078/parse2csv-0.1.4-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3ed0940259fe99b6bf4860d7d55c4ade", "sha256": "9c2e69963e90b05d9e9c4bbc71139e20dc4f73cd33a9460362be06b8ab95cad1" }, "downloads": -1, "filename": "parse2csv-0.1.4.tar.gz", "has_sig": false, "md5_digest": "3ed0940259fe99b6bf4860d7d55c4ade", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7460, "upload_time": "2018-06-05T17:07:07", "url": "https://files.pythonhosted.org/packages/78/97/458f5661f55e09b16e9c43eef9fb9578c1202839f5fcf2d9b409e9a4a379/parse2csv-0.1.4.tar.gz" } ] }