{ "info": { "author": "Gavin McQuillan", "author_email": "gavin.mcquillan@buildingenergy.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 5 - Production/Stable" ], "description": "mcm-core\n========\n\nCore MCM - Map Clean Merge\n\n\n\n\nOverview\n-----------\n\nMCM has two main peices, a reader and a mapper.\n\n- Reader\n * Reads csv files, returns a generator of DictCSVReader parsed rows.\n * Optionally chunks the rows into groupings of specified sizes.\n- Mapper\n * Can build a probabalistic column mapping given a schema and some raw data.\n * Will substitute saved values for suggested mapping (e.g. pulling a previous mapping from DB).\n * Totally flexible, you pass a callable which takes the raw data and returns a mapping.\n * Will clean data based on a Cleaner object for a given type. Type is inferred from the mapping schema.\n * Ability to set \"initial_data\"\n * If you always need to set some information in the object that you're mapping data into, this is useful.\n * Concatenate rows together with a specified delimiter character.\n * Data which doesn't match a given schema's mapping is still saved. It's put in a dictionary called ``extra_data``.\n\n \nInstalling\n----------\n\nOnce it's hosted on Pypi:\n```bash\n pip install mcm-core\n```\n\nIntegration\n-----------\n\n```python\nfrom mcm import cleaners, mapper, reader\n\n# Here our mapping is just a dictionary where our keys are raw data representations\n# and our values are our normalized attributes that we're mapping to.\nmapping = {'Thing': 'thing_1', 'Other thing': 'thing_2'}\n\n# model_class can be any type of object.\nmodel_class = object\n\n# Reading and mapping from a CSV file, simple case.\nparser = reader.MCMParser(csv_file_handle)\nmapped_objs = [m for m in parser.map_rows(mapping, model_class)]\n```\n\n\nDeveloping\n----------\n\n1. Clone.\n2. Create a virtualenv; if you use virtualenv wrapper you'll need to\n 1. Run ``python setup.py develop`` to hardlink your files into your env.\n\n\nTesting\n-------\n\nUnfortunately, there are some directory path issues still baked in.\nTo run tests you have to be in the ``tests`` directory:\n\n```bash\n cd tests && nosetests\n```\n", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "buildingenergy.com", "keywords": "mapping, data,", "license": "Apache2", "maintainer": "Gavin McQuillan", "maintainer_email": "", "name": "mcm", "package_url": "https://pypi.org/project/mcm/", "platform": "", "project_url": "https://pypi.org/project/mcm/", "project_urls": { "Homepage": "buildingenergy.com" }, "release_url": "https://pypi.org/project/mcm/0.0.1/", "requires_dist": null, "requires_python": null, "summary": "Map, Clean, Merge. For mapping data from one schema to another.", "version": "0.0.1" }, "last_serial": 1090105, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "248f4950908c761c6ff21ce75b9cba58", "sha256": "88602b0651debca8d38c4c85d001791434eeec8736c85946cbf6d89332095cfc" }, "downloads": -1, "filename": "mcm-0.0.1.tar.gz", "has_sig": false, "md5_digest": "248f4950908c761c6ff21ce75b9cba58", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 27479, "upload_time": "2014-05-12T21:23:29", "url": "https://files.pythonhosted.org/packages/02/23/80fdcb5942c3dfb7a617a3660de1f56753afb69f89e3222fb92c58066f79/mcm-0.0.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "248f4950908c761c6ff21ce75b9cba58", "sha256": "88602b0651debca8d38c4c85d001791434eeec8736c85946cbf6d89332095cfc" }, "downloads": -1, "filename": "mcm-0.0.1.tar.gz", "has_sig": false, "md5_digest": "248f4950908c761c6ff21ce75b9cba58", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 27479, "upload_time": "2014-05-12T21:23:29", "url": "https://files.pythonhosted.org/packages/02/23/80fdcb5942c3dfb7a617a3660de1f56753afb69f89e3222fb92c58066f79/mcm-0.0.1.tar.gz" } ] }