{ "info": { "author": "Jaidev Deshpande", "author_email": "deshpande.jaidev@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Framework :: Sphinx", "Framework :: Sphinx :: Extension", "Intended Audience :: Developers", "Intended Audience :: Education", "Intended Audience :: Financial and Insurance Industry", "Intended Audience :: Healthcare Industry", "Intended Audience :: Information Technology", "Intended Audience :: Manufacturing", "Intended Audience :: Other Audience", "Intended Audience :: Science/Research", "Intended Audience :: System Administrators", "Intended Audience :: Telecommunications Industry", "License :: OSI Approved :: MIT License", "Natural Language :: English", "Operating System :: OS Independent", "Programming Language :: Python :: 2.7", "Topic :: Database", "Topic :: Scientific/Engineering" ], "description": ".. -*- mode: rst -*-\r\n\r\n|Travis|_ |Coveralls|_ |Landscape|_ |RTFD|_\r\n\r\n.. |Travis| image:: https://travis-ci.org/motherbox/pysemantic.svg?branch=master\r\n.. _Travis: https://travis-ci.org/motherbox/pysemantic\r\n\r\n.. |Coveralls| image:: https://coveralls.io/repos/motherbox/pysemantic/badge.svg?branch=master\r\n.. _Coveralls: https://coveralls.io/r/motherbox/pysemantic?branch=master\r\n\r\n.. |Landscape| image:: https://landscape.io/github/motherbox/pysemantic/master/landscape.svg?style=flat\r\n.. _Landscape: https://landscape.io/github/motherbox/pysemantic/master\r\n\r\n.. |RTFD| image:: https://readthedocs.org/projects/pysemantic/badge/?version=latest\r\n.. _RTFD: https://readthedocs.org/projects/pysemantic/?badge=latest\r\n\r\n.. image:: docs/_static/logo.png\r\n\r\npysemantic\r\n==========\r\nA traits based data validation and data cleaning module for pandas data structures.\r\n\r\nDependencies\r\n------------\r\n* Traits\r\n* PyYaml\r\n* pandas\r\n* docopt\r\n\r\nQuick Start\r\n-----------\r\n\r\nInstalling with pip\r\n+++++++++++++++++++\r\n\r\nRun::\r\n\r\n $ pip install pysemantic\r\n\r\nInstalling from source\r\n++++++++++++++++++++++\r\n\r\nYou can install pysemantic by cloning this repository, installing the\r\ndependencies and running::\r\n\r\n $ python setup.py install\r\n\r\nin the root directory of your local clone.\r\n\r\nUsage\r\n+++++\r\n\r\nCreate an empty file named ``pysemantic.conf`` in your home directory. This can be as simple as running::\r\n\r\n$ touch ~/pysemantic.conf\r\n\r\nAfter installing pysemantic, you should have a command line script called\r\n``semantic``. Try it out by running::\r\n\r\n$ semantic list\r\n\r\nThis should do nothing. This means that you don't have any projects regiestered\r\nunder pysemantic. A _project_ in pysemantic is just a collection of _datasets_.\r\npysemantic manages your datasets like an IDE manages source code files in that\r\nit groups them under different projects, and each project has it's own tree\r\nstructure, build toolchains, requirements, etc. Similarly, different\r\npysemantic projects group under them a set of datasets, and manages them\r\ndepending on their respective user-defined specifications. Projects are\r\nuniquely identified by their names.\r\n\r\nFor now, let's add and configure a demo project called, simply,\r\n\"pysemantic_demo\". You can create a project and register it with pysemantic\r\nusing the ``add`` subcommand of the ``semantic`` script as follows::\r\n\r\n$ semantic add pysemantic_demo\r\n\r\nAs you can see, this does not fit the supported usage of the ``add`` subcommand.\r\nWe additionally need a file containing the specifications for this project.\r\n(Note that this file, containing the specifications, is referred to throughout\r\nthe documentation interchangeably as a *specfile* or a *data dictionary*.)\r\nBefore we create this file, let's download the well known Fisher iris datset,\r\nwhich we will use as the sample dataset for this demo. You can download it\r\n`here `_.\r\n\r\nOnce the dataset is downloaded, fire up your favourite text editor and create a\r\nfile named ``demo_specs.yaml``. Fill it up with the following content.\r\n\r\n.. code-block:: yaml\r\n\r\n iris:\r\n path: /absolute/path/to/iris.csv\r\n\r\nNow we can use this file as the data dictionary of the ``pysemantic_demo``\r\nproject. Let's tell pysemantic that we want to do so, by running the following\r\ncommand::\r\n\r\n$ semantic add pysemantic_demo /path/to/demo_specs.yaml\r\n\r\nWe're all set. To see how we did, start a Python interpreter and type the\r\nfollowing statements::\r\n\r\n>>> from pysemantic import Project\r\n>>> demo = Project(\"pysemantic_demo\")\r\n>>> iris = demo.load_dataset(\"iris\")\r\n\r\nVoila! The Python object named ``iris`` is actually a pandas DataFrame containing\r\nthe iris dataset! Well, nothing really remarkable so far. In fact, we cloned\r\nand installed a module, wrote two seemingly unnecessary files, and typed three\r\nlines of Python code to do something that could have been achieved by simply\r\nwriting::\r\n\r\n>>> iris = pandas.read_csv(\"/path/to/iris.csv\")\r\n\r\nMost datasets, however, are not as well behaved as this one. In fact they can\r\nbe a nightmare to deal with. Pysemantic can be far more intricate and far\r\nsmarter than this when dealing with mangled, badly encoded, ugly data with\r\ninconsistent data types. Check the IPython notebooks in the examples to see how to use Pysemantic for\r\nsuch data.", "description_content_type": null, "docs_url": null, "download_url": "https://github.com/motherbox/pysemantic/archive/0.1.1.tar.gz", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://pysemantic.readthedocs.org", "keywords": "pandas, data, traits, numpy", "license": "New BSD License\r\n\r\nCopyright (c) 2014\u20132015 The DataCulture Analytics Company.\r\nAll rights reserved.\r\n\r\n\r\nRedistribution and use in source and binary forms, with or without\r\nmodification, are permitted provided that the following conditions are met:\r\n\r\n a. Redistributions of source code must retain the above copyright notice,\r\n this list of conditions and the following disclaimer.\r\n b. Redistributions in binary form must reproduce the above copyright\r\n notice, this list of conditions and the following disclaimer in the\r\n documentation and/or other materials provided with the distribution.\r\n c. Neither the name of the DataCulture Analytics Company, nor the names of\r\n the PySemantic Developers, nor the names of its contributors may be used\r\n to endorse or promote products derived from this software without\r\n specific prior written permission. \r\n\r\n\r\nTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\"\r\nAND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE\r\nIMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE\r\nARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR\r\nANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL\r\nDAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR\r\nSERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER\r\nCAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT\r\nLIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY\r\nOUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH\r\nDAMAGE.", "maintainer": "Jaidev Deshpande", "maintainer_email": "deshpande.jaidev@gmail.com", "name": "pysemantic", "package_url": "https://pypi.org/project/pysemantic/", "platform": "", "project_url": "https://pypi.org/project/pysemantic/", "project_urls": { "Download": "https://github.com/motherbox/pysemantic/archive/0.1.1.tar.gz", "Homepage": "http://pysemantic.readthedocs.org" }, "release_url": "https://pypi.org/project/pysemantic/v0.1.1/", "requires_dist": null, "requires_python": null, "summary": "A traits based data validation module for pandas data structures.", "version": "v0.1.1" }, "last_serial": 1614948, "releases": { "v0.1": [ { "comment_text": "", "digests": { "md5": "4eda9d7a4c0c789ad8ba68ec8b932046", "sha256": "c35f291a4c8535c6f002ecde94e82245cff7c26c4da355ddfe7be4c182736ae0" }, "downloads": -1, "filename": "pysemantic-0.1.tar.gz", "has_sig": false, "md5_digest": "4eda9d7a4c0c789ad8ba68ec8b932046", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 370483, "upload_time": "2015-07-01T13:02:19", "url": "https://files.pythonhosted.org/packages/ea/5f/c144bd6cf1907c81adfa5bd65bd6f97e4747220eadc08d1bdbfae9ecf9c6/pysemantic-0.1.tar.gz" } ], "v0.1.1": [ { "comment_text": "", "digests": { "md5": "330ce56736e14ccb19bba764c2a3588d", "sha256": "b3bf42b9e66afd1d04fd84ca9dfab2dd806b250d9bc8e623bfab422bf3484bc7" }, "downloads": -1, "filename": "pysemantic-0.1.1.tar.gz", "has_sig": false, "md5_digest": "330ce56736e14ccb19bba764c2a3588d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 370767, "upload_time": "2015-07-01T14:23:21", "url": "https://files.pythonhosted.org/packages/ce/72/1661c0b85d22875478051016c4cca39a38fa65e7445b4514e5d07fe81a99/pysemantic-0.1.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "330ce56736e14ccb19bba764c2a3588d", "sha256": "b3bf42b9e66afd1d04fd84ca9dfab2dd806b250d9bc8e623bfab422bf3484bc7" }, "downloads": -1, "filename": "pysemantic-0.1.1.tar.gz", "has_sig": false, "md5_digest": "330ce56736e14ccb19bba764c2a3588d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 370767, "upload_time": "2015-07-01T14:23:21", "url": "https://files.pythonhosted.org/packages/ce/72/1661c0b85d22875478051016c4cca39a38fa65e7445b4514e5d07fe81a99/pysemantic-0.1.1.tar.gz" } ] }