{ "info": { "author": "Pierluigi Failla", "author_email": "{name}.{surname}@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 1 - Planning", "Environment :: Console", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Natural Language :: English", "Operating System :: OS Independent", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.3", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: Implementation :: PyPy", "Topic :: Documentation", "Topic :: Software Development :: Libraries :: Python Modules", "Topic :: System :: Installation/Setup", "Topic :: System :: Software Distribution" ], "description": "# `pipesnake`\n\n*a pandas sklearn-inspired pipeline data processor*\n\n`pipesnake` is a data processing pipeline able to handle Pandas Dataframes. In many cases\nDataframes are used to clean-up data, pre-processing it and to perform feature engineering, \n`pipesnake` tries to simplify these steps, creating complex pipelines.\n\n[documentation](docs/source/index.rst); [examples](examples/README.md);\n\n## Why?\n\nTwo easy reasons:\n* in many cases Pandas DataFrame is super easy to build _feature extractor_ or _data preocessors_\n* in many cases it is useful to have a pipeline that can process both `x` and `y` at the same time\n\n# How can you use `pipesnake` ?\n\n## Install\n\nThe easy way:\n\n`pip install --upgrade https://github.com/pierluigi-failla/pipesnake/tarball/master`\n\nto get the latest version available on GitHub, or:\n\n`pip install pipesnake` \n\nto install the latest stable version on [PyPi](https://pypi.python.org).\n\n## Coding\n\nYou can build your own pipelines combining `SeriesPipe` and `ParallelPipe`, both of them can handle list \nof `Transformer`. \n\nAn inherited `Transformer` object is a class which implements the abstract \n`base.Transformer` methods:\n\n```python\nfrom pipesnake.base import Transformer\n\nclass MyTransformer(Transformer):\n def __init__(self, name=None, ):\n Transformer.__init__(self, name=name, ...)\n\n def fit_x(self, x):\n \n\n def fit_y(self, y):\n \n\n def transform_x(self, x):\n \n\n def transform_y(self, y):\n \n\n def inverse_transform_x(self, x):\n \n\n def inverse_transform_y(self, y):\n \n```\n\nYou can find some `Transformers` already implemented in `pipesnake.transformers`. \n\nOnce you have all the needed `Transformers` you can create pipelines for feature engineering or data \nprocessing using `SeriesPipe` or `ParallelPipe`:\n\n```python\nfrom pipesnake.pipe import ParallelPipe\nfrom pipesnake.pipe import SeriesPipe\n\npipe = SeriesPipe(transformers=[\n ParallelPipe(transformers=[\n MyTransformer1(),\n MyTransformer2(),\n ]),\n MyTransformer3(),\n])\n```\n\nMore info in the [documentation]() and in the [examples](examples/README.md).\n\n# Batteries included\n\n`pipesnake` comes with several transformers included:\n\nModule | Name | Short Description\n--- | --- | ---\n`pipenskae.transformers.combiner` | `Combiner` | Apply user function to a column or a set of columns\n`pipenskae.transformers.combiner` | `Roller` | Apply the provided function rolling within a given window\n`pipenskae.transformers.converter` | `Category2Number` | Convert categorical to number\n`pipenskae.transformers.deeplearning` | `LSTMPacker` | Pack rows in order to be used as input for LSTM networks\n`pipenskae.transformers.dropper` | `DropDuplicates` | Drop duplicated rows and/or cols\n`pipenskae.transformers.dropper` | `DropNanCols` | Drop cols with nans\n`pipenskae.transformers.dropper` | `DropNanRows` | Drop rows with nans\n`pipenskae.transformers.financial` | `ToReturn` | Convert columns to `financial return`: r_t = (x_t - x_{t-1}) / x_{t-1}\n`pipenskae.transformers.imputer` | `ReplaceImputer` | Impute NaNs replacing them\n`pipenskae.transformers.imputer` | `KnnImputer` | Impute NaNs using K-nearest neighbors\n`pipenskae.transformers.misc` | `ToNumpy` | Convert `x` and `y` to a particular numpy type\n`pipenskae.transformers.misc` | `ColumnRenamer` | Rename `x` and `y` columns\n`pipenskae.transformers.misc` | `Copycat` | Copy the datasets forward\n`pipenskae.transformers.scaler` | `MinMaxScaler` | Min max scaler\n`pipenskae.transformers.scaler` | `StdScaler` | Standard deviation scaler\n`pipenskae.transformers.scaler` | `MadScaler` | Median absolute deviation scaler\n`pipenskae.transformers.scaler` | `UnitLenghtScaler` | Scale the feature vector to have norm 1.0\n`pipenskae.transformers.selector` | `ColumnSelector` | Select a given list of column names to keep\n`pipenskae.transformers.stats` | `ToSymbolProbability` | Convert values in columns to their probabilities\n\n# How can you contribute to `pipesnake` ?\n\nFirst of all grab a copy of the repository: \n\n`git clone https://github.com/scikit-learn/scikit-learn.git`\n\nyou can run tests just running `run_tests.py`. \n\nThere is a bunch of things you can contribute as far as `pipesnake` is at its early stages:\n\n* **improvements**: make the library bugfixed, faster, parallel, nicer, cleaner...;\n* **documentation**: this library uses Sphinx to generate documentation, so feel free to enrich it;\n* **samples**: create examples about using the library;\n* **transformers**: develop new-general-purpose transformers to share with the community;\n* **tests**: code better tests to extend the coverage and reduce code regression;\n\nor whatever you may thing is relevant to make `pipesnake` better.\n\n\n", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/pierluigi-failla/pipesnake", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "pipesnake", "package_url": "https://pypi.org/project/pipesnake/", "platform": "", "project_url": "https://pypi.org/project/pipesnake/", "project_urls": { "Homepage": "https://github.com/pierluigi-failla/pipesnake" }, "release_url": "https://pypi.org/project/pipesnake/0.1/", "requires_dist": null, "requires_python": "", "summary": "Feature extractor and data processing pipelines for Pandas inspired by Scikit-Learn.", "version": "0.1" }, "last_serial": 3524903, "releases": { "0.1": [ { "comment_text": "", "digests": { "md5": "35b9f057458b099ffa4d88ca66793486", "sha256": "52d21ed117a01edf46307869b10fb2e5151bfd20fce7069ae15915ce2ec33896" }, "downloads": -1, "filename": "pipesnake-0.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "35b9f057458b099ffa4d88ca66793486", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 32305, "upload_time": "2018-01-26T18:18:15", "url": "https://files.pythonhosted.org/packages/db/90/f7f25668e5005ca8c62911b52b8a641cdda5ce8c3847ae07e77d0191ceab/pipesnake-0.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "9e1338424b94b13153b70244a48fae42", "sha256": "5bc3b2103fe2afa99ac733c7e9b090f16a144bf4f5a525cceb13919391aebbab" }, "downloads": -1, "filename": "pipesnake-0.1.tar.gz", "has_sig": false, "md5_digest": "9e1338424b94b13153b70244a48fae42", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 30915, "upload_time": "2018-01-26T18:18:18", "url": "https://files.pythonhosted.org/packages/d3/21/6271046e0f0efd64bbace4bc70f35c86893e5315a920a9fde40c13e8ffa4/pipesnake-0.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "35b9f057458b099ffa4d88ca66793486", "sha256": "52d21ed117a01edf46307869b10fb2e5151bfd20fce7069ae15915ce2ec33896" }, "downloads": -1, "filename": "pipesnake-0.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "35b9f057458b099ffa4d88ca66793486", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 32305, "upload_time": "2018-01-26T18:18:15", "url": "https://files.pythonhosted.org/packages/db/90/f7f25668e5005ca8c62911b52b8a641cdda5ce8c3847ae07e77d0191ceab/pipesnake-0.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "9e1338424b94b13153b70244a48fae42", "sha256": "5bc3b2103fe2afa99ac733c7e9b090f16a144bf4f5a525cceb13919391aebbab" }, "downloads": -1, "filename": "pipesnake-0.1.tar.gz", "has_sig": false, "md5_digest": "9e1338424b94b13153b70244a48fae42", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 30915, "upload_time": "2018-01-26T18:18:18", "url": "https://files.pythonhosted.org/packages/d3/21/6271046e0f0efd64bbace4bc70f35c86893e5315a920a9fde40c13e8ffa4/pipesnake-0.1.tar.gz" } ] }