{ "info": { "author": "Daoud Clarke", "author_email": "", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3.5" ], "description": "PyPastry - the opinionated machine learning experimentation framework\n=====================================================================\n\nPyPastry is a framework for developers and data scientists to run\nmachine learning experiments. We enable you to:\n\n - Iterate quickly. The more experiments you do, the more likely you\n are to find something that works well.\n - Experiment correctly and consistently. Anything else is not really\n an experiment, is it?\n - Make experiments reproducible. That means keeping track of your\n code state and results.\n - Experiment locally. None of that Spark rubbish.\n - Use standard tools. Everything is based on Scikit-learn, Pandas and Git.\n\nQuick start\n-----------\n\nPyPastry requires python 3.5 or greater.\n\n > git clone https://github.com/datapastry/pypastry.git\n > pip install -e pypastry\n\t> pastry init pastry-test\n > cd pastry-test\n > pastry run -m \"First experiment\"\n Got dataset with 10 rows\n Git hash Dataset hash Run start Model Score Duration (s)\n 0 aa87ce62 71e8f4fd 2019-08-28 06:39:07 DecisionTreeClassifier 0.933 \u00b1 0.067 0.03\n\nThe command `pastry init` creates a file called `pie.py` in the `pastry-test` directory. If you open\nthat up, you should see some code. The important bit is:\n\n def get_experiment():\n dataset = pd.DataFrame({\n 'feature': [1, 0, 1, 1, 0, 0, 1, 1, 0, 1],\n 'class': [True, False, True, True, False, False, True, True, False, False],\n })\n predictor = DecisionTreeClassifier()\n cross_validator = StratifiedKFold(n_splits=5)\n scorer = make_scorer(f1_score)\n label_column = 'class'\n return Experiment(dataset, label_column, predictor, cross_validator, scorer)\n\nThis returns an `Experiment` instance that specifies how the experiment should be run. An experiment\nconsists of:\n - `dataset`: a Pandas `DataFrame` where each row is an instance to be used in the experiment.\n - `label_column`: the name of the column in `dataset` that contains the label we wish to predict.\n - `predictor`: a Scikit-learn predictor, e.g. a classifier, regressor or `Pipeline` object.\n - `cross_validator`: a Scikit-learn cross validator that specifies how the data should be split\n up when running the experiment.\n - `scorer` a Scikit-learn scorer that will be used as an indication of how well the classifier has\n learnt to generate predictions.\n\nWhen you type `pastry run`, PyPastry does this:\n - Splits `dataset` into one or more train and test sets.\n - For each train and test set, it trains the `predictor` on the train set and generate predictions\n on the test set, and computes the score on the test set using the `scorer`.\n - Generates a results file in JSON format and stores it in a folder called `results`\n - Adds the new file and any modified files to git staging and runs a git commit.\n - Outputs the results of the experiment.\n\nThe results includes:\n - Git hash: the commit identifier from git that allows you to return to this version of the code\n at any later point in time.\n - Dataset hash: a hash generated from the dataset that will change if the dataset changes.\n - Run start: the time that the experiment run started\n - Model: the name of the `predictor` class used\n - Score: the mean \u00b1 the standard error in the mean, computed over the different folds generated\n by the `cross_validator`.\n - Duration: how long the experiment took to run, in seconds.\n\nContributing\n------------\n\nPyPastry is at an early stage so there's plenty to do and we'd love to have your contribution.\n\nCheck out the issues for a list of things that need doing and post a comment if you'd like to take\nsomething on.\n\nIf you have an idea for something you'd like to do, create an issue.\n\nThanks for using PyPastry!\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/datapastry/pypastry", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "pypastry", "package_url": "https://pypi.org/project/pypastry/", "platform": "", "project_url": "https://pypi.org/project/pypastry/", "project_urls": { "Homepage": "https://github.com/datapastry/pypastry" }, "release_url": "https://pypi.org/project/pypastry/0.0.1/", "requires_dist": [ "tomlkit", "pandas", "scikit-learn", "pyarrow", "gitpython" ], "requires_python": ">=3.5", "summary": "PyPastry machine learning experimentation framework", "version": "0.0.1" }, "last_serial": 5874563, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "c83f7de587a722c261a10eac2fc9c390", "sha256": "7c7aa0b9c3da5fde482da03aebeda3eff896bf20b833ee0ab2ea33a818d8baa3" }, "downloads": -1, "filename": "pypastry-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "c83f7de587a722c261a10eac2fc9c390", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 4315, "upload_time": "2019-09-23T15:12:28", "url": "https://files.pythonhosted.org/packages/1b/33/4ab31dc0a86fad607ba145395afb8e8ba3758454c689d32599107467c376/pypastry-0.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "f1ebd03623b1cff9243f5bdd18b98869", "sha256": "a1c81cd0e01fe69637a9c6393dd6e97901d5c5fb8270553c9b6a61978fe0c063" }, "downloads": -1, "filename": "pypastry-0.0.1.tar.gz", "has_sig": false, "md5_digest": "f1ebd03623b1cff9243f5bdd18b98869", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 9031, "upload_time": "2019-09-23T15:12:34", "url": "https://files.pythonhosted.org/packages/0f/74/9d9217e5e0772c5d152b387f08d61102fa7426058ac886c6a633385a053f/pypastry-0.0.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "c83f7de587a722c261a10eac2fc9c390", "sha256": "7c7aa0b9c3da5fde482da03aebeda3eff896bf20b833ee0ab2ea33a818d8baa3" }, "downloads": -1, "filename": "pypastry-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "c83f7de587a722c261a10eac2fc9c390", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 4315, "upload_time": "2019-09-23T15:12:28", "url": "https://files.pythonhosted.org/packages/1b/33/4ab31dc0a86fad607ba145395afb8e8ba3758454c689d32599107467c376/pypastry-0.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "f1ebd03623b1cff9243f5bdd18b98869", "sha256": "a1c81cd0e01fe69637a9c6393dd6e97901d5c5fb8270553c9b6a61978fe0c063" }, "downloads": -1, "filename": "pypastry-0.0.1.tar.gz", "has_sig": false, "md5_digest": "f1ebd03623b1cff9243f5bdd18b98869", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 9031, "upload_time": "2019-09-23T15:12:34", "url": "https://files.pythonhosted.org/packages/0f/74/9d9217e5e0772c5d152b387f08d61102fa7426058ac886c6a633385a053f/pypastry-0.0.1.tar.gz" } ] }