{ "info": { "author": "Ken Van Haren", "author_email": "kvh@science.io", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: BSD License" ], "description": "Ramp - Rapid Machine Learning Prototyping\n=========================================\n\nRamp is a python module for rapid prototyping of machine learning\nsolutions. It is essentially a [pandas](http://pandas.pydata.org)\nwrapper around various python machine learning and statistics libraries\n([scikit-learn](http://scikit-learn.org), [rpy2](http://rpy.sourceforge.net/rpy2.html), etc.),\nproviding a simple, declarative syntax for\nexploring features, algorithms and transformations quickly and\nefficiently.\n\nDocumentation: http://ramp.readthedocs.org\n\n**Why Ramp?**\n\n * **Clean, declarative syntax**\n \n No more hackish one-off spaghetti scripts!\n\n * **Complex feature transformations**\n\n Chain and combine features:\n```python\nNormalize(Log('x'))\nInteractions([Log('x1'), (F('x2') + F('x3')) / 2])\n```\n Reduce feature dimension:\n```python\nDimensionReduction([F('x%d'%i) for i in range(100)], decomposer=PCA(n_components=3))\n```\n Incorporate residuals or predictions to blend with other models:\n```python\nResiduals(config_model1) + Predictions(config_model2)\n```\n Any feature that uses the target (\"y\") variable will automatically respect the\n current training and test sets.\n\n\n * **Caching**\n\n Ramp caches and stores on disk in fast HDF5 format (or elsewhere if you want) all features and models it\n computes, so nothing is recomputed unnecessarily. Results are stored \n and can be retrieved, compared, blended, and reused between runs.\n\n * **Easy extensibility**\n\n Ramp has a simple API, allowing you to plug in estimators from\n scikit-learn, rpy2 and elsewhere, or easily build your own feature\n transformations, metrics, feature selectors, reporters, or estimators.\n\n\n## Quick start\n[Getting started with Ramp: Classifying insults](http://www.kenvanharen.com/2012/11/getting-started-with-ramp-detecting.html)\n\nOr, the quintessential Iris example:\n\n```python\n import pandas\n from ramp import *\n import urllib2\n import sklearn\n from sklearn import decomposition\n\n\n # fetch and clean iris data from UCI\n data = pandas.read_csv(urllib2.urlopen(\n \"http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data\"))\n data = data.drop([149]) # bad line\n columns = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'class']\n data.columns = columns\n\n\n # all features\n features = [FillMissing(f, 0) for f in columns[:-1]]\n\n # features, log transformed features, and interaction terms\n expanded_features = (\n features +\n [Log(F(f) + 1) for f in features] +\n [\n F('sepal_width') ** 2,\n combo.Interactions(features),\n ]\n )\n\n\n # Define several models and feature sets to explore,\n # run 5 fold cross-validation on each and print the results.\n # We define 2 models and 4 feature sets, so this will be\n # 4 * 2 = 8 models tested.\n shortcuts.cv_factory(\n data=data,\n\n target=[AsFactor('class')],\n metrics=[[metrics.GeneralizedMCC()]],\n\n # Try out two algorithms\n model=[\n sklearn.ensemble.RandomForestClassifier(n_estimators=20),\n sklearn.linear_model.LogisticRegression(),\n ],\n\n # and 4 feature sets\n features=[\n expanded_features,\n\n # Feature selection\n [trained.FeatureSelector(\n expanded_features,\n # use random forest's importance to trim\n selectors.RandomForestSelector(classifier=True),\n target=AsFactor('class'), # target to use\n n_keep=5, # keep top 5 features\n )],\n\n # Reduce feature dimension (pointless on this dataset)\n [combo.DimensionReduction(expanded_features,\n decomposer=decomposition.PCA(n_components=4))],\n\n # Normalized features\n [Normalize(f) for f in expanded_features],\n ]\n )\n```\n\n## Status\nRamp is very alpha currently, so expect bugs, bug fixes and API changes.\n\n## Requirements\n * Numpy\n * Scipy \n * Pandas\n * PyTables\n * Sci-kit Learn\n\n## Author\nKen Van Haren. Email with feedback/questions: kvh@science.io", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://github.com/kvh/ramp", "keywords": "machine learning data analysis statistics mining", "license": "BSD", "maintainer": null, "maintainer_email": null, "name": "ramp", "package_url": "https://pypi.org/project/ramp/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/ramp/", "project_urls": { "Download": "UNKNOWN", "Homepage": "http://github.com/kvh/ramp" }, "release_url": "https://pypi.org/project/ramp/0.1.4/", "requires_dist": null, "requires_python": null, "summary": "Rapid machine learning prototyping", "version": "0.1.4" }, "last_serial": 798460, "releases": { "0.1": [ { "comment_text": "", "digests": { "md5": "92af8f92875fa2af0b62573e452b9fe9", "sha256": "e3de20833463040ef672748147976898395f0aabc99b77a947e3fe6ec0584a91" }, "downloads": -1, "filename": "ramp-0.1.tar.gz", "has_sig": false, "md5_digest": "92af8f92875fa2af0b62573e452b9fe9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 26154, "upload_time": "2012-11-18T21:34:31", "url": "https://files.pythonhosted.org/packages/ec/b2/c8b5c3bff275bfd90d70f265d63d0331fe248b1f453af2e3c314c60df7df/ramp-0.1.tar.gz" } ], "0.1.3": [ { "comment_text": "", "digests": { "md5": "036322a0d4d119d43c793b4b58bb6f9e", "sha256": "242f15a7159840062da8329291fd79391b5a7cc49141852990152e6ca75555f0" }, "downloads": -1, "filename": "ramp-0.1.3.tar.gz", "has_sig": false, "md5_digest": "036322a0d4d119d43c793b4b58bb6f9e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 27110, "upload_time": "2012-11-18T22:12:59", "url": "https://files.pythonhosted.org/packages/db/72/7d5abf9e085d3d4ec15b641edb1dbfc173bdd9857fc3b9ad6f70e1d975f4/ramp-0.1.3.tar.gz" } ], "0.1.4": [ { "comment_text": "", "digests": { "md5": "7a0a8a619ef7e2bc40dd615e1eb5ccf3", "sha256": "568d98769829ad3b9dc97ffbd36d923c9048903b5417580711caea1c392881f7" }, "downloads": -1, "filename": "ramp-0.1.4.tar.gz", "has_sig": false, "md5_digest": "7a0a8a619ef7e2bc40dd615e1eb5ccf3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 31632, "upload_time": "2013-01-31T20:41:57", "url": "https://files.pythonhosted.org/packages/13/24/443008a924ad21dd10923dba19788c4144282e9a360dac2fea6bf0ea3ebc/ramp-0.1.4.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "7a0a8a619ef7e2bc40dd615e1eb5ccf3", "sha256": "568d98769829ad3b9dc97ffbd36d923c9048903b5417580711caea1c392881f7" }, "downloads": -1, "filename": "ramp-0.1.4.tar.gz", "has_sig": false, "md5_digest": "7a0a8a619ef7e2bc40dd615e1eb5ccf3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 31632, "upload_time": "2013-01-31T20:41:57", "url": "https://files.pythonhosted.org/packages/13/24/443008a924ad21dd10923dba19788c4144282e9a360dac2fea6bf0ea3ebc/ramp-0.1.4.tar.gz" } ] }