{ "info": { "author": "Yngve Mardal Moe", "author_email": "yngve.m.moe@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "===========\nGroup Lasso\n===========\n\n.. image:: https://travis-ci.org/yngvem/group-lasso.svg?branch=master\n :target: https://github.com/yngvem/group-lasso\n\n.. image:: https://coveralls.io/repos/github/yngvem/group-lasso/badge.svg\n :target: https://coveralls.io/github/yngvem/group-lasso\n\n.. image:: https://readthedocs.org/projects/group-lasso/badge/?version=latest\n :target: https://group-lasso.readthedocs.io/en/latest/?badge=latest\n\n.. image:: https://img.shields.io/pypi/l/group-lasso.svg\n :target: https://github.com/yngvem/group-lasso/blob/master/LICENSE\n\n.. image:: https://img.shields.io/badge/code%20style-black-000000.svg\n :target: https://github.com/python/black\n\nThe group lasso [1]_ regulariser is a well known method to achieve structured \nsparsity in machine learning and statistics. The idea is to create \nnon-overlapping groups of covariates, and recover regression weights in which \nonly a sparse set of these covariate groups have non-zero components.\n\nThere are several reasons for why this might be a good idea. Say for example \nthat we have a set of sensors and each of these sensors generate five \nmeasurements. We don't want to maintain an unneccesary number of sensors. \nIf we try normal LASSO regression, then we will get sparse components. \nHowever, these sparse components might not correspond to a sparse set of \nsensors, since they each generate five measurements. If we instead use group \nLASSO with measurements grouped by which sensor they were measured by, then\nwe will get a sparse set of sensors.\n\nAn extension of the group lasso regulariser is the sparse group lasso\nregulariser [2]_, which imposes both group-wise sparsity and coefficient-wise\nsparsity. This is done by combining the group lasso penalty with the\ntraditional lasso penalty. In this library, I have implemented an efficient\nsparse group lasso solver being fully scikit-learn API compliant.\n\n------------------\nAbout this project\n------------------\nThis project is developed by Yngve Mardal Moe and released under an MIT \nlisence. I am still working out a few things so changes might come rapidly.\n\n------------------\nInstallation guide\n------------------\nGroup-lasso requires Python 3.5+, numpy and scikit-learn. \nTo install group-lasso via ``pip``, simply run the command::\n\n pip install group-lasso\n\nAlternatively, you can manually pull this repository and run the\n``setup.py`` file::\n\n git clone https://github.com/yngvem/group-lasso.git\n cd group-lasso\n python setup.py\n\n-------------\nDocumentation\n-------------\n\nYou can read the full documentation on \n`readthedocs `_.\n\n--------\nExamples\n--------\n\nGroup lasso regression\n======================\n\nThe group lasso regulariser is implemented following the scikit-learn API,\nmaking it easy to use for those familiar with the Python ML ecosystem.\n\n.. code-block:: python\n\n import numpy as np\n from group_lasso import GroupLasso\n\n # Dataset parameters\n num_data_points = 10_000\n num_features = 500\n num_groups = 25\n assert num_features % num_groups == 0\n\n # Generate data matrix\n X = np.random.standard_normal((num_data_points, num_features))\n\n # Generate coefficients and intercept\n w = np.random.standard_normal((500, 1))\n intercept = 2\n\n # Generate groups and randomly set coefficients to zero\n groups = np.array([[group]*20 for group in range(25)]).ravel()\n for group in range(num_groups):\n w[groups == group] *= np.random.random() < 0.8\n\n # Generate target vector:\n y = X@w + intercept\n noise = np.random.standard_normal(y.shape)\n noise /= np.linalg.norm(noise)\n noise *= 0.3*np.linalg.norm(y)\n y += noise\n\n # Generate group lasso object and fit the model\n gl = GroupLasso(groups=groups, reg=.05)\n gl.fit(X, y)\n estimated_w = gl.coef_\n estimated_intercept = gl.intercept_[0]\n\n # Evaluate the model\n coef_correlation = np.corrcoef(w.ravel(), estimated_w.ravel())[0, 1]\n print(\n \"True intercept: {intercept:.2f}. Estimated intercept: {estimated_intercept:.2f}\".format(\n intercept=intercept,\n estimated_intercept=estimated_intercept\n )\n )\n print(\n \"Correlation between true and estimated coefficients: {coef_correlation:.2f}\".format(\n coef_correlation=coef_correlation\n )\n )\n\n.. code-block::\n\n True intercept: 2.00. Estimated intercept: 1.53\n Correlation between true and estimated coefficients: 0.98\n\n\nGroup lasso as a transformer\n============================\n\nGroup lasso regression can also be used as a transformer\n\n.. code-block:: python\n\n import numpy as np\n from sklearn.pipeline import Pipeline\n from sklearn.linear_model import Ridge\n from group_lasso import GroupLasso\n\n # Dataset parameters\n num_data_points = 10_000\n num_features = 500\n num_groups = 25\n assert num_features % num_groups == 0\n\n # Generate data matrix\n X = np.random.standard_normal((num_data_points, num_features))\n\n # Generate coefficients and intercept\n w = np.random.standard_normal((500, 1))\n intercept = 2\n\n # Generate groups and randomly set coefficients to zero\n groups = np.array([[group]*20 for group in range(25)]).ravel()\n for group in range(num_groups):\n w[groups == group] *= np.random.random() < 0.8\n\n # Generate target vector:\n y = X@w + intercept\n noise = np.random.standard_normal(y.shape)\n noise /= np.linalg.norm(noise)\n noise *= 0.3*np.linalg.norm(y)\n y += noise\n\n # Generate group lasso object and fit the model\n # We use an artificially high regularisation coefficient since\n # we want to use group lasso as a variable selection algorithm.\n gl = GroupLasso(groups=groups, group_reg=0.1, l1_reg=0.05)\n gl.fit(X, y)\n new_X = gl.transform(X)\n\n\n # Evaluate the model\n predicted_y = gl.predict(X)\n R_squared = 1 - np.sum((y - predicted_y)**2)/np.sum(y**2)\n\n print(\"The rows with zero-valued coefficients have now been removed from the dataset.\")\n print(\"The new shape is:\", new_X.shape)\n print(\"The R^2 statistic for the group lasso model is: {R_squared:.2f}\".format(R_squared=R_squared))\n print(\"This is very low since the regularisation is so high.\"\n\n # Use group lasso in a scikit-learn pipeline\n pipe = Pipeline(\n memory=None,\n steps=[\n ('variable_selection', GroupLasso(groups=groups, reg=.1)),\n ('regressor', Ridge(alpha=0.1))\n ]\n )\n pipe.fit(X, y)\n predicted_y = pipe.predict(X)\n R_squared = 1 - np.sum((y - predicted_y)**2)/np.sum(y**2)\n\n print(\"The R^2 statistic for the pipeline is: {R_squared:.2f}\".format(R_squared=R_squared))\n\n\n.. code-block::\n\n The rows with zero-valued coefficients have now been removed from the dataset.\n The new shape is: (10000, 280)\n The R^2 statistic for the group lasso model is: 0.17\n This is very low since the regularisation is so high.\n The R^2 statistic for the pipeline is: 0.72\n\n-----------\nFurher work\n-----------\nThe todos are, in decreasing order of importance\n\n1. Python 3.5 compatibility\n\n----------------------\nImplementation details\n----------------------\nThe problem is solved using the FISTA optimiser [4]_ with a gradient-based \nadaptive restarting scheme [5]_. No line search is currently implemented, but \nI hope to look at that later.\n\nAlthough fast, the FISTA optimiser does not achieve as low loss values as the \nsignificantly slower second order interior point methods. This might, at \nfirst glance, seem like a problem. However, it does recover the sparsity \npatterns of the data, which can be used to train a new model with the given \nsubset of the features.\n\nAlso, even though the FISTA optimiser is not meant for stochastic \noptimisation, it has to my experience not suffered a large fall in \nperformance when the mini batch was large enough. I have therefore \nimplemented mini-batch optimisation using FISTA, and thus been able to fit \nmodels based on data with ~500 columns and 10 000 000 rows on my moderately \npriced laptop.\n\nFinally, we note that since FISTA uses Nesterov acceleration, is not a \ndescent algorithm. We can therefore not expect the loss to decrease \nmonotonically.\n\n----------\nReferences\n----------\n\n.. [1] Yuan, M. and Lin, Y. (2006), Model selection and estimation in\n regression with grouped variables. Journal of the Royal Statistical\n Society: Series B (Statistical Methodology), 68: 49-67.\n doi:10.1111/j.1467-9868.2005.00532.x\n\n.. [2] Simon, N., Friedman, J., Hastie, T., & Tibshirani, R. (2013).\n A sparse-group lasso. Journal of Computational and Graphical\n Statistics, 22(2), 231-245.\n\n.. [3] Yuan L, Liu J, Ye J. (2011), Efficient methods for overlapping\n group lasso. Advances in Neural Information Processing Systems\n (pp. 352-360).\n\n.. [4] Beck, A. and Teboulle, M. (2009), A Fast Iterative \n Shrinkage-Thresholding Algorithm for Linear Inverse Problems.\n SIAM Journal on Imaging Sciences 2009 2:1, 183-202.\n doi:10.1137/080716542 \n\n.. [5] O\u2019Donoghue, B. & Cand\u00e8s, E. (2015), Adaptive Restart for\n Accelerated Gradient Schemes. Found Comput Math 15: 715.\n doi:10.1007/s10208-013-9150-\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "group-lasso", "package_url": "https://pypi.org/project/group-lasso/", "platform": "", "project_url": "https://pypi.org/project/group-lasso/", "project_urls": null, "release_url": "https://pypi.org/project/group-lasso/1.1.1/", "requires_dist": [ "numpy", "scikit-learn" ], "requires_python": "", "summary": "Fast group lasso regularised linear models in a sklearn-style API.", "version": "1.1.1" }, "last_serial": 5786029, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "31c5e0092d06011a035db94ca0ba6326", "sha256": "5a24b485354697cbf2e9c5c05616ab00c0244b019f6f1c7b20ec1883219d7fdf" }, "downloads": -1, "filename": "group_lasso-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "31c5e0092d06011a035db94ca0ba6326", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9911, "upload_time": "2019-07-14T10:43:55", "url": "https://files.pythonhosted.org/packages/27/89/0323bf39d6e000732f8fe7b7894fb1ae892ade7fbd2854b35571f92e5597/group_lasso-0.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "287e514adb36cc282075c6e61aa62222", "sha256": "5b664877cbd75fdd2f1024eccb5bcc3215161c6e102ff50d84411a7c0049eff3" }, "downloads": -1, "filename": "group-lasso-0.1.0.tar.gz", "has_sig": false, "md5_digest": "287e514adb36cc282075c6e61aa62222", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12132, "upload_time": "2019-07-14T10:43:57", "url": "https://files.pythonhosted.org/packages/ac/85/ce88f95cf24f39c2ffa6ce91a1f1eb2a47f6034aa48028f3f3786836f385/group-lasso-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "fd10a239e65ce8e8842ca8d24657efd1", "sha256": "d7ab6a8d1275653c201ee5ef683fcfbb2a343ef2c81d978ccf3bb098399ecd13" }, "downloads": -1, "filename": "group_lasso-0.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "fd10a239e65ce8e8842ca8d24657efd1", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9877, "upload_time": "2019-07-14T11:54:32", "url": "https://files.pythonhosted.org/packages/de/d3/c191c85513a8f255e943a20f7dbe6e844ee37362ae3545215924446a8535/group_lasso-0.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1fa3cc96906f19e9c761d95855fc8937", "sha256": "e03385a6f01d03b9e5794a1049ae69084eb169ff6975e1c5b9d97465b23c50b8" }, "downloads": -1, "filename": "group-lasso-0.1.1.tar.gz", "has_sig": false, "md5_digest": "1fa3cc96906f19e9c761d95855fc8937", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12193, "upload_time": "2019-07-14T11:54:34", "url": "https://files.pythonhosted.org/packages/4f/5e/0c2b37eb4aba93cc1312f2da86c92407d4073a7ee0d86d94962c6abbad01/group-lasso-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "86e89dd944c3bddcda201be71fb58177", "sha256": "66313549c34c5b661e6f0fb42c0e52a66ffb152ade3b82c48ad4f8dc29b7dfd1" }, "downloads": -1, "filename": "group_lasso-0.1.2-py3-none-any.whl", "has_sig": false, "md5_digest": "86e89dd944c3bddcda201be71fb58177", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 13130, "upload_time": "2019-07-15T15:22:23", "url": "https://files.pythonhosted.org/packages/64/dd/8100ea7ba79f1a46ffa1372c1a55b19611f2c99f2cc9d7fea6d343a26185/group_lasso-0.1.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "564d5b8717e0ba8f31ee61bda2271d8a", "sha256": "a67ad3d3a965466e532c6de32c873a70ac02319486ee0ae599da3c3cb7f774ed" }, "downloads": -1, "filename": "group-lasso-0.1.2.tar.gz", "has_sig": false, "md5_digest": "564d5b8717e0ba8f31ee61bda2271d8a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 16474, "upload_time": "2019-07-15T15:22:24", "url": "https://files.pythonhosted.org/packages/c7/a9/ffb44ed101663c35de9081fcc169e7398e61f320f61e513ef34cebc673c4/group-lasso-0.1.2.tar.gz" } ], "0.1.3": [ { "comment_text": "", "digests": { "md5": "28dee915f84ec64b04b92dff96fc0758", "sha256": "bf080ecb214e73f110d450e021de67e5994a8a1fffd87b3f6b24a77999d233d5" }, "downloads": -1, "filename": "group_lasso-0.1.3-py3-none-any.whl", "has_sig": false, "md5_digest": "28dee915f84ec64b04b92dff96fc0758", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 13095, "upload_time": "2019-07-16T16:12:43", "url": "https://files.pythonhosted.org/packages/bf/7e/e1590c7b4c98631890cf4680ef76fc411a7ee9e8c5678b176951815cba6a/group_lasso-0.1.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3efea2e25d42612a8e28bc2c6f604a90", "sha256": "01d42102ae54f211fb32058a90e32e0a8fdfc1a5fd6ee534febe2cee39701956" }, "downloads": -1, "filename": "group-lasso-0.1.3.tar.gz", "has_sig": false, "md5_digest": "3efea2e25d42612a8e28bc2c6f604a90", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 16418, "upload_time": "2019-07-16T16:12:45", "url": "https://files.pythonhosted.org/packages/c3/31/cb36cb4ade6f5e22845955c54aa6f9c3bd90d115fcb32481413735e1e264/group-lasso-0.1.3.tar.gz" } ], "0.1.4": [ { "comment_text": "", "digests": { "md5": "b5dd2f5d5449fd7455e486ddb12e31a6", "sha256": "66ce3aca16065e3b500a53f1c176835f4fd1c53ce01e3f2e7b12b1abc5d015ef" }, "downloads": -1, "filename": "group_lasso-0.1.4-py3-none-any.whl", "has_sig": false, "md5_digest": "b5dd2f5d5449fd7455e486ddb12e31a6", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 13241, "upload_time": "2019-07-19T11:41:41", "url": "https://files.pythonhosted.org/packages/c1/eb/9ef5a7830b678fb800f8daaec28094cc2ea14f848ab29c5496df8ffac56f/group_lasso-0.1.4-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "27af83e4a792b93e7782c0da62977739", "sha256": "9dafd91f799853e3bcdc383e86ed785aa6d3e84394cf9e5ea891b673855c6cc4" }, "downloads": -1, "filename": "group-lasso-0.1.4.tar.gz", "has_sig": false, "md5_digest": "27af83e4a792b93e7782c0da62977739", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17064, "upload_time": "2019-07-19T11:41:43", "url": "https://files.pythonhosted.org/packages/98/6b/c2652728a35e44a657c82c2b9b54d63bfbe77f5bdb4e2ceabfa36836f66d/group-lasso-0.1.4.tar.gz" } ], "1.0.0": [ { "comment_text": "", "digests": { "md5": "65c2eeb31de3858b3d45380a58d43fd3", "sha256": "3bea06e37000ea28d4e4debc7abcc6b6ae19dc29d5f4ab5ab8e96606ffaece1e" }, "downloads": -1, "filename": "group_lasso-1.0.0-py3-none-any.whl", "has_sig": false, "md5_digest": "65c2eeb31de3858b3d45380a58d43fd3", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 13668, "upload_time": "2019-07-27T18:51:04", "url": "https://files.pythonhosted.org/packages/ea/64/d5d558eb1fe72e1cc2937514fcab74cf30324a12ab732094013d2c88a351/group_lasso-1.0.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "020701f0ea5ea34e473db9079687bbe5", "sha256": "d47cb6b33391b9939ecfa5d7aea90f42311671e7ab6bc973bd7e5f7bc867e327" }, "downloads": -1, "filename": "group-lasso-1.0.0.tar.gz", "has_sig": false, "md5_digest": "020701f0ea5ea34e473db9079687bbe5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17362, "upload_time": "2019-07-27T18:51:07", "url": "https://files.pythonhosted.org/packages/60/8b/70fe46f0e17f9667a56e3097a8214f6f83e0c8e2234e9c8377edf844313d/group-lasso-1.0.0.tar.gz" } ], "1.1.1": [ { "comment_text": "", "digests": { "md5": "48ab6386e6ef1e4adc38872c18e00205", "sha256": "d549153767960f34a2b457f84dfe5d79e5ee1d3ec5f2a9f354dfa6dec12fd476" }, "downloads": -1, "filename": "group_lasso-1.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "48ab6386e6ef1e4adc38872c18e00205", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 13767, "upload_time": "2019-09-05T11:25:37", "url": "https://files.pythonhosted.org/packages/57/15/fa202708fde6607f5f825c94578308d92cde2fd7872e72990b5b8f633c9f/group_lasso-1.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "cad3fb6efc3872d6e48ccf6cf2578acd", "sha256": "ddec3979f4a38809100e918541bf000aceb118b93b471659b2a8aa03dac6d310" }, "downloads": -1, "filename": "group-lasso-1.1.1.tar.gz", "has_sig": false, "md5_digest": "cad3fb6efc3872d6e48ccf6cf2578acd", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5638067, "upload_time": "2019-09-05T11:25:40", "url": "https://files.pythonhosted.org/packages/0b/cf/03389b6ce50a1989543a986e3cc6014b23af84406cad0693b6dd1767b418/group-lasso-1.1.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "48ab6386e6ef1e4adc38872c18e00205", "sha256": "d549153767960f34a2b457f84dfe5d79e5ee1d3ec5f2a9f354dfa6dec12fd476" }, "downloads": -1, "filename": "group_lasso-1.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "48ab6386e6ef1e4adc38872c18e00205", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 13767, "upload_time": "2019-09-05T11:25:37", "url": "https://files.pythonhosted.org/packages/57/15/fa202708fde6607f5f825c94578308d92cde2fd7872e72990b5b8f633c9f/group_lasso-1.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "cad3fb6efc3872d6e48ccf6cf2578acd", "sha256": "ddec3979f4a38809100e918541bf000aceb118b93b471659b2a8aa03dac6d310" }, "downloads": -1, "filename": "group-lasso-1.1.1.tar.gz", "has_sig": false, "md5_digest": "cad3fb6efc3872d6e48ccf6cf2578acd", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5638067, "upload_time": "2019-09-05T11:25:40", "url": "https://files.pythonhosted.org/packages/0b/cf/03389b6ce50a1989543a986e3cc6014b23af84406cad0693b6dd1767b418/group-lasso-1.1.1.tar.gz" } ] }