{ "info": { "author": "Ulf Hamster", "author_email": "554c46@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "[![Build Status](https://travis-ci.org/kmedian/binsel.svg?branch=master)](https://travis-ci.org/kmedian/binsel)\n[![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/kmedian/binsel/master?urlpath=lab)\n\n# binsel\nFeature selection for Hard Voting classifier.\n\n\n## Table of Contents\n* [Installation](#installation)\n* [Usage](#usage)\n* [Commands](#commands)\n* [Support](#support)\n* [Contributing](#contributing)\n\n\n## Installation\nThe `binsel` [git repo](http://github.com/kmedian/binsel) is available as [PyPi package](https://pypi.org/project/binsel)\n\n```\npip install binsel\n```\n\n\n## Usage\nCheck the [`binsel_hardvote` example](http://github.com/kmedian/binsel/examples/binsel_hardvote.ipynb) folder for notebooks.\n\n\n## Algorithm\nThe task is to select e.g. `n_select=3` binary features from a pool of many binary features.\nThese binary features might be the prediction of binary classifiers. \nThe selected binary features are then combined into one hard-voting classifier.\n\nA voting classifier should have the following properties\n\n* each voter (a binary feature) should be highly correlated to the target variable\n* the selected binary features should be uncorrelated.\n\nThe algorithm works as follows \n\n1. Generate multiple correlation matrices by bootstrapping (see [`korr.bootcorr`](https://github.com/kmedian/korr/blob/master/korr/bootcorr.py)). This includes `corr(X_i, X_j)` as well as `corr(Y, X_i)` computation. Also store the oob samples for evaluation.\n2. For each correlation matrix do ...\n a. Preselect the `i*` with the highest `abs(corr(Y, X_i))` estimates (e.g. pick the `n_pre=?` highest absolute correlations)\n b. Slice a correlation matrix `corr(X_i*, X_j*)` and find the least correlated combination of `n_select=?` features. (see [`korr.mincorr`](https://github.com/kmedian/korr/blob/master/korr/mincorr.py))\n c. Compute the out-of-bag (OOB) performance (see step 1) of the hard-voter with the selected `n_select=?` binary features\n3. Select the binary feature combination with the best OOB performance as final model.\n\n\n## Commands\n* Check syntax: `flake8 --ignore=F401`\n* Run Unit Tests: `python -W ignore -m unittest discover`\n* Remove `.pyc` files: `find . -type f -name \"*.pyc\" | xargs rm`\n* Remove `__pycache__` folders: `find . -type d -name \"__pycache__\" | xargs rm -rf`\n* Upload to PyPi with twine: `python setup.py sdist && twine upload -r pypi dist/*`\n\n\n## Support\nPlease [open an issue](https://github.com/kmedian/binsel/issues/new) for support.\n\n\n## Contributing\nPlease contribute using [Github Flow](https://guides.github.com/introduction/flow/). Create a branch, add commits, and [open a pull request](https://github.com/kmedian/binsel/compare/).", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://github.com/kmedian/binsel", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "binsel", "package_url": "https://pypi.org/project/binsel/", "platform": "", "project_url": "https://pypi.org/project/binsel/", "project_urls": { "Homepage": "http://github.com/kmedian/binsel" }, "release_url": "https://pypi.org/project/binsel/0.2.1/", "requires_dist": null, "requires_python": ">=3.6", "summary": "Feature selection for Hard Voting classifier", "version": "0.2.1" }, "last_serial": 5719233, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "37e1c96f96ff44701d3c7869f16ad80e", "sha256": "3ade199552c1bb9a01543d7f180ed8a448862bb17da5569bc00331def576e38d" }, "downloads": -1, "filename": "binsel-0.1.0.tar.gz", "has_sig": false, "md5_digest": "37e1c96f96ff44701d3c7869f16ad80e", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 9230, "upload_time": "2019-07-01T13:34:38", "url": "https://files.pythonhosted.org/packages/fb/62/b09b6b17dd9b9887e35c653bc2ac3d48b17e28e0650007e09f53e24fef32/binsel-0.1.0.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "bd9475e454df2899ca95922e34465371", "sha256": "85c628d6a451263058c9cd5d13a62b1ba7bc1642eac3407f433f081fe63ffc45" }, "downloads": -1, "filename": "binsel-0.1.2.tar.gz", "has_sig": false, "md5_digest": "bd9475e454df2899ca95922e34465371", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 18742, "upload_time": "2019-08-15T09:36:25", "url": "https://files.pythonhosted.org/packages/87/74/8f3af1fb92aadd6ae681be4c1662c66af4c688d46561f767bb960ab5a8e3/binsel-0.1.2.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "4cac7d1708c899d1dac1c2ac3709b4c8", "sha256": "45ee5204c183aa39785f6272f685197e6e4566585cd5fe519fe92e8bcc6ec77a" }, "downloads": -1, "filename": "binsel-0.2.0.tar.gz", "has_sig": false, "md5_digest": "4cac7d1708c899d1dac1c2ac3709b4c8", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 398675, "upload_time": "2019-08-23T06:55:24", "url": "https://files.pythonhosted.org/packages/db/14/d5ba52ed1164b6843bede9e050e1e5590ee7b936e031fb8924c3f2d896be/binsel-0.2.0.tar.gz" } ], "0.2.1": [ { "comment_text": "", "digests": { "md5": "58f578e63e9fd8d9269c60e51dd318ab", "sha256": "d06f32d0f67c2f544a7ae37bb87265efe00277f5733c21bcfbef86a0256d630b" }, "downloads": -1, "filename": "binsel-0.2.1.tar.gz", "has_sig": false, "md5_digest": "58f578e63e9fd8d9269c60e51dd318ab", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 398736, "upload_time": "2019-08-23T07:06:43", "url": "https://files.pythonhosted.org/packages/e4/13/d676a2e003bd723271c67a9bc91e91524df623c03f7591ef2d43383d802d/binsel-0.2.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "58f578e63e9fd8d9269c60e51dd318ab", "sha256": "d06f32d0f67c2f544a7ae37bb87265efe00277f5733c21bcfbef86a0256d630b" }, "downloads": -1, "filename": "binsel-0.2.1.tar.gz", "has_sig": false, "md5_digest": "58f578e63e9fd8d9269c60e51dd318ab", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 398736, "upload_time": "2019-08-23T07:06:43", "url": "https://files.pythonhosted.org/packages/e4/13/d676a2e003bd723271c67a9bc91e91524df623c03f7591ef2d43383d802d/binsel-0.2.1.tar.gz" } ] }