{ "info": { "author": "Edesio Alcoba\u00e7a, Felipe Alves Siqueira, Luis Paulo Faina Garcia", "author_email": "edesio@usp.br, felipe.siqueira@usp.br, lpgarcia@icmc.usp.br", "bugtrack_url": null, "classifiers": [ "Intended Audience :: Developers", "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License", "Natural Language :: English", "Operating System :: OS Independent", "Programming Language :: Python", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Topic :: Scientific/Engineering", "Topic :: Software Development" ], "description": "# pymfe: Python Meta-Feature Extractor\n[![Build Status](https://travis-ci.org/ealcobaca/pymfe.svg?branch=master)](https://travis-ci.org/ealcobaca/pymfe)\n[![codecov](https://codecov.io/gh/ealcobaca/pymfe/branch/master/graph/badge.svg)](https://codecov.io/gh/ealcobaca/pymfe)\n[![Documentation Status](https://readthedocs.org/projects/pymfe/badge/?version=latest)](https://pymfe.readthedocs.io/en/latest/?badge=latest)\n[![PythonVersion](https://img.shields.io/pypi/pyversions/pymfe.svg)](https://www.python.org/downloads/release/python-370/)\n[![Pypi](https://badge.fury.io/py/pymfe.svg)](https://badge.fury.io/py/pymfe)\n\n\nExtracts meta-features from datasets to support the design of recommendation systems based on Meta-Learning (MtL). The meta-features are able to characterize the complexity of datasets and to provide estimates of algorithm performance. The package contains not only the standard, but also more recent characterization measures. By making available a large set of meta-feature extraction functions, this package allows a comprehensive data characterization, a deep data exploration and a large number of MtL-based data analysis.\n\n## Measures\n\nIn MtL, meta-features are designed to extract general properties able to characterize datasets. The meta-feature values should provide relevant evidences about the performance of algorithms, allowing the design of MtL-based recommendation systems. Thus, these measures must be able to predict, with a low computational cost, the performance of the algorithms under evaluation. In this package, the meta-feature measures are divided into six groups:\n\n* **General**: General information related to the dataset, also known as simple measures, such as number of instances, attributes and classes.\n* **Statistical**: Standard statistical measures to describe the numerical properties of a distribution of data.\n* **Information-theoretic**: Particularly appropriate to describe discrete (categorical) attributes and their relationship with the classes.\n* **Model-based**: Measures designed to extract characteristics like the depth, the shape and size of a Decision Tree (DT) model induced from a dataset.\n* **Landmarking**: Represents the performance of simple and efficient learning algorithms. Include the subsampling and relative strategies to decrease the computation cost and enrich the relations between these meta-features (relative and subsampling landmarking are also available).\n* **Clustering:** Clustering measures extract information about dataset based on external validation indexes.\n\n## Dependencies\n\nThe main `pymfe` requirement is:\n* Python (>= 3.6)\n\n## Installation\n\nThe installation process is similar to other packages available on pip:\n\n```python\npip install -U pymfe\n```\n\nIt is possible to install the development version using:\n\n```python\npip install -U git+https://github.com/ealcobaca/pymfe\n```\n\nor\n\n```\ngit clone https://github.com/ealcobaca/pymfe.git\ncd pymfe\npython3 setup.py install\n```\n\n## Example of use\n\nThe simplest way to extract meta-features is instantiating the `MFE` class. The parameters are the measures, the group of measures and the summarization functions to be extracted. The default parameter is extract all the measures. The `fit` function can be called by passing the `X` and `y`. The `extract` function is used to extract the related measures. A simple example is given next:\n\n```python\n# Load a dataset\nfrom sklearn.datasets import load_iris\nfrom pymfe.mfe import MFE\n\ndata = load_iris()\ny = data.target\nX = data.data\n\n# Extract all measures\nmfe = MFE()\nmfe.fit(X, y)\nft = mfe.extract()\nprint(ft)\n\n# Extract general, statistical and information-theoretic measures\nmfe = MFE(groups=[\"general\", \"statistical\", \"info-theory\"])\nmfe.fit(X, y)\nft = mfe.extract()\nprint(ft)\n```\n\nSeveral measures return more than one value. To aggregate the returned values, summarization function can be used. This method can compute `min`, `max`, `mean`, `median`, `kurtosis`, `standard deviation`, among others. The default methods are the `mean` and the `sd`. Next, it is possible to see an example of the use of this method:\n\n```python\n## Extract all measures using min, median and max \nmfe = MFE(summary=[\"min\", \"median\", \"max\"])\nmfe.fit(X, y)\nft = mfe.extract()\nprint(ft)\n\n## Extract all measures using quantile\nmfe = MFE(summary=[\"quantiles\"])\nmfe.fit(X, y)\nft = mfe.extract()\nprint(ft)\n```\n\nIt is possible to pass custom arguments to every metafeature using MFE `extract` method kwargs. The keywords must be the target metafeature name, and the value must be a dictionary in the format {`argument`: `value`}, i.e., each key in the dictionary is a target argument with its respective value. In the example below, the extraction of metafeatures `min` and `max` happens as usual, but the metafeatures `sd,` `nr_norm` and `nr_cor_attr` will receive user custom argument values, which will interfere in each metafeature result.\n\n```python\n# Extract measures with custom user arguments\nmfe = MFE(features=[\"sd\", \"nr_norm\", \"nr_cor_attr\", \"min\", \"max\"])\nmfe.fit(X, y)\nft = mfe.extract(\n sd={\"ddof\": 0},\n nr_norm={\"method\": \"all\", \"failure\": \"hard\", \"threshold\": 0.025},\n nr_cor_attr={\"threshold\": 0.6},\n)\nprint(ft)\n```\n\n## Documentation\nWe write a great Documentation to guide you on how to use the pymfe library. You can find the Documentation in this [link](https://pymfe.readthedocs.io/en/latest/?badge=latest).\nYou can find in the documentation interesting pages like:\n* [Getting started](https://pymfe.readthedocs.io/en/latest/install.html)\n* [API documentation](https://pymfe.readthedocs.io/en/latest/api.html)\n* [Examples](https://pymfe.readthedocs.io/en/latest/auto_examples/index.html)\n* [News about pymfe](https://pymfe.readthedocs.io/en/latest/new.html)\n\n## Developer notes\n\n* We are glad to accept any contributions, please check [Contributing](CONTRIVUTING.md) and the [Documentation](https://pymfe.readthedocs.io/en/latest/?badge=latest).\n* To submit bugs and feature requests, report at [project issues](https://github.com/ealcobaca/pymfe/issues).\n* In the current version, the meta-feature extractor supports only classification problems. The authors plan to extend the package to add clustering and regression measures and to support MtL evaluation measures. For more specific information on how to extract each group of measures, please refer to the functions documentation page and the examples contained therein. For a general overview of the `pymfe` package, please have a look at the associated documentation.\n\n## License\n\nThis project is licensed under the MIT License - see the [License](LICENSE) file for details.\n\n## Cite Us\n\nIf you use the `pymfe` or [`mfe`](https://github.com/rivolli/mfe) in scientific publication, we would appreciate citations to the following paper:\n```\n@article{}\n```\n\n## Acknowledgments\nWe would like to thank every [Contributor](https://github.com/ealcobaca/pymfe/graphs/contributors) directly or indirectly has helped this project to happen. Thank you all.\n\n## References\n\n1. Rivolli, A., Garcia, L. P. F., Soares, C., Vanschoren, J., and de Carvalho, A. C. P. L. F. (2018). Towards Reproducible Empirical Research in Meta-Learning. arXiv:1808.10406.\n2. Pinto, F., Soares, C., & Mendes-Moreira, J. (2016, April). Towards automatic generation of metafeatures. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 215-226). Springer, Cham.\n3. Brazdil, P., Carrier, C. G., Soares, C., & Vilalta, R. (2008). Metalearning: Applications to data mining. Springer Science & Business Media.\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "https://github.com/ealcobaca/pymfe/releases", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/ealcobaca/pymfe", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "pymfe", "package_url": "https://pypi.org/project/pymfe/", "platform": "", "project_url": "https://pypi.org/project/pymfe/", "project_urls": { "Download": "https://github.com/ealcobaca/pymfe/releases", "Homepage": "https://github.com/ealcobaca/pymfe" }, "release_url": "https://pypi.org/project/pymfe/0.1.0/", "requires_dist": [ "numpy", "scipy", "sklearn", "patsy", "pandas", "statsmodels", "pytest ; extra == 'code-check'", "mypy ; extra == 'code-check'", "liac-arff ; extra == 'code-check'", "flake8 ; extra == 'code-check'", "pylint ; extra == 'code-check'", "sphinx ; extra == 'docs'", "sphinx-gallery ; extra == 'docs'", "sphinx-rtd-theme ; extra == 'docs'", "numpydoc ; extra == 'docs'", "liac-arff ; extra == 'docs'", "pytest ; extra == 'tests'", "pytest-cov ; extra == 'tests'", "liac-arff ; extra == 'tests'" ], "requires_python": "", "summary": "Meta-feature Extractor", "version": "0.1.0" }, "last_serial": 5793854, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "2b836dfa32a08bcc0721f9fe93639f05", "sha256": "bf6bf1330aa40d7c9339562b23820447891ee6913855dcdb23a7608d5557944b" }, "downloads": -1, "filename": "pymfe-0.0.1.tar.gz", "has_sig": false, "md5_digest": "2b836dfa32a08bcc0721f9fe93639f05", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 55728, "upload_time": "2019-04-05T19:42:44", "url": "https://files.pythonhosted.org/packages/cd/4b/02cb2024f07b12e13ec2bd2341b27a65a6c927143eb6edba35a0f518d0c1/pymfe-0.0.1.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "2c4026b37974f99397446bb88f389384", "sha256": "66d0ccd138cf9fc146cf7635d8c1f14c00c3049a96747475fe6f62133b7afdf5" }, "downloads": -1, "filename": "pymfe-0.0.2.tar.gz", "has_sig": false, "md5_digest": "2c4026b37974f99397446bb88f389384", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 55647, "upload_time": "2019-04-05T20:19:06", "url": "https://files.pythonhosted.org/packages/a4/23/d3898177978abcac463fe0c121650329a7249abd9f3f8c8804ba2cd2abe5/pymfe-0.0.2.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "400d57659d4e3277260c4138796221ba", "sha256": "6dc9964cb038a12c523ba1b92225970488ed35ac3622dfbd2d4c13cff760aea2" }, "downloads": -1, "filename": "pymfe-0.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "400d57659d4e3277260c4138796221ba", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 64536, "upload_time": "2019-05-02T20:44:59", "url": "https://files.pythonhosted.org/packages/6f/f8/3def2324872a07a83ac15d555130b09ba9bf6cca2e1f58eb00c89aa2d6cf/pymfe-0.0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "ea431e383ed09018215c56081f704c61", "sha256": "11f460400d5029c0bcb41bb9c16e0815b97378a71b42b3b81a7c8376778679cd" }, "downloads": -1, "filename": "pymfe-0.0.3.tar.gz", "has_sig": false, "md5_digest": "ea431e383ed09018215c56081f704c61", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 55726, "upload_time": "2019-05-02T20:45:01", "url": "https://files.pythonhosted.org/packages/03/94/6bc4f49db9770fa5b31b86b775a662e88d7a42ef42471b3cc10066558495/pymfe-0.0.3.tar.gz" } ], "0.1.0": [ { "comment_text": "", "digests": { "md5": "26883fb99270da656b872725429a2206", "sha256": "f9b9620f43bdbf73f23ccd1b8fa22540dbd31de0bbcd2f9e194cef867f8a071c" }, "downloads": -1, "filename": "pymfe-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "26883fb99270da656b872725429a2206", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 93089, "upload_time": "2019-09-06T20:41:40", "url": "https://files.pythonhosted.org/packages/3e/26/42d309b18cf2108352985358eef57fb83ff545dcd9c7b5772689b7fe2cf9/pymfe-0.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "2b69dc20f44e2173b3363d2bc6f2ea35", "sha256": "4d073fad430f94bc842bd700ebd5968b8d92d6ef503f5d58ac5cfa59c4a4a2d6" }, "downloads": -1, "filename": "pymfe-0.1.0.tar.gz", "has_sig": false, "md5_digest": "2b69dc20f44e2173b3363d2bc6f2ea35", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 79528, "upload_time": "2019-09-06T20:41:42", "url": "https://files.pythonhosted.org/packages/a8/67/98de950f0bb70d4b0c511a0829bfbe25810181c0f927905af4e9f8f72e52/pymfe-0.1.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "26883fb99270da656b872725429a2206", "sha256": "f9b9620f43bdbf73f23ccd1b8fa22540dbd31de0bbcd2f9e194cef867f8a071c" }, "downloads": -1, "filename": "pymfe-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "26883fb99270da656b872725429a2206", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 93089, "upload_time": "2019-09-06T20:41:40", "url": "https://files.pythonhosted.org/packages/3e/26/42d309b18cf2108352985358eef57fb83ff545dcd9c7b5772689b7fe2cf9/pymfe-0.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "2b69dc20f44e2173b3363d2bc6f2ea35", "sha256": "4d073fad430f94bc842bd700ebd5968b8d92d6ef503f5d58ac5cfa59c4a4a2d6" }, "downloads": -1, "filename": "pymfe-0.1.0.tar.gz", "has_sig": false, "md5_digest": "2b69dc20f44e2173b3363d2bc6f2ea35", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 79528, "upload_time": "2019-09-06T20:41:42", "url": "https://files.pythonhosted.org/packages/a8/67/98de950f0bb70d4b0c511a0829bfbe25810181c0f927905af4e9f8f72e52/pymfe-0.1.0.tar.gz" } ] }