{ "info": { "author": "pymetrics Data Team", "author_email": "data@pymetrics.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Developers", "Intended Audience :: Education", "Intended Audience :: Financial and Insurance Industry", "Intended Audience :: Healthcare Industry", "Intended Audience :: Legal Industry", "Intended Audience :: Other Audience", "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License", "Natural Language :: English", "Programming Language :: Python :: 2", "Programming Language :: Python :: 3", "Topic :: Scientific/Engineering", "Topic :: Scientific/Engineering :: Artificial Intelligence", "Topic :: Scientific/Engineering :: Visualization" ], "description": "# audit-AI\n\n\n\nOpen Sourced Bias Testing for Generalized Machine Learning Applications\n\n`audit-AI` is a Python library built on top of `pandas` and `sklearn` that\nimplements fairness-aware machine learning algorithms. `audit-AI` was developed\nby the Data Science team at [pymetrics](https://www.pymetrics.com/)\n\n# Bias Testing for Generalized Machine Learning Applications\n\n`audit-AI` is a tool to measure and mitigate the effects of discriminatory\npatterns in training data and the predictions made by machine learning\nalgorithms trained for the purposes of socially sensitive decision processes.\n\nThe overall goal of this research is to come up with a reasonable way to think\nabout how to make machine learning algorithms more fair. While identifying\npotential bias in training datasets and by consequence the machine learning\nalgorithms trained on them is not sufficient to solve the problem of\ndiscrimination, in a world where more and more decisions are being automated\nby Artificial Intelligence, our ability to understand and identify the degree\nto which an algorithm is fair or biased is a step in the right direction.\n\n# Regulatory Compliance and Checks for Practical and Statistical Bias\n\nAccording to the Uniform Guidelines on Employee Selection Procedures (UGESP;\nEEOC et al., 1978), all assessment tools should comply to fair standard of\ntreatment for all protected groups. Audit-ai extends this to machine learning\nmethods. Let's say we build a model that makes some prediction about people.\nThis model could theoretically be anything -- a prediction of credit scores,\nthe likelihood of prison recidivism, the cost of a home loan, etc. Audit-ai\ntakes data from a known population (e.g., credit information from people of\nmultiple genders and ethnicities), and runs them through the model in question.\nThe proportional pass rates of the highest-passing demographic group are compared\nto the lowest-passing group for each demographic category (gender and\nethnicity). This proportion is known as the bias ratio.\n\nAudit-ai determines whether groups are different according to a standard of\nstatistical significance (within a statistically different margin of error) or\npractical significance (whether a difference is large enough to matter on a\npractical level). The exact threshold of statistical and practical significance\ndepends on the field and use-case. Within the hiring space, the EEOC often\nuses a statistical significance of p < .05 to determine bias, and a bias ratio\nbelow the 4/5ths rule to demonstrate practical significance.\n\nThe 4/5ths rule effectively states that the lowest-passing group has to be\nwithin 4/5ths of the pass rate of the highest-passing group. Consider an example\nwith 4,000 users, 1,000 of each of the following groups: Asian, Black,\nHispanic/Latino, and White, who pass at a frequency of 250, 270, 240 and 260\nusers, respectively. The highest and lowest passing groups are Black (27%) and\nHispanic/Latino (24%), respectively. The bias ratio is therefore 24/27 or .889.\nAs this ratio is greater than .80 (4/5ths), the legal requirement enforced by\nthe EEOC, the model would pass the check for practical significance. Likewise,\na chi-squared test (a common statistical test for count data) would report that\nthese groups are above the p = .05 threshold, and therefore pass the check for\nstatistical significance.\n\nAudit-ai also offers tools to check for differences over time or across\ndifferent regions, using the Cochran-Mantel-Hanzel test, a common test in\nregulatory circles. To our knowledge this is the first implementation of this\nmeasure in an open-source python format.\n\n# Features\n\nHere are a few of the bias testing and algorithm auditing techniques\nthat this library implements.\n\n### Classification tasks\n\n- 4/5th, fisher, z-test, bayes factor, chi squared\n- sim_beta_ratio, classifier_posterior_probabilities\n\n### Regression tasks\n\n- anova\n- 4/5th, fisher, z-test, bayes factor, chi squared\n- group proportions at different thresholds\n\n# Installation\n\nThe source code is currently hosted on GitHub: https://github.com/pymetrics/audit-ai\n\nYou can install the latest released version with `pip`.\n\n```\n# pip\npip install audit-AI\n```\n\nIf you install with pip, you'll need to install scikit-learn, numpy, and pandas\nwith either pip or conda. Version requirements:\n\n- numpy\n- scipy\n- pandas\n\nFor vizualization:\n- matplotlib\n- seaborn\n\n# How to use this package:\n\n```python\n\nfrom auditai.misc import bias_test_check\n\nX = df.loc[:,features]\ny_pred = clf.predict_proba(X)\n\n# test for bias\nbias_test_check(labels=df['gender'], results=y_pred, category='Gender')\n\n>>> *Gender passes 4/5 test, Fisher p-value, Chi-Squared p-value, z-test p-value and Bayes Factor at 50.00*\n\n```\nTo get a plot of the different tests at different thresholds:\n\n```python\n\nfrom auditai.viz import plot_threshold_tests\n\nX = df.loc[:,features]\ny_pred = clf.predict_proba(X)\n\n# test for bias\nplot_threshold_tests(labels=df['gender'], results=y_pred, category='Gender')\n\n```\n\"Sample\n\n# Example Datasets\n\n- [german-credit](https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data))\n- [student-performance](https://archive.ics.uci.edu/ml/datasets/student+performance)\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/pymetrics/audit-ai", "keywords": "audit,adverse impact,artificial intelligence,machine learning,fairness,bias,accountability,transparency,discrimination", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "audit-AI", "package_url": "https://pypi.org/project/audit-AI/", "platform": "", "project_url": "https://pypi.org/project/audit-AI/", "project_urls": { "Company": "https://www.pymetrics.com/science/", "Homepage": "https://github.com/pymetrics/audit-ai" }, "release_url": "https://pypi.org/project/audit-AI/0.0.3/", "requires_dist": [ "numpy", "scipy", "pandas", "matplotlib", "statsmodels", "nose; extra == 'dev'", "coverage; extra == 'dev'", "flake8; extra == 'dev'", "detox; extra == 'dev'" ], "requires_python": "", "summary": "audit-AI detects demographic differences in the output of machine learning models or other assessments", "version": "0.0.3" }, "last_serial": 5381478, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "d25861feaf008479e94a286ae6dff38b", "sha256": "a5a40ba383b2184792508ef2b4396b839f626d0a4887b664c1800c6602531ef7" }, "downloads": -1, "filename": "audit_AI-0.0.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "d25861feaf008479e94a286ae6dff38b", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 23087, "upload_time": "2018-05-18T20:29:10", "url": "https://files.pythonhosted.org/packages/06/70/dec9ffba6826e82a05ac48a74b845dbdfc22382e32c4d1da8eac09a0178e/audit_AI-0.0.1-py2.py3-none-any.whl" } ], "0.0.1a1": [ { "comment_text": "", "digests": { "md5": "f1ed4be4ab93c6f70816382b9fcb9df7", "sha256": "9c50ae34f5f322440affad01f753d899e2230e24ea2e5d25edc19da0d5654371" }, "downloads": -1, "filename": "audit_AI-0.0.1a1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "f1ed4be4ab93c6f70816382b9fcb9df7", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 23116, "upload_time": "2018-05-18T17:52:34", "url": "https://files.pythonhosted.org/packages/f8/71/8063187367abf31d1e358e22f24cb4625f9dfd6c9bdfdfa1fcebd1c2115a/audit_AI-0.0.1a1-py2.py3-none-any.whl" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "824c751aa14802397f3de3e9e2c6deb2", "sha256": "5bc193361cabb3a1f6804a3d4f5fa1812a860d9b3b1ac81325ae78a334623067" }, "downloads": -1, "filename": "audit_AI-0.0.3-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "824c751aa14802397f3de3e9e2c6deb2", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 28715, "upload_time": "2019-06-10T15:22:26", "url": "https://files.pythonhosted.org/packages/a3/74/ab201ec9c5f670422e2d1cb11bfa9109aad93946cb68c6e421dc1d17f67e/audit_AI-0.0.3-py2.py3-none-any.whl" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "824c751aa14802397f3de3e9e2c6deb2", "sha256": "5bc193361cabb3a1f6804a3d4f5fa1812a860d9b3b1ac81325ae78a334623067" }, "downloads": -1, "filename": "audit_AI-0.0.3-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "824c751aa14802397f3de3e9e2c6deb2", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 28715, "upload_time": "2019-06-10T15:22:26", "url": "https://files.pythonhosted.org/packages/a3/74/ab201ec9c5f670422e2d1cb11bfa9109aad93946cb68c6e421dc1d17f67e/audit_AI-0.0.3-py2.py3-none-any.whl" } ] }