{ "info": { "author": "DiegoSong, TianjinMouth", "author_email": "ssyshenn@gmail.com, tjszgaosan@sina.cn", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: Apache Software License", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: Implementation :: CPython", "Programming Language :: Python :: Implementation :: PyPy" ], "description": "[![Contributions welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg)](CONTRIBUTING.md)\n[![GitHub top language](https://img.shields.io/github/languages/top/TianjinMouth/Tea)](https://img.shields.io/github/languages/top/TianjinMouth/TeaML)\n[![GitHub Issues](https://img.shields.io/github/issues/TianjinMouth/Tea.svg)](https://github.com/TianjinMouth/TeaML/issues)\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://github.com/TianjinMouth/TeaML/blob/master/LICENSE)\n\n# **TeaML - Automated Modeling in Financial Domain**\n\n\ud83c\udf89\ud83c\udf89\ud83c\udf89 We are proud to announce that we design an automatic modeling robot based on `financial risk control field`! \ud83c\udf89\ud83c\udf89\ud83c\udf89\n\n## Table of Contents\n- [Overview](#overview)\n- [Our Goal](#our-goal)\n- [Performance](#performance)\n- [Quick start](#quick-start)\n- [Support](#support)\n\n## \ud83d\udce3 Overview\n\nTeaML is a simple and design friendly automatic modeling learning framework.\nIt can automatically model from beginning to end, and in the end, it will also help you output a model report about the model.\n\n- **Human-friendly**. TeaML's code is straightforward, well documented and tested, which makes it very easy to understand and modify.\n- **Built-in financial risk control field**. TeaML built-in financial risk control field, it fits well with the use in the field of financial risk control, including WOE, and is very suitable for this scenario.\n- **Flexible**. TeaML provides a variety of variable selection methods, each of which can be self-defined. You can also assemble these algorithms in different order. \n- **Final Report**. TeaML can provide you with a final version of the model report, so that you can find the details in your model. \n\n## \u2728 Our Goal\n\n- **Automation** In the near future, we will update and add some fantastic algorithms, including but not limited to variable generation (VariableCluster is already in experimental function).\n- **Common Use** All algorithmic engineers, including model analysts, can use it to increase efficiency as long as you have some algorithmic knowledge.\n- **Wonderful thing** We hope that there will be many wonderful things to add. At present, there is no optimization algorithm and parallel strategy in this version. We will try to add these things in later iterations, maybe not too long.\n\n## \u23f3 Performance\n\n| Task | Strategy | Dataset | Score | Detail |\n| ------------------------------------------- | -------- | ------------------------- | ---------------- | -------------------------- |\n| Predicting the Delay Rate of Financial Risk | TeaML | Financial Risk Data | **0.6894** (AUC) | WOE(Monotonic) + STEPWISE |\n| Predicting the Delay Rate of Financial Risk | LightGBM | Financial Risk Data | **0.6773** (AUC) | LightGBM |\n\n\n## \ud83d\udcdd Quick start\n\n### Requirements and Installation\n\n\nThe project is based on **Python 3.7**, **Python 3.6** may also work, but it is not fully tested to ensure that all functions are normal.\n\nIf you haven't installed **lightgbm**, you need to install the package yourself.\n\n```bash\npip install TeaML\n```\n\n### Example Usage\n\nLet's run a simple version.\n\n```python\nfrom TeaML.utils.tea_encoder import *\nfrom TeaML.utils.tea_filter import *\nfrom TeaML.utils.tea_utils import *\nfrom TeaML.utils.auto_bin_woe import *\nimport TeaML\n\ndata = pd.read_csv(\"TeaML/examples.csv\")\n\n# encoder\nct = TeaBadRateEncoder(num=1)\nme = TeaMeanEncoder(categorical_features=['city'])\nt = TeaOneHotEncoder()\nencoder = [me]\n\n# woe & feature selection\nwoe = TeaML.WOE(bins=10, bad_rate_merge=True, bad_rate_sim_threshold=0.05, psi_threshold=0.1, iv_threshold=None)\niv = FilterIV(200, 100)\nvif = FilterVif(50)\nmod = FilterModel('lr', 70)\nnova = FilterANOVA(40, 30)\ncoline = FilterCoLine({'penalty': 'l2', 'C': 0.01, 'fit_intercept': True})\nfshap = FilterSHAP(70)\noutlier = OutlierTransform()\nfiltercor = FilterCorr(20)\nstepwise = FilterStepWise(method='p_value')\nmethod = [woe, stepwise]\n\n# main\ntea = TeaML.Tea(['core_lend_request_id', 'lend_customer_id', 'customer_sex',\n 'data_center_id', 'trace_back_time', 'mobile', 'user_id', 'id_no', 'task_id', 'id',\n 'id_district_name', 'id_province_name', 'id_city_name', 'pass_time'],\n 'is_overdue_M0',\n datetime_feature='pass_time',\n split_method='oot',\n file_path='report.xlsx')\ntea.wash(data, null_drop_rate=0.8, zero_drop_rate=0.9)\ntea.cook(encoder)\ntea.select(method)\ntea.drink(LogisticRegression(penalty='l2', C=1, class_weight='balanced'))\ntea.sleep(woe.bins)\n\n\n'''\nPreliminary screening...\n100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 19/19 [00:00<00:00, 29.19it/s]\n100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 19/19 [00:00<00:00, 50.03it/s]\n100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 19/19 [00:00<00:00, 55.00it/s]\n100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 19/19 [00:00<00:00, 104.02it/s]\n 0%| | 0/19 [00:003.6", "summary": "Automated Modeling in Financial Domain. TeaML is a simple and design friendly automatic modeling learning framework.", "version": "0.1.0" }, "last_serial": 5942949, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "d344f70be99ad200dbbb338a3bf7aebc", "sha256": "020fc7308f7adf959a94a31fb1d406e36b345f8972c52e232ff19a9354f2545e" }, "downloads": -1, "filename": "TeaML-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "d344f70be99ad200dbbb338a3bf7aebc", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">3.6", "size": 8792028, "upload_time": "2019-10-08T05:33:53", "url": "https://files.pythonhosted.org/packages/97/31/e37020ce0826731983ac402f9c82fd3c9167756c3cb227bccb6bb36c7b8f/TeaML-0.1.0-py3-none-any.whl" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "d344f70be99ad200dbbb338a3bf7aebc", "sha256": "020fc7308f7adf959a94a31fb1d406e36b345f8972c52e232ff19a9354f2545e" }, "downloads": -1, "filename": "TeaML-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "d344f70be99ad200dbbb338a3bf7aebc", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">3.6", "size": 8792028, "upload_time": "2019-10-08T05:33:53", "url": "https://files.pythonhosted.org/packages/97/31/e37020ce0826731983ac402f9c82fd3c9167756c3cb227bccb6bb36c7b8f/TeaML-0.1.0-py3-none-any.whl" } ] }