{ "info": { "author": "Siyuan Li", "author_email": "siyuan.li@fitchratings.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Programming Language :: Python :: 3.7" ], "description": "Function List:\n update_progress(job_title, progress):\n show status bar of long process. Operation purpose\n\n SummarizeDefault(raw_data,default_flag_column,by_vars,plot=False,continuous_var=False,n_bin=10):\n Summarize default data by different variables\n\n raw_data: pandas.DataFrame. The dataset contains the default column and groupby variable\n default_flag_column: String. column name of the default flag in dataset\n by_vars: String. Column name of the groupby variable\n continuous_var: Boolean. Default as False. If select as True, function will bucket the variable to do groupby.\n n_bin: int. If contunuous_var is set as True, function will bucket the variable into \"n_bin\" bins\n plot: Boolean. Default as False. If select True, function will generate graph\n\n return a DataFrame\n\n CalcROCPrep(s, default_flag_data):\n Preparation for ROC calculation. Not directly into result\n\n CalcROCStats(s,default_flag_data):\n Calculation of ROC stats\n s: pandas.DataFrame. Score generated from regression model\n default_flag_data: pandas.Dataframe. Default flag data from real transactions.\n\n return a dictionary of 3 stats\n\n CalcPredAccuracy(s,default_flag_data,n_bin=10,plot=False,is_default_rate=True):\n Compare the generated default probability with real default flag\n s: pandas.DataFrame. Score generated from regression model\n default_flag_data: pandas.Dataframe. Default flag data from real transactions.\n n_bin: int. default flag will be grouped into \"n_bin\" bins\n plot: Boolean. Default as False. If select True, function will generate graph\n is_default_rate. Boolean. Default as True. Select False only if this function is used to compare two series that are NOT between (0,1)\n\n return dataframe of bucket score VS actual default\n\n CalcModelROC(raw_data,response_var,explanatory_var_1,explanatory_var_2=\"\",explanatory_var_3=\"\",explanatory_var_4=\"\",explanatory_var_5=\"\",explanatory_var_6=\"\",explanatory_var_7=\"\",explanatory_var_8=\"\",explanatory_var_9=\"\",explanatory_var_10=\"\"):\n Directly generate ROC stats from a given model. Not suggested to use because of explanatory variable number limitation.\n raw_data: pandas.DataFrame. The dataset contains all variables used in model generation.\n response_var: string. Column name of response var.\n explanatory_var_1:string. Column name of response var.\n explanatory_var_2-10:string.Optional. Column name of response var.\n\n return dataframe\n\n CalcWOE_IV(data,default_flag_column,categorical_column,event_list=[1,\"Yes\"],continuous_variable=False,n_bucket=10):\n Calculate IV of each variable\n Calculation refer to: https://www.listendata.com/2015/03/weight-of-evidence-woe-and-information.html\n data: pandas.DataFrame. dataset that contains all variables\n default_flag_column: String. column name of the default flag in dataset\n categorical_column:String. column name of the categorical column\n continuous_var: Boolean. Default as False. If select as True, function will bucket the variable to do groupby.\n n_bin: int. If contunuous_var is set as True, function will bucket the variable into \"n_bin\" bins\n event_list: list. The list contains all possibility than can be considered as \"event\"\n\n return [IV value of certain variable, table of calculation]\n\n PerformRandomCV(data,response_var,explanatory_var_list=[],k=20,plot=True,sampling_as_default_percentage=False):\n Do Random Cross Validation of model to check model stability. \n Return standard deviation of each coefficients.\n raw_data: pandas.DataFrame. The dataset contains all variables used in model generation.\n response_var: string. Column name of response var.\n explanatory_var_list: list. List of all names of explanatory variables\n k: int. 1/k samples will be eliminated as train set.\n plot: Boolean. Default as True. Plot std. dev of each coefficient.\n sampling_as_default_percentage: Boolean. Default as False. If selected as True, sampling will be done as the default rate.\n\n return std.dev of coefficients during k-time regression\n\n PerformK_Fold(data,response_var, explanatory_var_list=[],k=10):\n Perform k-fold cross validation\n data: pandas.DataFrame. dataset that contains all variables.\n response_var: string. column name of response variable\n explanatory_var_list: list of strings. list that contains all explanatory variables\n\n return array of each fold score\n\n BestSubsetSelection(data,response_var,explanatory_var_list=[],plot=True):\n Run best subset selection\n data: pandas.DataFrame. raw dataset\n response_var: string. Column name of response var\n explanatory_var_list: list of string. Names of columns of all explanatory candidates.\n plot: If selected as True, it will plot the selection steps graph.\n\n return all combinations' RSS and R_squared\n\n ForwardStepwiseSelection(data,response_var,explanatory_var_list=[],plot=True):\n Do forward stepwise selection\n data: pandas.DataFrame. raw dataset\n response_var: string. Column name of response var\n explanatory_var_list: list of string. Names of columns of all explanatory candidates.\n plot: if selected as True, it will plot the selection steps graph\n\n return dataframe with AIC,BIC,R_squared of each variable combination\n\n CalcSpearmanCorrelation(data,var_1,var_2):\n Calculate spearman Correlation\n data: pandas.DataFrame. Contains var_1 and var_2\n var_1,var_2: string. column name of var_1 and var_2\n\n return float\n\n CalcKendallTau(data,var_1,var_2):\n Calculate KendallTau\n data: pandas.DataFrame. Contains var_1 and var_2\n var_1,var_2: string. column name of var_1 and var_2\n\n return float\n\n CalcSamplingGoodmanKruskalGamma(data,var_1, var_2,n_sample=1000):\n Calculate GoodmanKruskalGamma estimator with sampling data. Not suggest to use into report as this is just an estimator.\n data: pandas.DataFrame. Contains var_1 and var_2\n var_1,var_2: string. column name of var_1 and var_2\n\n return float\n\n CalcSomersD(data,var_1,var_2):\n Calculate Somers' D\n data: pandas.DataFrame. Contains var_1 and var_2\n var_1,var_2: string. column name of var_1 and var_2\n\n return float\n\n GenerateRankCorrelation(data,var_1,var_2):\n Calculate all 4 ranked correlation.\n data: pandas.DataFrame. Contains var_1 and var_2\n var_1,var_2: string. column name of var_1 and var_2. The order of var_1 and var_2 can not change.\n\n return float\n\n\n\tOLSForwardStepwise(data,response_var,explanatory_var_list=[],plot=True):\n\t\tPerform forward stepwise selection for OLS regression\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/siyuan-columbia/FitchValidationTool", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "ValidationTool", "package_url": "https://pypi.org/project/ValidationTool/", "platform": "", "project_url": "https://pypi.org/project/ValidationTool/", "project_urls": { "Homepage": "https://github.com/siyuan-columbia/FitchValidationTool" }, "release_url": "https://pypi.org/project/ValidationTool/1.0.0/", "requires_dist": null, "requires_python": "", "summary": "Fitch Ratings MVG Validation Tool for criteria models", "version": "1.0.0" }, "last_serial": 5784122, "releases": { "1.0.0": [ { "comment_text": "", "digests": { "md5": "ccf981355ab65400837812929044d128", "sha256": "4ca38f82b88efc5ee15066f0eef1527cc9e228f9ffc4279a9bb12aa5eace38bc" }, "downloads": -1, "filename": "ValidationTool-1.0.0-py3-none-any.whl", "has_sig": false, "md5_digest": "ccf981355ab65400837812929044d128", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 4026, "upload_time": "2019-09-05T01:24:19", "url": "https://files.pythonhosted.org/packages/67/96/872329d5aaa098fe24bf209b06578d17cf86e9148182f56927bed844c46f/ValidationTool-1.0.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "cd08fab2b15e6e75112bb1ad6a59e9bc", "sha256": "8f676885d34a37d72422705e61bc32a5708856e86aaca9a7d852894e01731de7" }, "downloads": -1, "filename": "ValidationTool-1.0.0.tar.gz", "has_sig": false, "md5_digest": "cd08fab2b15e6e75112bb1ad6a59e9bc", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3302, "upload_time": "2019-09-05T01:24:22", "url": "https://files.pythonhosted.org/packages/f6/0b/f0489eb3dbeaa0e2f8f79bbb011d922e314aba37dc99e49bc11232493ed4/ValidationTool-1.0.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "ccf981355ab65400837812929044d128", "sha256": "4ca38f82b88efc5ee15066f0eef1527cc9e228f9ffc4279a9bb12aa5eace38bc" }, "downloads": -1, "filename": "ValidationTool-1.0.0-py3-none-any.whl", "has_sig": false, "md5_digest": "ccf981355ab65400837812929044d128", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 4026, "upload_time": "2019-09-05T01:24:19", "url": "https://files.pythonhosted.org/packages/67/96/872329d5aaa098fe24bf209b06578d17cf86e9148182f56927bed844c46f/ValidationTool-1.0.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "cd08fab2b15e6e75112bb1ad6a59e9bc", "sha256": "8f676885d34a37d72422705e61bc32a5708856e86aaca9a7d852894e01731de7" }, "downloads": -1, "filename": "ValidationTool-1.0.0.tar.gz", "has_sig": false, "md5_digest": "cd08fab2b15e6e75112bb1ad6a59e9bc", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3302, "upload_time": "2019-09-05T01:24:22", "url": "https://files.pythonhosted.org/packages/f6/0b/f0489eb3dbeaa0e2f8f79bbb011d922e314aba37dc99e49bc11232493ed4/ValidationTool-1.0.0.tar.gz" } ] }