{ "info": { "author": "olcaytaner", "author_email": "olcay.yildiz@ozyegin.edu.tr", "bugtrack_url": null, "classifiers": [], "description": "Machine Learning\n============\n\nMachine learning is about optimization. In learning via optimization, we define an error function (loss function) on the model and try to optimize it, i.e., try to find the optimal parameters of the model by optimization techniques borrowed from optimization literature. \n\nMachine learning is about algorithms. In decision/regression trees, we need to write a recursive-learning algorithm to classify the data arriving at each decision node, and we also need to write a recursive code to generate the decision tree structure.\n\nMachine learning is about statistics. We usually assume Gaussian noise on the data, we assume multivariate normal distribution on class covariance matrices in quadratic discriminant analysis, and we use cross-validation or bootstrapping to generate multiple training sets. \n\nMachine learning is about models. In decision trees, the data structure is a binary/L-ary tree depending on the type of the features and the decision function used.\n\nMachine learning is about performance metrics. In classification, we use accuracy if we want to get a crude estimate on the performance of the classifier. If we need more details on pairwise classes, confusion matrix comes in handy. If the dataset has two classes, more metrics follow: precision, recall, true positive rate, false positive rate, F-measure, etc. \n\nWhat is machine learning about? Like defining an elephant, we need the combination of all of these topics to define machine learning: we start machine learning by assuming a certain model on the data, use algorithmic and/or optimization and/or statistical techniques to learn the parameters of that model (which is sometimes defined as curve fitting), and use performance metrics to evaluate our model/algorithm/classifier.\n\n# Algorithms\n\n## Nearest Neighbor\n\nThe most commonly used representative of nonparametric algorithms is the nearest neighbor. The assumption of nearest neighbor is simple, the world does not change much, i.e., similar things perform similarly. Therefore, we only need to store the dataset itself and make the decision on the test instance based on the similarity of it to the instances in the dataset. In other words, the class label of an instance is strongly influenced by its nearby instances.\n\n## Parametric Classification\n\nIf the class distributions are assumed to follow Gaussian density, we obtain our first parametric classifier, namely, quadratic discriminant. The number of parameters, i.e., the model complexity is Kd + Kd (d + 1) / 2, the first part is for class means and the second part for class covariance matrices.\n\nWe can assume a single shared covariance matrix for all classes. In this case, simplifying the function reduces to our second classifier, namely, linear discriminant. The model complexity is Kd + d (d + 1) / 2, where the first part is for class means and the second part for shared covariance matrix.\n\nWhen we assume all off-diagonals of the shared covariance matrix are zero; we get the naive Bayes classifier. The model complexity is Kd + d, where the first part is for class means and the second part for the diagonal of the shared covariance matrix.\n\nWe further reduce by taking priors equal and a single covariance value s. In this case, we get the nearest mean classifier and the model complexity is only Kd + 1.\n\n## Decision Trees\n\nDecision trees have a tree-based structure where each non-leaf node m implements a decision function, fm(x), and each leaf node corresponds to a class decision. Second, they are one of most interpretable learning algorithms available. When written as a set of IF-THEN rules, the decision tree can be transformed into a human-readable format, which then can be modified and/or validated by human experts in their corresponding domains.\n\n## Kernel Machines\n\nKernel machines, in other words, support vector machines, are maximum margin methods, where the model is written as a weighted sum of support vectors. Kernel machines are discriminative methods, i.e., they are only interested in the instances across the class boundaries in classification, or instances across the regressor in regression. For obtaining the optimal separating hyperplane, kernel machines try to maximize separability, or margin, and write the problem as a quadratic optimization problem, whose solution gives us support vectors.\n\n## Neural Networks\n\nArtificial neural networks (ANN) take their inspiration from the brain. The brain consists of billions of neurons and these neurons are interconnected and work in parallel, which makes the brain a powerful computing machine. Each neuron is connected through synapses to thousands of neurons and the firing of a neuron depends on those synapses.\n\nThere are three types of neurons (units) in ANN. Each unit except the input unit takes an input and calculates an output. Input units represent a single input feature xi or the bias = +1. Hidden units calculate an intermediate output from its inputs. They first combine their inputs linearly and then use nonlinear activation functions to map that linear combination to a nonlinear space. Output units calculate the output of the ANN.\n\nFor Developers\n============\n\nYou can also see [Cython](https://github.com/starlangsoftware/Classification-Cy), [Java](https://github.com/starlangsoftware/Classification), [C++](https://github.com/starlangsoftware/Classification-CPP), [Swift](https://github.com/starlangsoftware/Classification-Swift), or [C#](https://github.com/starlangsoftware/Classification-CS) repository.\n\n## Requirements\n\n* [Python 3.7 or higher](#python)\n* [Git](#git)\n\n### Python \n\nTo check if you have a compatible version of Python installed, use the following command:\n\n python -V\n \nYou can find the latest version of Python [here](https://www.python.org/downloads/).\n\n### Git\n\nInstall the [latest version of Git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git).\n\n## Pip Install\n\n\tpip3 install NlpToolkit-Classification\n\t\n## Download Code\n\nIn order to work on code, create a fork from GitHub page. \nUse Git for cloning the code to your local or below line for Ubuntu:\n\n\tgit clone \n\nA directory called Classification will be created. Or you can use below link for exploring the code:\n\n\tgit clone https://github.com/starlangsoftware/Classification-Py.git\n\n## Open project with Pycharm IDE\n\nSteps for opening the cloned project:\n\n* Start IDE\n* Select **File | Open** from main menu\n* Choose `Classification-Py` file\n* Select open as project option\n* Couple of seconds, dependencies will be downloaded. \n\nDetailed Description\n============\n\n+ [Classification Algorithms](#classification-algorithms)\n+ [Sampling Strategies](#sampling-strategies)\n+ [Feature Selection](#feature-selection)\n+ [Statistical Tests](#statistical-tests)\n\n## Classification Algorithms\n\nAlgoritmalar\u0131 e\u011fitmek i\u00e7in\n\n\ttrain(self, trainSet: InstanceList, parameters: Parameter)\n\nE\u011fitilen modeli bir veri \u00f6rne\u011fi \u00fcst\u00fcnde s\u0131namak i\u00e7in\n\n\tpredict(self, instance: Instance) -> str\n\nKarar a\u011fac\u0131 algoritmas\u0131 C45 s\u0131n\u0131f\u0131nda\n\nBagging algoritmas\u0131 Bagging s\u0131n\u0131f\u0131nda\n\nDerin \u00f6\u011frenme algoritmas\u0131 DeepNetwork s\u0131n\u0131f\u0131nda\n\nKMeans algoritmas\u0131 KMeans s\u0131n\u0131f\u0131nda\n\nDo\u011frusal ve do\u011frusal olmayan \u00e7ok katmanl\u0131 perceptron LinearPerceptron ve \nMultiLayerPerceptron s\u0131n\u0131flar\u0131nda\n\nNaive Bayes algoritmas\u0131 NaiveBayes s\u0131n\u0131f\u0131nda\n\nK en yak\u0131n kom\u015fu algoritmas\u0131 Knn s\u0131n\u0131f\u0131nda\n\nDo\u011frusal kesme analizi algoritmas\u0131 Lda s\u0131n\u0131f\u0131nda\n\n\u0130kinci derece kesme analizi algoritmas\u0131 Qda s\u0131n\u0131f\u0131nda\n\nDestek vekt\u00f6r makineleri algoritmas\u0131 Svm s\u0131n\u0131f\u0131nda\n\nRandomForest a\u011fa\u00e7 tabanl\u0131 ensemble algoritmas\u0131 RandomForest s\u0131n\u0131f\u0131nda\n\nBasit dummy ve rasgele s\u0131n\u0131fland\u0131r\u0131c\u0131 gibi temel iki s\u0131n\u0131fland\u0131r\u0131c\u0131 Dummy ve \nRandomClassifier s\u0131n\u0131flar\u0131nda\n\n## Sampling Strategies\n\nK katl\u0131 \u00e7apraz ge\u00e7erleme deneyi yapmak i\u00e7in KFoldRun, KFoldRunSeparateTest, \nStratifiedKFoldRun, StratifiedKFoldRunSeparateTest\n\nM tane K katl\u0131 \u00e7apraz ge\u00e7erleme deneyi yapmak i\u00e7in MxKFoldRun, MxKFoldRunSeparateTest,\nStratifiedMxKFoldRun, StratifiedMxKFoldRunSeparateTest\n\nBootstrap tipi deney yapmak i\u00e7in BootstrapRun\n\n## Feature Selection\n\nPca tabanl\u0131 boyut azaltma i\u00e7in Pca s\u0131n\u0131f\u0131\n\nDiscrete de\u011fi\u015fkenleri Continuous de\u011fi\u015fkenlere \u00e7evirmek i\u00e7in DiscreteToContinuous s\u0131n\u0131f\u0131\n\nDiscrete de\u011fi\u015fkenleri binary de\u011fi\u015fkenlere de\u011fi\u015ftirmek i\u00e7in LaryToBinary s\u0131n\u0131f\u0131\n\n## Statistical Tests\n\n\u0130statistiksel testler i\u00e7in Combined5x2F, Combined5x2t, Paired5x2t, Pairedt, Sign s\u0131n\u0131flar\u0131", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/StarlangSoftware/Classification-Py", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "NlpToolkit-Classification", "package_url": "https://pypi.org/project/NlpToolkit-Classification/", "platform": "", "project_url": "https://pypi.org/project/NlpToolkit-Classification/", "project_urls": { "Homepage": "https://github.com/StarlangSoftware/Classification-Py" }, "release_url": "https://pypi.org/project/NlpToolkit-Classification/1.0.11/", "requires_dist": null, "requires_python": "", "summary": "Classification library", "version": "1.0.11", "yanked": false, "yanked_reason": null }, "last_serial": 11878929, "releases": { "1.0.0": [ { "comment_text": "", "digests": { "md5": "633c1e06933e4a516420c1fefd67ff06", "sha256": "2fab9b7a53a3b6fbd367ffad0b6e14a2ded74c1a3c17e8cfc3e1580eba57c4fe" }, "downloads": -1, "filename": "NlpToolkit-Classification-1.0.0.tar.gz", "has_sig": false, "md5_digest": "633c1e06933e4a516420c1fefd67ff06", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 37086, "upload_time": "2019-10-21T19:57:08", "upload_time_iso_8601": "2019-10-21T19:57:08.982551Z", "url": "https://files.pythonhosted.org/packages/64/aa/eab19b0721618e599c9a56b3d859ec0da0fcbad3ff73287f262a7f866201/NlpToolkit-Classification-1.0.0.tar.gz", "yanked": false, "yanked_reason": null } ], "1.0.1": [ { "comment_text": "", "digests": { "md5": "6142abb729c9abb66131812cbd96041f", "sha256": "d748bb5b8fab19b639be23bdd3e3918341ff8d8ae424ace09d5a0bf8b9925414" }, "downloads": -1, "filename": "NlpToolkit-Classification-1.0.1.tar.gz", "has_sig": false, "md5_digest": "6142abb729c9abb66131812cbd96041f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 37596, "upload_time": "2019-11-10T18:13:52", "upload_time_iso_8601": "2019-11-10T18:13:52.127822Z", "url": "https://files.pythonhosted.org/packages/79/0c/a7da574422cfacc22c7d1937b45a7076f18e668a819e1128d686470673b3/NlpToolkit-Classification-1.0.1.tar.gz", "yanked": false, "yanked_reason": null } ], "1.0.10": [ { "comment_text": "", "digests": { "md5": "813aa8a8b116876aa1494606084219c0", "sha256": "9ae63999f02ace7b7d97e28d063e2a5dfcbc96933369bbd3c9f2741ef754905a" }, "downloads": -1, "filename": "NlpToolkit-Classification-1.0.10.tar.gz", "has_sig": false, "md5_digest": "813aa8a8b116876aa1494606084219c0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 41207, "upload_time": "2021-08-11T19:58:30", "upload_time_iso_8601": "2021-08-11T19:58:30.320275Z", "url": "https://files.pythonhosted.org/packages/5f/07/034293507c701eeb104e4db735e46b5ef57c104145bb483b4062ccee295f/NlpToolkit-Classification-1.0.10.tar.gz", "yanked": false, "yanked_reason": null } ], "1.0.11": [ { "comment_text": "", "digests": { "md5": "ccbef49deadecfc929df23bc3b983459", "sha256": "c9668d20682f1e5ef84e89924b41c46cf5a42d56f52fa023b99fd3aeb4eb7de7" }, "downloads": -1, "filename": "NlpToolkit-Classification-1.0.11.tar.gz", "has_sig": false, "md5_digest": "ccbef49deadecfc929df23bc3b983459", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 41807, "upload_time": "2021-10-30T16:38:07", "upload_time_iso_8601": "2021-10-30T16:38:07.220314Z", "url": "https://files.pythonhosted.org/packages/f2/0d/58bc662a456c7bcb38688a65e239d5c103970c5c586fc1876a835eca4cf8/NlpToolkit-Classification-1.0.11.tar.gz", "yanked": false, "yanked_reason": null } ], "1.0.2": [ { "comment_text": "", "digests": { "md5": "f7fb0e56a6aa479720054aef6e33d02c", "sha256": "e0d042af7b85b72c9ae850df7ebd73051a40f66f43e12dad774293c18e9616f0" }, "downloads": -1, "filename": "NlpToolkit-Classification-1.0.2.tar.gz", "has_sig": false, "md5_digest": "f7fb0e56a6aa479720054aef6e33d02c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 37583, "upload_time": "2019-11-19T09:20:13", "upload_time_iso_8601": "2019-11-19T09:20:13.554780Z", "url": "https://files.pythonhosted.org/packages/fe/aa/557cd97337cca203ad5436b805a5e15eaf5677fa3334658a1a300b7dc765/NlpToolkit-Classification-1.0.2.tar.gz", "yanked": false, "yanked_reason": null } ], "1.0.3": [ { "comment_text": "", "digests": { "md5": "941884a382ea2fb34282d46a8b2bc8fd", "sha256": "472db27e773c6404c0b085a02d951c3aa9ed1311e123c5ac0b3d411ecba7e52c" }, "downloads": -1, "filename": "NlpToolkit-Classification-1.0.3.tar.gz", "has_sig": false, "md5_digest": "941884a382ea2fb34282d46a8b2bc8fd", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 37669, "upload_time": "2019-12-21T08:31:59", "upload_time_iso_8601": "2019-12-21T08:31:59.056835Z", "url": "https://files.pythonhosted.org/packages/cc/a7/3be9b251e5af78b6400c57d3fca5f245e17c3d321e50098822ac9c3a7ed3/NlpToolkit-Classification-1.0.3.tar.gz", "yanked": false, "yanked_reason": null } ], "1.0.4": [ { "comment_text": "", "digests": { "md5": "5ebd920cd55309d1b97089d9010e2f3d", "sha256": "81c0fb88674ddf0c83dc2825ec76b706955166e538faf79dfb1f9f3d67a8181d" }, "downloads": -1, "filename": "NlpToolkit-Classification-1.0.4.tar.gz", "has_sig": false, "md5_digest": "5ebd920cd55309d1b97089d9010e2f3d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 37825, "upload_time": "2020-02-07T20:40:09", "upload_time_iso_8601": "2020-02-07T20:40:09.305088Z", "url": "https://files.pythonhosted.org/packages/ca/11/91dd7c1bdcff16c73391f68c563724a229c7f0050683599f91d401ee3a91/NlpToolkit-Classification-1.0.4.tar.gz", "yanked": false, "yanked_reason": null } ], "1.0.5": [ { "comment_text": "", "digests": { "md5": "3e169a421c5d8279b05c49103043339a", "sha256": "86fd8eee67f0a8c82cef919163964db43387fb95bc0baac6e2614c332109e6a0" }, "downloads": -1, "filename": "NlpToolkit-Classification-1.0.5.tar.gz", "has_sig": false, "md5_digest": "3e169a421c5d8279b05c49103043339a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 37708, "upload_time": "2020-02-18T04:58:34", "upload_time_iso_8601": "2020-02-18T04:58:34.985516Z", "url": "https://files.pythonhosted.org/packages/89/80/13dbe87e1730c79a65eead57f8a45797a251f42e0c3a5c3d7b68cc9ae7e0/NlpToolkit-Classification-1.0.5.tar.gz", "yanked": false, "yanked_reason": null } ], "1.0.6": [ { "comment_text": "", "digests": { "md5": "6f85c752838a5971708a9265a205a279", "sha256": "30c756bc24acee61ade013012a7a5c2768e01522e0b485066ae7cce250a8acb9" }, "downloads": -1, "filename": "NlpToolkit-Classification-1.0.6.tar.gz", "has_sig": false, "md5_digest": "6f85c752838a5971708a9265a205a279", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 37756, "upload_time": "2020-06-29T14:32:46", "upload_time_iso_8601": "2020-06-29T14:32:46.513155Z", "url": "https://files.pythonhosted.org/packages/1b/8f/19cae58c2f06c2ae80fce2fbf709527081a7056996b7bc3dda04834dd46a/NlpToolkit-Classification-1.0.6.tar.gz", "yanked": false, "yanked_reason": null } ], "1.0.7": [ { "comment_text": "", "digests": { "md5": "bd7644d12a28f8828bc92aa0d3b7ac59", "sha256": "af8e4c83e1fb5d47685a5e4c509aee2c2a0cc7a85c44df441c2d1a8c697737ab" }, "downloads": -1, "filename": "NlpToolkit-Classification-1.0.7.tar.gz", "has_sig": false, "md5_digest": "bd7644d12a28f8828bc92aa0d3b7ac59", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 37806, "upload_time": "2020-10-15T09:23:23", "upload_time_iso_8601": "2020-10-15T09:23:23.576445Z", "url": "https://files.pythonhosted.org/packages/d0/93/de3ae5b7752cb80dae86c8c0045f022f2eab49376e31d56ac74e6fe0e724/NlpToolkit-Classification-1.0.7.tar.gz", "yanked": false, "yanked_reason": null } ], "1.0.8": [ { "comment_text": "", "digests": { "md5": "e97a72b5db3768044ee9c977bdc5f3db", "sha256": "ecd0b07c4cf3d49f248db02b062cbc73e0abb40db952d107afb91e1161a1b82f" }, "downloads": -1, "filename": "NlpToolkit-Classification-1.0.8.tar.gz", "has_sig": false, "md5_digest": "e97a72b5db3768044ee9c977bdc5f3db", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 40742, "upload_time": "2021-04-25T14:50:22", "upload_time_iso_8601": "2021-04-25T14:50:22.572066Z", "url": "https://files.pythonhosted.org/packages/bc/32/4277b3c2be19a528d3e3ccebf1411b2e23ae14f3c92902d71314b3f89d15/NlpToolkit-Classification-1.0.8.tar.gz", "yanked": false, "yanked_reason": null } ], "1.0.9": [ { "comment_text": "", "digests": { "md5": "c1018d90c4968db525b2e7b3806b385d", "sha256": "3202bef3ddad5991f35bf745fb7e0fcef5106d53250627d80642022636dca89e" }, "downloads": -1, "filename": "NlpToolkit-Classification-1.0.9.tar.gz", "has_sig": false, "md5_digest": "c1018d90c4968db525b2e7b3806b385d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 40752, "upload_time": "2021-04-25T15:05:56", "upload_time_iso_8601": "2021-04-25T15:05:56.178519Z", "url": "https://files.pythonhosted.org/packages/e4/9e/915313a58c76ee2361afe0453303354d6a7a7054d55d0b972a53be884e6e/NlpToolkit-Classification-1.0.9.tar.gz", "yanked": false, "yanked_reason": null } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "ccbef49deadecfc929df23bc3b983459", "sha256": "c9668d20682f1e5ef84e89924b41c46cf5a42d56f52fa023b99fd3aeb4eb7de7" }, "downloads": -1, "filename": "NlpToolkit-Classification-1.0.11.tar.gz", "has_sig": false, "md5_digest": "ccbef49deadecfc929df23bc3b983459", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 41807, "upload_time": "2021-10-30T16:38:07", "upload_time_iso_8601": "2021-10-30T16:38:07.220314Z", "url": "https://files.pythonhosted.org/packages/f2/0d/58bc662a456c7bcb38688a65e239d5c103970c5c586fc1876a835eca4cf8/NlpToolkit-Classification-1.0.11.tar.gz", "yanked": false, "yanked_reason": null } ], "vulnerabilities": [] }