{ "info": { "author": "Data Analytics at Texas A&M (DATA) Lab, Yuening Li", "author_email": "yuehningli@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "# PyODDS\n[![Build Status](https://travis-ci.com/datamllab/PyODDS.svg?branch=master)](https://travis-ci.com/datamllab/PyODDS)\n[![Documentation Status](https://readthedocs.org/projects/pyodds-handbook/badge/?version=latest)](https://pyodds.github.io/)\n[![Codacy Badge](https://api.codacy.com/project/badge/Grade/3456033f37744ae2a5a69da448ee430d)](https://www.codacy.com/manual/pyodds/PyODDS?utm_source=github.com&utm_medium=referral&utm_content=pyodds/PyODDS&utm_campaign=Badge_Grade)\n[![Known Vulnerabilities](https://snyk.io//test/github/pyodds/PyODDS/badge.svg?targetFile=requirements.txt)](https://snyk.io//test/github/pyodds/PyODDS?targetFile=requirements.txt)\n[![PyPI version](https://badge.fury.io/py/pyodds.svg)](https://badge.fury.io/py/pyodds)\n\nOfficial Website: [`pyodds.github.io`](https://pyodds.github.io/)\n\n##\n\n**PyODDS** is an end-to end **Python** system for **outlier** **detection** with **database** **support**. PyODDS provides outlier detection algorithms which meet the demands for users in different fields, w/wo data science or machine learning background. PyODDS gives the ability to execute machine learning algorithms in-database without moving data out of the database server or over the network. It also provides access to a wide range of outlier detection algorithms, including statistical analysis and more recent deep learning based approaches.\n\nPyODDS is featured for:\n\n - **Full Stack Service** which supports operations and maintenances from light-weight SQL based database to back-end machine learning algorithms and makes the throughput speed faster;\n\n - **State-of-the-art Anomaly Detection Approaches** including **Statistical/Machine Learning/Deep Learning** models with unified APIs and detailed documentation;\n\n - **Powerful Data Analysis Mechanism** which supports both **static and time-series data** analysis with flexible time-slice(sliding-window) segmentation. \n\nThe Full API Reference can be found in [`handbook`](https://pyodds.github.io/).\n\n## API Demo:\n\n\n```sh\nfrom utils.import_algorithm import algorithm_selection\nfrom utils.utilities import output_performance,connect_server,query_data\n\n# connect to the database\nconn,cursor=connect_server(host, user, password)\n\n# query data from specific time range\ndata = query_data(database_name,table_name,start_time,end_time)\n\n# train the anomaly detection algorithm\nclf = algorithm_selection(algorithm_name)\nclf.fit(X_train)\n\n# get outlier result and scores\nprediction_result = clf.predict(X_test)\noutlierness_score = clf.decision_function(test)\n\n#visualize the prediction_result\nvisualize_distribution(X_test,prediction_result,outlierness_score)\n\n```\n\n## Cite this work\n\n\nYuening Li, Daochen Zha, Na Zou, Xia Hu. \"PyODDS: An End-to-End Outlier Detection System\" ([Download](https://arxiv.org/pdf/1910.02575.pdf))\n\nBiblatex entry:\n\n @article{li2019pyodds,\n author = {Li, Yuening and Zha, Daochen and Zou, Na and Hu, Xia},\n title = {PyODDS: An End-to-End Outlier Detection System},\n year = {2019},\n eprint = {arXiv:1910.02575},\n }\n\n\n\n## Quick Start\n```sh\npython demo.py --ground_truth --visualize_distribution\n```\n\n### Results are shown as\n```sh\nconnect to TDengine success\nLoad dataset and table\nLoading cost: 0.151061 seconds\nLoad data successful\nStart processing:\n100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 10/10 [00:00<00:00, 14.02it/s]\n==============================\nResults in Algorithm dagmm are:\naccuracy_score: 0.98\nprecision_score: 0.99\nrecall_score: 0.99\nf1_score: 0.99\nroc_auc_score: 0.99\nprocessing time: 15.330137 seconds\n==============================\nconnection is closed\n\n```\n\n\n## Installation\n\nTo install the package, please use the [`pip`](https://pip.pypa.io/en/stable/installing/) installation as follows:\n\n```sh\npip install pyodds\npip install git+git@github.com:datamllab/PyODDS.git\n```\n**Note:** PyODDS is only compatible with **Python 3.6** and above.\n\n### Required Dependencies\n\n```sh\n- pandas>=0.25.0\n- taos==1.4.15\n- tensorflow==2.0.0b1\n- numpy>=1.16.4\n- seaborn>=0.9.0\n- torch>=1.1.0\n- luminol==0.4\n- tqdm>=4.35.0\n- matplotlib>=3.1.1\n- scikit_learn>=0.21.3\n```\nTo compile and package the JDBC driver source code, you should have a Java jdk-8 or higher and Apache Maven 2.7 or higher installed. To install openjdk-8 on Ubuntu:\n\n```sh\nsudo apt-get install openjdk-8-jdk\n```\n\nTo install Apache Maven on Ubuntu:\n\n```sh\nsudo apt-get install maven\n```\nTo install the TDengine as the back-end database service, please refer to [this instruction](https://www.taosdata.com/en/getting-started/#Install-from-Package).\n\nTo enable the Python client APIs for TDengine, please follow [this handbook](https://www.taosdata.com/en/documentation/connector/#Python-Connector). \n\nTo insure the locale in config file is valid:\n\n```sh\nsudo locale-gen \"en_US.UTF-8\"\nexport LC_ALL=\"en_US.UTF-8\"\nlocale\n\n```\nTo start the service after installation, in a terminal, use:\n```sh\ntaosd\n```\n\n## Implemented Algorithms\n### Statistical Based Methods\nMethods | Algorithm | Class API\n------------ | -------------|-------------\nCBLOF | Clustering-Based Local Outlier Factor | :class:`algo.cblof.CBLOF`\nHBOS | Histogram-based Outlier Score | :class:`algo.hbos.HBOS`\nIFOREST | Isolation Forest | :class:`algo.iforest.IFOREST`\nKNN | k-Nearest Neighbors | :class:`algo.knn.KNN`\nLOF | Local Outlier Factor | :class:`algo.cblof.CBLOF`\nOCSVM | One-Class Support Vector Machines | :class:`algo.ocsvm.OCSVM`\nPCA | Principal Component Analysis | :class:`algo.pca.PCA`\nRobustCovariance | Robust Covariance| :class:`algo.robustcovariance.RCOV`\nSOD | Subspace Outlier Detection| :class:`algo.sod.SOD`\n\n### Deep Learning Based Methods\nMethods | Algorithm | Class API\n------------ | -------------|-------------\nautoencoder | Outlier detection using replicator neural networks | :class:`algo.autoencoder.AUTOENCODER`\ndagmm | Deep autoencoding gaussian mixture model for unsupervised anomaly detection | :class:`algo.dagmm.DAGMM`\n\n### Time Serie Methods\nMethods | Algorithm | Class API\n------------ | -------------|-------------\nlstmad | Long short term memory networks for anomaly detection in time series | :class:`algo.lstm_ad.LSTMAD`\nlstmencdec | LSTM-based encoder-decoder for multi-sensor anomaly detection | :class:`algo.lstm_enc_dec_axl.LSTMED`\nluminol | Linkedin's luminol\t | :class:`algo.luminol.LUMINOL`\n\n## APIs Cheatsheet\n\nThe Full API Reference can be found in [`handbook`](https://pyodds.github.io/).\n\n - **connect_server(hostname,username,password)**: Connect to Apache backend TDengine Service.\n\n - **query_data(connection,cursor,database_name,table_name,start_time,end_time)**: Query data from table *table_name* in database *database_name* within a given time range.\n\n - **algorithm_selection(algorithm_name,contamination)**: Select an algorithm as detector.\n\n - **fit(X)**: Fit *X* to detector.\n\n - **predict(X)**: Predict if instance in *X* is outlier or not.\n\n - **decision_function(X)**: Output the anomaly score of instances in *X*.\n\n - **output_performance(algorithm_name,ground_truth,prediction_result,outlierness_score)**: Output the prediction result as evaluation matrix in *Accuracy*, *Precision*, *Recall*, *F1 Score*, *ROC-AUC Score*, *Cost time*.\n\n - **visualize_distribution(X,prediction_result,outlierness_score)**: Visualize the detection result with the the data distribution.\n\n - **visualize_outlierscore(outlierness_score,prediction_result,contamination)** Visualize the detection result with the outlier score.\n\n\n## License\n\n\nYou may use this software under the MIT License.\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/datamllab/PyODDS", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "pyodds", "package_url": "https://pypi.org/project/pyodds/", "platform": "", "project_url": "https://pypi.org/project/pyodds/", "project_urls": { "Homepage": "https://github.com/datamllab/PyODDS" }, "release_url": "https://pypi.org/project/pyodds/1.0.0rc1/", "requires_dist": [ "pandas (>=0.24.0)", "tensorflow (==2.0.0b1)", "numpy (>=1.16.4)", "seaborn (>=0.9.0)", "torch (>=1.1.0)", "luminol (==0.4)", "tqdm (>=4.35.0)", "matplotlib (>=3.0.0)", "scikit-learn (>=0.21.3)" ], "requires_python": ">=3.5", "summary": "An end-to-end anomaly detection system", "version": "1.0.0rc1" }, "last_serial": 5950226, "releases": { "1.0.0b1": [ { "comment_text": "", "digests": { "md5": "8b891700120891b2d07463169136f277", "sha256": "c700b4cf157b6bee50c42f106a77a8b548f3f1a02462e69775f6489ced0fef21" }, "downloads": -1, "filename": "pyodds-1.0.0b1-py3-none-any.whl", "has_sig": false, "md5_digest": "8b891700120891b2d07463169136f277", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 5150, "upload_time": "2019-10-09T01:44:20", "url": "https://files.pythonhosted.org/packages/ca/b5/d13ec046e0b7c92e94bf3bf7484f054d2b3579c9f5509cf214666aedd4bf/pyodds-1.0.0b1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "a1e7d7e3fa095aa7974fc01652207907", "sha256": "8c6f9b84da4bd3ce48e2b474cc91589bd69fbb0dbdcac8a3b1f9c61f7312abeb" }, "downloads": -1, "filename": "pyodds-1.0.0b1.tar.gz", "has_sig": false, "md5_digest": "a1e7d7e3fa095aa7974fc01652207907", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 4808, "upload_time": "2019-10-09T01:44:22", "url": "https://files.pythonhosted.org/packages/b0/3d/7a68a186ec9bf618fbe0ba65f258f248c48ba6639e51537e4c95ee526b44/pyodds-1.0.0b1.tar.gz" } ], "1.0.0rc1": [ { "comment_text": "", "digests": { "md5": "7f78a644aab8a5573898683f9cce028a", "sha256": "bdde470c935c3a13e69e5e83a8ed5dfd0e87c1a744cd08fcd19987be38c930f1" }, "downloads": -1, "filename": "pyodds-1.0.0rc1-py3-none-any.whl", "has_sig": false, "md5_digest": "7f78a644aab8a5573898683f9cce028a", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 55954, "upload_time": "2019-10-09T14:34:46", "url": "https://files.pythonhosted.org/packages/e3/ff/1eb1f7f05a10223b57d9a8b3ecd3af0a3a126c40c544a621bc2a0a0111d6/pyodds-1.0.0rc1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "a6ddf5d98c42381df1175355a8e60bbd", "sha256": "9878cf3b9087dafd742190471ce703ebb369d9ad7d1005ee7e574dd3a3305f9b" }, "downloads": -1, "filename": "pyodds-1.0.0rc1.tar.gz", "has_sig": false, "md5_digest": "a6ddf5d98c42381df1175355a8e60bbd", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 37581, "upload_time": "2019-10-09T14:34:51", "url": "https://files.pythonhosted.org/packages/01/76/43743b4c3c5ed60392e9478d73f556d95fe69559cfefc887fa376875356b/pyodds-1.0.0rc1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "7f78a644aab8a5573898683f9cce028a", "sha256": "bdde470c935c3a13e69e5e83a8ed5dfd0e87c1a744cd08fcd19987be38c930f1" }, "downloads": -1, "filename": "pyodds-1.0.0rc1-py3-none-any.whl", "has_sig": false, "md5_digest": "7f78a644aab8a5573898683f9cce028a", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 55954, "upload_time": "2019-10-09T14:34:46", "url": "https://files.pythonhosted.org/packages/e3/ff/1eb1f7f05a10223b57d9a8b3ecd3af0a3a126c40c544a621bc2a0a0111d6/pyodds-1.0.0rc1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "a6ddf5d98c42381df1175355a8e60bbd", "sha256": "9878cf3b9087dafd742190471ce703ebb369d9ad7d1005ee7e574dd3a3305f9b" }, "downloads": -1, "filename": "pyodds-1.0.0rc1.tar.gz", "has_sig": false, "md5_digest": "a6ddf5d98c42381df1175355a8e60bbd", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 37581, "upload_time": "2019-10-09T14:34:51", "url": "https://files.pythonhosted.org/packages/01/76/43743b4c3c5ed60392e9478d73f556d95fe69559cfefc887fa376875356b/pyodds-1.0.0rc1.tar.gz" } ] }