{ "info": { "author": "Cahya Wirawan", "author_email": "Cahya.Wirawan@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "License :: OSI Approved :: MIT License", "Programming Language :: Python :: 3.5", "Topic :: Scientific/Engineering :: Artificial Intelligence", "Topic :: Text Processing :: Filters" ], "description": "Open Text Classification (OpenTC)\n=================================\n\nOpenTC is a text classification engine using machine learning. It is\ndesigned as client-server architecture and uses python libraries\nscikit-learn and tensorflow for it's machine learning algorithms.\nCurrently following algorithms are supported:\n\n- Naive Bayes\n- Support Vector Machine\n- Convolutional Neural Network\n\nIn the future it will also support FastText from Facebookresearch.\n\nThe engine is running as a server listening on command and text to be\nclassified. By default it listens on localhost port 3333, but it can be\nchanged in the yaml configuration file.\n\nOpenTC can be used for example for text classification (a demo website\nfor this purpose is available online `OpenTC\ndemo `__), or for other purposes such\nas Data Leak Prevention (DLP). An example of implementation for the DLP\nhas been created as ICAP Server:\n`opentc-icap `__\n\nRequirements\n------------\n\n- Python 3.x\n- numpy\n- pyparsing\n- PyYAML\n- scikit-learn\n- scipy\n- tensorflow 1.x\n\nHow to use\n----------\n\nInstallation\n~~~~~~~~~~~~\n\nInstall the module using pip:\n\n::\n\n $ pip install opentc\n\nor clone the repository\n\n::\n\n $ git clone https://github.com/cahya-wirawan/opentc.git\n $ cd opentc\n $ python setup.py install\n\nopentc\n~~~~~~\n\nsynopsis\n^^^^^^^^\n\nopentc\n\nDescription\n^^^^^^^^^^^\n\nThe command line to train the application based on the datasets define\nin the configuration file. The result of the training (pre-trained data)\ncan be used for the opentcd server.\n\nUsage\n^^^^^\n\n::\n\n $ python opentc -h\n usage: opentc [-h] [-c CLASSIFIER] [-C CONFIGURATION_FILE] [-d DATASET]\n [-l LOG_CONFIGURATION_FILE]\n\n optional arguments:\n -h, --help show this help message and exit\n -c CLASSIFIER, --classifier CLASSIFIER\n set classifier to use for the training (support\n currently bayesian, svm or cnn)\n -C CONFIGURATION_FILE, --configuration_file CONFIGURATION_FILE\n set the configuration file\n -d DATASET, --dataset DATASET\n set dataset to use for the training\n -l LOG_CONFIGURATION_FILE, --log_configuration_file LOG_CONFIGURATION_FILE\n set the log configuration file\n\nopentcd\n~~~~~~~\n\nsynopsis\n^^^^^^^^\n\nopentcd\n\nDescription\n^^^^^^^^^^^\n\nThe daemon listens for incoming connections on TCP port (default is\n3333) and classify files or text string on demand. It reads a\nconfiguration file in the following order: ./opentc.yml,\n~/.opentc/opentc.yml or /etc/opentc/opentc.yml.\n\nUsage\n^^^^^\n\nOpentcd uses the configuration file opentc.yml to define allmost all\npossible configuration. Only few setup can be overridden in command line\noptions.\n\nList of arguments:\n\n::\n\n $ python opentcd -h\n usage: opentcd [-h] [-a ADDRESS] [-C CONFIGURATION_FILE]\n [-l LOG_CONFIGURATION_FILE] [-p PORT] [-t TIMEOUT]\n\n optional arguments:\n -h, --help show this help message and exit\n -a ADDRESS, --address ADDRESS\n define the address for the server\n -C CONFIGURATION_FILE, --configuration_file CONFIGURATION_FILE\n set the configuration file\n -l LOG_CONFIGURATION_FILE, --log_configuration_file LOG_CONFIGURATION_FILE\n set the log configuration file\n -p PORT, --port PORT define the port number which the server uses to listen\n -t TIMEOUT, --timeout TIMEOUT\n define the time out\n\nRun it as background application:\n\n::\n\n $ python opentcd&\n 2017-05-02 13:33:22,276 - opentc.core.classifier.cnn_text - DEBUG - Load the checkpoint: \n data/input/cnn_twenty_newsgroup_20170301_090000-all/checkpoints/model-2210\n INFO:tensorflow:Restoring parameters from data/input/cnn_twenty_newsgroup_20170301_090000-all/checkpoints/model-2210\n 2017-05-02 13:33:23,899 - tensorflow - INFO - Restoring parameters \n from data/input/cnn_twenty_newsgroup_20170301_090000-all/checkpoints/model-2210\n 2017-05-02 13:33:27,375 - __main__ - INFO - Server start\n 2017-05-02 13:33:28,019 - opentc.core.server - INFO - Server loop running in thread: Thread-1\n\ndatasets and pre-trained data\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\nThe configuration file defines the path to the datasets and pre-trained\ndata. A pre-trained data for testing purpose can be downloaded from\n`data `__, it is around 1.4GB. Just\nuncompress it and change the path to the pre-trained data in opentc.yml\nfile accordingly.\n\nCommands\n^^^^^^^^\n\nThe command uses a newline character as the delimiter. If opentcd\ndoesn't recognize the command, or the command doesn't follow the\nrequirements specified below, it will reply with an error message, but\nstill wait for the next commands (this behaviour can be changed in the\nfuture).\n\nPING\n''''\n\nCheck the server's state. It should reply with \"PONG\".\n\nVERSION\n'''''''\n\nPrint the program version\n\nRELOAD\n''''''\n\nReload the engine\n\nLIST\\_CLASSIFIER\n''''''''''''''''\n\nList the supported classifiers (at the moment there are three\nclassifiers supported: Bayesian, Support Vector Machine and\nConvolutional Neural Network). It shows also the status of classifier,\neither True (enabled) or False (disabled).\n\nSET\\_CLASSIFIER\n'''''''''''''''\n\nEnabled or disabled the specific classifier\n\nPREDICT\\_STREAM\n'''''''''''''''\n\nClassify text streams. It uses a new line character as delimiter for\nevery sentences.\n\nPREDICT\\_FILE\n'''''''''''''\n\nClassify file. It uses a new line character as delimiter for every\nsentences\n\nCLOSE\n'''''\n\nClose the connection\n\nTodo\n----\n\n- Multilabel classification\n- Include FastText from Facebookresearch\n- Will use pyzmq and google's protobuf to improve the protocol and\n network communication\n\n", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/cahya-wirawan/opentc", "keywords": "machine learning cnn svm bayesian", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "opentc", "package_url": "https://pypi.org/project/opentc/", "platform": "", "project_url": "https://pypi.org/project/opentc/", "project_urls": { "Homepage": "https://github.com/cahya-wirawan/opentc" }, "release_url": "https://pypi.org/project/opentc/0.5.0/", "requires_dist": null, "requires_python": "", "summary": "A text classification engine using machine learning and designed as client-server architecture", "version": "0.5.0" }, "last_serial": 2903227, "releases": { "0.3.11": [ { "comment_text": "", "digests": { "md5": "9046d6120e15c24045e534cef4bb8212", "sha256": "e7584a5c7560d313b2635f8d993664896fe831a5a55d31ea9b23f742a8d7bd40" }, "downloads": -1, "filename": "opentc-0.3.11.tar.gz", "has_sig": false, "md5_digest": "9046d6120e15c24045e534cef4bb8212", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17104, "upload_time": "2017-04-04T18:13:46", "url": "https://files.pythonhosted.org/packages/d4/2d/06308eb61f04650e710a98520cb494feb57fd86e550bebaef3260cdfb926/opentc-0.3.11.tar.gz" } ], "0.3.12": [ { "comment_text": "", "digests": { "md5": "27929d200e57e8afcfc41910142e61ed", "sha256": "0aebe03cf3a0de012601ed50e69d1b504300af7478efbe6b0f0ab435be1c625a" }, "downloads": -1, "filename": "opentc-0.3.12.tar.gz", "has_sig": false, "md5_digest": "27929d200e57e8afcfc41910142e61ed", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17174, "upload_time": "2017-04-04T19:59:19", "url": "https://files.pythonhosted.org/packages/27/4b/db4f6a3c002b3b406c343a352b697bbc46dc3e1a98d1fd24eedb82e5522a/opentc-0.3.12.tar.gz" } ], "0.3.13": [ { "comment_text": "", "digests": { "md5": "6542642000b8b483ca163acf398d11fe", "sha256": "69a5cf0fb8378828aa521f536c5dfbdb28499190f2dc859c6de4521f4c333078" }, "downloads": -1, "filename": "opentc-0.3.13.tar.gz", "has_sig": false, "md5_digest": "6542642000b8b483ca163acf398d11fe", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17133, "upload_time": "2017-04-04T20:01:31", "url": "https://files.pythonhosted.org/packages/5f/df/c81e74ecfb866f295f93e6a7d41fec9c4be9fbb16b95791952aabe23acf0/opentc-0.3.13.tar.gz" } ], "0.3.4": [ { "comment_text": "", "digests": { "md5": "cb04a86879585338203f088be381c5ab", "sha256": "29db66accbb44a89a77d610746a58e667e45439c680183e47cbdf8475c10847e" }, "downloads": -1, "filename": "opentc-0.3.4.tar.gz", "has_sig": false, "md5_digest": "cb04a86879585338203f088be381c5ab", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 16855, "upload_time": "2017-04-03T22:00:23", "url": "https://files.pythonhosted.org/packages/47/3c/f5d0ce68ad11f50aabf6793bbb167b94ccabe1df54a2daefadecfa475812/opentc-0.3.4.tar.gz" } ], "0.3.9": [ { "comment_text": "", "digests": { "md5": "3af26f4bf1cef7b8f4047bf1d15bf4b8", "sha256": "30d05b09b782e899e28f37796a80685427ca1f12fe85c516f62e4c5abd008c8b" }, "downloads": -1, "filename": "opentc-0.3.9.tar.gz", "has_sig": false, "md5_digest": "3af26f4bf1cef7b8f4047bf1d15bf4b8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 16864, "upload_time": "2017-04-04T16:06:29", "url": "https://files.pythonhosted.org/packages/d3/65/c9c3aeb6838fcb7025240aa232b10ce329ae03ab5a99a39981cb2fe10347/opentc-0.3.9.tar.gz" } ], "0.4.0": [ { "comment_text": "", "digests": { "md5": "b49315426f5773f99e0b3aa333afa0e6", "sha256": "848b9a66383cf38772d61af9a521b404cd6fa7a23b4a9fd9d998cb4326db0e71" }, "downloads": -1, "filename": "opentc-0.4.0.tar.gz", "has_sig": false, "md5_digest": "b49315426f5773f99e0b3aa333afa0e6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17026, "upload_time": "2017-04-24T15:23:45", "url": "https://files.pythonhosted.org/packages/10/4c/368d54e7903f3f1bc6b7c569435bf98458d2507d9996c6f5d1e69dd1ba72/opentc-0.4.0.tar.gz" } ], "0.4.1": [ { "comment_text": "", "digests": { "md5": "630a6ae7176ab6204e4fe4e5a2beda43", "sha256": "80fbbbb6f719a9ff281e3fd7663239b3b5579604eed8841accb5be8293f7452e" }, "downloads": -1, "filename": "opentc-0.4.1.tar.gz", "has_sig": false, "md5_digest": "630a6ae7176ab6204e4fe4e5a2beda43", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 18651, "upload_time": "2017-05-02T14:17:03", "url": "https://files.pythonhosted.org/packages/20/5c/d438671391963b9d8ae6d74ee7ede8386382fa38b8429d8288af14822256/opentc-0.4.1.tar.gz" } ], "0.4.2": [ { "comment_text": "", "digests": { "md5": "81b561faabb9b5b9d0663418483cc49c", "sha256": "3bb146c89eb6b38b04f2834d47604aa47f5dfb4a7935be0273a5d2ba7548d1a4" }, "downloads": -1, "filename": "opentc-0.4.2.tar.gz", "has_sig": false, "md5_digest": "81b561faabb9b5b9d0663418483cc49c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17960, "upload_time": "2017-05-10T14:42:31", "url": "https://files.pythonhosted.org/packages/15/71/cfb9b578bdc42c559b1609ae199814076e88bc23c6bebac6e05746955401/opentc-0.4.2.tar.gz" } ], "0.4.3": [ { "comment_text": "", "digests": { "md5": "d8d4c9736e994288858f61a0e540a7e3", "sha256": "ab8dd71718f554d0ca6369866481c1ecc630873bcee3b061f245d5af1dcb6f75" }, "downloads": -1, "filename": "opentc-0.4.3.tar.gz", "has_sig": false, "md5_digest": "d8d4c9736e994288858f61a0e540a7e3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 18028, "upload_time": "2017-05-15T13:28:34", "url": "https://files.pythonhosted.org/packages/04/13/7112afd471f865cb5d63616d102ea4ba2a9b4a674e2f526d06b79a254c5f/opentc-0.4.3.tar.gz" } ], "0.4.4": [ { "comment_text": "", "digests": { "md5": "2507c526683cd84a401c64a05a17ed5d", "sha256": "e3ffac812aa120b7f088e0816d479df10f0a300a69760eb14ba1ad2c58271d19" }, "downloads": -1, "filename": "opentc-0.4.4.tar.gz", "has_sig": false, "md5_digest": "2507c526683cd84a401c64a05a17ed5d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 18049, "upload_time": "2017-05-16T15:32:29", "url": "https://files.pythonhosted.org/packages/c9/c1/162540e2375ecffbbefd83cad27d09a4c331c0923485b50ef5dfd571a221/opentc-0.4.4.tar.gz" } ], "0.4.5": [ { "comment_text": "", "digests": { "md5": "a3c139201aca0a4f2f54c0c5840ebe8d", "sha256": "346705636f5a2eabc6048902a56fb0e69f5a4955a1503748519b18ef98a35317" }, "downloads": -1, "filename": "opentc-0.4.5.tar.gz", "has_sig": false, "md5_digest": "a3c139201aca0a4f2f54c0c5840ebe8d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 18121, "upload_time": "2017-05-21T10:15:34", "url": "https://files.pythonhosted.org/packages/f0/1a/1aef9268ea92474ed74035d7bbd34eedb7e7b9a8a62e354d7a6d21f5ab78/opentc-0.4.5.tar.gz" } ], "0.5.0": [ { "comment_text": "", "digests": { "md5": "2c5cb8d13144b18da21475f6f232cdb7", "sha256": "e553832fa3025e91320fe5a83cc53bb845aa3c9127790c92ff1369b78a8ae234" }, "downloads": -1, "filename": "opentc-0.5.0.tar.gz", "has_sig": false, "md5_digest": "2c5cb8d13144b18da21475f6f232cdb7", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 18508, "upload_time": "2017-05-27T12:38:01", "url": "https://files.pythonhosted.org/packages/69/9a/18c1c06f81523af4863108373764b0266b296b424688b1079f21a6c66018/opentc-0.5.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "2c5cb8d13144b18da21475f6f232cdb7", "sha256": "e553832fa3025e91320fe5a83cc53bb845aa3c9127790c92ff1369b78a8ae234" }, "downloads": -1, "filename": "opentc-0.5.0.tar.gz", "has_sig": false, "md5_digest": "2c5cb8d13144b18da21475f6f232cdb7", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 18508, "upload_time": "2017-05-27T12:38:01", "url": "https://files.pythonhosted.org/packages/69/9a/18c1c06f81523af4863108373764b0266b296b424688b1079f21a6c66018/opentc-0.5.0.tar.gz" } ] }