{ "info": { "author": "Anis Ayari", "author_email": "anis.ayari.pro@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "# Smarties :art:\n\n\nSmarties is a Text Classifier using an innovative approach based on Wikipedia auto-learning to classify \nany documents/text. We use a Machine Learning and Doc2Vec algorithms.\n\n

\n \n

\nDid you see this scene in Avengers: Age of Ultron ? Our learnable programs work kind similar (without the vilain part, we hope... :smile: ).\nOur Ai use the most advanced open knowledge base to learn for identifing new categories, and our solution can always learn new related topics. \n\n## Getting Started :beginner:\n\nThese instructions will get you a copy of the project up and running on your local machine \nfor development and testing purposes. See deployment for notes on how to deploy the project \non a live system.\n\n### Prerequisites\n\nYou need to have python 3.+ install on your machine. \nPlease follow this link to install \n- Pywikibot : https://www.mediawiki.org/wiki/Manual:Pywikibot/Installation\n- mwviews https://github.com/mediawiki-utilities/python-mwviews\n- nltk Stopword\n### Installing\n```\npip install Smarties\n```\n\n### Example :round_pushpin:\n```\nwiki_dico_path = \"wiki_dico.json\"\n\nif __name__ == '__main__':\n sm.set_lang('english')\n #construct the knowledge base to a wiki_dico.json file\n title_theme_list = []\n title_theme_list.append(('Soccer', 'Soccer'))\n title_theme_list.append(('Baseball', 'Baseball'))\n title_theme_list.append(('Golf', 'Golf'))\n title_theme_list.append(('Basketball', 'Basketball'))\n title_theme_list.append(('Judo', 'Judo'))\n\n sm.construct_wiki_dico(wiki_dico_path, title_theme_list, init=True, max_article_links=30, find_links=True)\n sm.construct_database_from_knwoledge_base(wiki_dico_path, database_file_ouput=\"database_file_custom_name.csv\")\n\n df = sm.import_database(database_file =\"database_file_custom_name.csv\", sampling=True)\n df = sm.sort_keyword_from_database(df, stoppath=stoppath, number_of_character_per_word_at_least=5,\n number_of_words_at_most_per_phrase=20,\n number_of_keywords_appears_in_the_text_at_least=10)\n\n classifier,df = sm.model_from_database(df)\n\n sentence_to_predict = \"The French soccer team is perhaps one of the best team around the world.\"\n sm.predict(classifier, df, sentence_to_predict)\n```\nLanguage Supported :\n- arabic\n- danish\n- german\n- english\n- spanish\n- french\n- indonesian\n- italian\n- japanese\n- dutch\n- portuguese\n- brasilian\n- romanian\n- russian\n\n### TODO :grin:\nPlease feel free to contribute ! Report any bugs in the issue section. \n\n- [ ] Complete Code documentation and README\n- [x] Build a basis code\n- [ ] Multi Langage identification and support\n- [ ] File format support\n- [ ] Text Granularity classification\n- [x] Text processing (Clean, tokenizing,...)\n- [x] Wikipedia auto-search by class\n- [x] Database creation from a json (Wikipedia learning, Datastructure, sampling..)\n- [x] Create text classification algorithm (RAKE, GridSearch, RFC, Doc2Vec)\n- [x] Input example and return classification\n- [ ] Multiple output classification by granularity\n\n\n### LICENSE\n\nPlease be aware that this code is under the GPL3 license. \nYou must report each utilisation of this code to the author of this code (anisayari). \nPlease push your code using this API on a forked Github repo public.", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/anisayari/Smarties", "keywords": "", "license": "GPL3", "maintainer": "", "maintainer_email": "", "name": "Smarties", "package_url": "https://pypi.org/project/Smarties/", "platform": "", "project_url": "https://pypi.org/project/Smarties/", "project_urls": { "Homepage": "https://github.com/anisayari/Smarties" }, "release_url": "https://pypi.org/project/Smarties/0.11/", "requires_dist": null, "requires_python": "", "summary": "A Smart AI Text Learner and Classifier", "version": "0.11" }, "last_serial": 4100299, "releases": { "0.11": [ { "comment_text": "", "digests": { "md5": "bf48f4529913d7b0e49975cbcd77c84e", "sha256": "b661d8e6652a5bd475327cbfa292ecfc116bacaeaec98ab976800e7b2074c88c" }, "downloads": -1, "filename": "Smarties-0.11.tar.gz", "has_sig": false, "md5_digest": "bf48f4529913d7b0e49975cbcd77c84e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 14414, "upload_time": "2018-07-25T11:53:14", "url": "https://files.pythonhosted.org/packages/38/0a/13a5bf604fffadcce0d24072d4c43f9a3516fc4f27426a95e01ddce6ead8/Smarties-0.11.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "bf48f4529913d7b0e49975cbcd77c84e", "sha256": "b661d8e6652a5bd475327cbfa292ecfc116bacaeaec98ab976800e7b2074c88c" }, "downloads": -1, "filename": "Smarties-0.11.tar.gz", "has_sig": false, "md5_digest": "bf48f4529913d7b0e49975cbcd77c84e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 14414, "upload_time": "2018-07-25T11:53:14", "url": "https://files.pythonhosted.org/packages/38/0a/13a5bf604fffadcce0d24072d4c43f9a3516fc4f27426a95e01ddce6ead8/Smarties-0.11.tar.gz" } ] }