{ "info": { "author": "Antoine Simoulin", "author_email": "antoine.simoulin@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: Apache Software License", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# Titulus\n\nText visualization python package\n\n```python\nfrom titulus import color, print_\n\ntest = \"Nous sommes le 12/24/2018 aujourd'hui. Mon num\u00e9ro de tel est le (301)227-1340\"\ntokens = test.split()\nweights = np.random.randint(low=0, high=10, size=len(tokens))\n\nprint_(' '.join(color(tokens, weights, n=10)))\n```\n\n![alt text](./imgs/demo.png \"color demo\")\n\n```python\nfrom sklearn.datasets import fetch_20newsgroups\nfrom sklearn.feature_extraction.text import TfidfVectorizer\nfrom sklearn.linear_model import SGDClassifier\nfrom sklearn.pipeline import Pipeline\n\ncategories = ['alt.atheism', 'talk.religion.misc']\nnewsgroups_train = fetch_20newsgroups(subset='train',\n categories=categories)\nnewsgroups_test = fetch_20newsgroups(subset='test',\n categories=categories)\n\nX_train, X_test = newsgroups_train.data, newsgroups_test.data\ny_train, y_test = newsgroups_train.target, newsgroups_test.target\n```\n\n```python\nidx = np.random.randint(len(X_vec_list))\n\ntokens = tokenizer(X_train[idx])\ntoken_idx = [voc.index(t) if t in voc else -1 for t in tokens]\n\nweights = [X_vec_arr[idx, :][i] if i>0 else 0 for i in token_idx]\n\nprint_(' '.join(color(tokens, weights, start_hex=\"#FEFEFE\", finish_hex=\"#00a4e4\", n=20)))\n```\n\n![alt text](./imgs/demo2.png \"color demo\")\n\n\n```python\ntext_clf = Pipeline([('vect', vectorizer),\n ('clf', SGDClassifier(loss='hinge', penalty='l2', tol=0.2,\n alpha=1e-3, max_iter=15, random_state=42)),\n\n ])\n\n_ = text_clf.fit(X_train, y_train)\n\nX_vec = vectorizer.transform(X_train)\nX_vec_arr = X_vec.toarray()\nX_vec_list = [list(x) for x in X_vec_arr]\nvoc = vectorizer.get_feature_names()\n\nidx = np.random.randint(len(X_vec_list))\n\ntokens = tokenizer(X_train[idx])\ntoken_idx = [voc.index(t) if t in voc else -1 for t in tokens]\n\nweights_ = np.multiply(X_vec_arr[idx, :], text_clf.named_steps['clf'].coef_[0, :])\nweights = [weights_[i] if i>0 else 0 for i in token_idx]\n\nprint_(' '.join(color(tokens, weights, start_hex=\"#34BF49\", finish_hex=\"#BE0027\", middle_hex=\"#FEFEFE\", n=20)))\n```\n\n![alt text](./imgs/demo3.png \"color demo\")\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/AntoineSimoulin/titulus", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "titulus", "package_url": "https://pypi.org/project/titulus/", "platform": "", "project_url": "https://pypi.org/project/titulus/", "project_urls": { "Homepage": "https://github.com/AntoineSimoulin/titulus" }, "release_url": "https://pypi.org/project/titulus/0.0.1/", "requires_dist": null, "requires_python": "", "summary": "Text visualization python package", "version": "0.0.1" }, "last_serial": 5568192, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "e6b85a6292803fd43295484ea31bd15f", "sha256": "c180b962ceac733d713d03e331e0e87fe3c43b24beb3fc2ae3e73ad00f0c2963" }, "downloads": -1, "filename": "titulus-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "e6b85a6292803fd43295484ea31bd15f", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 7858, "upload_time": "2019-07-22T16:43:39", "url": "https://files.pythonhosted.org/packages/f3/88/5a107573e1a02129f53633e775a3084127dc72dc8effe72571c6c971b85c/titulus-0.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "dbf260f19aa888e93417f8d009a49121", "sha256": "11028e4997d7bd9491ea1bea80e989ceb0c0975b648c2f01297cb7703e074591" }, "downloads": -1, "filename": "titulus-0.0.1.tar.gz", "has_sig": false, "md5_digest": "dbf260f19aa888e93417f8d009a49121", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3405, "upload_time": "2019-07-22T16:43:42", "url": "https://files.pythonhosted.org/packages/1f/c3/63a4ab52a5d93208585df889d9f271355702544abfcc8aae9f1c3606d3fe/titulus-0.0.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "e6b85a6292803fd43295484ea31bd15f", "sha256": "c180b962ceac733d713d03e331e0e87fe3c43b24beb3fc2ae3e73ad00f0c2963" }, "downloads": -1, "filename": "titulus-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "e6b85a6292803fd43295484ea31bd15f", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 7858, "upload_time": "2019-07-22T16:43:39", "url": "https://files.pythonhosted.org/packages/f3/88/5a107573e1a02129f53633e775a3084127dc72dc8effe72571c6c971b85c/titulus-0.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "dbf260f19aa888e93417f8d009a49121", "sha256": "11028e4997d7bd9491ea1bea80e989ceb0c0975b648c2f01297cb7703e074591" }, "downloads": -1, "filename": "titulus-0.0.1.tar.gz", "has_sig": false, "md5_digest": "dbf260f19aa888e93417f8d009a49121", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3405, "upload_time": "2019-07-22T16:43:42", "url": "https://files.pythonhosted.org/packages/1f/c3/63a4ab52a5d93208585df889d9f271355702544abfcc8aae9f1c3606d3fe/titulus-0.0.1.tar.gz" } ] }