{ "info": { "author": "Anurag Kumar", "author_email": "anuragkumarak95@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "License :: OSI Approved :: GNU General Public License v3 (GPLv3)", "Programming Language :: Python", "Topic :: Utilities" ], "description": "# WordNet\n\n[![Build Status](https://travis-ci.org/anuragkumarak95/wordnet.svg?branch=master)](https://travis-ci.org/anuragkumarak95/wordnet)\n[![codecov](https://codecov.io/gh/anuragkumarak95/wordnet/branch/master/graph/badge.svg)](https://codecov.io/gh/anuragkumarak95/wordnet)\n[![Requirements Status](https://requires.io/github/anuragkumarak95/wordnet/requirements.svg?branch=master)](https://requires.io/github/anuragkumarak95/wordnet/requirements/?branch=master)\n\nCreate a Simple **network of words** related to each other using **Twitter Streaming API**.\n\n![Made with python-3.5](http://forthebadge.com/images/badges/made-with-python.svg)\n\nMajor parts of this project.\n\n* `Streamer` : ~/twitter_streaming.py\n* `TF-IDF` Gene : ~/wordnet/tf_idf_generator.py\n* `NN` words Gene :~/ wordnet/nn_words.py\n* `NETWORK` Gene : ~/wordnet/word_net.py\n\n## Using Streamer Functionality\n\n1. `Clone this repo` and run on bash '`$pip install -r requirements.txt`' @ root directory and you will be ready to go..\n\n1. Go to root-dir(~), Create a config.py file with details mentioned below:\n ```python\n # Variables that contains the user credentials to access Twitter Streaming API\n # this link will help you(http://socialmedia-class.org/twittertutorial.html)\n access_token = \"xxx-xx-xxxx\"\n access_token_secret = \"xxxxx\"\n consumer_key = \"xxxxxx\"\n consumer_secret = \"xxxxxxxx\"\n ```\n1. run `Streamer` with an array of filter words that you want to fetch tweets on. eg. `$python twitter_streaming.py hello hi hallo namaste > data_file.txt` this will save a line by line words from tweets filtered according to words used as args in `data_file.txt`.\n\n## Using WordNet Module\n\n1. `Clone this repo` and install wordnet module using this script,\n\n $python setup.py install\n\n1. To create a `TF-IDF` structure file for every doc, use:\n\n ```python\n from wordnet import find_tf_idf\n\n df, tf_idf = find_tf_idf(\n file_names=['file/path1','file/path2',..], # paths of files to be processed.(create using twitter_streamer.py)\n prev_file_path='prev/tf/idf/file/path.tfidfpkl', # prev TF_IDF file to modify over, format standard is .tfidfpkl. default = None\n dump_path='path/to/dump/file.tfidfpkl' # dump_path if tf-idf needs to be dumped, format standard is .tfidfpkl. default = None\n )\n\n '''\n if no file is provided prev_file_path parameter, new TF-IDF file will be generated ,and else\n TF-IDF values will be combined with previous file, and dumped at dump_path if mentioned,\n else will only return the new tf-idf list of dictionaries, and df dictionary.\n '''\n ```\n1. To use `NN` Word Gene of this module, simply use wordnet.find_knn:\n\n ```python\n from wordnet import find_knn\n\n words = find_knn(\n tf_idf=tf_idf, # this tf_idf is returned by find_tf_idf() above.\n input_word='german', # a word for which k nearest neighbours are required.\n k=10, # k = number of neighbours required, default=10\n rand_on=True # rand_on = either to randomly skip few words or show initial k words default=True\n )\n\n '''\n This function will return a list of words closely related to provided input_word refering to\n tf_idf var provided to it. either use find_tf_idf() to gather this var or pickle.load() a dump\n file dumped by the same function at your choosen directory. the file contains 2 lists in format\n (idf, tf_idf).\n '''\n ```\n\n1. To create a Word `Network`, use :\n\n ```python\n from wordnet import generate_net\n\n word_net = generate_net(\n df=df, # this df is returned by find_tf_idf() above.\n tf_idf=tf_idf, # this tf_idf is returned by find_tf_idf() above.\n dump_path='path/to/dump.wrnt' # dump_path = path to dump the generated files, format standard is .wrnt. default=None\n )\n\n '''\n this function returns a dict of Word entities, with word as key.\n '''\n ```\n\n1. To retrieve a Word `Network`, use :\n\n ```python\n from wordnet import retrieve_net\n\n word_net = retrieve_net(\n 'path/to/network.wrnt' # path to network file, format standard is .wrnt.\n )\n '''\n this function returns a dictionary of Word entities, with word as key.\n '''\n ```\n\n1. To retrieve list of words that are at some depth form a root word in the network, use:\n\n ```python\n from wordnet import return_net\n\n words = return_net(\n word, # root word in this process.\n word_net, # word network generated from generate_net()\n depth=1 # depth to which you wish this word collector to traverse.\n )\n '''\n This function returns a list of words that are at provided depth from root word in the\n network provided.\n '''\n ```\n\n### Test Run\n\nTo run a formal test, simply run this script. `python test.py`, this module will return **0** if everythinig worked as expected.\n\ntest.py uses sample data provided [here](https://github.com/anuragkumarak95/wordnet/tree/master/test) and executes unittest on `find_tf_idf()`, `find_knn()` & `generate_net()`.\n\n> `Streamer` functionality will not be provided under distribution of this code. That is just a script independent from the module.\n\n#### Contributions Are welcomed here\n\n![BUILT WITH LOVE](http://forthebadge.com/images/badges/built-with-love.svg)\n\nby [@Anurag](https://github.com/anuragkumarak95)\n", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://anuragkumarak95.github.io/wordnet/", "keywords": "word network python twitter streaming data", "license": "GNU General Public License v3", "maintainer": "", "maintainer_email": "", "name": "wordnet", "package_url": "https://pypi.org/project/wordnet/", "platform": "", "project_url": "https://pypi.org/project/wordnet/", "project_urls": { "Homepage": "https://anuragkumarak95.github.io/wordnet/" }, "release_url": "https://pypi.org/project/wordnet/0.0.1b2/", "requires_dist": null, "requires_python": "", "summary": "An module to create network of words on bases of realtive sense under a corpus of document.", "version": "0.0.1b2" }, "last_serial": 3195600, "releases": { "0.0.1b2": [ { "comment_text": "", "digests": { "md5": "29d275f6fdd5b2e7b3a101cd8e474eb9", "sha256": "0ec876cbab8fc997eaebe1056e6ccef5678811fe5e75734156b27482e761dcc9" }, "downloads": -1, "filename": "wordnet-0.0.1b2.tar.gz", "has_sig": false, "md5_digest": "29d275f6fdd5b2e7b3a101cd8e474eb9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8752, "upload_time": "2017-09-22T19:45:39", "url": "https://files.pythonhosted.org/packages/e5/c9/93f89fc3613db301ff92be67aa67a5f9e4b5e212081ce3569e84a9e57304/wordnet-0.0.1b2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "29d275f6fdd5b2e7b3a101cd8e474eb9", "sha256": "0ec876cbab8fc997eaebe1056e6ccef5678811fe5e75734156b27482e761dcc9" }, "downloads": -1, "filename": "wordnet-0.0.1b2.tar.gz", "has_sig": false, "md5_digest": "29d275f6fdd5b2e7b3a101cd8e474eb9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8752, "upload_time": "2017-09-22T19:45:39", "url": "https://files.pythonhosted.org/packages/e5/c9/93f89fc3613db301ff92be67aa67a5f9e4b5e212081ce3569e84a9e57304/wordnet-0.0.1b2.tar.gz" } ] }