{ "info": { "author": "Rob Abbott", "author_email": "abbott@soe.ucsc.edu", "bugtrack_url": null, "classifiers": [], "description": "Internet Argument Corpus\n========================\nThe **Internet Argument Corpus (IAC)** version 2 is a collection of corpora for research in political debate on internet forums. The data is provided in a MySQL database (`download `_). There is also Python code for accessing/creating the database (`here `_).\n\nDependencies\n------------\nData:\n * `MySQL `_ (Or `MariaDB `_)\n (Server for hosting, client for access)\n\nCode:\n * `Python 3 `_\n * Python libraries (pip3 install ):\n\n * sqlalchemy\n * inflect\n * mysqlclient (or other interface such as oursql)\n\nInstall (Code)\n--------------\nEither clone the git repository::\n\n git clone git@bitbucket.org:nlds_iac/internet-argument-corpus-2.git\n\nOr install via pip::\n\n pip3 install InternetArgumentCorpus\n\nInstall (Data)\n--------------\nRestoring from a sql dump::\n\n mysql --user=root -p createdebate < createdebate_20xx_xx_xx.sql\n\nNote that you may need to create the database first::\n\n drop database createdebate;\n SET GLOBAL innodb_file_format=Barracuda; # in case it isn't already\n CREATE SCHEMA createdebate DEFAULT CHARACTER SET utf8mb4 DEFAULT COLLATE utf8mb4_bin;\n\n\nBacking up::\n\n mysqldump createdebate -r createdebate_$(date +%Y_%m_%d).sql\n\nOr potentially faster but much more complicated (How *I* do it)::\n\n dir=$(date \"+%Y-%m-%d_%Hh%Mm\");\n mkdir -m 777 -p /tmp/$dir\n date\n for db in convinceme fourforums createdebate createdebate_released; do \n echo $db; \n mkdir -m 777 /tmp/$dir/$db; \n mysqldump --tab=/tmp/$dir/$db $db; \n rm /tmp/$dir/$db/*.sql; \n mysqldump --no-data $db -r /tmp/$dir/$db/$db.sql;\n echo \"compressing\";\n tar -czf /tmp/$dir/\"$db\"_$(date +%Y_%m_%d).tgz -C /tmp/$dir/ $db;\n rm -rf /tmp/$dir/$db;\n done; mv /tmp/$dir .; date;\n\n cd $dir\n date\n for db in convinceme fourforums createdebate createdebate_released; do \n echo $db; \n mysql -u root -p -e \"drop database $db; CREATE SCHEMA $db DEFAULT CHARACTER SET utf8mb4 DEFAULT COLLATE utf8mb4_bin; SET GLOBAL foreign_key_checks=0\"; \n mysql -u root -p $db < $db/$db.sql;\n mysqlimport -u root -p --use-threads=4 --local $db $db/*.txt; \n mysql -u root -p -e \"SET GLOBAL foreign_key_checks=1\"; \n done;date;\n\n\nUse\n---\nPython code:\n\n.. code-block:: python\n\n from iacorpus import load_dataset\n\n dataset = load_dataset('fourforums')\n print(dataset.dataset_metadata)\n for discussion in dataset:\n print(discussion)\n for post in discussion:\n print(post)\n exit()\n\nContributing\n------------\nI welcome suggestions, pull requests, bug reports, etc.!\n", "description_content_type": null, "docs_url": null, "download_url": "https://bitbucket.org/nlds_iac/internet-argument-corpus-2/get/1.0.1.tar.gz", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://bitbucket.org/nlds_iac/internet-argument-corpus-2", "keywords": "corpus", "license": "MIT", "maintainer": null, "maintainer_email": null, "name": "InternetArgumentCorpus", "package_url": "https://pypi.org/project/InternetArgumentCorpus/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/InternetArgumentCorpus/", "project_urls": { "Download": "https://bitbucket.org/nlds_iac/internet-argument-corpus-2/get/1.0.1.tar.gz", "Homepage": "https://bitbucket.org/nlds_iac/internet-argument-corpus-2" }, "release_url": "https://pypi.org/project/InternetArgumentCorpus/1.0.1/", "requires_dist": null, "requires_python": null, "summary": "The Internet Argument Corpus (IAC) version 2 is a collection of corpora for research in political debate on internet forums.", "version": "1.0.1" }, "last_serial": 2150873, "releases": { "1.0.1": [ { "comment_text": "", "digests": { "md5": "3e621a584b97e35c27f7fc472b501677", "sha256": "90f9c590a7b053f6371e89a1d6aacf78d4210814f75a9724e326ee20599fdfb5" }, "downloads": -1, "filename": "InternetArgumentCorpus-1.0.1.tar.gz", "has_sig": false, "md5_digest": "3e621a584b97e35c27f7fc472b501677", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 36844, "upload_time": "2016-06-04T19:19:43", "url": "https://files.pythonhosted.org/packages/51/3e/b131095d40b44f059d85034ad5747b9ada3aaf37bb405b832e08e06059df/InternetArgumentCorpus-1.0.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "3e621a584b97e35c27f7fc472b501677", "sha256": "90f9c590a7b053f6371e89a1d6aacf78d4210814f75a9724e326ee20599fdfb5" }, "downloads": -1, "filename": "InternetArgumentCorpus-1.0.1.tar.gz", "has_sig": false, "md5_digest": "3e621a584b97e35c27f7fc472b501677", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 36844, "upload_time": "2016-06-04T19:19:43", "url": "https://files.pythonhosted.org/packages/51/3e/b131095d40b44f059d85034ad5747b9ada3aaf37bb405b832e08e06059df/InternetArgumentCorpus-1.0.1.tar.gz" } ] }