{
"info": {
"author": "Sosuke Kato",
"author_email": "snoopies.drum@gmail.com",
"bugtrack_url": null,
"classifiers": [],
"description": "cornel-movie-dialogs-corpus-storm\n=================================\n\nA set of python modules for cornel movie-dialogs corpus with storm.\n\nAbstract\n--------\n\nThis module include some classes extending\n`storm `__ ORM for `cornel movie-dialogs\ncorpus `__\ndata.\n\nInstall\n-------\n\n::\n\n pip install storm # if you not\n pip install cornel-movie-dialogs-corpus-storm\n\nSetup\n-----\n\n1. download corpus and unzip\n2. generate database and insert with ``generate-mdcorpus-database.py``\n\nfor example:\n\n::\n\n generate-mdcorpus-database.py --corpus-dir \"cornell movie-dialogs corpus\" corpus.db\n\nUsage\n-----\n\n::\n\n from mdcorpus.orm import *\n from mdcorpus.parser import *\n\n ...\n\nClass List\n----------\n\n- MovieTitlesMetadata\n- Genre\n- MovieGenreLine\n- MovieCharactersMetadata\n- MovieConversation\n- MovieLine\n- RawScriptUrl\n\nCorpus Problem\n--------------\n\nThis is memo when I dealt with corpus problems.\n\nmovie\\_titles\\_metadata.txt\n~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n- I ignored an **alphabet** following year.\n\n - for example, line 34, ``1989/I``\n\n- I ignored **duplication** for genre data.\n\n - line 58, ``['horror', 'mystery', 'mystery', 'sci-fi', 'sci-fi']``\n\nCode Problem\n------------\n\nI use ``Python2.7`` and I don't know how to use ``codecs``\nmodule.(\\ `Unicode HOWTO \u2014 Python 2.7ja1\ndocumentation `__)\n\nmime\n~~~~\n\nconvert text-code to ``utf-8`` with `Mi `__\n\nbefore\n^^^^^^\n\n::\n\n cornell movie-dialogs corpus$ file --mime {(ls)}\n README.txt: text/plain; charset=iso-8859-1\n chameleons.pdf: application/pdf; charset=binary\n movie_characters_metadata.txt: text/plain; charset=iso-8859-1\n movie_conversations.txt: text/plain; charset=us-ascii\n movie_lines.txt: text/plain; charset=us-ascii\n movie_titles_metadata.txt: text/plain; charset=iso-8859-1\n raw_script_urls.txt: text/plain; charset=iso-8859-1\n\nafter\n^^^^^\n\n::\n\n cornell movie-dialogs corpus$ file --mime {(ls)}\n README.txt: text/plain; charset=utf-8\n chameleons.pdf: application/pdf; charset=binary\n movie_characters_metadata.txt: text/plain; charset=utf-8\n movie_conversations.txt: text/plain; charset=us-ascii\n movie_lines.txt: text/plain; charset=us-ascii\n movie_titles_metadata.txt: text/plain; charset=utf-8\n raw_script_urls.txt: text/plain; charset=utf-8\n\nmovie\\_titles\\_metadata.txt\n~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n- line 115, ``l\u00e9on``\n\nmovie\\_characters\\_metadata.txt\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n- line 1727 - 1736, ``l\u00e9on``\n\nresult\n~~~~~~\n\n::\n\n sqlite> select * from movie_titles_metadata where title = 'l\u00e9on';\n sqlite> select * from movie_titles_metadata where title = 'l\u99een';\n 114|l\u99een|1994|8.6|204901",
"description_content_type": null,
"docs_url": null,
"download_url": "UNKNOWN",
"downloads": {
"last_day": -1,
"last_month": -1,
"last_week": -1
},
"home_page": "https://github.com/sosuke-k/cornel-movie-dialogs-corpus-storm",
"keywords": null,
"license": "MIT",
"maintainer": null,
"maintainer_email": null,
"name": "cornel-movie-dialogs-corpus-storm",
"package_url": "https://pypi.org/project/cornel-movie-dialogs-corpus-storm/",
"platform": "UNKNOWN",
"project_url": "https://pypi.org/project/cornel-movie-dialogs-corpus-storm/",
"project_urls": {
"Download": "UNKNOWN",
"Homepage": "https://github.com/sosuke-k/cornel-movie-dialogs-corpus-storm"
},
"release_url": "https://pypi.org/project/cornel-movie-dialogs-corpus-storm/0.1.1/",
"requires_dist": null,
"requires_python": null,
"summary": "A set of python modules for cornel movie-dialogs corpus with storm",
"version": "0.1.1"
},
"last_serial": 1795391,
"releases": {
"0.1.0": [
{
"comment_text": "",
"digests": {
"md5": "f85bd7cd4db91eb7e9979c2b0cbd600a",
"sha256": "6513bb5b7e3bd15b0086e84fa392b1ee5bc4d12e3d2735217ac29b8ad7c313ec"
},
"downloads": -1,
"filename": "cornel-movie-dialogs-corpus-storm-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "f85bd7cd4db91eb7e9979c2b0cbd600a",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 3626,
"upload_time": "2015-10-30T19:25:47",
"url": "https://files.pythonhosted.org/packages/d1/59/6e7d97aefc6575217842f1936eeae39b9a052c646372ec08b012cfa23733/cornel-movie-dialogs-corpus-storm-0.1.0.tar.gz"
}
],
"0.1.1": [
{
"comment_text": "",
"digests": {
"md5": "5350f41b48e9eaa7859d23b026f8f5ce",
"sha256": "0a365eca49ca32faf294073c9258525b6e441a4558f8af8d7ac7844ce0d01d59"
},
"downloads": -1,
"filename": "cornel-movie-dialogs-corpus-storm-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "5350f41b48e9eaa7859d23b026f8f5ce",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 5440,
"upload_time": "2015-10-31T20:43:18",
"url": "https://files.pythonhosted.org/packages/f9/2e/573329b79419ee5c8feb82e8e7f3aa8132c978c847aa9f5ca0b75bbdb36b/cornel-movie-dialogs-corpus-storm-0.1.1.tar.gz"
}
]
},
"urls": [
{
"comment_text": "",
"digests": {
"md5": "5350f41b48e9eaa7859d23b026f8f5ce",
"sha256": "0a365eca49ca32faf294073c9258525b6e441a4558f8af8d7ac7844ce0d01d59"
},
"downloads": -1,
"filename": "cornel-movie-dialogs-corpus-storm-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "5350f41b48e9eaa7859d23b026f8f5ce",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 5440,
"upload_time": "2015-10-31T20:43:18",
"url": "https://files.pythonhosted.org/packages/f9/2e/573329b79419ee5c8feb82e8e7f3aa8132c978c847aa9f5ca0b75bbdb36b/cornel-movie-dialogs-corpus-storm-0.1.1.tar.gz"
}
]
}