{ "info": { "author": "Sosuke Kato", "author_email": "snoopies.drum@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "cornel-movie-dialogs-corpus-storm\n=================================\n\nA set of python modules for cornel movie-dialogs corpus with storm.\n\nAbstract\n--------\n\nThis module include some classes extending\n`storm `__ ORM for `cornel movie-dialogs\ncorpus `__\ndata.\n\nInstall\n-------\n\n::\n\n pip install storm # if you not\n pip install cornel-movie-dialogs-corpus-storm\n\nSetup\n-----\n\n1. download corpus and unzip\n2. generate database and insert with ``generate-mdcorpus-database.py``\n\nfor example:\n\n::\n\n generate-mdcorpus-database.py --corpus-dir \"cornell movie-dialogs corpus\" corpus.db\n\nUsage\n-----\n\n::\n\n from mdcorpus.orm import *\n from mdcorpus.parser import *\n\n ...\n\nClass List\n----------\n\n- MovieTitlesMetadata\n- Genre\n- MovieGenreLine\n- MovieCharactersMetadata\n- MovieConversation\n- MovieLine\n- RawScriptUrl\n\nCorpus Problem\n--------------\n\nThis is memo when I dealt with corpus problems.\n\nmovie\\_titles\\_metadata.txt\n~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n- I ignored an **alphabet** following year.\n\n - for example, line 34, ``1989/I``\n\n- I ignored **duplication** for genre data.\n\n - line 58, ``['horror', 'mystery', 'mystery', 'sci-fi', 'sci-fi']``\n\nCode Problem\n------------\n\nI use ``Python2.7`` and I don't know how to use ``codecs``\nmodule.(\\ `Unicode HOWTO \u2014 Python 2.7ja1\ndocumentation `__)\n\nmime\n~~~~\n\nconvert text-code to ``utf-8`` with `Mi `__\n\nbefore\n^^^^^^\n\n::\n\n cornell movie-dialogs corpus$ file --mime {(ls)}\n README.txt: text/plain; charset=iso-8859-1\n chameleons.pdf: application/pdf; charset=binary\n movie_characters_metadata.txt: text/plain; charset=iso-8859-1\n movie_conversations.txt: text/plain; charset=us-ascii\n movie_lines.txt: text/plain; charset=us-ascii\n movie_titles_metadata.txt: text/plain; charset=iso-8859-1\n raw_script_urls.txt: text/plain; charset=iso-8859-1\n\nafter\n^^^^^\n\n::\n\n cornell movie-dialogs corpus$ file --mime {(ls)}\n README.txt: text/plain; charset=utf-8\n chameleons.pdf: application/pdf; charset=binary\n movie_characters_metadata.txt: text/plain; charset=utf-8\n movie_conversations.txt: text/plain; charset=us-ascii\n movie_lines.txt: text/plain; charset=us-ascii\n movie_titles_metadata.txt: text/plain; charset=utf-8\n raw_script_urls.txt: text/plain; charset=utf-8\n\nmovie\\_titles\\_metadata.txt\n~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n- line 115, ``l\u00e9on``\n\nmovie\\_characters\\_metadata.txt\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n- line 1727 - 1736, ``l\u00e9on``\n\nresult\n~~~~~~\n\n::\n\n sqlite> select * from movie_titles_metadata where title = 'l\u00e9on';\n sqlite> select * from movie_titles_metadata where title = 'l\u99een';\n 114|l\u99een|1994|8.6|204901", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/sosuke-k/cornel-movie-dialogs-corpus-storm", "keywords": null, "license": "MIT", "maintainer": null, "maintainer_email": null, "name": "cornel-movie-dialogs-corpus-storm", "package_url": "https://pypi.org/project/cornel-movie-dialogs-corpus-storm/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/cornel-movie-dialogs-corpus-storm/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://github.com/sosuke-k/cornel-movie-dialogs-corpus-storm" }, "release_url": "https://pypi.org/project/cornel-movie-dialogs-corpus-storm/0.1.1/", "requires_dist": null, "requires_python": null, "summary": "A set of python modules for cornel movie-dialogs corpus with storm", "version": "0.1.1" }, "last_serial": 1795391, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "f85bd7cd4db91eb7e9979c2b0cbd600a", "sha256": "6513bb5b7e3bd15b0086e84fa392b1ee5bc4d12e3d2735217ac29b8ad7c313ec" }, "downloads": -1, "filename": "cornel-movie-dialogs-corpus-storm-0.1.0.tar.gz", "has_sig": false, "md5_digest": "f85bd7cd4db91eb7e9979c2b0cbd600a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3626, "upload_time": "2015-10-30T19:25:47", "url": "https://files.pythonhosted.org/packages/d1/59/6e7d97aefc6575217842f1936eeae39b9a052c646372ec08b012cfa23733/cornel-movie-dialogs-corpus-storm-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "5350f41b48e9eaa7859d23b026f8f5ce", "sha256": "0a365eca49ca32faf294073c9258525b6e441a4558f8af8d7ac7844ce0d01d59" }, "downloads": -1, "filename": "cornel-movie-dialogs-corpus-storm-0.1.1.tar.gz", "has_sig": false, "md5_digest": "5350f41b48e9eaa7859d23b026f8f5ce", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5440, "upload_time": "2015-10-31T20:43:18", "url": "https://files.pythonhosted.org/packages/f9/2e/573329b79419ee5c8feb82e8e7f3aa8132c978c847aa9f5ca0b75bbdb36b/cornel-movie-dialogs-corpus-storm-0.1.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "5350f41b48e9eaa7859d23b026f8f5ce", "sha256": "0a365eca49ca32faf294073c9258525b6e441a4558f8af8d7ac7844ce0d01d59" }, "downloads": -1, "filename": "cornel-movie-dialogs-corpus-storm-0.1.1.tar.gz", "has_sig": false, "md5_digest": "5350f41b48e9eaa7859d23b026f8f5ce", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5440, "upload_time": "2015-10-31T20:43:18", "url": "https://files.pythonhosted.org/packages/f9/2e/573329b79419ee5c8feb82e8e7f3aa8132c978c847aa9f5ca0b75bbdb36b/cornel-movie-dialogs-corpus-storm-0.1.1.tar.gz" } ] }