{ "info": { "author": "Ryan Heuser", "author_email": "heuser@stanford.edu", "bugtrack_url": null, "classifiers": [], "description": "# llp\n\nLiterary Language Processing (LLP): corpora, models, and tools for the digital humanities.\n\n## Quickstart\n\n1) Install:\n\n```\npip install llp # install with pip in terminal\n```\n\n2) Download an existing corpus...\n\n```\nllp status # show which corpora/data are available\nllp download ECCO_TCP # download a corpus\n```\n\n...or import your own:\n\n```\nllp import # use the \"import\" command \\\n -path_txt mycorpus/txts # a folder of txt files (use -path_xml for xml) \\\n -path_metadata mycorpus/meta.xls # a metadata csv/tsv/xls about those txt files \\\n -col_fn filename # filename in the metadata corresponding to the .txt filename\n```\n\n...or start a new one:\n\n```\nllp create # then follow the interactive prompt\n```\n\n3) Then you can load the corpus in Python:\n\n```python\nimport llp # import llp as a python module\ncorpus = llp.load('ECCO_TCP') # load the corpus by name or ID\n```\n\n...and play with convenient Corpus objects...\n\n```python\ndf = corpus.metadata # get corpus metadata as a pandas dataframe\nsmpl=df.query('1740 < year < 1780') # do a quick query on the metadata\n\ntexts = corpus.texts() # get a convenient Text object for each text\ntexts_smpl = corpus.texts(smpl.id) # get Text objects for a specific list of IDs\n```\n\n...and Text objects:\n\n```python\nfor text in texts_smpl: # loop over Text objects\n text_meta = text.meta # get text metadata as dictionary\n author = text.author # get common metadata as attributes \n\n txt = text.txt # get plain text as string\n xml = text.xml # get xml as string\n\n tokens = text.tokens # get list of words (incl punct)\n words = text.words # get list of words (excl punct)\n counts = text.word_counts # get word counts as dictionary (from JSON if saved)\n ocracc = text.ocr_accuracy # get estimate of ocr accuracy\n \n spacy_obj = text.spacy # get a spacy text object\n nltk_obj = text.nltk # get an nltk text object\n blob_obj = text.blob # get a textblob object\n```\n\n## Corpus magic\n\nEach corpus object can generate data about itself:\n\n```python\ncorpus.save_metadata() # save metadata from xml files (if possible)\ncorpus.save_plain_text() # save plain text from xml (if possible)\ncorpus.save_mfw() # save list of all words in corpus and their total count\ncorpus.save_freqs() # save counts as JSON files\ncorpus.save_dtm() # save a document-term matrix with top N words\n```\n\nYou can also run these commands in the terminal:\n\n```\nllp install my_corpus # this is equivalent to python above\nllp install my_corpus -parallel 4 # but can access parallel processing with MPI/Slingshot\nllp install my_corpus dtm # run a specific step\n```\n\nGenerating this kind of data allows for easier access to things like:\n\n```python\nmfw = corpus.mfw(n=10000) # get the 10K most frequent words\ndtm = corpus.freqs(words=mfw) # get a document-term matrix as a pandas dataframe\n```\n\nYou can also build word2vec models:\n\n```python\nw2v_model = corpus.word2vec() # get an llp word2vec model object\nw2v_model.model() # run the modeling process\nw2v_model.save() # save the model somewhere\ngensim_model = w2v_model.gensim # get the original gensim object\n```", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/quadrismegistus/llp", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "llp", "package_url": "https://pypi.org/project/llp/", "platform": "", "project_url": "https://pypi.org/project/llp/", "project_urls": { "Homepage": "https://github.com/quadrismegistus/llp" }, "release_url": "https://pypi.org/project/llp/0.2.2/", "requires_dist": null, "requires_python": "", "summary": "Literary Language Processing (LLP): corpora, models, and tools for the digital humanities", "version": "0.2.2" }, "last_serial": 5699580, "releases": { "0.1.1": [ { "comment_text": "", "digests": { "md5": "4b71181d57fc4790dd9db0691eedf2cb", "sha256": "8ef6467cd504f3395def9dd69a55cc4d27e15aca21923893f4cb833b9bb2ace8" }, "downloads": -1, "filename": "llp-0.1.1.tar.gz", "has_sig": false, "md5_digest": "4b71181d57fc4790dd9db0691eedf2cb", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 108117, "upload_time": "2019-08-10T23:48:27", "url": "https://files.pythonhosted.org/packages/bc/65/4c971c1691bc79035c245ba0bb959664b93121025672fdf712541b98fc3b/llp-0.1.1.tar.gz" } ], "0.1.10": [ { "comment_text": "", "digests": { "md5": "59067aca1a39784611de192f7907031d", "sha256": "8d449aba187461383b8f43ce3dff142441c71c939b287eb9bdb0253ccd606026" }, "downloads": -1, "filename": "llp-0.1.10.tar.gz", "has_sig": false, "md5_digest": "59067aca1a39784611de192f7907031d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5408089, "upload_time": "2019-08-11T08:35:01", "url": "https://files.pythonhosted.org/packages/7b/f1/46b2c123533ae077ba2d15ae5aefa1dd74c14f20ee6e623e44ab68d9677d/llp-0.1.10.tar.gz" } ], "0.1.11": [ { "comment_text": "", "digests": { "md5": "685872618135bd8cf195db47b95abc08", "sha256": "3faab065243190cbb7489a37ba54810dd66e9ce1806517459a3c5023555d9cf6" }, "downloads": -1, "filename": "llp-0.1.11.tar.gz", "has_sig": false, "md5_digest": "685872618135bd8cf195db47b95abc08", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5408116, "upload_time": "2019-08-11T08:47:31", "url": "https://files.pythonhosted.org/packages/31/ca/36498bd402896334f15f8614aa56fe397d4bd9bab2b2243a214b127d1127/llp-0.1.11.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "8dac4c643aadca47cb35d1ecef8232b7", "sha256": "b1c9a9c7f95e8762196f2e3f5bd9c7eedae824c012f952a809131876f6cf6462" }, "downloads": -1, "filename": "llp-0.1.2.tar.gz", "has_sig": false, "md5_digest": "8dac4c643aadca47cb35d1ecef8232b7", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3542, "upload_time": "2019-08-11T04:34:29", "url": "https://files.pythonhosted.org/packages/5f/2d/1f5083a7edb26ae0095bd41d39a0abb7cd51a3266b32571d52cce297e0cb/llp-0.1.2.tar.gz" } ], "0.1.3": [ { "comment_text": "", "digests": { "md5": "69cf49db265b864fa100d7ee675e5ca6", "sha256": "95a50debf5aa5efe574cf35fa3ab58ce8fbf7c71438a2c62af024254b2109797" }, "downloads": -1, "filename": "llp-0.1.3.tar.gz", "has_sig": false, "md5_digest": "69cf49db265b864fa100d7ee675e5ca6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3957, "upload_time": "2019-08-11T04:43:02", "url": "https://files.pythonhosted.org/packages/4f/1e/b6d5ce0fe30e18d3b8e8862f40313edd2f1c6aa19bbc6de053594adc9266/llp-0.1.3.tar.gz" } ], "0.1.4": [ { "comment_text": "", "digests": { "md5": "3ced9fea8587d8bdd883a2702684370b", "sha256": "4f09178a7fe48768c8dd7b6d9fd097ef105dda7af2c58eb7aaeb2079f6d3e951" }, "downloads": -1, "filename": "llp-0.1.4.tar.gz", "has_sig": false, "md5_digest": "3ced9fea8587d8bdd883a2702684370b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 108320, "upload_time": "2019-08-11T04:50:57", "url": "https://files.pythonhosted.org/packages/97/68/e288f5ed8bd249027436e227851e0f1db6a4817b5e0759b37d257ee8fca9/llp-0.1.4.tar.gz" } ], "0.1.5": [ { "comment_text": "", "digests": { "md5": "c2813f5e32f57661f2eb1e69ae3aa22e", "sha256": "6b365f6658807ef8c648d62bef14c239a972f05920d341be1b1cbf9b255a0238" }, "downloads": -1, "filename": "llp-0.1.5.tar.gz", "has_sig": false, "md5_digest": "c2813f5e32f57661f2eb1e69ae3aa22e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 109226, "upload_time": "2019-08-11T06:47:10", "url": "https://files.pythonhosted.org/packages/8d/0a/c02bd353ed8fb891690e5899b60da8bb4375b9bfbd2796e99b53c3b22be7/llp-0.1.5.tar.gz" } ], "0.1.6": [ { "comment_text": "", "digests": { "md5": "f1ee8cb84a1c37a067557ce9fe1b4c32", "sha256": "957c38545f40d0a71cd3a9f5cbf7e41ab15b439023f195c25ff1185198607ce4" }, "downloads": -1, "filename": "llp-0.1.6.tar.gz", "has_sig": false, "md5_digest": "f1ee8cb84a1c37a067557ce9fe1b4c32", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 109185, "upload_time": "2019-08-11T07:07:17", "url": "https://files.pythonhosted.org/packages/09/f5/bb1331842f4ecf655bc78529f308070d1284c801c9b04eaa88cd83b2ad5f/llp-0.1.6.tar.gz" } ], "0.1.7": [ { "comment_text": "", "digests": { "md5": "54d124f42822f15b83dd23669459d565", "sha256": "9f7b00fc7e3b5a8a4dc41fe496c83b47a001b0bda68f18a5ae65fe8689c57502" }, "downloads": -1, "filename": "llp-0.1.7.tar.gz", "has_sig": false, "md5_digest": "54d124f42822f15b83dd23669459d565", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5407661, "upload_time": "2019-08-11T07:18:21", "url": "https://files.pythonhosted.org/packages/0f/fb/f9fee00a55b4b81c603ca3268508844e950f047aaa6945f5f2b01633b314/llp-0.1.7.tar.gz" } ], "0.1.8": [ { "comment_text": "", "digests": { "md5": "792edbcfd6e25672219fb5bf03129ab8", "sha256": "484e0b061a9b28daeaeeb8889973e53b7876b94bfa7a3fd8290c2a637f0ae6ba" }, "downloads": -1, "filename": "llp-0.1.8.tar.gz", "has_sig": false, "md5_digest": "792edbcfd6e25672219fb5bf03129ab8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5408080, "upload_time": "2019-08-11T08:03:11", "url": "https://files.pythonhosted.org/packages/01/af/483fbeebe301334e7e364f6f5cef60d3101946b0d191365687fb25927904/llp-0.1.8.tar.gz" } ], "0.1.9": [ { "comment_text": "", "digests": { "md5": "37186c1a99275c7f355c5b997547e578", "sha256": "08e875b198dc010da026ff26347f3055ed203897d188097727358a39e9bceb16" }, "downloads": -1, "filename": "llp-0.1.9.tar.gz", "has_sig": false, "md5_digest": "37186c1a99275c7f355c5b997547e578", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5408079, "upload_time": "2019-08-11T08:07:04", "url": "https://files.pythonhosted.org/packages/12/aa/1865d304190a1e13d615c2f5889bc892687ef73a698c9b5f5aad64d59a33/llp-0.1.9.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "c34a2271b172c3ddf678f776971691a0", "sha256": "cdec383bdb3831b64c203321def9521589855fe36e4d585f1da3163e58a5457e" }, "downloads": -1, "filename": "llp-0.2.0.tar.gz", "has_sig": false, "md5_digest": "c34a2271b172c3ddf678f776971691a0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5415610, "upload_time": "2019-08-12T23:02:15", "url": "https://files.pythonhosted.org/packages/3d/50/f767cd6189ae647dd73fce24b010d8e818c9e1b2098ed5842475a6eaae82/llp-0.2.0.tar.gz" } ], "0.2.1": [ { "comment_text": "", "digests": { "md5": "d44a99939b786ef9481d9772b658a406", "sha256": "00522c443503eff34eec9d3a004e77573f3c05d3e4171cb6ea78b815fe71c713" }, "downloads": -1, "filename": "llp-0.2.1.tar.gz", "has_sig": false, "md5_digest": "d44a99939b786ef9481d9772b658a406", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5416444, "upload_time": "2019-08-14T19:23:49", "url": "https://files.pythonhosted.org/packages/6a/40/d3e2e13eecbc9f1437d29ca03cd94acdd171550b3a47c4b50506d08f411d/llp-0.2.1.tar.gz" } ], "0.2.2": [ { "comment_text": "", "digests": { "md5": "8e3ba0eaf4e615d70d5624dab7ce23a3", "sha256": "3540909c754a0f740d951073bb38e0544ab44c61718328db25832b9c1d8bfd0f" }, "downloads": -1, "filename": "llp-0.2.2.tar.gz", "has_sig": false, "md5_digest": "8e3ba0eaf4e615d70d5624dab7ce23a3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5417665, "upload_time": "2019-08-19T17:34:51", "url": "https://files.pythonhosted.org/packages/9d/0b/9d0f00336a402bf622e40978b401e202d418b1ea122d1e762dea30b50ac7/llp-0.2.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "8e3ba0eaf4e615d70d5624dab7ce23a3", "sha256": "3540909c754a0f740d951073bb38e0544ab44c61718328db25832b9c1d8bfd0f" }, "downloads": -1, "filename": "llp-0.2.2.tar.gz", "has_sig": false, "md5_digest": "8e3ba0eaf4e615d70d5624dab7ce23a3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5417665, "upload_time": "2019-08-19T17:34:51", "url": "https://files.pythonhosted.org/packages/9d/0b/9d0f00336a402bf622e40978b401e202d418b1ea122d1e762dea30b50ac7/llp-0.2.2.tar.gz" } ] }