{
    "info": {
        "author": "Jai Juneja",
        "author_email": "jai.juneja@gmail.com",
        "bugtrack_url": null,
        "classifiers": [
            "Development Status :: 3 - Alpha",
            "Intended Audience :: Developers",
            "License :: OSI Approved :: BSD License",
            "Operating System :: OS Independent",
            "Programming Language :: Python",
            "Topic :: Scientific/Engineering :: Artificial Intelligence",
            "Topic :: Scientific/Engineering :: Information Analysis",
            "Topic :: Text Processing :: Filters",
            "Topic :: Text Processing :: Linguistic"
        ],
        "description": "PyTLDR: Automatic Text Summarization in Python\n==============================================\n\n|Build Status| |PyPI version|\n\nA Python module to perform automatic summarization of articles, text\nfiles and web pages.\n\nLicense\n-------\n\nCopyright 2014 Jai Juneja.\n\nThis program is free software: you can redistribute it and/or modify it\nunder the terms of the GNU General Public License as published by the\nFree Software Foundation, either version 3 of the License, or (at your\noption) any later version.\n\nThis program is distributed in the hope that it will be useful, but\nWITHOUT ANY WARRANTY; without even the implied warranty of\nMERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General\nPublic License for more details.\n\nYou should have received a copy of the GNU General Public License along\nwith this program. If not, see http://www.gnu.org/licenses/.\n\nInstallation\n------------\n\nUsing pip or easy\\_install\n~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nYou can download the latest release version using ``pip`` or\n``easy_install``:\n\n::\n\n    pip install pytldr\n\nLatest development version\n~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nYou can alternatively download the latest development version directly\nfrom GitHub:\n\n::\n\n    git clone https://github.com/jaijuneja/PyTLDR.git\n\nChange into the root directory:\n\n::\n\n    cd pytldr\n\nThen install the package:\n\n::\n\n    python setup.py install\n\nUsage\n-----\n\nA simple sample program using the PyTLDR module can be found at\n``https://github.com/jaijuneja/PyTLDR/blob/master/example.py``\n\nIn its current form, this module contains three distinct implementations\nof automatic text summarization:\n\n-  Using the TextRank algorithm (based on PageRank)\n-  Using Latent Semantic Analysis\n-  Using a sentence relevance score\n\nNote that all three of the above implementations are extractive - that\nis, they simply extract and display the most relevant sentences from the\ninput text. They do not formulate their own sentences (such algorithms\nare known as \"abstractive\", and are still at a primitive stage).\n\nSentence tokenization\n~~~~~~~~~~~~~~~~~~~~~\n\nPyTLDR comes with a built-in sentence tokenizer that is used for\nsummarization. The tokenizer performs stemming in several languages as\nwell as stop-word removal. You may also specify your own list of\nstop-words.\n\n.. code:: python\n\n    from pytldr.nlp.tokenizer import Tokenizer\n\n    tokenizer = Tokenizer(language='english', stopwords=None, stemming=True)\n    # Note that if stopwords=None then the tokenizer loads stopwords from a bundled data-set\n    # You can alternatively specify a text file or provide a list of words\n\nNote that the tokenizer is the only input required to initialize a\nsummarizer object, as shown below.\n\nTextRank Summarization\n~~~~~~~~~~~~~~~~~~~~~~\n\nRanks sentences using the PageRank algorithm, where \"votes\" or\n\"in-links\" are represented by words shared between sentences.\n\n.. code:: python\n\n    from pytldr.summarize.textrank import TextRankSummarizer\n    from pytldr.nlp.tokenizer import Tokenizer\n\n    tokenizer = Tokenizer('english')\n    summarizer = TextRankSummarizer(tokenizer)\n\n    # If you don't specify a tokenizer when intiializing a summarizer then the\n    #\u00a0English tokenizer will be used by default\n    summarizer = TextRankSummarizer()\u00a0 # English tokenizer used\n\n    # This object creates a summary using the summarize method:\n    #\u00a0e.g. summarizer.summarize(text, length=5, weighting='frequency', norm=None)\n\n    # The length parameter specifies the length of the summary, either as a\n    # number of sentences, or a percentage of the original text\n\n    # The summarizer can take as input...\n    # 1. A string:\n    summary = summarizer.summarize(\"Some long article bla bla...\", length=4)\n\n    # 2. A text file:\n    summary = summarizer.summarize(\"/path/to/file.txt\", length=0.25)\n    #\u00a0Above summary is a quarter of the length of the original text\n\n    # 3. A URL (must start with http://):\n    summary = summarizer.summarize(\"http://newsite.com/some_article\")\n\nLatent Semantic Analysis (LSA) Summarization\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nReduces the dimensionality of the article into several \"topic\" clusters\nusing singular value decomposition, and selects the sentences that are\nmost relevant to these topics. This is a rather more abstract\nsummarization algorithm.\n\nThis module comes packaged with two distinct implementations of the LSA\nalgorithm, as described in two academic papers:\n\n-  J. Steinberger and K. Jezek (2004). Using latent semantic analysis in\n   text summarization and summary evaluation.\n-  Ozsoy, M., Alpaslan, F., and Cicekli, I. (2011). Text summarization\n   using latent semantic analysis.\n\nThe more recent Ozsoy et al. implentation is called by default, but both\nclasses have the same interface.\n\n.. code:: python\n\n    from pytldr.summarize.lsa import LsaSummarizer, LsaOzsoy, LsaSteinberger\n\n    summarizer = LsaOzsoy()\n    summarizer = LsaSteinberger()\n    summarizer = LsaSummarizer()  # This is identical to the LsaOzsoy object\n\n    summary = summarizer.summarize(\n        text, topics=4, length=5, binary_matrix=True, topic_sigma_threshold=0.5\n    )\n\n    # topics specifies the number of topics to cluster the article into.\n    # topic_sigma_threshold removes all topics with a singular value less than a given\n    # percentage of the largest singular value.\n\nRelevance Score Summarization\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nThis method computes and ranks the cosine similarity between each\nsentence vector and the overall document, removing the most relevant\nsentence at each iteration. It closely follows the approach described in\nthe paper:\n\n-  Y. Gong and X. Liu (2001). Generic text summarization using relevance\n   measure and latent semantic analysis.\n\n.. code:: python\n\n    from pytldr.summarize.relevance import RelevanceSummarizer\n\n    summarizer = RelevanceSummarizer()\n    summary = summarizer.summarize(text, length=5, binary_matrix=True):\n\nMore help\n~~~~~~~~~\n\nYou can read the documentation for each of the above implementations by\ntyping the following into your python console:\n\n.. code:: python\n\n    help(TextRankSummarizer)\n    help(LsaSummarizer)\n    help(RelevanceSummarizer)\n\nContact\n-------\n\nIf you have any questions or have encountered an error, feel free to\ncontact me at ``jai -dot- juneja -at- gmail -dot- com``.\n\n.. |Build Status| image:: https://travis-ci.org/jaijuneja/PyTLDR.svg?branch=master\n   :target: https://travis-ci.org/jaijuneja/PyTLDR\n.. |PyPI version| image:: https://badge.fury.io/py/pytldr.svg\n   :target: https://pypi.python.org/pypi/pytldr",
        "description_content_type": null,
        "docs_url": null,
        "download_url": "UNKNOWN",
        "downloads": {
            "last_day": -1,
            "last_month": -1,
            "last_week": -1
        },
        "home_page": "https://github.com/jaijuneja/PyTLDR",
        "keywords": "summarizer,summarization,natural language processing,nlp,machine learning,data mining,latent semantic analysis,lsa",
        "license": "BSD",
        "maintainer": null,
        "maintainer_email": null,
        "name": "PyTLDR",
        "package_url": "https://pypi.org/project/PyTLDR/",
        "platform": "UNKNOWN",
        "project_url": "https://pypi.org/project/PyTLDR/",
        "project_urls": {
            "Download": "UNKNOWN",
            "Homepage": "https://github.com/jaijuneja/PyTLDR"
        },
        "release_url": "https://pypi.org/project/PyTLDR/0.1.4/",
        "requires_dist": null,
        "requires_python": null,
        "summary": "A module to perform automatic article summarization.",
        "version": "0.1.4"
    },
    "last_serial": 1454862,
    "releases": {
        "0.1": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "54b2fdddd79f8e6d93c5f48b1df491b1",
                    "sha256": "17be7770e573cd0b4a28abfae7fab30f715819829a5e685d3041ef5b33a617af"
                },
                "downloads": -1,
                "filename": "PyTLDR-0.1.tar.gz",
                "has_sig": false,
                "md5_digest": "54b2fdddd79f8e6d93c5f48b1df491b1",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 17207,
                "upload_time": "2015-03-06T04:08:12",
                "url": "https://files.pythonhosted.org/packages/46/0d/54e378ab8fc6771bf43095fe9d4571a63fd9eaecdb0e2a6b044227084e6d/PyTLDR-0.1.tar.gz"
            }
        ],
        "0.1.1": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "bd5905bba9dcdc69b8120ad7a2f81f51",
                    "sha256": "bd23b191842f44c00a653453d853197ce068a1a9056b42d28dd422c1838f61b6"
                },
                "downloads": -1,
                "filename": "PyTLDR-0.1.1.tar.gz",
                "has_sig": false,
                "md5_digest": "bd5905bba9dcdc69b8120ad7a2f81f51",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 17800,
                "upload_time": "2015-03-06T05:49:59",
                "url": "https://files.pythonhosted.org/packages/e4/81/ee0c52f40c867fa34d695929393931e4c1e19642b17efc52d4e64697a345/PyTLDR-0.1.1.tar.gz"
            }
        ],
        "0.1.2": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "3461a0e43a96c80c44d93370f41c85f8",
                    "sha256": "823aaced6b46e5053887c0e6aca5133a3a6f8f32c49e6768a084ffdb47140e71"
                },
                "downloads": -1,
                "filename": "PyTLDR-0.1.2.tar.gz",
                "has_sig": false,
                "md5_digest": "3461a0e43a96c80c44d93370f41c85f8",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 17969,
                "upload_time": "2015-03-07T03:50:08",
                "url": "https://files.pythonhosted.org/packages/16/25/a09a39be1d053c504077c00173c07fc4985ab3084a78e98f2607d9670b8e/PyTLDR-0.1.2.tar.gz"
            }
        ],
        "0.1.3": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "66a2d3f5870db02fb532ef285dd04e08",
                    "sha256": "b3a8cdca419f041944db0689356797675bfc42f3b69f586745292702542a60aa"
                },
                "downloads": -1,
                "filename": "PyTLDR-0.1.3.tar.gz",
                "has_sig": false,
                "md5_digest": "66a2d3f5870db02fb532ef285dd04e08",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 18109,
                "upload_time": "2015-03-09T15:13:16",
                "url": "https://files.pythonhosted.org/packages/49/7a/c606250e94a6ce679a1ab9cbbe7fe5ed0e058bc5849766247394f96e8609/PyTLDR-0.1.3.tar.gz"
            }
        ],
        "0.1.4": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "649ca803854bd2a182b08d981552204e",
                    "sha256": "d00624749438c5e5f6fbf4a39ca97797ff80407a86c1fadd2bf4663c18cd5ffa"
                },
                "downloads": -1,
                "filename": "PyTLDR-0.1.4.tar.gz",
                "has_sig": false,
                "md5_digest": "649ca803854bd2a182b08d981552204e",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 18134,
                "upload_time": "2015-03-09T22:32:23",
                "url": "https://files.pythonhosted.org/packages/70/09/02ed27061159e5f6d35abad4ec9ef3cac8e220093d61a2f7a42f53c9cb22/PyTLDR-0.1.4.tar.gz"
            }
        ]
    },
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "649ca803854bd2a182b08d981552204e",
                "sha256": "d00624749438c5e5f6fbf4a39ca97797ff80407a86c1fadd2bf4663c18cd5ffa"
            },
            "downloads": -1,
            "filename": "PyTLDR-0.1.4.tar.gz",
            "has_sig": false,
            "md5_digest": "649ca803854bd2a182b08d981552204e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 18134,
            "upload_time": "2015-03-09T22:32:23",
            "url": "https://files.pythonhosted.org/packages/70/09/02ed27061159e5f6d35abad4ec9ef3cac8e220093d61a2f7a42f53c9cb22/PyTLDR-0.1.4.tar.gz"
        }
    ]
}