{
    "info": {
        "author": "Stanford NLP Group",
        "author_email": "chaganty@cs.stanford.edu",
        "bugtrack_url": null,
        "classifiers": [
            "Development Status :: 4 - Beta",
            "Intended Audience :: Developers",
            "License :: OSI Approved :: MIT License",
            "Programming Language :: Python :: 2",
            "Programming Language :: Python :: 2.7",
            "Programming Language :: Python :: 3",
            "Programming Language :: Python :: 3.3",
            "Programming Language :: Python :: 3.4",
            "Programming Language :: Python :: 3.5",
            "Topic :: Software Development :: Object Brokering"
        ],
        "description": "Stanford CoreNLP Python Interface\n=================================\n\n.. image:: https://travis-ci.org/stanfordnlp/python-stanford-corenlp.svg?branch=master\n    :target: https://travis-ci.org/stanfordnlp/python-stanford-corenlp\n\nThis package contains a python interface for `Stanford CoreNLP\n<https://github.com/stanfordnlp/CoreNLP>`_ that contains a reference\nimplementation to interface with the `Stanford CoreNLP server\n<https://stanfordnlp.github.io/CoreNLP/corenlp-server.html>`_.\nThe package also contains a base class to expose a python-based annotation\nprovider (e.g. your favorite neural NER system) to the CoreNLP\npipeline via a lightweight service.\n\nTo use the package, first download the `official java CoreNLP release \n<https://stanfordnlp.github.io/CoreNLP/#download>`_, unzip it, and define an environment\nvariable :code:`$CORENLP_HOME` that points to the unzipped directory.\n\nYou can also install this package from `PyPI <https://pypi.python.org/pypi/stanford-corenlp/>`_ using :code:`pip install stanford-corenlp` \n\n----\n\nCommand Line Usage\n------------------\nProbably the easiest way to use this package is through the `annotate` command-line utility::\n\n    usage: annotate [-h] [-i INPUT] [-o OUTPUT] [-f {json}]\n                    [-a ANNOTATORS [ANNOTATORS ...]] [-s] [-v] [-m MEMORY]\n                    [-p PROPS [PROPS ...]]\n\n    Annotate data\n\n    optional arguments:\n      -h, --help            show this help message and exit\n      -i INPUT, --input INPUT\n                            Input file to process; each line contains one document\n                            (default: stdin)\n      -o OUTPUT, --output OUTPUT\n                            File to write annotations to (default: stdout)\n      -f {json}, --format {json}\n                            Output format\n      -a ANNOTATORS [ANNOTATORS ...], --annotators ANNOTATORS [ANNOTATORS ...]\n                            A list of annotators\n      -s, --sentence-mode   Assume each line of input is a sentence.\n      -v, --verbose-server  Server is made verbose\n      -m MEMORY, --memory MEMORY\n                            Memory to use for the server\n      -p PROPS [PROPS ...], --props PROPS [PROPS ...]\n                            Properties as a list of key=value pairs\n\n\nWe recommend using `annotate` in conjuction with the wonderful `jq`\ncommand to process the output. As an example, given a file with a\nsentence on each line, the following command produces an equivalent\nspace-separated tokens::\n\n    cat file.txt | annotate -s -a tokenize | jq '[.tokens[].originalText]' > tokenized.txt\n\n\nAnnotation Server Usage\n-----------------------\n\n.. code-block:: python\n\n  import corenlp\n\n  text = \"Chris wrote a simple sentence that he parsed with Stanford CoreNLP.\"\n\n  # We assume that you've downloaded Stanford CoreNLP and defined an environment\n  # variable $CORENLP_HOME that points to the unzipped directory.\n  # The code below will launch StanfordCoreNLPServer in the background\n  # and communicate with the server to annotate the sentence.\n  with corenlp.CoreNLPClient(annotators=\"tokenize ssplit pos lemma ner depparse\".split()) as client:\n    ann = client.annotate(text)\n\n  # You can access annotations using ann.\n  sentence = ann.sentence[0]\n\n  # The corenlp.to_text function is a helper function that\n  # reconstructs a sentence from tokens.\n  assert corenlp.to_text(sentence) == text\n\n  # You can access any property within a sentence.\n  print(sentence.text)\n\n  # Likewise for tokens\n  token = sentence.token[0]\n  print(token.lemma)\n\n  # Use tokensregex patterns to find who wrote a sentence.\n  pattern = '([ner: PERSON]+) /wrote/ /an?/ []{0,3} /sentence|article/'\n  matches = client.tokensregex(text, pattern)\n  # sentences contains a list with matches for each sentence.\n  assert len(matches[\"sentences\"]) == 1\n  # length tells you whether or not there are any matches in this\n  assert matches[\"sentences\"][0][\"length\"] == 1\n  # You can access matches like most regex groups.\n  matches[\"sentences\"][1][\"0\"][\"text\"] == \"Chris wrote a simple sentence\"\n  matches[\"sentences\"][1][\"0\"][\"1\"][\"text\"] == \"Chris\"\n\n  # Use semgrex patterns to directly find who wrote what.\n  pattern = '{word:wrote} >nsubj {}=subject >dobj {}=object'\n  matches = client.semgrex(text, pattern)\n  # sentences contains a list with matches for each sentence.\n  assert len(matches[\"sentences\"]) == 1\n  # length tells you whether or not there are any matches in this\n  assert matches[\"sentences\"][0][\"length\"] == 1\n  # You can access matches like most regex groups.\n  matches[\"sentences\"][1][\"0\"][\"text\"] == \"wrote\"\n  matches[\"sentences\"][1][\"0\"][\"$subject\"][\"text\"] == \"Chris\"\n  matches[\"sentences\"][1][\"0\"][\"$object\"][\"text\"] == \"sentence\"\n\nSee `test_client.py` and `test_protobuf.py` for more examples. Props to\n@dan-zheng for tokensregex/semgrex support.\n\n\nAnnotation Service Usage\n------------------------\n\n*NOTE*: The annotation service allows users to provide a custom\nannotator to be used by the CoreNLP pipeline. Unfortunately, it relies\non experimental code internal to the Stanford CoreNLP project is not yet\navailable for public use.\n\n.. code-block:: python\n\n    import corenlp\n    from .happyfuntokenizer import Tokenizer\n\n    class HappyFunTokenizer(Tokenizer, corenlp.Annotator):\n        def __init__(self, preserve_case=False):\n            Tokenizer.__init__(self, preserve_case)\n            corenlp.Annotator.__init__(self)\n\n        @property\n        def name(self):\n            \"\"\"\n            Name of the annotator (used by CoreNLP)\n            \"\"\"\n            return \"happyfun\"\n\n        @property\n        def requires(self):\n            \"\"\"\n            Requires has to specify all the annotations required before we\n            are called.\n            \"\"\"\n            return []\n\n        @property\n        def provides(self):\n            \"\"\"\n            The set of annotations guaranteed to be provided when we are done.\n            NOTE: that these annotations are either fully qualified Java\n            class names or refer to nested classes of\n            edu.stanford.nlp.ling.CoreAnnotations (as is the case below).\n            \"\"\"\n            return [\"TextAnnotation\",\n                    \"TokensAnnotation\",\n                    \"TokenBeginAnnotation\",\n                    \"TokenEndAnnotation\",\n                    \"CharacterOffsetBeginAnnotation\",\n                    \"CharacterOffsetEndAnnotation\",\n                   ]\n\n        def annotate(self, ann):\n            \"\"\"\n            @ann: is a protobuf annotation object.\n            Actually populate @ann with tokens.\n            \"\"\"\n            buf, beg_idx, end_idx = ann.text.lower(), 0, 0\n            for i, word in enumerate(self.tokenize(ann.text)):\n                token = ann.sentencelessToken.add()\n                # These are the bare minimum required for the TokenAnnotation\n                token.word = word\n                token.tokenBeginIndex = i\n                token.tokenEndIndex = i+1\n\n                # Seek into the txt until you can find this word.\n                try:\n                    # Try to update beginning index\n                    beg_idx = buf.index(word, beg_idx)\n                except ValueError:\n                    # Give up -- this will be something random\n                    end_idx = beg_idx + len(word)\n\n                token.beginChar = beg_idx\n                token.endChar = end_idx\n\n                beg_idx, end_idx = end_idx, end_idx\n\n    annotator = HappyFunTokenizer()\n    # Calling .start() will launch the annotator as a service running on\n    # port 8432 by default.\n    annotator.start()\n\n    # annotator.properties contains all the right properties for\n    # Stanford CoreNLP to use this annotator. \n    with corenlp.CoreNLPClient(properties=annotator.properties, annotators=\"happyfun ssplit pos\".split()) as client:\n        ann = client.annotate(\"RT @ #happyfuncoding: this is a typical Twitter tweet :-)\")\n\n        tokens = [t.word for t in ann.sentence[0].token]\n        print(tokens)\n\n\nSee `test_annotator.py` for more examples.\n\n\n",
        "description_content_type": "",
        "docs_url": null,
        "download_url": "",
        "downloads": {
            "last_day": -1,
            "last_month": -1,
            "last_week": -1
        },
        "home_page": "https://github.com/stanfordnlp/python-stanford-corenlp",
        "keywords": "corenlp natural-language-processing nlp",
        "license": "MIT",
        "maintainer": "",
        "maintainer_email": "",
        "name": "stanford-corenlp",
        "package_url": "https://pypi.org/project/stanford-corenlp/",
        "platform": "",
        "project_url": "https://pypi.org/project/stanford-corenlp/",
        "project_urls": {
            "Homepage": "https://github.com/stanfordnlp/python-stanford-corenlp"
        },
        "release_url": "https://pypi.org/project/stanford-corenlp/3.9.2/",
        "requires_dist": [
            "corenlp-protobuf (>=3.8.0)",
            "requests (>=2.10.0)",
            "six (>=1.9)",
            "check-manifest; extra == 'dev'",
            "coverage; extra == 'test'"
        ],
        "requires_python": "",
        "summary": "Official python interface for Stanford CoreNLP",
        "version": "3.9.2"
    },
    "last_serial": 4692189,
    "releases": {
        "3.7.0": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "d0b4cbaad54ff343584924829a7406b2",
                    "sha256": "ba633e186427fe725e40207c810eea2b40a36ba3329ecdaaebb4dc68045ba865"
                },
                "downloads": -1,
                "filename": "stanford-corenlp-3.7.0.tar.gz",
                "has_sig": false,
                "md5_digest": "d0b4cbaad54ff343584924829a7406b2",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 12766,
                "upload_time": "2017-06-10T03:33:24",
                "url": "https://files.pythonhosted.org/packages/eb/c2/9dae16864dc322e130ce5133d47d49744dec03b538ae64ba6c583bcfb376/stanford-corenlp-3.7.0.tar.gz"
            }
        ],
        "3.7.1": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "4dc9706f1851430686e65a3d25f0e17a",
                    "sha256": "d49b11f394070e9353eaa283188b2c4a531ca4337c6025c2a01266edeb848197"
                },
                "downloads": -1,
                "filename": "stanford-corenlp-3.7.1.tar.gz",
                "has_sig": false,
                "md5_digest": "4dc9706f1851430686e65a3d25f0e17a",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 12838,
                "upload_time": "2017-06-10T19:28:51",
                "url": "https://files.pythonhosted.org/packages/5a/94/3aafea9951b440bec011023a1154d620036d225ce2264bbfb42f0eb3d1f9/stanford-corenlp-3.7.1.tar.gz"
            }
        ],
        "3.8.0": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "712e29e37dfc69b1630379cf6dc08446",
                    "sha256": "baee23f6deff0bd91a43a1ab035a0dec3d5bc302d448075daddeffdadc5f176d"
                },
                "downloads": -1,
                "filename": "stanford_corenlp-3.8.0-py2.py3-none-any.whl",
                "has_sig": false,
                "md5_digest": "712e29e37dfc69b1630379cf6dc08446",
                "packagetype": "bdist_wheel",
                "python_version": "2.7",
                "requires_python": null,
                "size": 11509,
                "upload_time": "2017-11-18T23:15:16",
                "url": "https://files.pythonhosted.org/packages/5d/64/440e3daa30bb48fe6c08df40c8251326268c34d9402fb64b85c0a69ac09d/stanford_corenlp-3.8.0-py2.py3-none-any.whl"
            },
            {
                "comment_text": "",
                "digests": {
                    "md5": "89c43a89820ea46ba47c5581ca5a3f21",
                    "sha256": "bd0530f1828b70e4e785f1265fd065b25501826c000ef1820886e919fc71a02c"
                },
                "downloads": -1,
                "filename": "stanford-corenlp-3.8.0.tar.gz",
                "has_sig": false,
                "md5_digest": "89c43a89820ea46ba47c5581ca5a3f21",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 14196,
                "upload_time": "2017-11-18T23:15:15",
                "url": "https://files.pythonhosted.org/packages/19/12/c98ba238da577f5ae2ab35ebe4f877de8a1e2dede32d054c93b181632014/stanford-corenlp-3.8.0.tar.gz"
            }
        ],
        "3.9.2": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "2769e25407f7d0cde21b77074337c128",
                    "sha256": "43e94ab1d78aac8ccfc6d02290620ec856b5a6fed2a5564938292966db806baa"
                },
                "downloads": -1,
                "filename": "stanford_corenlp-3.9.2-py2.py3-none-any.whl",
                "has_sig": false,
                "md5_digest": "2769e25407f7d0cde21b77074337c128",
                "packagetype": "bdist_wheel",
                "python_version": "py2.py3",
                "requires_python": null,
                "size": 11160,
                "upload_time": "2019-01-14T00:26:10",
                "url": "https://files.pythonhosted.org/packages/31/a9/695357743b55c08e74e46fe72579ca2a2559fa9b196d9f2035339af89b94/stanford_corenlp-3.9.2-py2.py3-none-any.whl"
            },
            {
                "comment_text": "",
                "digests": {
                    "md5": "e30b1a2618717406712cd91b417c6533",
                    "sha256": "598a18653779cbd5d5fcbf7d2b6ccf3f7372de9a5f7ecd8c9e4cbbb65ba46d64"
                },
                "downloads": -1,
                "filename": "stanford-corenlp-3.9.2.tar.gz",
                "has_sig": false,
                "md5_digest": "e30b1a2618717406712cd91b417c6533",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 18317,
                "upload_time": "2019-01-14T00:26:11",
                "url": "https://files.pythonhosted.org/packages/20/0f/7e5675963ceec1fec931b3c0ccde88bca94698f29dae431dc9e57794e67b/stanford-corenlp-3.9.2.tar.gz"
            }
        ]
    },
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "2769e25407f7d0cde21b77074337c128",
                "sha256": "43e94ab1d78aac8ccfc6d02290620ec856b5a6fed2a5564938292966db806baa"
            },
            "downloads": -1,
            "filename": "stanford_corenlp-3.9.2-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2769e25407f7d0cde21b77074337c128",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 11160,
            "upload_time": "2019-01-14T00:26:10",
            "url": "https://files.pythonhosted.org/packages/31/a9/695357743b55c08e74e46fe72579ca2a2559fa9b196d9f2035339af89b94/stanford_corenlp-3.9.2-py2.py3-none-any.whl"
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "e30b1a2618717406712cd91b417c6533",
                "sha256": "598a18653779cbd5d5fcbf7d2b6ccf3f7372de9a5f7ecd8c9e4cbbb65ba46d64"
            },
            "downloads": -1,
            "filename": "stanford-corenlp-3.9.2.tar.gz",
            "has_sig": false,
            "md5_digest": "e30b1a2618717406712cd91b417c6533",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 18317,
            "upload_time": "2019-01-14T00:26:11",
            "url": "https://files.pythonhosted.org/packages/20/0f/7e5675963ceec1fec931b3c0ccde88bca94698f29dae431dc9e57794e67b/stanford-corenlp-3.9.2.tar.gz"
        }
    ]
}