{
    "info": {
        "author": "John K. Von Seggern",
        "author_email": "vonseg@gmail.com",
        "bugtrack_url": null,
        "classifiers": [
            "Development Status :: 2 - Pre-Alpha",
            "License :: OSI Approved :: MIT License",
            "Programming Language :: Python",
            "Topic :: Software Development :: Compilers",
            "Topic :: Software Development :: Interpreters"
        ],
        "description": "sourcer\n=======\n\nSimple parsing library for Python.\n\nThere's not much documentation yet, and the performance is probably pretty\nbad, but if you want to give it a try, go for it!\n\nFeel free to send me your feedback at vonseg@gmail.com. Or use the github\n`issue tracker <https://github.com/jvs/sourcer/issues>`_.\n\n.. contents::\n\n\nInstallation\n------------\n\nTo install sourcer::\n\n    pip install sourcer\n\nIf pip is not installed, use easy_install::\n\n    easy_install sourcer\n\nOr download the source from `github <https://github.com/jvs/sourcer>`_\nand install with::\n\n    python setup.py install\n\n\nExamples\n--------\n\n\nExample 1: Hello, World!\n~~~~~~~~~~~~~~~~~~~~~~~~\n\nLet's parse the string \"Hello, World!\" (just to make sure the basics work):\n\n.. code:: python\n\n    from sourcer import *\n\n    # Let's parse strings like \"Hello, foo!\", and just keep the \"foo\" part.\n    greeting = 'Hello' >> Opt(',') >> ' ' >> Pattern(r'\\w+') << '!'\n\n    # Let's try it on the string \"Hello, World!\"\n    person1 = parse(greeting, 'Hello, World!')\n    assert person1 == 'World'\n\n    # Now let's try omitting the comma, since we made it optional (with \"Opt\").\n    person2 = parse(greeting, 'Hello Chief!')\n    assert person2 == 'Chief'\n\nSome notes about this example:\n\n* The ``>>`` operator means \"Discard the result from the left operand. Just\n  return the result from the right operand.\"\n* The ``<<`` operator similarly means \"Just return the result from the result\n  from the left operand and discard the result from the right operand.\"\n* ``Opt`` means \"This term is optional. Parse it if it's there, otherwise just\n  keep going.\"\n* ``Pattern`` means \"Parse strings that match this regular expression.\"\n\n\nExample 2: Parsing Arithmetic Expressions\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nHere's a quick example showing how to use operator precedence parsing:\n\n.. code:: python\n\n    from sourcer import *\n\n    Int = Pattern(r'\\d+') * int\n    Parens = '(' >> ForwardRef(lambda: Expr) << ')'\n    Expr = OperatorPrecedence(\n        Int | Parens,\n        InfixRight('^'),\n        Prefix('+', '-'),\n        Postfix('%'),\n        InfixLeft('*', '/'),\n        InfixLeft('+', '-'),\n    )\n\n    # Now let's try parsing an expression.\n    t1 = parse(Expr, '1+2^3/4')\n    assert t1 == Operation(1, '+', Operation(Operation(2, '^', 3), '/', 4))\n\n    # Let's try putting some parentheses in the next one.\n    t2 = parse(Expr, '1*(2+3)')\n    assert t2 == Operation(1, '*', Operation(2, '+', 3))\n\n    # Finally, let's try using a unary operator in our expression.\n    t3 = parse(Expr, '-1*2')\n    assert t3 == Operation(Operation(None, '-', 1), '*', 2)\n\nSome notes about this example:\n\n* The ``*`` operator means \"Take the result from the left operand and then\n  apply the function on the right.\"\n* In this case, the function is simply ``int``.\n* So in our example, the ``Int`` rule matches any string of digit characters\n  and produces the corresponding ``int`` value.\n* So the ``Parens`` rule in our example parses an expression in parentheses,\n  discarding the parentheses.\n* The ``ForwardRef`` term is necessary because the ``Parens`` rule wants to\n  refer to the ``Expr`` rule, but ``Expr`` hasn't been defined by that point.\n* The ``OperatorPrecedence`` rule constructs the operator precedence table.\n  It parses operations and returns ``Operation`` objects.\n\n\nExample 3: Building an Abstract Syntax Tree\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nLet's try building a simple AST for the\n`lambda calculus <http://en.wikipedia.org/wiki/Lambda_calculus>`_. We can use\n``Struct`` classes to define the AST and the parser at the same time:\n\n.. code:: python\n\n    from sourcer import *\n\n    class Identifier(Struct):\n        def parse(self):\n            self.name = Word\n\n    class Abstraction(Struct):\n        def parse(self):\n            self.parameter = '\\\\' >> Word\n            self.body = '. ' >> Expr\n\n    class Application(LeftAssoc):\n        def parse(self):\n            self.left = Operand\n            self.operator = ' '\n            self.right = Operand\n\n    Word = Pattern(r'\\w+')\n    Parens = '(' >> ForwardRef(lambda: Expr) << ')'\n    Operand = Parens | Abstraction | Identifier\n    Expr = Application | Operand\n\n    t1 = parse(Expr, r'(\\x. x) y')\n    assert isinstance(t1, Application)\n    assert isinstance(t1.left, Abstraction)\n    assert isinstance(t1.right, Identifier)\n    assert t1.left.parameter == 'x'\n    assert t1.left.body.name == 'x'\n    assert t1.right.name == 'y'\n\n    t2 = parse(Expr, 'x y z')\n    assert isinstance(t2, Application)\n    assert isinstance(t2.left, Application)\n    assert isinstance(t2.right, Identifier)\n    assert t2.left.left.name == 'x'\n    assert t2.left.right.name == 'y'\n    assert t2.right.name == 'z'\n\n\nExample 4: Tokenizing\n~~~~~~~~~~~~~~~~~~~~~\n\nIt's often useful to tokenize your input before parsing it. Let's create a\ntokenizer for the lambda calculus.\n\n.. code:: python\n\n    from sourcer import *\n\n    class LambdaTokens(TokenSyntax):\n        def __init__(self):\n            self.Word = r'\\w+'\n            self.Symbol = AnyChar(r'(\\.)')\n            self.Space = Skip(r'\\s+')\n\n    # Run the tokenizer on a lambda term with a bunch of random whitespace.\n    Tokens = LambdaTokens()\n    ans1 = tokenize(Tokens, '\\n (   x  y\\n\\t) ')\n\n    # Assert that we didn't get any space tokens.\n    assert len(ans1) == 4\n    (t1, t2, t3, t4) = ans1\n    assert isinstance(t1, Tokens.Symbol) and t1.content == '('\n    assert isinstance(t2, Tokens.Word) and t2.content == 'x'\n    assert isinstance(t3, Tokens.Word) and t3.content == 'y'\n    assert isinstance(t4, Tokens.Symbol) and t4.content == ')'\n\n    # Let's use the tokenizer with a simple grammar, just to show how that\n    # works.\n    Sentence = Some(Tokens.Word) << '.'\n    ans2 = tokenize_and_parse(Tokens, Sentence, 'This is a test.')\n\n    # Assert that we got a list of Word tokens.\n    assert all(isinstance(i, Tokens.Word) for i in ans2)\n\n    # Assert that the tokens have the expected content.\n    contents = [i.content for i in ans2]\n    assert contents == ['This', 'is', 'a', 'test']\n\n\nIn this example, the ``Skip`` term tells the tokenizer that we want to ignore\nwhitespace. The ``AnyChar`` term tell the tokenizer that a symbol can be any\none of the characters ``(``, ``\\``, ``.``, ``)``. Alternatively, we could have\nused:\n\n.. code:: python\n\n    Symbol = r'[(\\\\.)]'\n\n\nExample 5: Parsing Significant Indentation\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nWe can use sourcer to parse languages with significant indentation. Here's a\nbare-bones example to demonstrate one possible approach.\n\n.. code:: python\n\n    from sourcer import *\n\n    class TestTokens(TokenSyntax):\n        def __init__(self):\n            # Let's just use words, newlines, and spaces in this example.\n            self.Word = r'\\w+'\n            self.Newline = r'\\n'\n            # In this case, we'll say that an indent is a newline followed by\n            # some spaces, followed by a word.\n            self.Indent = r'(?<=\\n) +(?=\\w)'\n            # And let's just throw out all other space characters.\n            self.Space = Skip(' +')\n\n    # All our token classes are attributes of this ``Tokens`` object. It's\n    # essentially a namespace for our token classes.\n    Tokens = TestTokens()\n\n    class InlineStatement(Struct):\n        def parse(self):\n            # Let's say an inline-statement is just some word tokens. We'll use\n            # ``Content`` to get the string content of each token (since in this\n            # case, we don't care about the tokens themselves).\n            self.words = Some(Content(Tokens.Word))\n\n        def __repr__(self):\n            # We'll define a ``repr`` method so that we can easily check the\n            # parse results. We'll just put a semicolon after each statement.\n            return '%s;' % ' '.join(self.words)\n\n    class Block(Struct):\n        def parse(self, indent=''):\n            # A block is a bunch of statements at the same indentation,\n            # all separated by some newline tokens.\n            self.statements = Statement(indent) // Some(Tokens.Newline)\n\n        def __repr__(self):\n            # In this case, we'll put a space between each statement and enclose\n            # the whole block in curly braces. This will make it easy for us to\n            # tell if our parse results look right.\n            return '{%s}' % ' '.join(repr(i) for i in self.statements)\n\n    def Statement(indent):\n        # Let's say there are two ways to get a statement:\n        # - Get an inline-statement with the current indentation.\n        # - Get a block that is indented farther than the current indentation.\n        return (CurrentIndent(indent) >> InlineStatement\n            | IncreaseIndent(indent) ** Block)\n\n    def CurrentIndent(indent):\n        # The point of this function is to return a parsing expression that\n        # matches the current indent (which is provided as an argument).\n        return Return('') if indent == '' else indent\n\n    def IncreaseIndent(current):\n        # To see if the next indentation is more than the current indentation,\n        # we peek at the next token, using ``Expect``, and we get its string\n        # content using ``Content``. The ``^`` operator means \"require\". In this\n        # case, we require that the next indentation is longer than the current\n        # indentation.\n        token = Expect(Content(Tokens.Indent))\n        return token ^ (lambda token: len(current) < len(token))\n\n    # Let's say that a program is a block, optionally surrounded by newlines.\n    # (The ``>>`` and ``<<`` operators discard the newlines in this case.)\n    OptNewlines = List(Tokens.Newline)\n    Program = OptNewlines >> Block << OptNewlines\n\n    test = '''\n    print foo\n    while true\n        print bar\n        if baz\n            then break\n    exit\n    '''\n\n    # Let's parse the test case and then use ``repr`` to make sure that we get\n    # back what we expect.\n    ans = tokenize_and_parse(Tokens, Program, test)\n    expect = '{print foo; while true; {print bar; if baz; {then break;}} exit;}'\n    assert repr(ans) == expect\n\n\nMore Examples\n-------------\nParsing `Excel formula <https://github.com/jvs/sourcer/tree/master/examples>`_\nand some corresponding\n`test cases <https://github.com/jvs/sourcer/blob/master/tests/test_excel.py>`_.",
        "description_content_type": null,
        "docs_url": null,
        "download_url": "UNKNOWN",
        "downloads": {
            "last_day": -1,
            "last_month": -1,
            "last_week": -1
        },
        "home_page": "https://github.com/jvs/sourcer",
        "keywords": "packrat,parser,peg",
        "license": "MIT License",
        "maintainer": null,
        "maintainer_email": null,
        "name": "sourcer",
        "package_url": "https://pypi.org/project/sourcer/",
        "platform": "any",
        "project_url": "https://pypi.org/project/sourcer/",
        "project_urls": {
            "Download": "UNKNOWN",
            "Homepage": "https://github.com/jvs/sourcer"
        },
        "release_url": "https://pypi.org/project/sourcer/0.1.3/",
        "requires_dist": null,
        "requires_python": null,
        "summary": "simple parsing library",
        "version": "0.1.3"
    },
    "last_serial": 1442541,
    "releases": {
        "0.1.1": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "dd3c8d605c374a38e2cd2a70190ab86b",
                    "sha256": "42d0ce9a77bd2aa7b5e2d9d428348e4a9344e1b4342059e3e2ac20803f2f9374"
                },
                "downloads": -1,
                "filename": "sourcer-0.1.1.tar.gz",
                "has_sig": false,
                "md5_digest": "dd3c8d605c374a38e2cd2a70190ab86b",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 4877,
                "upload_time": "2015-02-19T07:39:06",
                "url": "https://files.pythonhosted.org/packages/77/66/48b0a515537d885950d74b51a266b70143b37ae1db4fd781cd0ecc0ee9cb/sourcer-0.1.1.tar.gz"
            }
        ],
        "0.1.2": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "5b99715347ac429d5f52ce9a0c2568c6",
                    "sha256": "61b78a8a4412fff8aef109cc143c08953bb85154bb865e61cb54682f6375fd31"
                },
                "downloads": -1,
                "filename": "sourcer-0.1.2.tar.gz",
                "has_sig": false,
                "md5_digest": "5b99715347ac429d5f52ce9a0c2568c6",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 7420,
                "upload_time": "2015-02-21T23:58:24",
                "url": "https://files.pythonhosted.org/packages/24/cf/f82de97ea7522527177451c7f20f00c3b7dc0065f1f960de031f270c8162/sourcer-0.1.2.tar.gz"
            }
        ],
        "0.1.3": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "1efb5675cb7338b8e8493b0a81e0d71b",
                    "sha256": "bdae3b196960acd6171009b9e865420f9d664aaa8f6b7fb1e07ca8495bdb5e0d"
                },
                "downloads": -1,
                "filename": "sourcer-0.1.3.tar.gz",
                "has_sig": false,
                "md5_digest": "1efb5675cb7338b8e8493b0a81e0d71b",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 10952,
                "upload_time": "2015-02-28T19:21:54",
                "url": "https://files.pythonhosted.org/packages/83/0e/b8ccbb6eeebe3ac0db24b089d5bbf4dbdd12470eef6269916f32641e05e6/sourcer-0.1.3.tar.gz"
            }
        ]
    },
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "1efb5675cb7338b8e8493b0a81e0d71b",
                "sha256": "bdae3b196960acd6171009b9e865420f9d664aaa8f6b7fb1e07ca8495bdb5e0d"
            },
            "downloads": -1,
            "filename": "sourcer-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "1efb5675cb7338b8e8493b0a81e0d71b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 10952,
            "upload_time": "2015-02-28T19:21:54",
            "url": "https://files.pythonhosted.org/packages/83/0e/b8ccbb6eeebe3ac0db24b089d5bbf4dbdd12470eef6269916f32641e05e6/sourcer-0.1.3.tar.gz"
        }
    ]
}