{ "info": { "author": "Ruslan Spivak", "author_email": "ruslan.spivak@gmail.com", "bugtrack_url": null, "classifiers": [ "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Operating System :: Unix", "Programming Language :: Python", "Topic :: Software Development :: Interpreters" ], "description": "TinyPie - Tree-Based Interpreter, Compiler, and VM for TinyPie language\n=======================================================================\n\nOverview\n--------\n\nTinyPie is a tree-based interpreter for a simple programming\nlanguage with a Python-like syntax.\n\nIt's based on Pie language from [Language Implementation Patterns](http://pragprog.com/titles/tpdsl/language-implementation-patterns) Ch.9\n\nQuote from [the book](http://pragprog.com/titles/tpdsl/language-implementation-patterns): \"A tree-based interpreter is like a compiler front\nend with an interpreter grafted onto the end instead of a code generator\"\n\nTinyPie also includes Bytecode Assembler and Register-Based Virtual Machine.\n\nGoals of the project\n--------------------\n\n1. Self-education\n\n2. To serve as an example for people interested in crafting\n their own interpreter in Python for a simple programming language or DSL\n\n\nInstallation\n------------\n\n1. Using `buildout`\n\n $ git clone git://github.com/rspivak/tinypie.git\n $ cd tinypie\n $ python bootstrap.py\n $ bin/buildout\n\n Run the interpreter\n\n $ bin/tinypie factorial.tp\n\n2. Using `pip` or `easy_install` (no need for sudo if using `virtualenv`)\n\n $ sudo pip install tinypie\n\n (or run alternatively easy_install: $ sudo easy_install tinypie)\n\n Run the interpreter\n\n $ tinypie factorial.tp\n\n\nMain interpreter features\n-------------------------\n\n- Implemented in Python\n\n- Regexp-based lexer\n\n- LL(k) recursive-descent parser\n\n- Parser constructs homogeneous Abstract Syntax Tree (AST)\n\n- Static / lexical scope support.\n Interpreter builds complete scope tree during AST construction.\n\n- Interpeter manages global memory space and function space stack\n\n- Interpreter implements external AST visitor\n\n- Forward references support\n\n\nHigh-level overview:\n\n | +------------------------------------+\n | source code | Symbol Table / Scope Tree |\n | | |\n \\|/ | +-----------------------+ |\n +--------------X---------------+ | | GlobalScope | |\n | | | +----/-------------\\----+ |\n | Regexp-based lexer | | / \\ |\n | | |+--------/-----+ +-------\\-------+|\n +--------------+---------------+ ||FunctionSymbol| |FunctionSymbol ||\n | |+------X-------+ +-------X-------+|\n | token stream | /|\\ /|\\ |\n \\|/ | | | |\n +--------------X---------------+ |+------+-------+ +-------+-------+|\n | | || LocalScope | | LocalScope ||\n |LL(k) recursive-descent parser| |+--------------+ +---------------+|\n | | +------------------------------------+\n +--------------+---------------+\n | +------------------------------------+\n | AST | Memory Space |\n \\|/ | |\n +--------------X-------------+ |Function Space Global Memory |\n | Interpreter | |Stack Space |\n | with external tree visitor | |+-------------+ +----------------+|\n | | ||FunctionSpace| | MemorySpace ||\n +--------------+-------------+ |+-------------+ +----------------+|\n | |+-------------+ |\n | output ||FunctionSpace| |\n \\|/ |+-------------+ |\n X +------------------------------------+\n\n\n\nThis Tree-Based Interpreter is similar to [Syntax-Directed Interpreter](http://github.com/rspivak/nososql) but instead of directly executing\nsource code during parsing the interpreter executes the source code by\nconstructing AST and then walking the tree.\n\nAdvantages of tree-based interperter over syntax-directed one:\n\n1. Symbol definition and symbol resolution happen at different stages:\n - symbol definition is performed during parsing\n - symbol resolution is performed during interpretation\n\n This allows forward references and simplifies implementation -\n parser deals with scope tree only and interpreter deals with memory\n spaces.\n\n2. More flexible because of AST as an intermediate representation.\n\n\nLanguage grammar\n----------------\n\n program -> (function_definition | statement)+ EOF\n function_definition -> 'def' ID '(' (ID (',' ID)*)? ')' slist\n slist -> ':' NL statement+ '.' NL\n | statement\n statement -> 'print' expr NL\n | 'return' expr NL\n | call NL\n | assign NL\n | 'if' expr slist ('else' slist)?\n | 'while' expr slist\n | NL\n assign -> ID '=' expr\n expr -> add_expr (('<' | '==') add_expr)?\n add_expr -> mult_expr (('+' | '-') mult_expr)*\n mult_expr -> atom ('*' atom)*\n atom -> ID | INT | STRING | call | '(' expr ')'\n call -> ID '(' (expr (',' expr)*)? ')'\n\n\nShort language reference\n------------------------\n\n- statements: `print`, `return`, `if`, `while`, and function call\n- literals: integers and strings\n- operators: <, ==, +, -, *\n\nExpressions may contain identifiers.\n\nCode examples\n-------------\n\n1. Factorial\n\n print factorial(6) # forward reference\n\n def factorial(x): # function definition\n if x < 2 return 1 # if statement\n return x * factorial(x - 1) # return statement\n . # DOT marks the block end\n\n2. WHILE loop\n\n index = 0\n\n while index < 10:\n print index\n index = index + 1\n . # DOT marks the block end\n\n3. IF ELSE\n\n x = 10\n if x == 10 print 'Yes'\n else print 'No'\n\n4. Variable lookup in different scopes\n\n x = 1 # global variable\n def foo(x): # 'foo' is defined in global memory space\n print x # prints 3 (parameter value)\n . # DOT marks the block end\n def bar(): # 'bar' is defined in global memory space\n x = 7 # modify global variable\n . # DOT marks the block end\n\n foo(3) # prints 3\n bar()\n print x # prints 7 ('bar' modified global variable)\n\n\nAST Examples\n------------\n\nTrees in these examples are represented as S-expressions for tersness.\n\nFunction call:\n\n foo(5, 7)\n\n (CALL ID INT INT) means CALL is the root with children ID(foo), INT(5), and INT(7)\n\nFunction definition:\n\n def foo(x, y):\n z = x + y\n return z\n .\n\n (FUNC_DEF ID ID ID (BLOCK (ASSIGN ID (SUB ID ID)) (RETURN ID)))\n\n\nDevelopment\n-----------\n\nInstall 'enscript' utility (optional).\nIf you are on Ubuntu:\n\n $ sudo apt-get install enscript\n\nBoostrap the buildout and run it:\n\n $ cd tinypie\n $ python bootstrap.py\n $ bin/buildout\n\nRun tests, test coverage and produce coverage reports:\n\n $ bin/test\n $ bin/coverage-test\n $ bin/coveragereport\n\n Check ./var/report/tinypie.html out for coverage report.\n\nRun pep8 and pylint to check code style and search for potential bugs:\n\n $ bin/pep8\n $ bin/pylint\n\nAST visualizer\n--------------\n\nTo see how TinyPie AST for different language constructs looks\nlike you can use `gendot` command line utility that is generated\nby buildout. This utility generates DOT file than can be further\nprocessed by [dot](http://www.graphviz.org/) program to draw nice graphs.\n\n $ echo -n 'foo(3)' | bin/gendot\n digraph astgraph {\n node [shape=plaintext, fontsize=12, fontname=\"Courier\", height=.1];\n ranksep=.3;\n edge [arrowsize=.5]\n\n node1 [label=\"BLOCK\"];\n node2 [label=\"CALL\"];\n node3 [label=\"ID (foo)\"];\n node4 [label=\"INT (3)\"];\n\n node2 -> node3\n node2 -> node4\n node1 -> node2\n }\n\n To draw graph and save it as a PNG image:\n\n $ echo -n 'foo(3)' | bin/gendot > funcall.dot\n $ dot -Tpng -o funcall.png funcall.dot\n\n\nBytecode Assembler\n------------------\n\nConverts assembly program into binary bytecodes.\nThe bytecode is further interpreted by the TinyPie\nRegister-Based Bytecode Interpreter / Virtual Machine.\n\nTinyPie Assembly language grammar:\n\n program -> globals? (label | function_definition | instruction | NL)+\n globals -> '.globals' INT NL\n label -> ID ':' NL\n function_definition -> '.def' ID ':' 'args' '=' INT ',' 'locals' '=' INT NL\n instruction -> ID NL\n | ID operand NL\n | ID operand ',' operand NL\n | ID operand ',' operand NL ',' operand NL\n | 'call' ID ',' operand NL\n | 'loadk' REG ',' (INT | STRING) NL\n operand -> REG | ID | STRING | INT\n\n\nAssembler yields the following components:\n\n1. *Code memory*: This is a `bytearray` containing\n bytecode instructions and their operands derived from\n the assembly source code.\n\n2. *Global data memory size*: The number of slots allocated\n in global memory for use with GSTORE and GLOAD assembly commands.\n\n3. *Program entry point*: An address of main function `.def main: ...`\n\n4. *Constant pool*: A list of objects (integers, strings, function symbols)\n that are not part of the code memory. Bytecode instructions refer\n to those objects via integer index.\n\n\nHere is a factorial function in TinyPie language:\n\n def factorial(x):\n if x < 2 return 1\n return x * factorial(x - 1)\n .\n\n print factorial(5)\n\n\nHere is an equivalent TinyPie assembly code:\n\n .def factorial: args=1, locals=3\n # r1 holds argument 'n'\n loadk r2, 2\n lt r3, r1, r2 # n < 2 ?\n brf r3, cont # if n >= 2 jump to 'cont'\n loadk r0, 1 # else return 1\n ret\n cont:\n loadk r2, 1 # r2 = 1\n move r3, r1 # r3 = n\n sub r1, r1, r2 # r1 = n - 1\n call factorial, r1 # factorial(n - 1)\n mul r0, r3, r0 # n = n * result of factorial(n - 1)\n ret\n\n .def main: args=0, locals=1\n loadk r1, 5\n call factorial, r1 # factorial(5)\n print r0 # 120\n halt\n\nHere are the resulting elements produced by the Bytecode Assembler\nby translating the above assembly code:\n\n Constant pool:\n 0000: \n 0001: 2\n 0002: 1\n 0003: \n 0004: 3\n\n Code memory:\n 0000: 6 0 0 0 2 0 0 0\n 0008: 1 4 0 0 0 3 0 0\n 0016: 0 1 0 0 0 2 13 0\n 0024: 0 0 3 0 0 0 41 6\n 0032: 0 0 0 0 0 0 0 2\n 0040: 9 6 0 0 0 2 0 0\n 0048: 0 2 14 0 0 0 3 0\n 0056: 0 0 1 2 0 0 0 1\n 0064: 0 0 0 1 0 0 0 2\n 0072: 16 0 0 0 0 0 0 0\n 0080: 1 3 0 0 0 0 0 0\n 0088: 0 3 0 0 0 0 9 6\n 0096: 0 0 0 1 0 0 0 4\n 0104: 16 0 0 0 0 0 0 0\n 0112: 1 15 0 0 0 0 10\n\nBytecode instructions are described in the following section.\n\n\nRegister-Based Bytecode Interpreter / Virtual Machine\n-----------------------------------------------------\n\nVirtual Machine architecture:\n\n1. **IP**: Instruction pointer register that points into the *code memory*\n at the next instruction to execute.\n\n2. **CPU**: Instruction dispatcher that simulates *fetch-decode-execute*\n cycle with a *switch* (if elif) statement in a loop - reads bytecode\n at IP, decodes its operands and executes corresponding operation.\n\n3. **Global data memory**: a list of global objects. Contents is\n accessed via integer index.\n\n4. **Code memory**: Holds bytecode instructions and their operands.\n\n5. **Call stack**: Holds StackFrame objects with function return address,\n parameters, and local variables.\n\n6. **Stack frame**: A StackFrame object that holds all required information\n to invoke a function:\n - function symbol\n - function return address\n - registers hold return value, arguments, locals, and temporary values\n\n7. **FP**: Frame pointer - a special-purpose register that points to\n the top of the function `call stack`\n\n8. **Constant pool**: Integers, strings, and function symbols all go into\n constant pool. Instructions refer to constant pool values via an integer\n index.\n\n\nHigh-level overview:\n\n |\n TinyPie | assembly\n |\n V\n +-----------------------+\n | |\n | Bytecode Assembler |\n | |\n +----------+------------+\n |\n code memory(bytecode) | constant pool\n V\n +--------------------------------------------------------------------------+\n | |\n | Register-Based Bytecode Interpreter / Virtual Machine |\n | |\n | +------------------------+ +------------------------------------+ |\n | | | | Function Call Stack | |\n | | Constant Pool +----+ | | |\n | | (integer, string, | | | | |\n | | function symbol) | | | +--------------------------------+ | |\n | +------------------------+ | | | Stack Frame | | |\n | | | | | | |\n | | | |function symbol: FS('main') | | |\n | | | |return address: 0 | | |\n | +------------------------+ | | | ret| args|locals | | |\n | | | | | |registers: r0|r1 r2|r3 r4 r5 | | |\n | | Global data memory | | | +--------------------------------+ | |\n | | +-| | | | |\n | | | | | | | |\n | +------------------------+ | | | +--------------------------------+ | |\n | | | | | Stack Frame | | |\n | | | | | | | |\n | | | | |function symbol: FS('fact') | | |\n | +------------------------+ | | | |return address: 17 | | |\n | | | | | | | ret| args |locals | | |\n | | Code memory | | | | |registers: r0|r1 r2 r3|r4 | | |\n | | (bytecode) | | | | +--------------------------------+ | |\n | | | | | | | |\n | +----------+-------------+ | | | | |\n | | | | +------------------------------------+ |\n | V | | |\n + +------------------------+ | | |\n | | | | | |\n | | CPU <-+ | |\n | | (fetch-decode-execute) | | |\n | | <----+ |\n | +------------------------+ |\n | |\n +--------------------------------------------------------------------------+\n\n\nBytecode instructions for TinyPie VM:\n\n # Index serves as an opcode\n INSTRUCTIONS = [\n None,\n Instruction('add', REG, REG, REG), # A B C R(A) = R(B) + R(C)\n Instruction('sub', REG, REG, REG), # A B C R(A) = R(B) - R(C)\n Instruction('mul', REG, REG, REG), # A B C R(A) = R(B) * R(C)\n Instruction('lt', REG, REG, REG), # A B C R(A) = R(B) < R(C)\n Instruction('eq', REG, REG, REG), # A B C R(A) = R(B) == R(C)\n Instruction('loadk', REG, POOL), # A B R(A) = CONST_POOL[B]\n Instruction('gload', REG, POOL), # A B R(A) = GLOBALS[B]\n Instruction('gstore', POOL, REG), # A B GLOBALS[A] = R(B)\n Instruction('ret'),\n Instruction('halt'),\n Instruction('br', INT), # A branch to A\n Instruction('brt', REG, INT), # A B R(A) is True -> branch to B\n Instruction('brf', REG, INT), # A B R(A) is False -> branch to B\n Instruction('move', REG, REG), # A B R(A) = R(B)\n Instruction('print', REG), # A print R(A)\n Instruction('call', FUNC, REG), # A B call A, R(B)\n ]\n\nTinyPie VM comes with a `tpvm` command line utility:\n\n $ bin/tpvm -h\n Usage: tpvm [input file]\n\n If no input file is provided STDIN is used by default.\n\n\n Options:\n -h, --help show this help message and exit\n -i FILE, --input=FILE\n Input file. Defaults to standard input.\n -c, --coredump Print coredump to standard output.\n -d, --disasm Print disassembled code to standard output.\n -t, --trace Print execution trace.\n\nExample output:\n\n fact.tps\n .def fact: args=1, locals=3\n # r1 holds argument 'n'\n loadk r2, 2\n lt r3, r1, r2 # n < 2 ?\n brf r3, cont # if n >= 2 jump to 'cont'\n loadk r0, 1 # else return 1\n ret\n cont:\n loadk r2, 1 # r2 = 1\n move r3, r1 # r3 = n\n sub r1, r1, r2 # r1 = n - 1\n call fact, r1 # fact(n - 1)\n mul r0, r3, r0 # n = n * result of fact(n - 1)\n ret\n\n .def main: args=0, locals=1\n loadk r1, 3\n call fact, r1 # fact(3)\n print r0 # 6\n halt\n\n $ bin/tpvm --coredump --disasm --trace -i /tmp/fact.tps\n 0095: LOADK r1, #4:3 main.registers=[? | ?] calls=[main]\n 0104: CALL #0:fact@0, r1 main.registers=[? | 3] calls=[main]\n 0000: LOADK r2, #1:2 fact.registers=[? | 3 | ? ? ?] calls=[main fact]\n 0009: LT r3, r1, r2 fact.registers=[? | 3 | 2 ? ?] calls=[main fact]\n 0022: BRF r3, 41 fact.registers=[? | 3 | 2 0 ?] calls=[main fact]\n 0041: LOADK r2, #2:1 fact.registers=[? | 3 | 2 0 ?] calls=[main fact]\n 0050: MOVE r3, r1 fact.registers=[? | 3 | 1 0 ?] calls=[main fact]\n 0059: SUB r1, r1, r2 fact.registers=[? | 3 | 1 3 ?] calls=[main fact]\n 0072: CALL #0:fact@0, r1 fact.registers=[? | 2 | 1 3 ?] calls=[main fact]\n 0000: LOADK r2, #1:2 fact.registers=[? | 2 | ? ? ?] calls=[main fact fact]\n 0009: LT r3, r1, r2 fact.registers=[? | 2 | 2 ? ?] calls=[main fact fact]\n 0022: BRF r3, 41 fact.registers=[? | 2 | 2 0 ?] calls=[main fact fact]\n 0041: LOADK r2, #2:1 fact.registers=[? | 2 | 2 0 ?] calls=[main fact fact]\n 0050: MOVE r3, r1 fact.registers=[? | 2 | 1 0 ?] calls=[main fact fact]\n 0059: SUB r1, r1, r2 fact.registers=[? | 2 | 1 2 ?] calls=[main fact fact]\n 0072: CALL #0:fact@0, r1 fact.registers=[? | 1 | 1 2 ?] calls=[main fact fact]\n 0000: LOADK r2, #1:2 fact.registers=[? | 1 | ? ? ?] calls=[main fact fact fact]\n 0009: LT r3, r1, r2 fact.registers=[? | 1 | 2 ? ?] calls=[main fact fact fact]\n 0022: BRF r3, 41 fact.registers=[? | 1 | 2 1 ?] calls=[main fact fact fact]\n 0031: LOADK r0, #2:1 fact.registers=[? | 1 | 2 1 ?] calls=[main fact fact fact]\n 0040: RET fact.registers=[1 | 1 | 2 1 ?] calls=[main fact fact fact]\n 0081: MUL r0, r3, r0 fact.registers=[1 | 1 | 1 2 ?] calls=[main fact fact]\n 0094: RET fact.registers=[2 | 1 | 1 2 ?] calls=[main fact fact]\n 0081: MUL r0, r3, r0 fact.registers=[2 | 2 | 1 3 ?] calls=[main fact]\n 0094: RET fact.registers=[6 | 2 | 1 3 ?] calls=[main fact]\n 0113: PRINT r0 main.registers=[6 | 3] calls=[main]\n 6\n Constant pool:\n 0000: \n 0001: 2\n 0002: 1\n 0003: \n 0004: 3\n\n Code memory:\n 0000: 6 0 0 0 2 0 0 0\n 0008: 1 4 0 0 0 3 0 0\n 0016: 0 1 0 0 0 2 13 0\n 0024: 0 0 3 0 0 0 41 6\n 0032: 0 0 0 0 0 0 0 2\n 0040: 9 6 0 0 0 2 0 0\n 0048: 0 2 14 0 0 0 3 0\n 0056: 0 0 1 2 0 0 0 1\n 0064: 0 0 0 1 0 0 0 2\n 0072: 16 0 0 0 0 0 0 0\n 0080: 1 3 0 0 0 0 0 0\n 0088: 0 3 0 0 0 0 9 6\n 0096: 0 0 0 1 0 0 0 4\n 0104: 16 0 0 0 0 0 0 0\n 0112: 1 15 0 0 0 0 10\n\n Disassembly:\n 0000: LOADK r2, #1:2\n 0009: LT r3, r1, r2\n 0022: BRF r3, 41\n 0031: LOADK r0, #2:1\n 0040: RET\n 0041: LOADK r2, #2:1\n 0050: MOVE r3, r1\n 0059: SUB r1, r1, r2\n 0072: CALL #0:fact@0, r1\n 0081: MUL r0, r3, r0\n 0094: RET\n 0095: LOADK r1, #4:3\n 0104: CALL #0:fact@0, r1\n 0113: PRINT r0\n 0118: HALT\n\n\n\n=======\nCHANGES\n=======\n\n0.2 (2011-03-03)\n----------------\n\n- Added Bytecode Assembler\n- Added Register-Based Virtual Machine\n\n0.1 (initial release)\n---------------------\n\n- Initial release of TinyPie interpreter", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://github.com/rspivak/tinypie", "keywords": null, "license": "MIT", "maintainer": null, "maintainer_email": null, "name": "tinypie", "package_url": "https://pypi.org/project/tinypie/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/tinypie/", "project_urls": { "Download": "UNKNOWN", "Homepage": "http://github.com/rspivak/tinypie" }, "release_url": "https://pypi.org/project/tinypie/0.2/", "requires_dist": null, "requires_python": null, "summary": "Tree-Based Interpreter, Compiler and VM for TinyPie language", "version": "0.2" }, "last_serial": 800706, "releases": { "0.1": [ { "comment_text": "", "digests": { "md5": "7420865a3eca3f6468eac6d3f0dba41c", "sha256": "c3eb41a8699f8a72e8e7990b073a4d464614169d1dc5d5ae26c4ffe7fe4ea351" }, "downloads": -1, "filename": "tinypie-0.1.tar.gz", "has_sig": false, "md5_digest": "7420865a3eca3f6468eac6d3f0dba41c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 13366, "upload_time": "2011-01-21T12:29:29", "url": "https://files.pythonhosted.org/packages/f5/6f/26be9d70b957d2b448a92dcd03d9bb4c81fcdc23ba977b32dba92c21dfbc/tinypie-0.1.tar.gz" } ], "0.2": [ { "comment_text": "", "digests": { "md5": "bfa44b8c67097bf5c77b15d6e248aa86", "sha256": "36a470f2cab64db87cb11f287e542c2b3c19fe30d2823e762aeaa8b454dd125f" }, "downloads": -1, "filename": "tinypie-0.2.tar.gz", "has_sig": false, "md5_digest": "bfa44b8c67097bf5c77b15d6e248aa86", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 31008, "upload_time": "2011-03-03T07:17:00", "url": "https://files.pythonhosted.org/packages/e7/8b/bfafdf8249c77929428c201a440944ea0285d11b4fe6910b23e944465de3/tinypie-0.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "bfa44b8c67097bf5c77b15d6e248aa86", "sha256": "36a470f2cab64db87cb11f287e542c2b3c19fe30d2823e762aeaa8b454dd125f" }, "downloads": -1, "filename": "tinypie-0.2.tar.gz", "has_sig": false, "md5_digest": "bfa44b8c67097bf5c77b15d6e248aa86", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 31008, "upload_time": "2011-03-03T07:17:00", "url": "https://files.pythonhosted.org/packages/e7/8b/bfafdf8249c77929428c201a440944ea0285d11b4fe6910b23e944465de3/tinypie-0.2.tar.gz" } ] }