{ "info": { "author": "Ryan Rozanski", "author_email": "", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Environment :: Console", "Intended Audience :: Developers", "Intended Audience :: Education", "Intended Audience :: End Users/Desktop", "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License", "Natural Language :: English", "Operating System :: OS Independent", "Programming Language :: Python", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.0", "Programming Language :: Python :: 3.1", "Programming Language :: Python :: 3.2", "Programming Language :: Python :: 3.3", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Topic :: Scientific/Engineering", "Topic :: Software Development", "Topic :: Software Development :: Build Tools", "Topic :: Software Development :: Code Generators", "Topic :: Software Development :: Compilers", "Topic :: Text Processing", "Topic :: Text Processing :: Linguistic" ], "description": "SPaG (Scanner, Parser, and Generator)\n================================================================================\n[![MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://github.com/rrozansk/SPaG/blob/master/LICENSE.txt) ![COV](https://img.shields.io/badge/Coverage-99%25-green.svg) ![VER](https://img.shields.io/badge/Version-1.0.0a0-yellow.svg)\n\n - [Introduction](#introduction)\n - [Installation](#installation)\n - [Source](#source)\n - [Pip](#pip)\n - [Overview](#overview)\n - [Scanner](#scanner)\n - [Parser](#parser)\n - [Generator](#generator)\n - [Generators](#generators)\n - [Status](#status)\n - [Script](#script)\n - [Development](#development)\n\n# Introduction\n\nSPaG is an extensible purely Python framework, with no external dependencies,\ncapable of compiling input specifications of scanners (regular grammars) and/or\nparsers (LL(1) BNF context free grammars) into\n[DFA](https://en.wikipedia.org/wiki/Deterministic_finite_automaton)'s\nand [LL(1) parse table](https://en.wikipedia.org/wiki/LL_parser)'s. These\nresults can then be converted into valid scanner/parser programs following the\noptions configured within the framework during generation. The processes used to\ntransform the input specifications into DFA's and LL(1) parse tables are very\nformal and well defined. Thorough table driven testing is also utilized to\nensure correctness of implementation. The program generator(s) comprise the\nextensible portion of the framework allowing new targeted languages to be easily\n[pluged-in](#generators). Included in the framework is a script which lightly\nwraps the core library functionality to accepts user input file(s) to help drive\nthe compilation/generation process from the command line. The\n[overview](#overview) below describes in detail what each component sets out to\ndo, how it accomplished those intended goals, and the accepted input.\n\n# Installation\n\nInstallation is supported through multiple methods, all of which are listed\nbelow. Any required dependencies along with the steps needed for a proper install\nare listed. Installation installs the module containing scanner\n(RegularGrammar), parser (ContextFreeGrammar), and generator (Generator) objects\nas well as a command line script for easily generating scanners and/or parsers\nfrom input specification(s) and optional configuration file. It also installs\nany supported core language [generators](#status) which may grow over time.\n\n## Source\n\nInstall from source for the latest up to date code. This may include unreleased\nbug fixes, feature extensions, or newly supported language child generators.\nSince the code is going to built from source it is generally a good idea to also\ntest it before installation or package distribution. For this prupose a Makefile\nis included to automate testing, linting, virtual environment\ninstallation/cleanup, etc. While the framework is pure, dependent free, Python\nthe testing is not and some prerequisites are required. All required python\npackages are listed in the\n[requirments](https://github.com/rrozansk/SPaG/blob/master/requirements.txt).\nHowever, since these packages are installed in a virtual environment with the\nhelp of Make the only requirements the user needs to worry are those which must\nbe installed on the machine itself and include:\n * git (>= v2.1.3)\n * make (>= v4.1)\n * Python (>= v2.7)\n * Pip (>= v9.0.3)\n * setuptools (>= v40.4.3)\n * virtualenv (>= v16.0.0)\n\n```sh\n# Obtain the source code.\n$ git clone https://github.com/rrozansk/SPaG.git\n\n# Run sanity.\n# 1. Constructs a virtual environment with SPaG installed for testing.\n# 2. Enter the newly built virtual environment.\n# 3. Lint the code to ensure standards are followed.\n# 4. Test the code and generate a report to ensure a working package.\n# 5. Leave the virtual environment.\n# 6. Clean up.\n$ make sanity\n\n# Open the generated report in a web-browser and inspect the results.\n$ chrome test_report.html\n\n# If the results look good then install SPaG from the source.\n$ make install\n```\n## Pip\n\nInstall a specific prebuilt pacakge distribution from the PyPI repository.\nSince the distributions are thoroughly tested and linted before publishing\nthere is no need for that step in this method of installation, unlike the source\nmethod. The only requirements needed on the host machine for this method to work\ninclude:\n * Python (>= v2.7)\n * Pip (>= v9.0.3)\n\n```sh\n# Only step required!\n$ pip install SPaG\n```\n\n# Overview\n\nThe below sections provides a quick overview of each individual core component.\nIt briefly describes what the component consists of and how it accomplished it's\ntasks and goals.\n\n## Scanner\n\nAn object exposing only initialazation and read only properties. The scanner\nattempts to transform a collection of named patterns (regular expression as\ntoken/type pairs) into a unique minimal DFA accepting any input specified while\nalso containing an errors sink state for all invalid input (if required). The\ntransformation begins by first checking the expression for validity while\ninternalizing it. Next, the use of an augmented version of Dijkstra's [shunting\nyard](https://en.wikipedia.org/wiki/Shunting-yard_algorithm) algorithm\ntransforms the expression into\n[prefix notation](https://en.wikipedia.org/wiki/Polish_notation). From here\n[Thompson's construction](https://en.wikipedia.org/wiki/Thompson%27s_construction)\nis utilized to produce an\n[NFA](https://en.wikipedia.org/wiki/Nondeterministic_finite_automaton) with\nepsilon productions. The NFA is then directly converted into a\n[minimal DFA](https://en.wikipedia.org/wiki/DFA_minimization) with respect to\nreachable states using e-closure conversions which are cached. Finally, the\nminimal DFA is made\n[total](https://en.wikipedia.org/wiki/Partial_function#Total_function), if not\nalready, so it can be further minimized with respect to nondistinguishable\nstates using\n[Hopcroft's algorithm](https://en.wikipedia.org/wiki/DFA_minimization#Nondistinguishable_states).\nThis results in the smallest possible total DFA which recognizes the given\ninput. The input itself (regular expressions) must be specified following these\nguidelines:\n\n * only printable ascii characters (uppercase, lowercase, digits, punctuation,\n whitespace) are supported.\n * supported core operators (and extensions) include:\n * '|' (union -> choice -> either or)\n * '.' (concatenation -> combine)\n * '?' (question -> choice -> 1 or none)\n * '*' (kleene star -> repitition >= 0)\n * '+' (kleene plus -> repitition >= 1)\n * () (grouping -> disambiguation -> any expression)\n * [ab] (character class -> choice -> any specified character)\n * [a-c] (character range -> choice -> any char between the two)\n * [^ab] (character negation -> choice -> all but the specified)\n * other things to keep in mind (potential gotcha's):\n * character classes and ranges can be combined in the same set of brackets,\n possibly multiple times (e.g. [abc-pqrs-z]).\n * character ranges can be specified as forward or backwards\n (e.g. [a-c] or [c-a]).\n * '^' is required to come first after the bracket for negation to properly\n work. In fact, '^' directly following a bracket is always interpreted as\n negation.\n * '^' may be used with character classes or ranges.\n * if '^' is alone in the brackets (e.g. [^]) it is translated as any\n possible input character (i.e. a 'wildcard').\n * if a literal '-' if required in a character class or range it must be\n specified last so as to not be interpreted as a range.\n * for literal right brackets inside character ranges or classes it must be\n escaped.\n * concatenation can be either implicit or explicit in the given input\n expression(s).\n * operator literals (i.e. '|.?*+()[]') can be specified through escape\n sequences using a backslash (i.e. '\\').\n\n## Parser\n\nThe parser attempts to transform a collection of\n[BNF](https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form) production rules\ninto a parse table. The first step in constructing the resulting table is\ndetermining the terminal and nonterminal sets, which is very trivial. From here,\nthe sets are used to construct the first and follow sets which identify what\ncharacters can be expected when parsing corresponding production rules.\nSubsequently, the first and follow sets are used to construct the predict sets\nwhich are in turn used in the table construction. Finally, the table is\nverified by checking for conflicts. While any BNF is accepted only\n[LL(1) grammars](https://en.wikipedia.org/wiki/LL_grammar) will produce a valid\nparse table containing no conflicts. Furthermore, only left factored LL(1)\ngrammars will produce a parse table with a single member per entry. To\nproperly specify a LL(1) grammar the following requirements must be met:\n\n * No left recursion (direct or indirect)\n * Must be left factored\n * No ambiguity\n\nIf any of the above requirements are violated, or the grammar specified is not\nLL(1), there will be a conflict in the table. This means with the given set of\ninput productions and with the aid of a single look ahead token it cannot be\nknown what rule to choose in order to successfully produce a parse without\nbacktracking.\n\n## Generator\n\nThe base generator is the object which all generators must inherit from. It is\nresponsible for allowing easy getting and setting of the same set of options\nacross all generators as well as providing basic protections against bad\noption combinations, program input, and generator output. It also allows easy\nreuse of language generators to compile many specifications into the same output\nlanguage while allowing easy configuration option changes between generation.\n\n# Generators\n\nThe generators are wrappers on top of the scanner/parser objects and are\nresponsible for translating/compiling the intermediate representations from the\noutput of the scanner/parser into whatever context desired with the given\noptions configured at the time of generation. Most of the time this will be a\nuseable program in some choosen programming language, which should obey the\nscanner's function type as an input character stream to a token stream while the\nparser should take that same token stream to an abstract syntax tree (AST).\nHowever, this is not always the case since it is possible to write other\ngenerators, say for example [cowsay](https://en.wikipedia.org/wiki/Cowsay). This\nillustrates the fact that the generators comprise the extensible portion of the\npackage. This plug-in segment of SPaG will grow over time to subsume more\ngenerators into the core implementation. These implementation will be for\nprogramming languages only. While the newly subsumed generators will be\nimmediately available if building from source, they will only be only available\nin the pip repository as a newer release of the SPaG package. Below is the\nstatus of all the generators currently tracked in the repository, linked to\nthere implementation, and any applicable notes which may apply.\n\n## Status\n\n| Generator | Status | Notes |\n|:--------------------------------------------------------------------------------:|:----------:|:-----------------------------------------------------:|\n| [C](https://github.com/rrozansk/SPaG/blob/master/spag/generators/c.py) | DEVELOPING | Direct scanner generator complete; Parser in progress |\n| [Go](https://github.com/rrozansk/SPaG/blob/master/spag/generators/go.py) | PLANNED | |\n| [Python](https://github.com/rrozansk/SPaG/blob/master/spag/generators/python.py) | PLANNED | |\n\n## Script\n\nA script is included in the package installation to allow easy generation of\nscanners and/or parsers (and possibly some combination of the two) into any\nnumber of output programming languages from the command line. It is a very light\nweight wrapper on the core functionality in the SPaG package. It also allows\neasy configuration of genration options and input specifications from command\nline or configuration file. Simply stated the CLI program drives the generation\nof the cross product of langauges x scanners x parser. Below shows various\nexamples for the generator to produce scanner's and/or parser's for different\ntoken sets and LL(1) languages which may be found under the\n[examples](https://github.com/rrozansk/SPaG/tree/master/examples) directory.\nShown below are some basic invocation's for help, scanner, parser, and\nconfiguration file generation.\n\n```sh\n# Generate your scanner and/or parser! ...but first ask for help to see all\n# available command line options.\n$ spag_cli --help\n\n# Generate any number of scanners (or parsers) for any set of languages at once!\n$ spag_cli -s examples/INI/scanner.txt examples/JSON/scanner.txt -g c go\n$ spag_cli -p examples/INI/parser.txt examples/JSON/parser.txt -g c go\n\n# Generate a scanner/parser combo (possibly for a languages front-end reader).\n$ spag_cli -s examples/INI/scanner.txt -p examples/INI/parser.txt -g c\n\n# Generate a default configuration file for easier runtime configuration.\n$ spag_cli --generate-rcfile .spagrc\n\n# Control the generation [in/out]put with the specified configuration values.\n$ spag_cli -c .spagrc\n\n# NOTE: It is possible to override configuration file values with command line\n# flag values if they are specified AFTER the configuration flag.\n$ spag_cli -c .spagrc # Provide flag(s) for overwriting values here.\n```\n\n## Development\n\nCreating a new generator is extremely simple, as it only requires the addition\nof a single file. This file must contain a class definition following the proper\nnaming conventions set forth for the new output language to be picked up by the\npacakge script. Furthermore, all python package prerequisites are listed in the\n[requirments](https://github.com/rrozansk/SPaG/blob/master/requirements.txt)\nfile. The below code block serves as a template for createing new generators\nwhich should be placed under spag/generators. The filename (denoted between\ncurly brackets in the template below) should be named after the language being\ncompiled to. Notice should also be taken between the capital and lowercase 'F'.\n\n```python\n\"\"\"\nA scanner/parser generator targeting {filename}.\n\"\"\"\nfrom spag import Generator\n\n\nclass {Filename}(Generator):\n \"\"\"\n A simple object for compiling scanner's and/or parser's to {filename} programs.\n \"\"\"\n\n def _translate(self):\n \"\"\"\n Override this method in subclasses which should translate the (IR)\n intermediate representation of the given scanner and/or parser to the\n targeted language. It should return the files and there content in a\n dictionary allowing multiple files to be returned for a given language.\n This is the only required function a child generators must implement.\n\n Return:\n dict[str, str]: Generated file names/paths and associated content.\n \"\"\"\n raise NotImplementedError('{filename} generator in development')\n```\n\nOnce you have completed implementing the new generator it needs to be added to\nthe list of\n[supported generators](https://github.com/rrozansk/SPaG/blob/master/spag/generators/__init__.py).\nThe tests should also be expanded to ensure the new generator can be properly\nimported as well. Additionally, the code must be linted, ensuring it follows\nthe proper conventions set forth and therefore produces a similiar look and feel\nas the rest of the code in the repository. Linting, testing and installing can\nbe automated through the use of the Makefile as shown below.\n\n```sh\n# Add the new generator to the __all__ list using your favorite editor.\n$ vim spag/generators/__init__.py\n\n# Ask the Makefile what it can do to help in linting, testing and installing.\n$ make help\n\n# Run sanity, to ensure no breakage occurs, which lints and tests the code\n# against an installed version of the source in a virtual environment.\n$ make sanity\n```\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "https://github.com/rrozansk/SPaG", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/rrozansk/SPaG", "keywords": "abstract-syntax-tree ast AST Backus-Naur-form Backus-normal-form bnf BNF command-line-interface cli CLI code-generation compiler context-free-grammar deterministic-finite-automaton dfa DFA dfa-minimization direct-encoding finite-state-machine first-set follow-set fsm FSM lexeme lexer lexer-generator lexical-analysis ll1 LL1 LL(1) longest-match maximal-munch nfa NFA nondeterministic-finite-automaton nonterminal(s) parser parser-generator production-rule(s) regular-expression(s) regular-grammar scanner scanner-generator script state-transition-table table-encoding terminal(s) tokenization tokenizer tokens", "license": "MIT", "maintainer": "Ryan Rozanski", "maintainer_email": "", "name": "SPaG", "package_url": "https://pypi.org/project/SPaG/", "platform": "", "project_url": "https://pypi.org/project/SPaG/", "project_urls": { "Download": "https://github.com/rrozansk/SPaG", "Homepage": "https://github.com/rrozansk/SPaG" }, "release_url": "https://pypi.org/project/SPaG/1.0.0a0/", "requires_dist": null, "requires_python": ">= 2.7.0", "summary": "A module containing scanner (regular expression) and parser (BNF) compilers as well as a base generator, which provides protection and validation, from which all target language generators must inherit from. A script is also included which reads the respective specification(s) from file and outputs the resulting code to disk.", "version": "1.0.0a0" }, "last_serial": 4384722, "releases": { "1.0.0a0": [ { "comment_text": "", "digests": { "md5": "959861cacbc754e0b21579ce3923260a", "sha256": "48e143615721e7be1cc06efb90472c0db1f36198b923b6cbec50354ae4d54213" }, "downloads": -1, "filename": "SPaG-1.0.0a0-py2-none-any.whl", "has_sig": false, "md5_digest": "959861cacbc754e0b21579ce3923260a", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": ">= 2.7.0", "size": 31658, "upload_time": "2018-10-17T02:40:43", "url": "https://files.pythonhosted.org/packages/47/4d/2134f011f7ed0318372dc88aef0602e36a7d653ab5241eb4fa84fd17db15/SPaG-1.0.0a0-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "6872c8d71834688525bf503792a5514c", "sha256": "19d6f5f6f726b5a77012582ec249ba9c3841b2b005dd880be03497e89a11a012" }, "downloads": -1, "filename": "SPaG-1.0.0a0.tar.gz", "has_sig": false, "md5_digest": "6872c8d71834688525bf503792a5514c", "packagetype": "sdist", "python_version": "source", "requires_python": ">= 2.7.0", "size": 35499, "upload_time": "2018-10-17T02:40:44", "url": "https://files.pythonhosted.org/packages/62/ab/b285a39ed431716b74e004f387e9f9cd154d725947f8d516094863fa817d/SPaG-1.0.0a0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "959861cacbc754e0b21579ce3923260a", "sha256": "48e143615721e7be1cc06efb90472c0db1f36198b923b6cbec50354ae4d54213" }, "downloads": -1, "filename": "SPaG-1.0.0a0-py2-none-any.whl", "has_sig": false, "md5_digest": "959861cacbc754e0b21579ce3923260a", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": ">= 2.7.0", "size": 31658, "upload_time": "2018-10-17T02:40:43", "url": "https://files.pythonhosted.org/packages/47/4d/2134f011f7ed0318372dc88aef0602e36a7d653ab5241eb4fa84fd17db15/SPaG-1.0.0a0-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "6872c8d71834688525bf503792a5514c", "sha256": "19d6f5f6f726b5a77012582ec249ba9c3841b2b005dd880be03497e89a11a012" }, "downloads": -1, "filename": "SPaG-1.0.0a0.tar.gz", "has_sig": false, "md5_digest": "6872c8d71834688525bf503792a5514c", "packagetype": "sdist", "python_version": "source", "requires_python": ">= 2.7.0", "size": 35499, "upload_time": "2018-10-17T02:40:44", "url": "https://files.pythonhosted.org/packages/62/ab/b285a39ed431716b74e004f387e9f9cd154d725947f8d516094863fa817d/SPaG-1.0.0a0.tar.gz" } ] }