{ "info": { "author": "Dan Borufka", "author_email": "danborufka@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# TSL \u00e2\u20ac\u201c Text Scraping Language\nPython package for processing of a scraping language in pseudo-code\n\n*The TSL Python package allows you to write and execute pseudo-code style language to process text files with Regular expressions and simple logic. This gives an easy entry to data mining to non-programmers.*\n\nYou can either run it as a CLI using `python TSL.py myScript.tsl` or use the **TSLEngine class** like this:\n\n```python\nfrom TSLEngine import TSLEngine\n\nTSL = TSLEngine('myScript.tsl')\n\nif TSL.task:\n TSL.run()\n```\n\n## Example:\n![Sublime Text Syntax Highlighting](https://raw.githubusercontent.com/polygoat/TSL/master/preview.png \"Sublime Text Syntax Highlighting\")\n\n... This will read all lines from `stats/milestones.csv`, take all lines, splits them into columns, selects the second column and saves the corresponding row into a file labeled by said column (e.g. `stats/31-03-2019.txt`).\n\n---\n# Index\n### [How does it work?](#how-does-it-work)\n### [Setup](#setup)\n### [Available TSL Commands](#available-tsl-commands)\n### [Templating](#templating)\n---\n# How does it work?\nTSL runs through the script line by line and executes corresponding Python code in the background. File handling, complex data types, and templating are built-in for rapid prototyping. Every line starts with a command followed by a space and space-separated arguments. \nMost commands support optional clauses like `as ...` (storage variable) or `in ...` (file handle) to supply further information.\n\nA command's inputs and outputs can be **strings** or **collections of strings**. In ladder case, TSL iterates over a collection's strings and applies the command to each of them. The commands `as`, `remember`, `split`, and `for every` loops change the context to the provided variable. This means you can omit `as` clauses in the following commands, always automatically referring to the context. To reference variables rather than strings use square brackets. `log something` will log the string \"something\", while `log [something]` will log the content of the variable called _something_.\n\n---\n# Setup\nUse `pip install tsl` to install the package.\n\n---\n# Available TSL Commands\n\n## File & system operations\n\n### bash `` as ``\n*Runs a bash command and saves the returned output to a variable.*\n\n**Example:**\n```fortran\n bash git branch as branches\n```\n\n### empty `[]`\n*Opens up a file and deletes all its content.*\n\n**Example:**\n```fortran\n in wordbag.txt\n \tempty\n```\n\n### in *``*\n*Opens up a file and reads all its lines. You can log the lines using `log line`\nAll future file operations are refering to this one until your next \"in\" statement.\nYou'll usually see this followed by a `take` or `find all` command*\n\n**Example:**\n```fortran\n in stats/01092019.txt\n```\n\n### in *``*\n*Creates the nested directory structure if it doesn't exist. Otherwise, the path will be used as context for future operations.*\n\n**Example:**\n```fortran\n in \"/Sublime Text/Packages\"\n count files as fileCount\n log [fileCount]\n```\n\n### save `[as ]`\n*Saves the latest collection in the given filename.*\n\n**Example:**\n```fortran\n save as runner/cleaned_userinputs.txt\n```\n\n### write `[]`\n*Writes given variable (or the results of the last `find all`) into the last file opened with `in`*\n\n**Example:**\n```fortran\n write [userIds]\n```\n\n### add *``* [to ``]\n*Appends content to a file different from the currently open one*\n\n**Example:**\n```fortran\n add [libraries] to libs.txt\n```\n\n---\n## Selections\n\n### select nth [of `[input]`]\n*Selects a specific item of a collection, given its index.*\n\n**Example:**\n```fortran\n in bigrams.txt \n select 4th\n```\n\n\n### select words [of `[input]`][as ``]\n*Selects all words found in the last opened file.*\n\n**Example:**\n```fortran\n in utterances.txt\t\n \tselect words\n```\n\n### select [from *``*] [to *``*]\n*Selects the range from the indicated string/RegEX/number until the indicated string or regular expression or number. Note that we start counting with 1 to keep it natural*\n\n**Example:**\n```fortran\n select from \"accessibilityApp\" to \"[v:\"\n select from \\s to \\s\n select from 1 to \"[v:samsung.tvSearchAndPlay.Genres:drama]\"\n select two of [bigrams]\n```\n\n### select from ``\n*Selects the range from the indicated string / regular expression / number until the end of the line*\n\n****Example:**\n```fortran\n select from \"dateTime\"\n select from \\d\\d\\d\n select from 122 \n```\n\n### select to ``\n*Selects the range from the beginning of the line to the indicated string / regular expression / number.*\n\n**Example:**\n```fortran\n select to \"dateTime\"\n select to \\W\n select to 5th \n select to 370 \n```\n\n---\n## Debugging & calculations\n\n### be ``\n*Sets one of the following properties of TSL to true:*\n\n`verbose` | `active`\n\n### calculate `operation` as ``\n*Calculates mathematical operations*\n\n**Example:**\n```fortran\n calculate (5 * 4) / 2 as ratio\n```\n\n### log *``*\n*Prints to the console. Use strings with template tags (e.g. \"here is: [varName]\") for variables*\n\n### count `` as ``\n*Stores the count of lines in a selection.*\n\n**Example:**\n```fortran\n count [entries-per-day] as frequency\n log [frequency]\n```\n\n### count *``* in `` as ``\n*Stores the count of files or folders in a directory.*\n\n**Example:**\n```fortran\n count files in \"C:\\Windows\" as systemFiles\n log \"Exactly [systemFiles] system files found.\"\n```\n\n---\n## Manipulation\n\n### change `` to ``\n*Iterates over a collection and changes all entries according to the template tag. Use brackets to tag variables, like so: `[varName]`*\n\n**Example:**\n```fortran\n change [salute] to \"Hi, [salute] #[i]\"\n```\n*will e.g. change \"my name is Dan\" to \"Hi, my name is Dan #1\"*\n\n### combine `` with `` as ``\n*Merges two sets and stores it in a new variable.*\n\n**Example:**\n```fortran\n combine [vowels] with [consonants] as letters\n```\n\n### find all *``* [in ``] [as ``]\n*Finds all occurrences of a string or regular expression in the lines of the currently open file or a stored collection. The results of this search are automatically stored in a variable `found`*\n\nExample:\n```fortran\n in corpus_de.txt\n \ttake lines as utterances\n \tfind all [aeiou]+ in [utterances]\n \tlog [found]\n```\n\n### remove lines\n*Removes the last selected lines (e.g. the ones found using a `find all`)*\n\n### replace *``* by `` [in ``]\n*Replaces given string or regular expression by another string, optionally in a particular collection.*\n\n**Example:**\n```fortran\n replace \\W+ by \"_\"\n```\n\n### sort [``]\n\n*Sorts either the supplied or last referenced collection alphanumerically (in ascending order).*\n\n### split *``* by `` as ``\nSplits a string into a collection using delimiter.\n\n**Example:**\n```fortran\n split apples;bananas;oranges by ; as fruits\n log [fruits]\n```\n\n### unique lines\n*Removes all duplicate lines from the last referenced collection.*\n\n## Memory\n\n### remember *``* as ``\n*Stores a string or variable in a new variable.*\n\n### take *``* [as ``]\n*Changes the selected collection to whole lines (`take lines as ...`), results of a `find all` directive, or to the files found in a folder specified with a preceding `in ` directive.*\n\n**Example:**\n```fortran\n in source.txt\n \tfind all <[^>]+>\n \ttake lines as htmlLines\n \tlog [htmlLines]\n\n in libraries/de\n \ttake files as germanLibs\n \tlog [germanLibs]\n```\n\n---\n## Flow\n\n### for every ``\n### ---\n\n*Loops through a collection, populating the variable `i` with the current index. From within the loop, the item of the collection can be accessed using the variable name in singular (books -> book, babies -> baby).*\n\n*If a collection is empty, the for-loop is skipped. This becomes useful to create conditional flows.*\n\n*Always terminate a loop with three consecutive hyphens in a separte line.*\n\n**Example:**\n```fortran\n in corpus.txt\n \tfind all [^\\b]+\\b[^\\b]+ as bigrams\n \tfor every [bigram]\n \t\tlog \"#[i]: [bigram]\"\n \t---\n```\n\n### run `path/to/script.tsl`\n*Runs another TSL file*\n\nThe external TSL file will receive the same scope as inlined code.\n\n---\n# Templating\n\nTemplates are enclosed in square brackets and can appear in quoted strings, file paths, and even within regular expressions:\n```fortran\n{\n remember \"\\CommNetwork\" as domain\n in user.txt\n find all \\b[domain][^:]: as user\n for every [user]\n select from 0 to -1\n in \"/users/[user]/credentials.txt\"\n change [user] to \"[user]:pleaseresetme\"\n add [user]\n ---\n}\n```\nIf the variables can not be found, the template tags remain untouched, including square brackets. This allows us to easily mix them in with regular expressions.\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/polygoat/TSL", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "tsl-compiler", "package_url": "https://pypi.org/project/tsl-compiler/", "platform": "", "project_url": "https://pypi.org/project/tsl-compiler/", "project_urls": { "Homepage": "https://github.com/polygoat/TSL" }, "release_url": "https://pypi.org/project/tsl-compiler/0.0.1/", "requires_dist": null, "requires_python": "", "summary": "Text Scraping Language package", "version": "0.0.1" }, "last_serial": 5210683, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "d4c4f58bff0c03453e082db64db11073", "sha256": "544827eb46dcfa22e948b226d8e1db722743d42b7640469c5671f86ab1d8fe1b" }, "downloads": -1, "filename": "tsl_compiler-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "d4c4f58bff0c03453e082db64db11073", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 26691, "upload_time": "2019-05-01T00:10:21", "url": "https://files.pythonhosted.org/packages/0b/7f/4d870b1aaeeb63ef389285598169839a98540df3e303f3ffe3c9d933e009/tsl_compiler-0.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "6bf0debeef9cd0aed347507ec255247a", "sha256": "eceb97e35eed940aaccf5d0b738e1aa24b091148981f6d63f8c5390127416d39" }, "downloads": -1, "filename": "tsl-compiler-0.0.1.tar.gz", "has_sig": false, "md5_digest": "6bf0debeef9cd0aed347507ec255247a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17883, "upload_time": "2019-05-01T00:10:28", "url": "https://files.pythonhosted.org/packages/9b/1c/8eb192d6e062f2c21423c14d405e29df36bc90746215cdcffa45248d41d9/tsl-compiler-0.0.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "d4c4f58bff0c03453e082db64db11073", "sha256": "544827eb46dcfa22e948b226d8e1db722743d42b7640469c5671f86ab1d8fe1b" }, "downloads": -1, "filename": "tsl_compiler-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "d4c4f58bff0c03453e082db64db11073", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 26691, "upload_time": "2019-05-01T00:10:21", "url": "https://files.pythonhosted.org/packages/0b/7f/4d870b1aaeeb63ef389285598169839a98540df3e303f3ffe3c9d933e009/tsl_compiler-0.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "6bf0debeef9cd0aed347507ec255247a", "sha256": "eceb97e35eed940aaccf5d0b738e1aa24b091148981f6d63f8c5390127416d39" }, "downloads": -1, "filename": "tsl-compiler-0.0.1.tar.gz", "has_sig": false, "md5_digest": "6bf0debeef9cd0aed347507ec255247a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17883, "upload_time": "2019-05-01T00:10:28", "url": "https://files.pythonhosted.org/packages/9b/1c/8eb192d6e062f2c21423c14d405e29df36bc90746215cdcffa45248d41d9/tsl-compiler-0.0.1.tar.gz" } ] }