{ "info": { "author": "Pahaz Blinov", "author_email": "pahaz.white@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Topic :: Utilities" ], "description": "**Author**: `Pahaz White`_\n\n**Repo**: https://github.com/pahaz/py3line/\n\nPyline is a UNIX command-line tool for bash one-liner scripts.\nIt's python line alternative to `grep`, `sed`, and `awk`.\n\nThis project inspired by: `pyfil`_, `piep`_, `pysed`_, `pyline`_, `pyp`_ and\n`Jacob+Mark recipe `_\n\n**WHY I MAKE IT?**\n\nSometimes, I have to use `sed` / `awk` / `grep`. Usually for simple text\nprocessing. Find some pattern inside the text file using Python regexp,\nor comment/uncomment some config line by bash one line command.\n\nI always forget the necessary options and `sed` / `awk` DSL.\nBut I now python, I like it, and I want use it for this simple bash tasks.\nDefault `python -c` is not enough to write readable bash one-liners.\n\nWhy not a `pyline`?\n * Don`t support python3\n * Have many options\n * Don`t support command chaining\n\n**PRINCIPLES**\n\n * AS MUCH SIMPLE TO UNDERSTAND BASH ONE LINER SCRIPT AS POSSIBLE\n * LESS SCRIPT ARGUMENTS\n * AS MUCH EASY TO INSTALL AS POSSIBLE (CONTAINER FRIENDLY ???)\n * SMALL CODEBASE (less 500 loc)\n * LAZY AND EFFECTIVE AS POSSIBLE\n\nInstallation\n============\n\n`py3line`_ is on PyPI, so simply run:\n\n::\n\n pip install py3line\n\nor ::\n\n sudo curl -L \"https://61-63976011-gh.circle-artifacts.com/0/py3line-$(uname -s)-$(uname -m)\" -o /usr/local/bin/py3line\n sudo chmod +x /usr/local/bin/py3line\n\nto have it installed in your environment.\n\nFor installing from source, clone the\n`repo `_ and run::\n\n python setup.py install\n\nTutorial\n========\n\nLets start with examples, we want to evaluate a number of words in each line:\n\n.. code-block:: bash\n\n $ echo -e \"Here are\\nsome\\nwords for you.\" | ./py3line.py \"x = len(line.split(' ')); print(x, line)\"\n 2 Here are\n 1 some\n 3 words for you.\n\n\nPy3line process input stream by python code line by line.\n\n * **echo -e \"Here are\\nsome\\nwords for you.\"** -- create an input stream consists of three lines\n * **|** -- pipeline input stream to py3line\n * **\"x = len(line.split()); print(x, line)\"** -- define 2 actions: \"x = len(line.split(' '))\" evaluate number of words in each line, then \"print(x, line)\" print the result. Each action apply to the input stream step by step.\n\nThe example above can be represented as the following python code::\n\n import sys\n\n def process(stream):\n for line in stream:\n x = len(line.split(' ')) # action 1\n print(x, line) # action 2\n yield line\n\n stream = (line.rstrip(\"\\r\\n\") for line in sys.stdin if line)\n stream = process(stream)\n for line in stream: pass\n\nYou can also get the executed python code by ``--pycode`` argument.\n\n.. code-block:: bash\n\n $ ./py3line.py \"x = len(line.split(' ')); print(x, line)\" --pycode #skipbashtest\n ...\n\nStream transform\n----------------\n\nLets try more complex example, we want to to evaluate the number of words in the whole file. \nThis value is easy to calculate if you convert the input stream from a stream of lines \nto a number of words in line stream. Just override ``line`` variable ::\n\n $ echo -e \"Here are\\nsome\\nwords for you.\" | ./py3line.py \"line = len(line.split()); print(sum(stream))\"\n 6\n\nHere we have a stream transformation action **\"print(sum(stream))\"**.\n\nThe example above can be represented as the following python code::\n\n import sys\n\n def process(stram):\n for line in stream:\n line = len(line.split()) # action 1\n yield line\n\n def transform(stream):\n print(sum(stream)) # action 2\n return stream\n\n stream = (line.rstrip(\"\\r\\n\") for line in sys.stdin if line)\n stream = transform(process(stream))\n for line in stream: pass\n\nYou can also get the executed python code by ``--pycode`` argument.\n\n.. code-block:: bash\n\n $ ./py3line.py \"line = len(line.split()); print(sum(stream))\" --pycode #skipbashtest\n ...\n\nLazy as possible\n----------------\n\nPy3line does calculations only when necessary by the use of python generators.\nThis means that the input stream does not fit into memory and you can easy process more data than your RAM allows.\n\nBut it also imposes limitations on the ability to work with the data flow. \nYou cannot use multiple aggregation functions at the same time. For example, \nif we want to calculate the maximum number of words in a line and the total number of words in a whole file at the same time.::\n\n $ echo -e \"Here are\\nsome\\nwords for you.\" | ./py3line.py \"line = len(line.split()); print(sum(stream)); print(max(stream))\" #skipbashtest\n 6\n 2019-05-05 14:55:09,353 | ERROR | Traceback (most recent call last):\n File \"\", line 15, in \n stream = transform2(process1(stream))\n File \"\", line 10, in transform2\n print(max(stream))\n ValueError: max() arg is an empty sequence\n\nWe can see the ``empty sequence`` error. It throws because our ``stream`` generator is already empty. \nAnd we can't find any max value on empty stream.\n\nstream memorization\n~~~~~~~~~~~~~~~~~~~\n\nWe can solve it by converting the ``stream`` generator to a list of values in memory using python ``list(stream)`` function. ::\n\n $ echo -e \"Here are\\nsome\\nwords for you.\" | ./py3line.py \"line = len(line.split()); stream = list(stream); print(sum(stream), max(stream))\"\n 6 3\n\nThe example above can be represented as the following python code::\n\n import sys\n\n def process(stram):\n for line in stream:\n line = len(line.split()) # action 1\n yield line\n\n def transform(stream):\n stream = list(stream) # action 2\n print(sum(stream), max(stream)) # action 3\n return stream\n\n stream = (line.rstrip(\"\\r\\n\") for line in sys.stdin if line)\n stream = transform(process(stream))\n for line in stream: pass\n\nevaluate on the fly\n~~~~~~~~~~~~~~~~~~~\n\nWe can also solve it without putting the stream into memory. Just use the auxiliary variables where \nwe will place the calculated result in the process of processing the stream. ::\n\n $ echo -e \"Here are\\nsome\\nwords for you.\" | ./py3line.py \"s = 0; m = 0; num_of_words = len(line.split()); s += num_of_words; m = max(m, num_of_words); print(s, m)\"\n 2 2\n 3 2\n 6 3\n\nThe example above can be represented as the following python code::\n\n import sys\n\n def process(stram):\n s = 0 # action 1\n m = 0 # action 2\n for line in stream:\n num_of_words = len(line.split()) # action 3\n s += num_of_words # action 4\n m = max(m, num_of_words) # action 5\n print(s, m) # action 6\n yield line\n\n stream = (line.rstrip(\"\\r\\n\") for line in sys.stdin if line)\n stream = process(stream)\n for line in stream: pass\n\nBut we want only the last result. We don't want to see intermediate results.\nTo do this, you can add a loop over all elements of the stream before printing \nby ``for line in stream: pass``. Don't worry, this loop doesn't add unnecessary calculations \nas we use Python language generators. The loop will simply force the stream \nto be iterated before the print function called. ::\n\n $ echo -e \"Here are\\nsome\\nwords for you.\" | ./py3line.py \"s = 0; m = 0; num_of_words = len(line.split()); s += num_of_words; m = max(m, num_of_words); for line in stream: pass; print(s, m)\"\n 6 3\n\nThe example above can be represented as the following python code::\n\n import sys\n\n def process(stram):\n global s, m\n s = 0 # action 1\n m = 0 # action 2\n for line in stream:\n num_of_words = len(line.split()) # action 3\n s += num_of_words # action 4\n m = max(m, num_of_words) # action 5\n yield line\n\n def transform(stream):\n global s, m\n for line in stream: pass # action 6\n print(s, m) # action 7\n return stream\n\n stream = (line.rstrip(\"\\r\\n\") for line in sys.stdin if line)\n stream = transform(process(stream))\n for line in stream: pass\n\npython generator laziness\n~~~~~~~~~~~~~~~~~~~~~~~~~\n\nLet's check python generator laziness. \nJust run ``for line in stream: print(1);`` \ntwice in a row::\n\n $ echo -e \"Here are\\nsome\\nwords for you.\" | ./py3line.py \"for line in stream: print(1); for line in stream: print(1)\"\n 1\n 1\n 1\n\nAs we can see, it only one-time iteration over the python generator items. \nAnd all subsequent iterations will work with an empty generator, \nwhich is equivalent to a cycle through an empty list.\n\nThe example above can be represented as the following python code::\n\n import sys\n\n def transform(stream):\n for line in stream: pass # action 1 (3 iterations)\n for line in stream: pass # action 2 (0 iterations)\n return stream\n\n stream = (line.rstrip(\"\\r\\n\") for line in sys.stdin if line)\n stream = transform(stream)\n for line in stream: pass # (0 iterations)\n\nwork with a part of stream\n~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nTODO ....\n\nDetails\n=======\n\nLet us define some terminology. **py3line \"action1; action2; action3**\n\nWe have actions: action1, action2 and action3.\nEach action have type. It may be ``element processing`` or ``stream transformation``.\n\nWe can understand the type of action based on the variables used in it. \nWe have two variables: ``line`` and ``stream``. \nThey are markers that define the type of action.\n\nLets look at some types from examples abow::\n\n x = line.split() -- element processing\n print(x, line) -- element processing\n print(sum(stream)) -- stream transformation\n stream = list(stream) -- stream transformation\n print(sum(stream), max(stream)) -- stream transformation\n s = 0 -- unidentified\n m = 0 -- unidentified\n num_of_words = len(line.split()) -- element processing\n s += num_of_words -- unidentified\n m = max(m, num_of_words) -- unidentified\n print(s, m) -- unidentified\n for line in stream: pass -- stream transformation\n\n**[rule1]** If an action has an undefined type, it inherits its type from the previous action.\n**[rule2]** If there is no previous action, then the action is considered a stream transformation.\n\nExamples::\n\n s = 0 -- stream transformation (because of [rule2])\n num_of_words = len(line.split()) -- element processing (because of `line` marker)\n s += num_of_words -- element processing (because of [rule1])\n print(s) -- element processing (because of [rule1])\n\nAnd if we want to do ``print`` at the and, we should have some `stream` marker in actions before. \n\n::\n\n s = 0 -- stream transformation (because of [rule2])\n num_of_words = len(line.split()) -- element processing (because of `line` marker)\n s += num_of_words -- element processing (because of [rule1])\n stream -- stream transformation (because of `stream` marker)\n print(s) -- stream transformation (because of [rule1])\n\nUnfortunately, it is not so clearly to people who are not familiar with the the implementation.\nTherefore, it is better to use a more explicit to readers actions like ``for line in stream: pass``.\n\n::\n\n s = 0 -- stream transformation (because of [rule2])\n num_of_words = len(line.split()) -- element processing (because of `line` marker)\n s += num_of_words -- element processing (because of [rule1])\n for line in stream: pass -- stream transformation (because of `stream` marker)\n print(s) -- stream transformation (because of [rule1])\n\n\nSome examples\n=============\n\n.. code-block:: bash\n\n # Print every line (null transform)\n $ cat ./testsuit/test.txt | ./py3line.py \"print(line)\"\n This is my cat,\n whose name is Betty.\n This is my dog,\n whose name is Frank.\n This is my fish,\n whose name is George.\n This is my goat,\n whose name is Adam.\n\n.. code-block:: bash\n\n # Number every line\n $ cat ./testsuit/test.txt | ./py3line.py \"stream = enumerate(stream); print(line)\"\n (0, 'This is my cat,')\n (1, ' whose name is Betty.')\n (2, 'This is my dog,')\n (3, ' whose name is Frank.')\n (4, 'This is my fish,')\n (5, ' whose name is George.')\n (6, 'This is my goat,')\n (7, ' whose name is Adam.')\n\n.. code-block:: bash\n\n # Number every line\n $ cat ./testsuit/test.txt | ./py3line.py \"stream = enumerate(stream); print(line[0], line[1])\"\n 0 This is my cat,\n 1 whose name is Betty.\n 2 This is my dog,\n 3 whose name is Frank.\n 4 This is my fish,\n 5 whose name is George.\n 6 This is my goat,\n 7 whose name is Adam.\n\nOr just ``cat ./testsuit/test.txt | ./py3line.py \"stream = enumerate(stream); print(*line)\"``\n\n.. code-block:: bash\n\n # Print every first and last word\n $ cat ./testsuit/test.txt | ./py3line.py \"s = line.split(); print(s[0], s[-1])\"\n This cat,\n whose Betty.\n This dog,\n whose Frank.\n This fish,\n whose George.\n This goat,\n whose Adam.\n\n.. code-block:: bash\n\n # Split into words and print as list (strip al non word char like comma, dot, etc)\n $ cat ./testsuit/test.txt | ./py3line.py \"print(re.findall(r'\\w+', line))\"\n ['This', 'is', 'my', 'cat']\n ['whose', 'name', 'is', 'Betty']\n ['This', 'is', 'my', 'dog']\n ['whose', 'name', 'is', 'Frank']\n ['This', 'is', 'my', 'fish']\n ['whose', 'name', 'is', 'George']\n ['This', 'is', 'my', 'goat']\n ['whose', 'name', 'is', 'Adam']\n\n.. code-block:: bash\n\n # Split into words (strip al non word char like comma, dot, etc)\n $ cat ./testsuit/test.txt | ./py3line.py \"print(*re.findall(r'\\w+', line))\"\n This is my cat\n whose name is Betty\n This is my dog\n whose name is Frank\n This is my fish\n whose name is George\n This is my goat\n whose name is Adam\n\n.. code-block:: bash\n\n # Find all three letter words\n $ cat ./testsuit/test.txt | ./py3line.py \"print(re.findall(r'\\b\\w\\w\\w\\b', line))\"\n ['cat']\n []\n ['dog']\n []\n []\n []\n []\n []\n\n.. code-block:: bash\n\n # Find all three letter words + skip empty lists\n cat ./testsuit/test.txt | ./py3line.py \"line = re.findall(r'\\b\\w\\w\\w\\b', line); if not line: continue; print(line)\"\n ['cat']\n ['dog']\n\n.. code-block:: bash\n\n # Regex matching with groups\n $ cat ./testsuit/test.txt | ./py3line.py \"line = re.findall(r' is ([A-Z]\\w*)', line); if not line: continue; print(*line)\"\n Betty\n Frank\n George\n Adam\n\n.. code-block:: bash\n\n # cat ./testsuit/test.txt | ./py3line.py \"line = re.search(r' is ([A-Z]\\w*)', line); if not line: continue; line.group(1)\"\n $ cat ./testsuit/test.txt | ./py3line.py \"rgx = re.compile(r' is ([A-Z]\\w*)'); line = rgx.search(line); if not line: continue; print(line.group(1))\"\n Betty\n Frank\n George\n Adam\n\n.. code-block:: bash\n\n # head -n 2\n # cat ./testsuit/test.txt | ./py3line.py \"stream = enumerate(stream); if line[0] >= 2: break; print(line[1])\"\n $ cat ./testsuit/test.txt | ./py3line.py \"stream = list(stream)[:2]; print(line)\"\n This is my cat,\n whose name is Betty.\n\n.. code-block:: bash\n\n # Print just the URLs in the access log\n $ cat ./testsuit/nginx.log | ./py3line.py \"print(shlex.split(line)[13])\"\n HEAD / HTTP/1.0\n HEAD / HTTP/1.0\n HEAD / HTTP/1.0\n HEAD / HTTP/1.0\n HEAD / HTTP/1.0\n GET /admin/moktoring/session/add/ HTTP/1.1\n GET /admin/jsi18n/ HTTP/1.1\n GET /static/admin/img/icon-calendar.svg HTTP/1.1\n GET /static/admin/img/icon-clock.svg HTTP/1.1\n HEAD / HTTP/1.0\n HEAD / HTTP/1.0\n HEAD / HTTP/1.0\n HEAD / HTTP/1.0\n HEAD / HTTP/1.0\n GET /logout/?reason=startApplication HTTP/1.1\n GET / HTTP/1.1\n GET /login/?next=/ HTTP/1.1\n POST /admin/customauth/user/?q=%D0%9F%D0%B0%D1%81%D0%B5%D1%87%D0%BD%D0%B8%D0%BA HTTP/1.1\n\n.. code-block:: bash\n\n # Print most common accessed urls and filter accessed more then 5 times\n $ cat ./testsuit/nginx.log | ./py3line.py \"line = shlex.split(line)[13]; stream = collections.Counter(stream).most_common(); if line[1] < 5: continue; print(line)\"\n ('HEAD / HTTP/1.0', 10)\n\nComplex examples\n----------------\n\n.. code-block:: bash\n\n # create directory tree\n echo -e \"y1\\nx2\\nz3\" | ./py3line.py \"pathlib.Path('/DATA/' + line +'/db-backup/').mkdir(parents=True, exist_ok=True)\"\n\n group by 3 lines ... (https://askubuntu.com/questions/1052622/separate-log-text-according-to-paragraph)\n\nHELP\n----\n\n::\n\n $ ./py3line.py --help\n usage: py3line.py [-h] [-v] [-q] [--version] [--pycode]\n [expression [expression ...]]\n\n Py3line is a UNIX command-line tool for a simple text stream processing by the\n Python one-liner scripts. Like grep, sed and awk.\n\n positional arguments:\n expression python comma separated expressions\n\n optional arguments:\n -h, --help show this help message and exit\n -v, --verbose\n -q, --quiet\n --version print the version string\n --pycode show generated python code\n\n::\n\n $ ./py3line.py --version\n 0.3.1\n\n.. _Pahaz White: https://github.com/pahaz/\n.. _py3line: https://pypi.python.org/pypi/py3line/\n.. _pyp: https://pypi.python.org/pypi/pyp/\n.. _piep: https://github.com/timbertson/piep/tree/master/piep/\n.. _pysed: https://github.com/dslackw/pysed/blob/master/pysed/main.py\n.. _pyline: https://github.com/westurner/pyline/blob/master/pyline/pyline.py\n.. _pyfil: https://github.com/ninjaaron/pyfil", "description_content_type": "", "docs_url": null, "download_url": "https://pypi.python.org/packages/source/s/py3line/py3line-0.3.1.zip", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/pahaz/py3line", "keywords": "google spreadsheet api util helper", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "py3line", "package_url": "https://pypi.org/project/py3line/", "platform": "unix", "project_url": "https://pypi.org/project/py3line/", "project_urls": { "Download": "https://pypi.python.org/packages/source/s/py3line/py3line-0.3.1.zip", "Homepage": "https://github.com/pahaz/py3line" }, "release_url": "https://pypi.org/project/py3line/0.3.1/", "requires_dist": null, "requires_python": "", "summary": "UNIX command-line tool for python line-based stream processing", "version": "0.3.1" }, "last_serial": 5259780, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "c7e8b091aee0b6917919a6234ab49426", "sha256": "c598d951c8daf070032454ad4944f9fbcadba414bd5105a1f4f722676ecafd73" }, "downloads": -1, "filename": "py3line-0.0.1.tar.gz", "has_sig": false, "md5_digest": "c7e8b091aee0b6917919a6234ab49426", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6060, "upload_time": "2016-07-22T18:52:36", "url": "https://files.pythonhosted.org/packages/28/cd/5fe6a799ead4aa1430a0f42ea6e6c3ac116b73902140950ad85e8d91cafc/py3line-0.0.1.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "e301258576e9bc994c565e4dabe514ed", "sha256": "cb4942d231f65d0316f17f70aeb81c5b7d737066fa66ac039ae6ef97b3e56ba7" }, "downloads": -1, "filename": "py3line-0.0.2.tar.gz", "has_sig": false, "md5_digest": "e301258576e9bc994c565e4dabe514ed", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9211, "upload_time": "2016-07-26T07:20:35", "url": "https://files.pythonhosted.org/packages/3a/82/a36c224e7f4251840e86d9f85d6ea40f7d9250e126817a9dbcd302c13e50/py3line-0.0.2.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "9477f140c118bbca75cf0eea5cde9c1a", "sha256": "a66f78a211adf6b9d5c562e1b450e805d14172bb0dc0989303ec8238a077ad90" }, "downloads": -1, "filename": "py3line-0.0.3.tar.gz", "has_sig": false, "md5_digest": "9477f140c118bbca75cf0eea5cde9c1a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9420, "upload_time": "2016-09-25T10:25:38", "url": "https://files.pythonhosted.org/packages/dd/90/c863a1559cc073d7aabe9974c641caa469c42115754de1c07373c3c0821d/py3line-0.0.3.tar.gz" } ], "0.0.4": [ { "comment_text": "", "digests": { "md5": "0cf7c813d05f71406fd3f8b182326e8e", "sha256": "cbbea7a8dbb9a072f97084081d96baa4a133ad242b80256f39bed68b8d52d22b" }, "downloads": -1, "filename": "py3line-0.0.4.tar.gz", "has_sig": false, "md5_digest": "0cf7c813d05f71406fd3f8b182326e8e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9434, "upload_time": "2018-02-03T04:21:54", "url": "https://files.pythonhosted.org/packages/44/3a/6f3a5693cd315a0bf9156d1108ba65b2c4f6b45f6923ac2b2e3cd7a5c8ee/py3line-0.0.4.tar.gz" } ], "0.3.0": [ { "comment_text": "", "digests": { "md5": "4f1837b79ac115981aee6f7b7fa2d38b", "sha256": "56c7a08a060e89cb8487c4b72fda0b73bbb9a0c27d73a5f4dcc5b9975b895b29" }, "downloads": -1, "filename": "py3line-0.3.0.tar.gz", "has_sig": false, "md5_digest": "4f1837b79ac115981aee6f7b7fa2d38b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20261, "upload_time": "2019-05-12T19:59:20", "url": "https://files.pythonhosted.org/packages/93/5c/c3527fa83a9e9ac17dd54c5e0d266666de02a76283d0c41f5732f5043459/py3line-0.3.0.tar.gz" } ], "0.3.1": [ { "comment_text": "", "digests": { "md5": "d5a50c27fb7164b2f0f8cc747cd9451d", "sha256": "1f133d2ccbbcb286602494c3d3f548ec80b780de6cd90754f028d7c2c3970715" }, "downloads": -1, "filename": "py3line-0.3.1.tar.gz", "has_sig": false, "md5_digest": "d5a50c27fb7164b2f0f8cc747cd9451d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20298, "upload_time": "2019-05-12T20:05:47", "url": "https://files.pythonhosted.org/packages/c2/b7/6424925d630d647c751d2feeb3c917a8d97213857d6270edd2ee56539727/py3line-0.3.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "d5a50c27fb7164b2f0f8cc747cd9451d", "sha256": "1f133d2ccbbcb286602494c3d3f548ec80b780de6cd90754f028d7c2c3970715" }, "downloads": -1, "filename": "py3line-0.3.1.tar.gz", "has_sig": false, "md5_digest": "d5a50c27fb7164b2f0f8cc747cd9451d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20298, "upload_time": "2019-05-12T20:05:47", "url": "https://files.pythonhosted.org/packages/c2/b7/6424925d630d647c751d2feeb3c917a8d97213857d6270edd2ee56539727/py3line-0.3.1.tar.gz" } ] }