{ "info": { "author": "Povilas Balciunas", "author_email": "balciunas90@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "=====\nAbout\n=====\n\n.. image:: https://travis-ci.org/povilasb/pycollection-pipelines.svg?branch=master\n.. image:: https://www.quantifiedcode.com/api/v1/project/7913d23626d3406fa334a88e962d8529/badge.svg\n :target: https://www.quantifiedcode.com/app/project/7913d23626d3406fa334a88e962d8529\n :alt: Code issues\n\nExperimental `collection pipeline `_\npattern implementation in python.\n\n.. code-block:: python\n\n cat('/tmp/file.txt') | filter('some line') | filter('some line 2') | out()\n\n.. contents:: :local:\n\n.. image:: basic_samples.gif\n\n.. note::\n\n Library only works on Python 3. There are no plans to support previous\n python versions.\n\nUsage\n=====\n\nEvery pipeline has a data source (**cat()**, **http()**, etc.) and an optional\ndata transformation/filtering and output processors.\n\n::\n\n Source Transformations Output\n +---------------+----------------------+-------+\n | | | |\n v v v V\n echo('1.2.3.4') | split('.') | count() | out()\n\nYou can save a partial pipe and reuse it later.\n\n.. code-block:: python\n\n from collection_pipelines import *\n\n word_list = echo('word1 word2 word3') | words()\n word_list | out() # will print the words to stdout\n\n word_list | filter(word2) | freq() | bar() # will draw a bar chart for word frequencies\n\nReference\n=========\n\nSources\n-------\n\n**cat(file_name)**\n Reads the specified file. Sends items to the pipe line by line.\n\n**echo(text)**\n Sends the specified text to the pipe.\n\n**http(url)**\n Sends HTTP GET method to the specified URL and puts the response body to pipe.\n\nTransformers, Filters\n---------------------\n\n**filter(value)**\n Filters out the items that match the specified value.\n\n**head(N)**\n Passes only the first N items through the pipe.\n\n**tail(N)**\n Passes only the last N items through the pipe.\n\n**count()**\n Calculates the incoming items. When the pipeline source signals the end\n of the items, *count()* sends a single item - numbers of items, to the\n pipe.\n\n**freq()**\n Calculates how many times each unique item appears on the pipe.\n When the source signals the end of the items, every unique item is\n sent together with it's repetition count.\n *freq()* outputs tuples: *('item_x', 8)*.\n\n**unique()**\n Filters out the duplicate items.\n\n**json(path)**\n Parses incoming items as json strings, extracts elements with the\n specified json path and sends them trough the pipe.\n E.g. let's say we have json '{\"name\": {\"first\": \"Bob\"}}' and we want to\n extract the first name.\n Then *path* would be \"name.first\".\n\n**split(delimiter)**\n Splits the incoming items by the specified delimiter.\n Sends the resulting items through the pipeline one by one.\n\n**words(text)**\n Splits the incoming text items into words and sends each word through\n the pipe.\n\nOutput Processors\n-----------------\n\n**out()**\n Outputs items to stdandard output.\n\n**value()**\n Returns the collected items rather than outputting them somewhere.\n Useful when you want to store resulting pipeline items to variable.\n If more than one item passes the pipeline, the array of those items is\n returned.\n\n**line()**\n Collects all items and draws a line chart.\n Items must be tuples where first item is X axis value, and second item\n is Y axis value.\n Chart is plotted using matplotlib.\n\n**bar()**\n Collects all items and draws a bar chart.\n Items must be tuples where first item is X axis value, and second item\n is Y axis value.\n Chart is plotted using matplotlib.\n\n**wordcloud()**\n Collects all text items and draws a word cloud.\n See: https://github.com/amueller/word_cloud\n\nDevelopment\n===========\n\nIf you want to write your own sources, transformers or outputs there's\ncouple of base classes you should get familiar with.\n\nLet's implement a very basic filter that forwards only even numbers.\n\n.. code-block:: python\n\n from collection_pipelines import *\n\n class even(CollectionPipelineProcessor):\n def process(self, item):\n if isinstance(item, int):\n if item % 2 == 0:\n self.receiver.send(item)\n\n echo([1, 2, 3]) | even() | out()\n\nSources\n-------\n\nEvery source object must extend the *CollectionPipelineSource* class and\nimplement the *on_begin()* method.\n\nE.g. this source will send random integer to a pipeline:\n\n.. code-block:: python\n\n import random\n\n class rand_int(CollectionPipelineSource):\n def on_begin(self):\n self.receiver.send(random.randint(0, 1000))\n self.receiver.close()\n\nTransformers, Filters\n---------------------\n\nEvery transformer and filter is a python object that instantiates a class\nthat extends *CollectionPipelineProcessor* class.\nAll the work is done in *process()* method.\nThis methods receives an item passing the pipeline.\n\nYou might either ignore, transform or simply pass forward the items.\nTo send item further to the pipe use *self.receiver.send(item)*.\n\nE.g. if you wanted to multiply all items, you could implement the method\nlike this\n\n.. code-block:: python\n\n def process(self, item):\n self.receiver.send(item * 2)\n\nOutput processors\n-----------------\n\nPipeline output processors must extend the *CollectionPipelineOutput* class.\nOutput processors are special in a way that they don't forwards the items\nany further. They trigger the pipeline execution.\n\nImplementing an output processor is very similar to implementing a transformer.\n\n.. code-block:: python\n\n class stdout(CollectionPipelineOutput):\n def process(self, item):\n print(item)\n\nSuch processor would print an item as soon as it received one.\nThere's also a special method *on_done()*, which is called when all items\nin the pipeline are processed.\n\nE.g. if you wanted an output processor to print items only when you received\nall of them, the class would look like\n\n.. code-block:: python\n\n class stdout(CollectionPipelineOutput):\n def __init__(self):\n self.items = []\n\n def process(self, item):\n self.items.append(item)\n\n def on_done(self):\n for item in self.items:\n print(item)\n\nMore Samples\n============\n\nBar Chart\n---------\n\n.. code-block:: python\n\n echo([('apples', 2), ('bananas', 5), ('oranges', 3)]) | bar()\n\n.. image:: bar.png\n\nLine Chart\n----------\n\n.. code-block:: python\n\n echo([(1, 10), (2, 7), (3, 5), (4, 5), (5, 8)]) | line()\n\n.. image:: line.png\n\nWord Cloud\n----------\n\n.. code-block:: python\n\n cat('README.rst') | wordcloud()\n\n.. image:: wordcloud.png", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/povilasb/pycollection-pipelines", "keywords": null, "license": "MIT", "maintainer": null, "maintainer_email": null, "name": "collection-pipelines", "package_url": "https://pypi.org/project/collection-pipelines/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/collection-pipelines/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://github.com/povilasb/pycollection-pipelines" }, "release_url": "https://pypi.org/project/collection-pipelines/0.1.3/", "requires_dist": null, "requires_python": null, "summary": "Framework to implement collection pipelines in python.", "version": "0.1.3" }, "last_serial": 2208535, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "74212a6e1ac9bc8c2c986a466e1369c1", "sha256": "be02c083aa87010ab21cee1ba4142555f74f104986f0c5f7ba5f54dd4b09e383" }, "downloads": -1, "filename": "collection-pipelines-0.1.0.tar.gz", "has_sig": false, "md5_digest": "74212a6e1ac9bc8c2c986a466e1369c1", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7277, "upload_time": "2016-07-07T17:58:35", "url": "https://files.pythonhosted.org/packages/57/49/3392bcd6645ba10370300bbad35eb999b4a3bc75e61cf476daeae1d3e199/collection-pipelines-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "6a7104ecbf4593f713fc64e809116665", "sha256": "6a3346d41bbeb1a445ada934acf5da121f6068b10ba121ca210745c9d3c7a7fe" }, "downloads": -1, "filename": "collection-pipelines-0.1.1.tar.gz", "has_sig": false, "md5_digest": "6a7104ecbf4593f713fc64e809116665", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7394, "upload_time": "2016-07-07T18:23:01", "url": "https://files.pythonhosted.org/packages/06/a3/e5fc39e8f5de241ef04418e417f629d1f222529d1fc83420a1ddad4f3d01/collection-pipelines-0.1.1.tar.gz" } ], "0.1.3": [ { "comment_text": "", "digests": { "md5": "8071df226cf7b6bac723d63fdcb66cbd", "sha256": "5393b9af818f280ce8867f99f15c1922f479844d227aa282cdd69d2fc04d3699" }, "downloads": -1, "filename": "collection-pipelines-0.1.3.tar.gz", "has_sig": false, "md5_digest": "8071df226cf7b6bac723d63fdcb66cbd", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7333, "upload_time": "2016-07-07T18:51:30", "url": "https://files.pythonhosted.org/packages/39/fc/491c907a6b5f80a1c673ee395749235363a69883836a35623cf3172218f9/collection-pipelines-0.1.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "8071df226cf7b6bac723d63fdcb66cbd", "sha256": "5393b9af818f280ce8867f99f15c1922f479844d227aa282cdd69d2fc04d3699" }, "downloads": -1, "filename": "collection-pipelines-0.1.3.tar.gz", "has_sig": false, "md5_digest": "8071df226cf7b6bac723d63fdcb66cbd", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7333, "upload_time": "2016-07-07T18:51:30", "url": "https://files.pythonhosted.org/packages/39/fc/491c907a6b5f80a1c673ee395749235363a69883836a35623cf3172218f9/collection-pipelines-0.1.3.tar.gz" } ] }