{ "info": { "author": "Zhan Haoxun", "author_email": "programmer.zhx@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 1 - Planning", "Environment :: Console", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Natural Language :: English", "Operating System :: OS Independent", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3.4", "Topic :: Documentation", "Topic :: Software Development :: Libraries :: Python Modules", "Topic :: System :: Installation/Setup", "Topic :: System :: Software Distribution" ], "description": "a small and simple language within the project\n`sblog `__.\n\nInstall\n=======\n\n::\n\n pip install sdataflow\n\nConcepts\n========\n\n``sdataflow`` provides:\n\n- A small and simple language to define the relation of entities. An\n ``entity`` is a logic unit defined by user(i.e. a data processing\n function), it generates some kind of ``outcome`` as a respond to some\n kind of input ``outcome``\\ (which might be generated by other\n Entity). Relations of entities forms a ``dataflow``.\n- An command line program ``sdataflow`` generates html file for\n debugging.\n- A scheduler automatically runs entities and ships outcome to its\n destination.\n\nLanguage\n========\n\nTutorial\n--------\n\nLet's start with a simplest case(\\ **one-to-one** relation):\n\n::\n\n A --> B\n\nwhere entity ``B`` accepts outcome of ``A`` as its input.\n\nTo define a **one-to-more** or **more-to-one** relation:\n\n::\n\n # one-to-more\n A --> B\n A --> C\n A --> D\n\n # more-to-one\n B --> A\n C --> A\n D --> A\n\nwhere in the **one-to-more** case, copies of outcome of ``A`` could be\npassed to ``B``, ``C`` and ``D``. In the **more-to-one** case, outcomes\nof ``B``, ``C`` and ``D`` would be passed to ``A``.\n\nAnd here's the form of **outcome dispatching**, that is, a mechanism of\nsending different kinds of outcome of an entity to different\ndestinations. For instance, entity ``A`` generates two kinds of outcome,\nsay ``[type1]`` and ``[type2]``, and pass outcomes of ``[type1]`` to\n``B``, outcomes of ``[type2]`` to ``C``:\n\n::\n\n # one way.\n A --> [type1]\n A --> [type2]\n [type1] --> B\n [type2] --> C\n\n # another way.\n A --[type1]--> B\n A --[type2]--> C\n\nwhere identifier embraced in brackets(i.e. ``[type1]``) represents the\nname of outcome. In contrast to the form of outcome dispatching,\n``A --> B`` would simple pass outcome of ``A``, with default name\n``A``\\ (the name of entity generates the outcome), to entity ``B``.\nEssentially, above form(statement contains brackets) overrides the name\nof outcome, and acts like a filter for outcome dispatching.\n\nOutcome could be used to define **one-to-more**, **more-to-one**\nrelations as well, in the same way discussed above:\n\n::\n\n # one-to-more example.\n A --> [type1]\n A --> [type2]\n [type1] --> B\n [type1] --> C\n [type2] --> D\n [type2] --> E\n\n # more-to-one example.\n A --> [type1]\n B --> [type1]\n [type1] --> C\n\nAfter loading all user defined dataflow, there are basically two steps\nof analysis will be applied:\n\n1. Build a DAG of dataflow. Break if error happens(i.e. syntax error,\n cyclic path).\n2. Apply topology sort to DAG to get the linear ordering of entity\n invocation.\n\nLexical Rules\n-------------\n\n::\n\n ARROW : re.escape('-->')\n DOUBLE_HYPHENS : re.escape('--')\n BRACKET_LEFT : re.escape('[')\n BRACKET_RIGHT : re.escape(']')\n ID : r'\\w+'\n\nThe effect of above rules would be equivalent as if passing such rules\nto Python's ``re`` module with the flag ``UNICODE`` being set.\n\nCFGs\n----\n\n::\n\n start : stats\n\n stats : stats single_stat\n | empty\n \n single_stat : entity_to_entity\n | entity_to_outcome\n | outcome_to_entity\n \n entity_to_entity : ID general_arrow ID\n\n general_arrow : ARROW\n | DOUBLE_HYPHENS outcome ARROW\n\n outcome : BRACKET_LEFT ID BRACKET_RIGHT\n \n entity_to_outcome : ID ARROW outcome\n\n outcome_to_entity : outcome ARROW ID\n\nCommand-line program\n====================\n\nAfter install ``sdataflow`` through ``pip``, user can invoke a\ncommand-line program ``sdataflow``. Synopsis of ``sdataflow`` is simple:\n\n::\n\n Usage:\n sdataflow \n\nUser could pass the file path of datafow definition to ``sdataflow``,\nthen the program will parse the file, analyse the dataflow and finally\ngenerate a html file. Ues a browser to open such html file(based on\nproject `mermaid `__), and then, you\nget a graphic representation of your dataflow!\n\nAn example is given for illustration:\n\n::\n\n $ cat example.sd\n A --[odd]--> B\n A --[even]--> C\n B --> D\n C --> D\n $ sdataflow example.sd \n $ ls\n example.html example.sd\n\nUes a browser to open ``example.html``:\n\n.. figure:: https://cloud.githubusercontent.com/assets/5213906/7351794/03ade3b2-ed3a-11e4-9032-e859458857dd.png\n :alt: screen shot 2015-04-28 at 12 02 58 am\n\n screen shot 2015-04-28 at 12 02 58 am\n\nAPI\n===\n\nForm of Callback\n----------------\n\nAs mentioned above, an entity stands for a user defined logic unit.\nHence, after defining the relations of entities in the language\ndiscussed above, user should defines a set of callbacks, corresponding\nto each entity in the definition.\n\nA callback is a **callable**\\ (function, generator, bound method) that\nreturns ``None``\\ (i.e. a function with no ``return`` statement), or an\niterable object of which the element is a (key, value) tuple, with key\nas the name of outcome and value as user defined object. Argument list\nof such callable could be:\n\n1. An empty list, meaning that such callback accept no data.\n2. An one-element list.\n\nCode fragment for illustration:\n\n.. code:: python\n\n # normal function returns `None`, with empty argument list.\n def func1():\n pass\n\n\n # normal function return `None`, with one-element argument list.\n def func2(items):\n for name_of_outcome, obj in items:\n # do something.\n\n\n # normal function return elements, with one-element argument list.\n def func3(items):\n # ignore `items`\n data = [('some outcome name', i) for i in range(10)]\n return data\n\n\n # generator yield element, with one-element argument list.\n def gen1(items):\n # ignore `items`\n for i in range(10):\n yield 'some outcome name', i\n\n\n class ExampleClass(object):\n\n @classmethod\n def method1(cls):\n pass\n \n @classmethod \n def method2(cls, items):\n pass\n\n def method3(self):\n pass\n \n def method4(self, items):\n pass\n \n\n # class bound method, with empty argument list.\n ExampleClass.method1\n # class bound method, with one-element argument list.\n ExampleClass.method2\n\n example_instance = ExampleClass()\n # class bound method, with empty argument list.\n example_instance.method3\n # class bound method, with one-element argument list.\n example_instance.method4\n\nNote that the name of outcome is the string embraced in\nbrackets(\\ **not** including the brackets).\n\nAll In One Interface\n--------------------\n\n``sdataflow`` provides a class ``sdataflow.DataflowHandler`` to parse\n``doc``\\ (a string represents the relations of entities), register\ncallbacks and schedule the execution of callbacks.\n\n::\n\n class DataflowHandler\n __init__(self, doc, name_callback_mapping=None)\n `doc`: unicode or utf-8 encoded binary data.\n `name_callback_mapping`: a dict of (`name`, `callback`) pairs. `name`\n could be unicode or utf-8 encoded binary data. `callback` is a function\n or generator. `name_callback_mapping` could be `None`, since callback\n can be registered by function decorator(see next section).\n \n run(self)\n Automatically execute all registered callbacks.\n\nExample:\n\n.. code:: python\n\n from sdataflow import DataflowHandler\n from sdataflow.callback import create_data_wrapper\n\n doc = ('A --[odd]--> B '\n 'A --[even]--> C '\n 'B --> D '\n 'C --> D ')\n\n def a():\n odd = create_data_wrapper('odd')\n even = create_data_wrapper('even')\n for i in range(1, 10):\n if i % 2 == 0:\n yield even(i)\n else:\n yield odd(i)\n\n def b(items):\n default = create_data_wrapper('B')\n # remove 1.\n for outcome_name, number in items:\n if number == 1:\n continue\n yield default(number)\n\n def c(items):\n default = create_data_wrapper('C')\n # remove 2.\n for outcome_name, number in items:\n if number == 2:\n continue\n yield default(number)\n\n def d(items):\n numbers = {i for _, i in items}\n assert set(range(3, 10)) == numbers\n\n name_callback_mapping = {\n 'A': a,\n 'B': b,\n 'C': c,\n 'D': d,\n }\n\n # parse `doc`, register `a`, `b`, `c`, `d`.\n handler = DataflowHandler(doc, name_callback_mapping)\n\n # execute callbacks.\n handler.run()\n\nIn above example, ``A`` generates numbers in the range of 1 to 9, of\nwhich the odd numbers(1, 3, 5, 7, 9) are sent to ``B``, the even\nnumbers(2, 4, 6, 8) are sent to ``C``. Then ``B`` removes number 1 and\nsends the rest(3, 5, 7, 9) to ``D``, while ``C`` removes number 2 and\nsends the rest(4, 6, 8) to ``D``. Finally, ``D`` receives outcomes of\nboth ``C`` and ``D``, and make sure that is equal to\n``set(range(3, 10))``.\n\nUse Decorator To Register Normal Function\n-----------------------------------------\n\n``sdataflow.callback.register_callback`` is a function decorator with\nsignature:\n\n::\n\n register_callback(entity_name, *outcome_names)\n\nwhere ``entity_name`` could be an unicode or utf-8 encoded binary\nstring, indicating the entity to which the function should be\nregistered. If ``outcome_names`` is given, the decorator would inject\nseveral ``sdataflow.callback.create_data_wrapper`` generated data\nwrapper to the function being decorated.\n\nExample:\n\n.. code:: python\n\n @register_callback('A')\n def zero_arg():\n return 0\n \n @register_callback('C')\n def one_arg(items):\n return 1\n\n DataflowHandler(doc)\n\nwhere ``zero_arg`` is registered to entity ``A``, ``one_arg`` is\nregistered to entity ``B``. Note that as mentioned above, second\nparameter of ``DataflowHandler`` can be ignored.\n\nWhen names of decorator registered callback conflict with names of\n``name_callback_mapping``, the second parameter of ``DataflowHandler``,\ncallbacks in ``name_callback_mapping`` will be accepted, and callbacks\nregistered by function decorator will be discarded. For example:\n\n.. code:: python\n\n @register_callback('A')\n def zero_arg():\n return 0\n \n @register_callback('C')\n def should_not_be_registered(items):\n return 1\n \n def one_arg(items):\n return 42\n \n DataflowHandler(doc, {'C': one_arg})\n\nwhere ``one_arg`` will be registered instead of\n``should_not_be_registered``.\n\nExample of function injection:\n\n.. code:: python\n\n @register_callback('A', 'type1', 'type2')\n def func():\n return func.type1(1), func.type2(2)\n \n assert (\n ('type1', 1),\n ('type2', 2),\n ) == func()\n\nBe careful to apply ``register_callback`` to things other than\n``function``, let's say, you want to register a class method:\n\n.. code:: python\n\n class Example(object):\n\n # wrong, `classmethod` is not bound.\n @register_callback('A')\n @classmethod\n def func(cls):\n pass\n\n\n # try following code instead.\n register_callback('A')(Example.func) \n\nPure Interface of ``sdataflow`` Language\n----------------------------------------\n\n``sdataflow.lang.parse`` can be used to parse the definition of\ndataflow:\n\n::\n\n parse(doc)\n input: `doc` with type of six.binary_type or six.text_type.\n output: linear ordering and root nodes of dataflow.\n\n``parse`` returns a 2-tuple, with the first element is a list of linear\nordering of dataflow, and the second element is a list of root nodes of\nthe forest.", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/haoxun/sdataflow", "keywords": null, "license": "UNKNOWN", "maintainer": null, "maintainer_email": null, "name": "sdataflow", "package_url": "https://pypi.org/project/sdataflow/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/sdataflow/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://github.com/haoxun/sdataflow" }, "release_url": "https://pypi.org/project/sdataflow/0.3/", "requires_dist": null, "requires_python": null, "summary": "A simple language to describe dataflow between entries, implemented in Python.", "version": "0.3" }, "last_serial": 1590968, "releases": { "0.3": [ { "comment_text": "", "digests": { "md5": "7c243278d43f195ffd803a3e619a84f5", "sha256": "6d610195a0ce882f63a835939bca9a4d1579791548d9ad4f3726919e16ec8634" }, "downloads": -1, "filename": "sdataflow-0.3.tar.gz", "has_sig": false, "md5_digest": "7c243278d43f195ffd803a3e619a84f5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 259062, "upload_time": "2015-04-27T16:57:05", "url": "https://files.pythonhosted.org/packages/35/bd/6915da863cb8fafa86ef25ae92a945e0e78f1b407a237f2c924c4dd5ba65/sdataflow-0.3.tar.gz" } ], "0.3.2-rc-1": [ { "comment_text": "", "digests": { "md5": "bf9ec09325ba4e344a5c410798694ddc", "sha256": "be2561f7fc2f2a17a8d6a24538d9e66733a716df54e40f6479fee3cbda42dab7" }, "downloads": -1, "filename": "sdataflow-0.3.2-rc-1.tar.gz", "has_sig": false, "md5_digest": "bf9ec09325ba4e344a5c410798694ddc", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 265210, "upload_time": "2015-06-13T18:17:44", "url": "https://files.pythonhosted.org/packages/9f/a0/02d8cf2bed261ae8159706c4d2b5b1d513ee1a442718ee7a701010b4999c/sdataflow-0.3.2-rc-1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "7c243278d43f195ffd803a3e619a84f5", "sha256": "6d610195a0ce882f63a835939bca9a4d1579791548d9ad4f3726919e16ec8634" }, "downloads": -1, "filename": "sdataflow-0.3.tar.gz", "has_sig": false, "md5_digest": "7c243278d43f195ffd803a3e619a84f5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 259062, "upload_time": "2015-04-27T16:57:05", "url": "https://files.pythonhosted.org/packages/35/bd/6915da863cb8fafa86ef25ae92a945e0e78f1b407a237f2c924c4dd5ba65/sdataflow-0.3.tar.gz" } ] }