{ "info": { "author": "Manish Tomar", "author_email": "manishtomar.public@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Natural Language :: English", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Topic :: Internet :: WWW/HTTP" ], "description": "bloc\n====\n\n.. image:: https://img.shields.io/pypi/v/bloc.svg\n :target: https://pypi.org/project/bloc\n :alt: PyPI package\n\n.. image:: https://travis-ci.org/manishtomar/bloc.svg?branch=master\n :target: https://travis-ci.org/manishtomar/bloc\n :alt: CI Status\n\n.. image:: https://codecov.io/github/manishtomar/bloc/branch/master/graph/badge.svg\n :target: https://codecov.io/github/manishtomar/bloc\n :alt: Test Coverage\n\nSimple single-master group membership framework based on Twisted that helps in partitioning workloads or\nstateless data among multiple nodes. It consists of 2 components: \n\n1) Standalone TCP server provided as a twisted plugin that keeps track of the group. Currently the protocol\n is HTTP but it may change in future.\n2) Twisted based client library talking to the above server (Other language libraries \u00e7an be implemented on demand)\n\nIt provides failure detection based on heartbeats. However, since it is single master the server is\na single point of failure. But since the server is completely stateless it can be easily restarted without any issues.\n\nIt works on Python 2.7 and 3.6.\n\nInstallation\n------------\n``pip install bloc`` on both server and client nodes. \n\nUsage\n-----\nOn server run ``twist -t 4 -s 6`` where 4 is client heartbeat timeout and 6 is settling timeout (explained below).\nThis will start HTTP server on port 8989 by default. One can give different port via ``-l tcp:port`` option.\n\nOn client, to equally partition ``items`` among multiple nodes, create ``BlocClient`` and call ``get_index_total``\non regular basis. Following is sample code:\n\n.. code-block:: python\n\n from functools import partial\n from twisted.internet import task\n from twisted.internet.defer import inlineCallbacks, gatherResults\n\n @inlineCallbacks\n def do_stuff(bc):\n \"\"\" Process items based on index and total got from BlocClient \"\"\"\n # get_index_total returns this node's index and total number of nodes in the group\n index_total = bc.get_index_total()\n if index_total is None:\n return\n index, total = index_total\n items = yield get_items_to_process()\n my_items = filter(partial(is_my_item, index, total), items)\n yield gatherResults([process_item(item) for item in my_items])\n\n def is_my_item(index, total, item):\n \"\"\" Can I process this item? \"\"\"\n return hash(item) % total + 1 == index\n\n @task.react\n def main(reactor):\n bc = BlocClient(reactor, \"server_ip:8989\", 3)\n bc.startService()\n # Call do_stuff every 2 seconds\n return task.LoopingCall(do_stuff, bc).start(2)\n\nHere, the important function is ``is_my_item`` which decides whether the item can be processed by\nthis node based on the index and total. It works based on item's hash. Needless to say, it is important\nto have stable hash function implemented for your item. Ideally, there shouldn't be any necessity for item\nto be anything other than some kind of key (string). This function will guarantee that only one node\nwill process a particular item provided bloc server provides a unique index to each node which it does.\n\nAs an example, say node A and B are running above code talking to same bloc server and items is following\nlist of userids being processed::\n\n 1. 365f54e9-5de8-4509-be16-38e0c37f5fe9\n 2. f6a6a396-d0bf-428a-b63b-830f98874b6c\n 3. 6bec3551-163d-4eb8-b2d8-1f2c4b106d33\n 4. b6691e16-1d95-42de-8ad6-7aa0c81fe080\n\nIf node A's ``get_item_index`` returns ``(1, 2)`` then ``is_my_item`` will return ``True`` for userid 1 and 3\nand in node B's ``get_item_index`` returns ``(2, 2)`` and ``is_my_item`` will return ``True`` for userid 2 and 4.\nThis way you can partition the user ids among multiple nodes.\n\nThe choice of hash function and keyspace may decide how equally the workload is distributed across the nodes.\n\nThe above code assumes that ``items`` is dynamic which will be true if it is based on your application\ndata like users. However, there are situations where it can be a fixed number if your data is already\npartitioned among fixed number of buckets in which case you can use bloc to assign buckets to each node.\nAn example of this is `otter's scheduling feature `_\nwhich partitions events to be executed among a fixed set of 10 buckets and distributes the buckets\nwithin < 10 nodes. Another example is kafka's partitioned topic. Each node can consume a particular\npartition based on index and total provided.\n\n``get_index_total`` returns ``None`` when there is no index assigned which can happen when nodes are added/removed\nor when the client cannot talk to the server due to any networking issues. The client must stop doing its work\nwhen this happens because next time the node could have different index assigned. This is why the\nclient's processing based on the index must be stateless.\n\nindex and total are internally updated on interval provided when creating ``BlocClient``. They can change \nover time but only after ``get_index_total`` returns ``None`` for settling period (provided when starting server).\nHence, ``get_index_total`` must be called at least once during the settling period to always have the latest value\nand not accidentally work with incorrect index.\n\nYou would have noticed ``bc.startService`` in above code which is required to be called before calling\n``get_index_total``. If you are setting up twisted server using service hierarchy then it is best\nto add ``BlocClient`` object as a child service. This way Twisted will start and stop the service when required.\n\nHow does it work:\n-----------------\n\nThe server at any time remains in either of the two states: SETTLING or SETTLED. It starts of in\nSETTLING and remains in that state when nodes start to join or leave. When the nodes stop having\nactivity (no more joins / leaving) for configurable time (called settling time given when starting server),\nit then transitions to SETTLED state at which time it assigns each node an index and informs them about it.\nThe settling time is provided with ``-s`` option when starting the server and should generally be few seconds\ngreater than heartbeat interval. This way the server avoids unnecessarily assigning indexes when\nmultiple nodes are joining/leaving at close times.\n\nClient hearbeats to the server at interval provided when creating ``BlocClient``. The server keeps\ntrack of clients based on this heartbeat and removes any client that does not heartbeat in configured\ntime. This time is provided as ``-t`` option when starting the server. The heartbeat timeout provided\nin server should be a little more than the heartbeat interval provided in client to take into account\nlatency or temporary network glitches. In example above, server times out after 4 seconds and client\nheartbeats every 3 seconds. This hearbeat mechanism provides failure detection. If any of the nodes\nis bad that node will just stop processing work.\n\nSome things to know:\n--------------------\n\n* **No security**: Currently the server does not authenticate the client and accepts from any client.\n The connection is also not encrypted. Depending on demand I am planning to add mutual TLS authentication\n* **No benchmarks done**. However, since its all in memory and Twisted it should easily scale to\n few hundred clients. I'll do some testing and update later.\n* By default ``twist`` logging is at info level and due to heartbeats in HTTP every request is logged.\n You can give ``--log-level=warn`` option to avoid it.\n\n\n", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/manishtomar/bloc", "keywords": "group membership distributed systems", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "bloc", "package_url": "https://pypi.org/project/bloc/", "platform": "", "project_url": "https://pypi.org/project/bloc/", "project_urls": { "Homepage": "https://github.com/manishtomar/bloc" }, "release_url": "https://pypi.org/project/bloc/0.1.2/", "requires_dist": [ "klein (>=15.0.0)", "treq (>=15.1.0)", "twisted (>=16.5.0)" ], "requires_python": "", "summary": "Single-master group membership framework", "version": "0.1.2" }, "last_serial": 2852075, "releases": { "0.0.1": [], "0.1.0": [ { "comment_text": "", "digests": { "md5": "fcf554c5e67b7cdcbd794b4573b67c5f", "sha256": "c4a83eaf192c2fe2eaf532a438dc6174290866941287ddc1065902458ecb1a4b" }, "downloads": -1, "filename": "bloc-0.1.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "fcf554c5e67b7cdcbd794b4573b67c5f", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 12732, "upload_time": "2017-05-03T06:32:09", "url": "https://files.pythonhosted.org/packages/79/5d/e81fb7a177d9778ae8bdb8b3907711fdd45f82d4c017609b6c6cc4b7cd32/bloc-0.1.0-py2.py3-none-any.whl" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "c2a92f054974d932cd2783bfc69cb24d", "sha256": "336c3e2a45b06a69475da3c591303b7feef70063f9aba08aa02bd0c938bb734c" }, "downloads": -1, "filename": "bloc-0.1.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "c2a92f054974d932cd2783bfc69cb24d", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 12737, "upload_time": "2017-05-03T06:39:16", "url": "https://files.pythonhosted.org/packages/59/74/22d7ce30f33d8a8be8244e5af6f1e2f52b0e688fe2f8bd4456ae08450c56/bloc-0.1.1-py2.py3-none-any.whl" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "576e2276b6fcf42a2dfa415fbed80f97", "sha256": "361d8f98cec055bc3193832adfebb894d7dad689cbe46de23eea80a23e9e460b" }, "downloads": -1, "filename": "bloc-0.1.2-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "576e2276b6fcf42a2dfa415fbed80f97", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 13366, "upload_time": "2017-05-04T19:01:37", "url": "https://files.pythonhosted.org/packages/99/ed/df234b3a250148e6e0520ea0b2d96e5c3fec2dce52b7e0d9f509465ab16c/bloc-0.1.2-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "7b1d2c9646f14ccf55631bf54bd960fb", "sha256": "33850ced7a99c4598562510dbed0da207fb32cd00adb60353fbee56b81f2200f" }, "downloads": -1, "filename": "bloc-0.1.2.tar.gz", "has_sig": false, "md5_digest": "7b1d2c9646f14ccf55631bf54bd960fb", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9116, "upload_time": "2017-05-04T19:01:39", "url": "https://files.pythonhosted.org/packages/61/b0/b889ff20dd89e91d0116d4c8a155d6a9d3dfa44b656077d97a08ddf6dddf/bloc-0.1.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "576e2276b6fcf42a2dfa415fbed80f97", "sha256": "361d8f98cec055bc3193832adfebb894d7dad689cbe46de23eea80a23e9e460b" }, "downloads": -1, "filename": "bloc-0.1.2-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "576e2276b6fcf42a2dfa415fbed80f97", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 13366, "upload_time": "2017-05-04T19:01:37", "url": "https://files.pythonhosted.org/packages/99/ed/df234b3a250148e6e0520ea0b2d96e5c3fec2dce52b7e0d9f509465ab16c/bloc-0.1.2-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "7b1d2c9646f14ccf55631bf54bd960fb", "sha256": "33850ced7a99c4598562510dbed0da207fb32cd00adb60353fbee56b81f2200f" }, "downloads": -1, "filename": "bloc-0.1.2.tar.gz", "has_sig": false, "md5_digest": "7b1d2c9646f14ccf55631bf54bd960fb", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9116, "upload_time": "2017-05-04T19:01:39", "url": "https://files.pythonhosted.org/packages/61/b0/b889ff20dd89e91d0116d4c8a155d6a9d3dfa44b656077d97a08ddf6dddf/bloc-0.1.2.tar.gz" } ] }