{ "info": { "author": "Justin Cunningham", "author_email": "bam@yelp.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 2 - Pre-Alpha", "Intended Audience :: Developers", "Natural Language :: English", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.3", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: Implementation :: PyPy" ], "description": "# Data Pipeline Clientlib\n\n\nWhat is it?\n-----------\nData Pipeline Clientlib provides an interface to tail and publish to data pipeline topics.\n\n[Read More](https://engineeringblog.yelp.com/2016/07/billions-of-messages-a-day-yelps-real-time-data-pipeline.html)\n\n\nHow to download\n---------------\n```\ngit clone git@github.com:Yelp/data_pipeline.git\n```\n\n\nTests\n-----\nRunning unit tests\n```\nmake -f Makefile-opensource test\n```\n\n\nConfiguration\n-------------\nInclude the `data_pipeline` namespace in your `module_env_config` of `config.yaml`\nand configure following values for `kafka_ip`, `zk_ip` and `schematizer_ip`\n\n```\nmodule_env_config:\n\t...\n - namespace: data_pipeline\n config:\n kafka_broker_list:\n - :9092\n kafka_zookeeper: :2181\n schematizer_host_and_port: :8888\n ...\n```\n\n\nUsage\n-----\nRegistering a simple schema with the Schematizer service.\n```\nfrom data_pipeline.schematizer_clientlib.schematizer import get_schematizer\ntest_avro_schema_json = {\n \"type\": \"record\",\n \"namespace\": \"test_namespace\",\n \"source\": \"test_source\",\n \"name\": \"test_name\",\n \"doc\": \"test_doc\",\n \"fields\": [\n {\"type\": \"string\", \"doc\": \"test_doc1\", \"name\": \"key1\"},\n {\"type\": \"string\", \"doc\": \"test_doc2\", \"name\": \"key2\"}\n ]\n}\nschema_info = get_schematizer().register_schema_from_schema_json(\n namespace=\"test_namespace\",\n source=\"test_source\",\n schema_json=test_avro_schema_json,\n source_owner_email=\"test@test.com\",\n contains_pii=False\n)\n```\n\nCreating a simple Data Pipeline Message from payload data.\n```\nfrom data_pipeline.message import Message\nmessage = Message(\n schema_id = schema_info.schema_id,\n payload_data = {\n 'key1': 'value1',\n 'key2': 'value2'\n }\n)\n```\n\nStarting a Producer and publishing messages with it::\n```\nfrom data_pipeline.producer import Producer\nwith Producer() as producer:\n producer.publish(message)\n```\n\nStarting a Consumer with name `my_consumer` that listens for\nmessages in all topics within the `test_namespace` and `test_source`.\nIn this example, the consumer consumes a single message, processes it, and\ncommits the offset.\n```\nfrom data_pipeline.consumer import Consumer\nfrom data_pipeline.consumer_source import TopicInSource\nconsumer_source = TopicInSource(\"test_namespace\", \"test_source\")\nwith Consumer(\n consumer_name='my_consumer',\n team_name='bam',\n expected_frequency_seconds=12345,\n consumer_source=consumer_source\n) as consumer:\n while True:\n message = consumer.get_message()\n if message is not None:\n ... do stuff with message ...\n consumer.commit_message(message)\n```\n\n\nDisclaimer\n-------\nWe're still in the process of setting up this package as a stand-alone. There may be additional work required to run Producers/Consumers and integrate with other applications.\n\n\nLicense\n-------\nData Pipeline Clientlib is licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0\n\n\nContributing\n------------\nEveryone is encouraged to contribute to Data Pipeline Clientlib by forking the Github repository and making a pull request or opening an issue.\n\n\n\nDocumentation\n-------------\n\nThe full documentation is at TODO (DATAPIPE-2031|abrar): Link to public servicedocs.\n\n\n\nHistory\n-------\n\n0.1.4 (2015-08-12)\n++++++++++++++++++\n\n* Defined consumer/producer registration API\n\n0.1.3 (2015-08-10)\n++++++++++++++++++\n\n* Added keys kwargs to data pipeline messages\n\n0.1.0 (2015-03-01)\n++++++++++++++++++\n\n* First release.", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/Yelp/data_pipeline", "keywords": "data_pipeline", "license": "UNKNOWN", "maintainer": null, "maintainer_email": null, "name": "data_pipeline", "package_url": "https://pypi.org/project/data_pipeline/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/data_pipeline/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://github.com/Yelp/data_pipeline" }, "release_url": "https://pypi.org/project/data_pipeline/0.9.12/", "requires_dist": null, "requires_python": null, "summary": "Provides an interface to consume and publish to data pipeline topics.", "version": "0.9.12" }, "last_serial": 2492761, "releases": { "0.9.12": [ { "comment_text": "", "digests": { "md5": "e10ebdaf19715de98a06db7d44fe1555", "sha256": "5fb536d573887689eb98cdbca83d1d741e63a38a20155bdcb855824749497afb" }, "downloads": -1, "filename": "data_pipeline-0.9.12-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "e10ebdaf19715de98a06db7d44fe1555", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 222359, "upload_time": "2016-11-30T22:51:37", "url": "https://files.pythonhosted.org/packages/1c/29/c6611efd53e50f69c1064fc0a75da79f1707a1b6a170e29deb0764189ba4/data_pipeline-0.9.12-py2.py3-none-any.whl" } ], "0.9.5": [ { "comment_text": "", "digests": { "md5": "043c9b00db6edcce76440396f242a639", "sha256": "053de6a9ef21c2d656c48e6471ea3f7549181343b5d463c2f6d288ad11ac2c7a" }, "downloads": -1, "filename": "data_pipeline-0.9.5-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "043c9b00db6edcce76440396f242a639", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 214808, "upload_time": "2016-11-23T06:21:44", "url": "https://files.pythonhosted.org/packages/f6/a3/5eadc752216a6b4383b98e406f53284422c49665f8ab52fba82521e9aa79/data_pipeline-0.9.5-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "d56a12f58e788c45f921cf04500c9ad7", "sha256": "63778205679dc92726dc5daf9fbec95d0f7d2f0b16537276baae5ec6e90ad156" }, "downloads": -1, "filename": "data_pipeline-0.9.5.tar.gz", "has_sig": false, "md5_digest": "d56a12f58e788c45f921cf04500c9ad7", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 131312, "upload_time": "2016-11-23T06:21:47", "url": "https://files.pythonhosted.org/packages/4b/de/82046042a57edc9e549d26223b5f9432c63cbb2377f2c25e3185441c7538/data_pipeline-0.9.5.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "e10ebdaf19715de98a06db7d44fe1555", "sha256": "5fb536d573887689eb98cdbca83d1d741e63a38a20155bdcb855824749497afb" }, "downloads": -1, "filename": "data_pipeline-0.9.12-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "e10ebdaf19715de98a06db7d44fe1555", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 222359, "upload_time": "2016-11-30T22:51:37", "url": "https://files.pythonhosted.org/packages/1c/29/c6611efd53e50f69c1064fc0a75da79f1707a1b6a170e29deb0764189ba4/data_pipeline-0.9.12-py2.py3-none-any.whl" } ] }