{ "info": { "author": "Kyle Wilcox", "author_email": "kyle@axds.co", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Intended Audience :: Developers", "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License", "Operating System :: POSIX :: Linux", "Programming Language :: Python", "Topic :: Scientific/Engineering" ], "description": "## ncreplayer\n\n[![Build Status](https://travis-ci.org/axiom-data-science/ncreplayer.svg?branch=master)](https://travis-ci.org/axiom-data-science/ncreplayer)\n[![License](http://img.shields.io/:license-mit-blue.svg)](http://doge.mit-license.org)\n\n\nSmall little utility designed to load a CF DSG compliant netCDF file and replay it back onto a Kafka topic in either `batch` mode or `stream` mode. Optionally control the timestamps and timedeltas in the original file using configuration parameters.\n\nData is formatted as described in the AVRO schema file `schema.avsc`. You can choose to serialize the data as `avro`, `msgpack` or the default `json`.\n\n\n## WHY?\n\nThis utility is designed to make my life easier. It isn't inteded to be used by many people but has been useful for:\n\n* Playing back data in a netCDF file with different times and intervals than what is defined in the file.\n* Setting up a quick stream of data for load testing.\n* Transitioning parts of large systems that rely on static files to stream processing.\n* Accepting a standardized data format from collaborators (netCDF/CF/DSG) and being able to stream it into larger systems.\n\n\n## Data Format\n\n**Simplest example**\n```json\n{\n \"uid\": \"1\",\n \"time\": \"2019-04-01T00:00:00Z\",\n \"lat\": 30.5,\n \"lon\": -76.5,\n}\n```\n\n**Full Example**\n```yaml\n{\n \"uid\": \"1\",\n \"gid\": null,\n \"time\": \"2019-04-01T00:00:00Z\",\n \"lat\": 30.5,\n \"lon\": -76.5,\n \"z\": null,\n \"values\": {\n \"salinity\": 30.2,\n \"temperature\": 46.5\n },\n \"meta\": \"\"\n}\n```\n\n* `values` objects are optional and are a multi-type AVRO `map`.\n* `meta` is optional and open-ended. It is intended to carry along metadata to describe the `values`. I recommend using `nco-json`. This is useful if the system listening to these messages needs some context about the data, like streaming data to a website. YMMV.\n\n\n## Configuration\n\nThis program uses [`Click`](https://click.palletsprojects.com/) for the CLI interface. I probably spent 50% of the time making this utility just playing around with the CLI interface. I have no idea if what I came up with is mind-blowingly awesome or a bunch of crap. It works. Comments are welcome.\n\n```sh\n$ ncreplay\nUsage: ncreplay [OPTIONS] FILENAME COMMAND [ARGS]...\n\nOptions:\n --brokers TEXT Kafka broker string (comman separated)\n [required]\n --topic TEXT Kafka topic to send the data to [required]\n --packing [json|avro|msgpack] The data packing algorithm to use\n --registry TEXT URL to a Schema Registry if avro packing is\n requested\n --uid TEXT Variable name, global attribute, or value to\n use for the uid values\n --gid TEXT Variable name, global attribute, or value to\n use for the gid values\n --meta / --no-meta Include the `nco-json` metadata in each\n message?\n --help Show this message and exit.\n\nCommands:\n batch Batch process a netCDF file in chunks, pausing every [chunk]...\n stream Streams each unique timestep in the netCDF file every [delta]...\n```\n\n### batch\n\n```sh\n$ ncreplay /path/to/file.nc batch --help\nUsage: ncreplay batch [OPTIONS]\n\n Batch process a netCDF file in chunks, pausing every [chunk] records\n for [delta] seconds. Optionally change the [starting] time of the file\n and/or change each timedelta using the [factor] and [offset] parameters.\n\nOptions:\n -s, --starting [%Y-%m-%d|%Y-%m-%dT%H:%M:%S|%Y-%m-%d %H:%M:%S]\n -f, --factor FLOAT\n -o, --offset FLOAT\n -d, --delta FLOAT\n -c, --chunk INTEGER\n --help Show this message and exit.\n```\n\n### stream\n\n```sh\n$ ncreplay /path/to/file.nc stream --help\nUsage: ncreplay stream [OPTIONS]\n\n Streams each unqiue timestep in the netCDF file every [delta] seconds.\n Optionally you can control the [starting] point of the file and this will\n re-calculate all of the timestamps to match the original timedeltas.\n\nOptions:\n -s, --starting [%Y-%m-%d|%Y-%m-%dT%H:%M:%S|%Y-%m-%d %H:%M:%S]\n -d, --delta FLOAT\n --help Show this message and exit\n```\n\n## Development / Testing\n\nThere are no tests (yet) but you can play around with the options using files included in this repository\n\nFirst setup a Kafka ecosystem\n\n```bash\n$ docker run -d --net=host \\\n -e ZK_PORT=50000 \\\n -e BROKER_PORT=4001 \\\n -e REGISTRY_PORT=4002 \\\n -e REST_PORT=4003 \\\n -e CONNECT_PORT=4004 \\\n -e WEB_PORT=4005 \\\n -e RUNTESTS=0 \\\n -e DISABLE=elastic,hbase \\\n -e DISABLE_JMX=1 \\\n -e RUNTESTS=0 \\\n -e FORWARDLOGS=0 \\\n -e SAMPLEDATA=0 \\\n --name ncreplayer-testing \\\n landoop/fast-data-dev:1.0.1\n```\n\nThen setup a listener\n\n```bash\n$ docker run -it --rm --net=host \\\n landoop/fast-data-dev:1.0.1 \\\n kafka-console-consumer \\\n --bootstrap-server localhost:4001 \\\n --topic axds-ncreplayer-data\n```\n\nNow batch or stream a file:\n\n```bash\n# Batch\n$ ncreplay tests/data/gda_example.nc batch -d 10 -c 10\n\n# Stream\n$ ncreplay tests/data/gda_example.nc stream -d 10\n```\n\nTo test the `avro` packing, setup a listener that will unpack the data automatically:\n\n```bash\n$ docker run -it --rm --net=host \\\n landoop/fast-data-dev:1.0.1 \\\n kafka-avro-console-consumer \\\n --bootstrap-server localhost:4001 \\\n --property schema.registry.url=http://localhost:4002 \\\n --topic axds-ncreplayer-data\n```\n\nAnd use `avro` packing\n\n```bash\n$ ncreplay --packing avro tests/data/gda_example.nc batch -d 10 -c 10\n```\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/axiom-data-science/ncreplayer", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "ncreplayer", "package_url": "https://pypi.org/project/ncreplayer/", "platform": "", "project_url": "https://pypi.org/project/ncreplayer/", "project_urls": { "Homepage": "https://github.com/axiom-data-science/ncreplayer" }, "release_url": "https://pypi.org/project/ncreplayer/1.0.2/", "requires_dist": [ "click", "easyavro (>=2.5.0)", "msgpack-python", "numpy", "pandas", "pocean-core" ], "requires_python": "", "summary": "Playback netCDF CF DSG files onto a Kafka topic", "version": "1.0.2" }, "last_serial": 5268990, "releases": { "1.0.2": [ { "comment_text": "", "digests": { "md5": "b87b11f1092d6eeaddc92232e5e10c43", "sha256": "4ea8ca92e70d5d83eed1c27045e2ea30d7a6349cb8bf608584b5700b664e6a41" }, "downloads": -1, "filename": "ncreplayer-1.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "b87b11f1092d6eeaddc92232e5e10c43", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 11574, "upload_time": "2019-05-14T19:17:45", "url": "https://files.pythonhosted.org/packages/58/a2/2ab3ce44d60b56df37d9f1d3331295e52476c30cd3fbaec3bccbe6527346/ncreplayer-1.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "ee8a5618b8c46d84b1af58940855c37a", "sha256": "b3182e18c2ca7842c32e014b05ae8e38a68cc3a6da0fb8701569364c751b26e6" }, "downloads": -1, "filename": "ncreplayer-1.0.2.tar.gz", "has_sig": false, "md5_digest": "ee8a5618b8c46d84b1af58940855c37a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7881, "upload_time": "2019-05-14T19:17:47", "url": "https://files.pythonhosted.org/packages/09/12/c9a9f8b5d678a600b96ba80c6dbb210cee049b2f8ccb4fe9127a6167a9cc/ncreplayer-1.0.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "b87b11f1092d6eeaddc92232e5e10c43", "sha256": "4ea8ca92e70d5d83eed1c27045e2ea30d7a6349cb8bf608584b5700b664e6a41" }, "downloads": -1, "filename": "ncreplayer-1.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "b87b11f1092d6eeaddc92232e5e10c43", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 11574, "upload_time": "2019-05-14T19:17:45", "url": "https://files.pythonhosted.org/packages/58/a2/2ab3ce44d60b56df37d9f1d3331295e52476c30cd3fbaec3bccbe6527346/ncreplayer-1.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "ee8a5618b8c46d84b1af58940855c37a", "sha256": "b3182e18c2ca7842c32e014b05ae8e38a68cc3a6da0fb8701569364c751b26e6" }, "downloads": -1, "filename": "ncreplayer-1.0.2.tar.gz", "has_sig": false, "md5_digest": "ee8a5618b8c46d84b1af58940855c37a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7881, "upload_time": "2019-05-14T19:17:47", "url": "https://files.pythonhosted.org/packages/09/12/c9a9f8b5d678a600b96ba80c6dbb210cee049b2f8ccb4fe9127a6167a9cc/ncreplayer-1.0.2.tar.gz" } ] }