{ "info": { "author": "Gabes Jean", "author_email": "naparuba@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 5 - Production/Stable", "Environment :: Console", "Intended Audience :: System Administrators", "License :: OSI Approved :: MIT License", "Operating System :: MacOS :: MacOS X", "Operating System :: Microsoft :: Windows", "Operating System :: POSIX", "Programming Language :: Python", "Programming Language :: Python :: 2 :: Only", "Topic :: System :: Distributed Computing", "Topic :: System :: Monitoring", "Topic :: System :: Networking :: Monitoring" ], "description": "\r\n\r\n\r\nThis is a first release of the opsbro project about a service discovery / monitoring / light cfg management / command execution tool.\r\n\r\n[![Build Status](https://travis-ci.org/naparuba/opsbro.svg)](https://travis-ci.org/naparuba/opsbro)\r\n\r\n\r\n## **OpsBro**: monitoring and service discovery\r\n\r\n![Agent](images/agent.png)\r\n\r\n\r\n\r\n## Installation\r\n\r\n#### Prerequites\r\nYou will need:\r\n\r\n * python 2.6 or 2.7 (python 3 is not managed currently)\r\n\r\nIt will automatically install:\r\n * python-leveldb\r\n * python-requests\r\n * python-jinja2 \r\n * python-cherrypy3\r\n\r\n\r\nTo monitor linux:\r\n * sysstat\r\n\r\nTo monitor mongodb server:\r\n * python-pymongo\r\n\r\n\r\n\r\n\r\n#### Installation\r\n\r\nJust launch:\r\n\r\n python setup.py install\r\n\r\n\r\n## Run your daemon, and join the opsbro cluster\r\n\r\n#### Start OpsBro\r\n\r\nYou can start opsbro as a daemon with:\r\n \r\n /etc/init.d/opsbro start\r\n\r\nYou can also launch it in foreground:\r\n\r\n opsbro agent start\r\n\r\n\r\n#### Stop opsbro daemon\r\nJust launch:\r\n \r\n opsbro agent stop\r\n\r\nOr use the init.d script:\r\n\r\n /etc/init.d/opsbro stop\r\n\r\n\r\n\r\n#### Display opsbro information\r\nJust launch:\r\n\r\n opsbro agent info\r\n\r\nYou will have several information about the current opsbro agent state:\r\n\r\n \r\n![Agent](images/info.png) \r\n\r\n\r\n#### Agent cluster membership\r\n\r\n##### Add your local node to the node cluster\r\n\r\nFirst you need to install and launch the node in another server.\r\n\r\nThen in this other server you can launch:\r\n \r\n opsbro gossip join OTHER-IP\r\n\r\n##### Auto discover LAN nodes (UDP broadcast detection)\r\n\r\nIf your nodes are on the same LAN, you can use the UDP auto-detection to list others nodes on your network. I will send an UDP broadcast packet that other nodes will answer.\r\n\r\nNOTE: if you are using an encryption key (recommanded) then you must already have set it. If not, the other node won't answer to your query.\r\n\r\n\r\n opsbro gossip detect\r\n\r\nIf the other nodes are present, they will be list by the command.\r\n\r\n\r\nIf you want to auto-join the other node cluster, then you can use the --auto-join parameter:\r\n\r\n opsbro gossip detect --auto-join\r\n\r\nIt will try to join nodes based on:\r\n * first try to a proxy node if present\r\n * if no proxy node is present, then use the fist other node\r\n\r\n\r\n##### List your opsbro cluster members\r\nYou can list the cluster members on all nodes with :\r\n\r\n opsbro gossip members\r\n\r\n![Agent](images/members.png) \r\n\r\nAnd you will see the new node on the UI if you enable it\r\n\r\n\r\n\r\n\r\n\r\n## Discover your server (os, apps, location, ...)\r\n\r\nDetectors are rules that are executed by the agent to detect your server properties like\r\n\r\n * OS (linux, redhat, centos, debian, windows, ...)\r\n * Applications (mongodb, redis, mysql, apache, ...)\r\n * Location (city, GPS Lat/Long)\r\n\r\nYou should declare a json object like:\r\n\r\n detector:\r\n interval: 10s\r\n apply_if: \"grep('centos', '/etc/redhat-release')\"\r\n groups: [\"linux\", \"centos\"]\r\n\r\n\r\n * Execute every 10 seconds\r\n * If there is the strong centos in the file /etc/redhat-release\r\n * Then add the group \"linux\" and centos\" to the local agent\r\n\r\n\r\n## Collect your server metrics (cpu, kernel, databases metrics, etc)\r\n\r\nCollectors are code executed by the agent to grok and store local os or application metrics. \r\n\r\nYou can list available collectors with the command:\r\n\r\n opsbro collectors list\r\n \r\n \r\n![Agent](images/collectors-list.png) \r\n\r\n * enabled: it's running well\r\n * disabled: it's missing a librairy for running\r\n\r\n## Execute checks\r\n\r\nYou can execute checks on your agent by two means:\r\n * Use the collectors data and evaluate check rule on it\r\n * Execute a nagios-like plugin\r\n\r\n### Common check parameters for evaluated and nagios plugins based checks\r\n\r\nSome parameters are common on the two check types you can defined.\r\n\r\n * interval: how much seconds the checks will be scheduled\r\n * if_group: if present, will declare and execute the check only if the agent group is present\r\n\r\n\r\n### Evaluate check rule on collectors data\r\n\r\nEvaluated check will use collectors data and should be defined with:\r\n * ok_output: python expression that create a string that will be shown to the user if the state is OK\r\n * critical_if: python expression that try to detect a CRITICAL state\r\n * critical_output: python expression that create a string that will be shown to the user if the state is CRITICAL\r\n * warning_if: python expression that try to detect WARNING state\r\n * warning_output: python expression that create a string that will be shown to the user if the state is WARNING\r\n * thresholds: [optionnal] you can set here dict of thresholds you will access from your check rule by \"configuration.thresholds.XXX\"\r\n \r\nThe evaluation is done like this:\r\n * if the critical expression is True => go CRITICAL\r\n * else if warning expression is True => go WARNING\r\n * else go OK\r\n \r\nFor example here is a cpu check on a linux server:\r\n\r\n check:\r\n interval: 10s\r\n if_group: linux\r\n \r\n ok_output: \"'OK: cpu is great: %s%%' % (100-{collector.cpustats.cpuall.%idle})\"\r\n \r\n critical_if: \"{collector.cpustats.cpuall.%idle} < {configuration.thresholds.cpuidle.critical}\"\r\n critical_output: \"'Critical: cpu is too high: %s%%' % (100-{collector.cpustats.cpuall.%idle})\"\r\n\r\n warning_if: \"{collector.cpustats.cpuall.%idle} < {configuration.thresholds.cpuidle.warning}\"\r\n warning_output: \"'Warning: cpu is very high: %s%%' % (100-{collector.cpustats.cpuall.%idle})\"\r\n \r\n thresholds :\r\n cpuidle :\r\n warning: 5\r\n critical: 1\r\n\r\n\r\n\r\n### Use Nagios plugins\r\n\r\nNagios based checks will use Nagios plugins and run them. Use them if you don't have access to the information you need in the collectors.\r\n\r\nThe parameter for this is:\r\n * script: the command line to execute your plugin\r\n \r\n Here is an example \r\n \r\n check:\r\n if_group: linux\r\n script: \"$nagiosplugins$/check_mailq -w $mailq.warning$ -c $mailq.critical$\"\r\n interval: 60s\r\n \r\n mailq:\r\n warning: 1\r\n critical: 2\r\n\r\n\r\nNOTE: the $$ evaluation is not matching the previous checks, we will fix it in a future version but it will break the current version configuration.\r\n\r\n\r\n## Notify check/node state change with emails\r\n\r\nYou can be notified about check state changed with handlers. currently 2 are managed: email & slack\r\n\r\nYou must define it in your local configuration:\r\n\r\n handler:\r\n type: mail\r\n severities:\r\n - ok\r\n - warning\r\n - critical\r\n - unknown\r\n contacts:\r\n - \"admin@mydomain.com\"\r\n addr_from: \"opsbro@mydomain.com\"\r\n smtp_server: localhost\r\n subject_template: \"email.subject.tpl\"\r\n text_template: \"email.text.tpl\"\r\n\r\n * type: email\r\n * severities: raise this handler only for this new states\r\n * contacts: who notifies\r\n * addr_from: from address to set your email from\r\n * smtp_server: which SMTP server to end your notification\r\n * subject_template: jinja2 template for the email subject, from the directory /templates\r\n * text_template: jinja2 template for the email content, from the directory /templates\r\n\r\nThen your handler must be registered into your checks, in the \"handlers\" list.\r\n\r\n## Notify check/node state change into slack\r\n\r\nYou can be notified about check state changed with handlers. currently only one is managed: email.\r\n\r\nYou must define it in your local configuration:\r\n\r\n handler:\r\n id: slack\r\n type: slack\r\n severities:\r\n - ok\r\n - warning\r\n - critical\r\n - unknown\r\n\r\n token: ''\r\n channel: '#alerts'\r\n\r\n * type: slack\r\n * severities: raise this handler only for this new states\r\n * token: your slack token. Get one at https://api.slack.com/custom-integrations/legacy-tokens\r\n * channel: on which channel should the alerts go. If the channel is not existing, it will try to create one\r\n\r\n\r\nThen your handler must be registered into your checks, in the \"handlers\" list.\r\n\r\n## Export your nodes and check states into Shinken\r\n\r\nYou can export all your nodes informations (new, deleted or change node) into your Shinken installation. It will automatically:\r\n * create new host when you start a new node\r\n * change the host configuration (host templates) when a new group is add/removed on your agent\r\n * remove your host when you delete your agent (by terminating your Cloud instance for example)\r\n\r\nYou must add in the agent installed on your shinken arbiter daemon the following local configuration:\r\n\r\n shinken:\r\n cfg_path: \"/etc/shinken/agent\"\r\n\r\n * cfg_path: a directory where all your nodes will be sync as shinken hosts configuration (cfg files)\r\n\r\nCurrently it also use hard path to manage your shinken communication:\r\n * the unix socket */var/lib/shinken/nagios.cmd* should be created by your shinken arbiter/receiver [named-pipe](http://shinken.io/package/named-pipe) module.\r\n * it call the \"/etc/init.d/shinken reload\" command when a node configuration is changed(new, removed or group/template change) \r\n\r\n\r\n## Access your nodes informations by DNS\r\n\r\nIf you enable the DNS interface for your agent, it will start an internal DNS server that will answer to queries. So your applications will be able to easily exchange with the valid node that is still alive or launched.\r\n\r\nYou must define a dns object in your local configuration to enable this interface:\r\n\r\n dns:\r\n enabled : true\r\n\t port : 6766\r\n\t domain : \".opsbro\"\r\n\r\n\r\n * enabled: start or not the listener\r\n * port: UDP port to listen for UDP requests\r\n * domain: allowed domain to request, should match a specific domain name to be redirected to this \r\n\r\nYou will be able to query it with dig for test:\r\n\r\n $dig -p 6766 @localhost redis.group.dc1.opsbro +short\r\n 192.168.56.103\r\n 192.168.56.105\r\n\r\nIt list all available node with the \"group\" redis.\r\n\r\n\r\nTODO: document the local dns redirect to link into resolv.conf\r\n\r\n\r\n## Export and store your application telemetry into the agent metric system \r\n\r\n### Real time application performance metrics\r\n\r\nThe statsd protocol is a great way to extract performance statistics from your application into your monitoring system. You application will extract small timing metrics (like function execution time) and send it in a non blocking way (in UDP).\r\n\r\nThe statsd daemon part will agregate counters for 10s and will then export the min/max/average/99thpercentile to a graphite server so you can store and graph them.\r\n\r\nIn order to enable the statsd listener, you must define the statsd in your local configuration:\r\n\r\n statsd:\r\n enabled : true\r\n port : 8125\r\n interval : 10\r\n\r\n * enabled: launch or not the statsd listener\r\n * port: UDP port to listen\r\n * interval: store metrics into memory for X seconds, then export them into graphite for storing\r\n\r\n**TODO**: change the ts group that enable this feature to real *role*/addons\r\n\r\n\r\n### Store your metrics for long term into Graphite\r\n\r\nYou can store your metrics into a graphite like system, that will automatically agregate your data.\r\n\r\nIn order to enable the graphite system, you must declare a graphite object in your local configuration:\r\n \r\n graphite:\r\n enabled : true\r\n port : 2003\r\n udp : true\r\n tcp : true\r\n\r\n * enabled: launch or not the graphite listener\r\n * port: TCP and/or UDP port to listen metrics\r\n * udp: listen in UDP mode\r\n * tcp: listen to TCP mode\r\n\r\n**TODO**: finish the graphite part, to show storing and forwarding mode.\r\n\r\n## Get access to your graphite data with an external UI like Grafana\r\n\r\n**TODO**: test and show graphite /render interface of the agent into grafana\r\n\r\n\r\n## Get notified when there is a node change (websocket)\r\n\r\nYou can get notified about a node change (new node, deleted node or new group or check state change) with a websocket interface on your local agent.\r\n\r\nAll you need is to enable it on your local node configuration:\r\n\r\n websocket:\r\n\t enabled : true\r\n\t port : 6769\r\n\r\n * enabled: start or not the websocket listener\r\n * port: which TCP port to listen for the websocket connections\r\n \r\n**Note**: the HTTP UI need the websocket listener to be launched to access node states.\r\n\r\n## Store your data/configuration into the cluster (KV store)\r\n\r\nThe agent expose a KV store system. It will automatically dispatch your data into 3 nodes allowed to store raw data.\r\n\r\nWrite and Read queries can be done on the node you want, it don't have to be a KV storing node. The agent will forward your read/write query to the node that manage your key, and this one will synchronize the data with 2 others nodes after it did answer to the requestor.\r\n\r\nThe KEY dispatching between nodes is based on a SHA1 consistent hashing.\r\n\r\nThe API is:\r\n\r\n * **/kv/** *(GET)* : list all keys store on the local node\r\n * **/kv/KEY-NAME** *(GET)*: get the key value with base64 encoding\r\n * **/kv/KEY-NAME** *(PUT)* : store a key value\r\n * **/kv/KEY-NAME** *(DELETE)* : delete a key\r\n \r\n\r\n**TODO**: change the KV store from group to a role in the configuration\r\n\r\n## Use your node states and data/configuration (KV) to keep your application configuration up-to-date\r\n\r\n// Generators\r\n\r\n\r\n## How to see collected data? (metrology)\r\n\r\nThe opsbro agent is by default getting lot of metrology data from your OS and applications. It's done by \"collctors\" objets. You can easily list them and look at the colelcted data by launching:\r\n\r\n opsbro collectors show\r\n\r\n\r\n**TODO** Allow to export into json format\r\n\r\n\r\n## How to install applications on your system thanks to OpsBro\r\n\r\n**TODO** Document installers, and rename them in fact (asserters?)\r\n\r\n## How to see docker performance informations?\r\n\r\nIf docker is launched on your server, OpsBro will get data from it, like collectors, images and performances.\r\n\r\nTo list all of this just launch:\r\n\r\n opsbro docker show\r\n\r\n\r\n## Get quorum strong master/slave elections thanks to RAFT\r\n \r\nYou can ask your node cluster system to elect a node for a specific task or application thanks to the RAFT implementation inside the agents\r\n\r\n**TODO**: currently in alpha stade in the code \r\n\r\n\r\n## Access the agent API documentation\r\n\r\n**TODO**: this must be developed, in order to access your node API available in text or HTTP\r\n\r\n\r\n## Is there an UI available?\r\n\r\nYes. There is a UI available in opsbro.io wbesite (SaaS).", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://opsbro.io", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "opsbro", "package_url": "https://pypi.org/project/opsbro/", "platform": "", "project_url": "https://pypi.org/project/opsbro/", "project_urls": { "Homepage": "http://opsbro.io" }, "release_url": "https://pypi.org/project/opsbro/0.3/", "requires_dist": null, "requires_python": "", "summary": "OpsBro is a service discovery tool", "version": "0.3" }, "last_serial": 3151414, "releases": { "0.1b1": [ { "comment_text": "", "digests": { "md5": "a93c4d26e15f09f5479623e6bed9c1d9", "sha256": "9354daaf7fc23e23ebdadf4961ba33f0e0ce352512089c5392759b17f5ad87f4" }, "downloads": -1, "filename": "opsbro-0.1b1.tar.gz", "has_sig": false, "md5_digest": "a93c4d26e15f09f5479623e6bed9c1d9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 863626, "upload_time": "2017-09-04T08:02:18", "url": "https://files.pythonhosted.org/packages/8c/0e/054fc26a2235932342422846e667a243638e5b21b1569a1e28ea2e19a2e6/opsbro-0.1b1.tar.gz" } ], "0.3": [ { "comment_text": "", "digests": { "md5": "599ebcd57fba9019f13cb28cb23faae9", "sha256": "3010c02375c806e02e1cffb9959a5850f103c4454c254a9dccc6e7313b1975af" }, "downloads": -1, "filename": "opsbro-0.3.tar.gz", "has_sig": false, "md5_digest": "599ebcd57fba9019f13cb28cb23faae9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 872597, "upload_time": "2017-09-05T20:01:33", "url": "https://files.pythonhosted.org/packages/3c/7b/4b115b16a89c1a1aa1912233a94bbb2accabc3dc5fcfa55785ee7dc2efa6/opsbro-0.3.tar.gz" } ], "0.3b1": [ { "comment_text": "", "digests": { "md5": "74adba16545ef42e4f6b7462588bfec9", "sha256": "0a0e59b32e4134a410c42733437600800ecf76db23c8007e0c1a891dfc10c504" }, "downloads": -1, "filename": "opsbro-0.3b1.tar.gz", "has_sig": false, "md5_digest": "74adba16545ef42e4f6b7462588bfec9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 868800, "upload_time": "2017-09-05T19:33:55", "url": "https://files.pythonhosted.org/packages/05/58/4c0f7a9936689f3b3a0677d40ed37cc76c83874eccf7132552b9d29cb46d/opsbro-0.3b1.tar.gz" } ], "0.3b2": [ { "comment_text": "", "digests": { "md5": "995a13f5a33c45c8df4e7dc75582958c", "sha256": "0000be3944d238bf61a4058cb0a8f83b337fefaf7b7549464e34594f6562d75c" }, "downloads": -1, "filename": "opsbro-0.3b2.tar.gz", "has_sig": false, "md5_digest": "995a13f5a33c45c8df4e7dc75582958c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 872812, "upload_time": "2017-09-05T19:50:54", "url": "https://files.pythonhosted.org/packages/4e/dc/25525ee2bd29dbbe39b88919bfc31b9e45c9ef269840d72632e9ef95b2bc/opsbro-0.3b2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "599ebcd57fba9019f13cb28cb23faae9", "sha256": "3010c02375c806e02e1cffb9959a5850f103c4454c254a9dccc6e7313b1975af" }, "downloads": -1, "filename": "opsbro-0.3.tar.gz", "has_sig": false, "md5_digest": "599ebcd57fba9019f13cb28cb23faae9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 872597, "upload_time": "2017-09-05T20:01:33", "url": "https://files.pythonhosted.org/packages/3c/7b/4b115b16a89c1a1aa1912233a94bbb2accabc3dc5fcfa55785ee7dc2efa6/opsbro-0.3.tar.gz" } ] }