{ "info": { "author": "Charlie Lewis", "author_email": "charliel@lab41.org", "bugtrack_url": null, "classifiers": [ "Environment :: Other Environment", "Operating System :: POSIX :: Linux", "Programming Language :: Python", "Topic :: Software Development :: Libraries :: Application Frameworks" ], "description": "Hemlock\n=======\n[![PyPI version](https://badge.fury.io/py/hemlock.png)](http://badge.fury.io/py/hemlock) [![Build Status](https://travis-ci.org/Lab41/Hemlock.png?branch=master)](https://travis-ci.org/Lab41/Hemlock) [![downloads](https://pypip.in/d/hemlock/badge.png)](http://crate.io/packages/hemlock/) [![Coverage Status](https://coveralls.io/repos/Lab41/Hemlock/badge.png?branch=master)](https://coveralls.io/r/Lab41/Hemlock?branch=master)\n\nHemlock is an open-source project exploring ways to create a common data access\nlayer that eliminates the need to understand underlying data topologies but\nstill preserving the requirements of each data source such as access control,\nperformance, and formats.\n\n![Hemlock L](https://raw.github.com/Lab41/Hemlock/master/docs/images/overview_hemlock.png \"Hemlock\")\n\nInstall instructions\n====================\n\nOption A, install using pip:\n\n```bash\nsudo pip install hemlock\n```\n\nOption B, build from source:\n\n```bash\ngit clone https://github.com/Lab41/Hemlock.git\ncd Hemlock\nsudo python setup.py install\n```\n\nRequired Dependencies\n---------------------\n\nPython modules:\n- [MySQLdb](http://mysql-python.sourceforge.net/MySQLdb.html)\n- [texttable](https://pypi.python.org/pypi/texttable)\n- [couchbase](http://www.couchbase.com/communities/python/getting-started) >= 1.0\n- [APScheduler](https://pypi.python.org/pypi/APScheduler)\n\nBuild a server running [MySQL](http://www.mysql.com/) to store user accounts, tenants, and registered \nsystems.\n\n\nBuild a [Couchbase 2.0](http://www.couchbase.com/) cluster to store metadata and data of registered systems.\n\nBuild an [ElasticSearch 0.90.2](http://www.elasticsearch.org/) cluster to store the index of Couchbase.\n\nAdd XDCR one-way replication from Couchbase to ElasticSearch using this [plugin](https://github.com/couchbaselabs/elasticsearch-transport-couchbase) (Note, grab version 1.1.0).\n\nOnce the plugin is installed, be sure and update the couchbase_template.json under plugins/transport-couchbase/ to have the following:\n\n```json\n{\n \"template\" : \"*\",\n \"order\" : 10,\n \"mappings\" : {\n \"couchbaseCheckpoint\" : {\n \"_source\" : {\n \"includes\" : [\"doc.*\"]\n },\n \"date_detection\" : false,\n \"dynamic_templates\": [\n {\n \"store_no_index\": {\n \"match\": \"*\",\n \"mapping\": {\n \"store\" : \"no\",\n \"index\" : \"no\",\n \"include_in_all\" : false\n }\n }\n }\n ]\n },\n \"_default_\" : {\n \"_source\" : {\n \"includes\" : [\"meta.*\"]\n },\n \"date_detection\" : false,\n \"properties\" : {\n \"meta\" : {\n \"type\" : \"object\",\n \"include_in_all\" : false\n }\n }\n }\n }\n}\n```\n\nOnce that is added, start up ElasticSearch with ``bin/elasticsearch`` and then perform the following the first time:\n\n```bash\ncurl -XPUT http://localhost:9200/_template/couchbase -d @plugins/transport-couchbase/couchbase_template.json\n```\n\nInstalling required databases\n-----------------------------\n\n1. Create database ``hemlock`` in [MySQL](http://www.mysql.com/).\n2. Create bucket ``hemlock`` in [Couchbase](http://www.couchbase.com/).\n3. Create index ``hemlock`` in [ElasticSearch](http://www.elasticsearch.org/).\n\n\nGetting started\n----------------\n\n1. Create Hemlock credentials (see 'Credential files')\n ```bash\n HEMLOCK_MYSQL_SERVER=192.168.1.10\n HEMLOCK_MYSQL_USERNAME=user\n HEMLOCK_MYSQL_DB=hemlock\n HEMLOCK_MYSQL_PW=pass\n HEMLOCK_COUCHBASE_SERVER=192.168.1.20\n HEMLOCK_COUCHBASE_BUCKET=hemlock\n HEMLOCK_COUCHBASE_USERNAME=hemlock\n HEMLOCK_COUCHBASE_PW=pass\n HEMLOCK_ELASTICSEARCH_ENDPOINT=192.168.1.30\n ```\n\n (if you'd like these to persist, consider adding export before each line and performing ``source`` on the file)\n2. Create a tenant, role, user, and data source system\n ```bash\n hemlock tenant-create --name Project1\n\n hemlock tenant-list\n \n hemlock role-create --name User\n \n hemlock role-list\n \n hemlock user-create --name User1 \\\n --username Username1 \\\n --email user1@email.com \\\n --rold_id 42ba73f9-0ab6-4a50-908c-1585955754f4 \\\n --tenant_id 7d0f6b0d-334a-4d89-bd1a-70e8e1c04aa6\n \n hemlock user-list\n \n hemlock register-local-system --name System1 \\\n --data_type csv \\\n --description \"description\" \\\n --tenant_id 7d0f6b0d-334a-4d89-bd1a-70e8e1c04aa6 \\\n --hostname system1.fqdn \\\n --endpoint http://hemlock.server/ \\\n --poc_name user1 \\\n --poc_email user1@email.com\n \n hemlock system-list\n ```\n3. Add credentials for data source system, for example: mysql_creds\n ```bash\n MYSQL_SERVER=192.168.1.30\n MYSQL_DB=db1\n #MYSQL_TABLE=table1\n MYSQL_USERNAME=user\n MYSQL_PW=pass\n ```\n4. Store a client\n ```bash\n hemlock client-store --name mysql_client_1 --type mysql --system_id 7d0f6b0d-334a-4d89-bd1a-70e8e1c04aa6 --credential_file /path/to/mysql_creds \n\n hemlock client-list\n ```\n5. Add credentials for hemlock\n ```bash\n hemlock hemlock-server-store --credential_file /path/to/hemlock_creds\n ```\n6. Create a schedule server (optional)\n ```bash\n hemlock schedule-server-create --name schedule_server_1\n\n hemlock schedule-server-list\n ```\n7. Add a schedule for the data source system to run (optional)\n ```bash\n hemlock client-schedule --name schedule1 \\\n --minute \"54\" \\\n --hour \"12\" \\\n --day_of_month \"*\" \\\n --month \"*\" \\\n --day_of_week \"*\" \\\n --client_id 7d0f6b0d-334a-4d89-bd1a-70e8e1c04aa6\n --schedule_server_id 7d0f6b0d-334a-4d89-bd1a-70e8e1c04aa6\n\n hemlock schedule-list\n ```\n8. Perform a test run for pulling data from the data source system\n ```bash\n hemlock client-run --uuid 7d0f6b0d-334a-4d89-bd1a-70e8e1c04aa6\n ```\n9. Search for data that has been loaded into Hemlock\n ```bash\n hemlock query-data --user 7d0f6b0d-334a-4d89-bd1a-70e8e1c04aa6 --query foo\n ```\n or\n ```bash\n Direct with elasticsearch:\n\n http://elasticsearch.fqdn:9200/hemlock/_search?q=foo\n\n Which returns something the following:\n \n {\n \"took\": 14,\n \"timed_out\": false,\n \"_shards\": {\n \"total\": 20,\n \"successful\": 20,\n \"failed\": 0\n },\n \"hits\": {\n \"total\": 1,\n \"max_score\": 3.6582048,\n \"hits\": [\n {\n \"_index\": \"hemlock\",\n \"_type\": \"couchbaseDocument\",\n \"_id\": \"865f458b4421ae5fd758e3c81aca9f8d8b4696b6\",\n \"_score\": 3.6582048,\n \"_source\": {\n \"meta\": {\n \"id\": \"865f458b4421ae5fd758e3c81aca9f8d8b4696b6\",\n \"rev\": \"1-0010f1ac6045ccf40000000000000000\",\n \"flags\": 0,\n \"expiration\": 0\n }\n }\n }\n ]\n }\n }\n\n Now we can feed the 'id' into Couchbase to return the full document:\n \n http://couchbase.fqdn:8092/hemlock/865f458b4421ae5fd758e3c81aca9f8d8b4696b6\n \n Which returns something like the following:\n \n {\n \"hemlock-system\": \"a50b86c2-59f7-42a3-aa67-3367579189fe\",\n \"hemlock-date\": \"2013-09-03 16:10:20\",\n \"stream\": \"DOYLIE\"\n }\n ```\n \nCredential files\n----------------\n\n1. Create a ``hemlock_creds`` file (see hemlock_creds_sample for an example): \n\n ```bash\n HEMLOCK_MYSQL_SERVER=192.168.1.10\n HEMLOCK_MYSQL_USERNAME=user\n HEMLOCK_MYSQL_DB=hemlock\n HEMLOCK_MYSQL_PW=pass\n HEMLOCK_COUCHBASE_SERVER=192.168.1.20\n HEMLOCK_COUCHBASE_BUCKET=hemlock\n HEMLOCK_COUCHBASE_USERNAME=hemlock\n HEMLOCK_COUCHBASE_PW=pass\n ```\n \n2. Create credential files for each client you intend to use ([examples](https://github.com/Lab41/Hemlock/tree/master/hemlock/clients/)).\n\n\nCurrently supported data sources\n================================\n\nTechnology | Parameter | Python Module Dependencies\n---------- | --------- | ------------\nMySQL | mysql | MySQLdb\nMongoDB | mongo | pymongo\nRedis | redis | redis\nLocal FileSystem | fs | magic, pdfminer, xmltodict\nRESTful API | rest | \nStreams | stream_odd |\n\n\nAdding a new data source type\n-----------------------------\n\nCreate a new class under the clients folder for each new data source type. Most\nclasses will need two methods defined: ``connect_client`` and ``get_data``.\n\nThe following is a template that can be used to work from:\n\n```python\nclass HMyclient:\n def connect_client(self, client_dict):\n # return a handle that can be used to get data from the data source\n return c_server\n def get_data(self, client_dict, c_server, h_server, client_uuid):\n # data_list is an array of arrays to contain the data\n data_list = [[]]\n # desc_list is an array that contains the schema (if exists or known)\n desc_list = []\n return data_list, desc_list\n```\n\nUsage examples\n==============\n\n- Create a tenant\n\n ```bash\n hemlock tenant-create --name Project1\n ```\n- Create a role\n\n ```bash\n hemlock role-create --name User\n ```\n- Create a user\n\n ```bash\n hemlock user-create --name User1 \\\n --username Username1 \\\n --email user1@email.com \\\n --rold_id 42ba73f9-0ab6-4a50-908c-1585955754f4 \\\n --tenant_id 7d0f6b0d-334a-4d89-bd1a-70e8e1c04aa6\n ```\n- Register a local system\n\n ```bash\n hemlock register-local-system --name System1 \\\n --data_type csv \\\n --description \"description\" \\\n --tenant_id 7d0f6b0d-334a-4d89-bd1a-70e8e1c04aa6 \\\n --hostname system1.fqdn \\\n --endpoint http://hemlock.server/ \\\n --poc_name user1 \\\n --poc_email user1@email.com\n ```\n- List registered systems\n\n ```bash\n hemlock system-list\n ```\n- List created users\n\n ```bash\n hemlock user-list\n ```\n- Lists created tenants\n\n ```bash\n hemlock tenant-list\n ```\n- [Connecting to a client](https://github.com/Lab41/Hemlock/tree/master/hemlock/clients/)\n- [Full CLI API list](https://github.com/Lab41/Hemlock/blob/master/docs/CLI.md)\n\n\nRelated repositories\n====================\n\n - [Hemlock-REST](http://lab41.github.io/Hemlock-REST/)\n - [Hemlock-Frontend](http://lab41.github.io/Hemlock-Frontend/)\n\nDocumentation\n=============\n\n - [Docs](http://lab41.github.io/Hemlock/docs/_build/html/index.html)\n\nTests\n=====\n\nThe tests for this project use [py.test](http://pytest.org/latest/)\n\n\nContributing to Hemlock\n=======================\n\nWhat to contribute? Awesome! Issue a pull request or see more details [here](https://github.com/Lab41/Hemlock/blob/master/CONTRIBUTING.md).", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://lab41.github.io/Hemlock", "keywords": "hemlock,metadata,cache,heterogeneous", "license": "LICENSE.txt", "maintainer": null, "maintainer_email": null, "name": "hemlock", "package_url": "https://pypi.org/project/hemlock/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/hemlock/", "project_urls": { "Download": "UNKNOWN", "Homepage": "http://lab41.github.io/Hemlock" }, "release_url": "https://pypi.org/project/hemlock/0.1.6/", "requires_dist": null, "requires_python": null, "summary": "Hemlock is a way of providing a common data access layer.", "version": "0.1.6" }, "last_serial": 899873, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "3a9c9c9052b023b7f7724f8fde20a79c", "sha256": "ad4c439e28255d4d1a3ed539175a3f805c5263c64f25fc807c04b68509c2d8e2" }, "downloads": -1, "filename": "hemlock-0.1.0.tar.gz", "has_sig": false, "md5_digest": "3a9c9c9052b023b7f7724f8fde20a79c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 152098, "upload_time": "2013-08-20T16:16:37", "url": "https://files.pythonhosted.org/packages/59/d2/13f052d83d5c7ecd88e29f11c5e0dfd5ccb439fe342933a0fb095965a417/hemlock-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "1c30fc5b858e7b10e8f12c70e0c60a33", "sha256": "3808c5ecb590c5b90ddb9348138675624090cc5933578e8d2c878e861919b3f6" }, "downloads": -1, "filename": "hemlock-0.1.1.tar.gz", "has_sig": false, "md5_digest": "1c30fc5b858e7b10e8f12c70e0c60a33", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 667548, "upload_time": "2013-09-03T21:22:47", "url": "https://files.pythonhosted.org/packages/8e/70/7ccf9ddc84bfb6d0fc176d958be1af44b9c750a03fca93caf11cc5797ea5/hemlock-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "c952c8be5b069150be3d346506e4c9f4", "sha256": "36ab9edc099484355c458a545c6b3319e9ccf32cdcfbc5fcc1f807f5fa30a685" }, "downloads": -1, "filename": "hemlock-0.1.2.tar.gz", "has_sig": false, "md5_digest": "c952c8be5b069150be3d346506e4c9f4", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 669610, "upload_time": "2013-09-04T03:52:12", "url": "https://files.pythonhosted.org/packages/57/4d/3bdda29cdcc900c9f68873f738d783dfb7df18fde2c8d25b8ccebc649f00/hemlock-0.1.2.tar.gz" } ], "0.1.3": [ { "comment_text": "", "digests": { "md5": "97f5b517b4744b56aae7cab12d32aa06", "sha256": "9f212935147f18ea064197e80529ce9039aca73fddccd542010682917af69948" }, "downloads": -1, "filename": "hemlock-0.1.3.tar.gz", "has_sig": false, "md5_digest": "97f5b517b4744b56aae7cab12d32aa06", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1008292, "upload_time": "2013-09-16T19:47:54", "url": "https://files.pythonhosted.org/packages/2e/18/5a554d4d6a4fb62701deca2018b1a03f371451ab60fc26500e3cc0492b91/hemlock-0.1.3.tar.gz" } ], "0.1.4": [ { "comment_text": "", "digests": { "md5": "428973f90bbcf414ef700c2cfd258abf", "sha256": "221d7b5a909dfce37d58538381b9301eab4bf3acd2d0712ed88c3938d986f1ea" }, "downloads": -1, "filename": "hemlock-0.1.4.tar.gz", "has_sig": false, "md5_digest": "428973f90bbcf414ef700c2cfd258abf", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1037379, "upload_time": "2013-09-23T22:42:47", "url": "https://files.pythonhosted.org/packages/a5/84/37db392e9ea7803a5f41b3bd7e5da5c93217fd186f3a7ed527a974ef51c6/hemlock-0.1.4.tar.gz" } ], "0.1.5": [ { "comment_text": "", "digests": { "md5": "e7a3765adea0af7e01a29174d644548a", "sha256": "a4b0542159a9f7328bac37b4f9beba70e8ae7a5494da0335877e20a1cfe31115" }, "downloads": -1, "filename": "hemlock-0.1.5.tar.gz", "has_sig": false, "md5_digest": "e7a3765adea0af7e01a29174d644548a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1066262, "upload_time": "2013-10-17T17:33:25", "url": "https://files.pythonhosted.org/packages/06/b1/f3d945349923249f675b010f07786ecfaf2aa5486c3f3d5ebd150e33a62b/hemlock-0.1.5.tar.gz" } ], "0.1.6": [ { "comment_text": "", "digests": { "md5": "9a5ff30ef80eb3586b5de22e86ff1f6c", "sha256": "60c6f909bdab813b45255474e7a1489448063d36f4d63b3f7071ffe790d26d03" }, "downloads": -1, "filename": "hemlock-0.1.6.tar.gz", "has_sig": false, "md5_digest": "9a5ff30ef80eb3586b5de22e86ff1f6c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1066470, "upload_time": "2013-10-21T16:50:58", "url": "https://files.pythonhosted.org/packages/08/bf/84fe2cc76a6af4f69cfedbcd82026c0d7d7cdc9dbbf19a9d9d377b08bfb3/hemlock-0.1.6.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "9a5ff30ef80eb3586b5de22e86ff1f6c", "sha256": "60c6f909bdab813b45255474e7a1489448063d36f4d63b3f7071ffe790d26d03" }, "downloads": -1, "filename": "hemlock-0.1.6.tar.gz", "has_sig": false, "md5_digest": "9a5ff30ef80eb3586b5de22e86ff1f6c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 1066470, "upload_time": "2013-10-21T16:50:58", "url": "https://files.pythonhosted.org/packages/08/bf/84fe2cc76a6af4f69cfedbcd82026c0d7d7cdc9dbbf19a9d9d377b08bfb3/hemlock-0.1.6.tar.gz" } ] }