{ "info": { "author": "Jim Crist", "author_email": "jiminy.crist@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "Hadoop Test Clusters\n====================\n\n|pypi| |cdh5| |cdh6|\n\n.. |pypi| image:: https://img.shields.io/pypi/v/hadoop-test-cluster.svg\n :target: https://pypi.org/project/hadoop-test-cluster/\n.. |cdh5| image:: https://img.shields.io/docker/pulls/jcrist/hadoop-testing-cdh5.svg\n :target: https://hub.docker.com/r/jcrist/hadoop-testing-cdh5/\n.. |cdh6| image:: https://img.shields.io/docker/pulls/jcrist/hadoop-testing-cdh6.svg\n :target: https://hub.docker.com/r/jcrist/hadoop-testing-cdh6/\n\nA dockerized setup for testing code on a hadoop cluster.\n\nInstallation\n------------\n\n``hadoop-test-cluster`` is available on PyPI:\n\n.. code-block:: console\n\n $ pip install hadoop-test-cluster\n\nYou can also install from source on github:\n\n.. code-block:: console\n\n $ pip install git+https://github.com/jcrist/hadoop-test-cluster.git\n\nOverview\n--------\n\nFor testing purposes, infrastructure for setting up a mini hadoop cluster using\ndocker is provided here. Two base images are provided:\n\n- ``cdh5``: provides a CDH5 installation of Hadoop 2.6\n- ``cdh6``: provides a CDH6 installation of Hadoop 3.0\n\nBoth images can be run with 2 different configurations:\n\n- ``simple``: uses ``simple`` authentication (unix user permissions)\n- ``kerberos`` uses ``kerberos`` for authentication\n\nEach cluster has three containers:\n\n- One ``master`` node running the ``hdfs-namenode`` and\n ``yarn-resourcemanager``, as well as the kerberos daemons.\n- One ``worker`` node running the ``hdfs-datanode`` and ``yarn-nodemanager``\n- One ``edge`` node for interacting with the cluster\n\nOne user account has also been created for testing purposes:\n\n- Login: ``testuser``\n- Password: ``testpass``\n\nFor the ``kerberos`` setup, a keytab for this user has been put at\n``/home/testuser/testuser.keytab``, so you can kinit easily like ``kinit -kt\n/home/testuser/testuser.keytab testuser``\n\nAn admin kerberos principal has also been created for use with ``kadmin``:\n\n- Login: ``admin/admin``\n- Password: ``adminpass``\n\nPorts\n-----\n\nThe full address is dependent on the IP address of your docker-machine driver,\nwhich can be found at:\n\n.. code-block:: console\n\n $ docker-machine inspect --format {{.Driver.IPAddress}})\n\n- NameNode RPC: 9000\n- NameNode Webui: 50070\n- ResourceManager Webui: 8088\n- Kerberos KDC: 88\n- Kerberos Kadmin: 749\n- DataNode Webui: 50075\n- NodeManager Webui: 8042\n\nThe ``htcluster`` commandline tool\n----------------------------------\n\nTo work with either cluster, please use the ``htcluster`` tool. This is a thin\nwrapper around ``docker-compose``, with utilities for quickly doing most common\nactions.\n\n.. code-block:: console\n\n $ htcluster --help\n usage: htcluster [--help] [--version] command ...\n\n Manage hadoop test clusters\n\n positional arguments:\n command\n startup Start up a hadoop cluster.\n login Login to a node in the cluster.\n exec Execute a command on the node as a user\n shutdown Shutdown the cluster and remove the containers.\n compose Forward commands to docker-compose\n kerbenv Output environment variables to setup kerberos locally. Intended\n use is to eval the output in bash: eval $(htcluster kerbenv)\n\n optional arguments:\n --help, -h Show this help message then exit\n --version Show version then exit\n\nStarting a cluster\n~~~~~~~~~~~~~~~~~~\n\nStart a CDH5 cluster with simple authentication:\n\n.. code-block:: console\n\n $ htcluster startup --image cdh5 --config simple\n\nStart a CDH6 cluster with kerberos authentication\n\n.. code-block:: console\n\n $ htcluster startup --image cdh6 --config kerberos\n\nStarting a cluster, mounting the current directory to ~/workdir\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n.. code-block:: console\n\n $ htcluster startup --image cdh5 --mount .:workdir\n\nLogin to the edge node\n~~~~~~~~~~~~~~~~~~~~~~\n\n.. code-block:: console\n\n $ htcluster login\n\nRun a commmand as the user on the edge node\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n.. code-block:: console\n\n $ htcluster exec -- myscript.sh some other args\n\nShutdown the cluster\n~~~~~~~~~~~~~~~~~~~~\n\n.. code-block:: console\n\n $ htcluster shutdown\n\n\nAuthenticating with Kerberos from outside Docker\n------------------------------------------------\n\nIn the kerberized cluster, the webui's are secured by kerberos, and so won't be\naccessible from your browser unless you configure kerberos properly. This is\ndoable, but takes a few steps:\n\n1. Kerberos/SPNEGO requires that the requested url matches the hosts domain.\n The easiest way to do this is to modify your ``/etc/hosts`` and add a line for\n ``master.example.com``:\n\n .. code-block:: console\n\n # Add a line to /etc/hosts pointing master.example.com to your docker-machine\n # driver ip address.\n # Note that you probably need to run this command as a super user.\n $ echo \"$(docker-machine inspect --format {{.Driver.IPAddress}}) master.example.com\" >> /etc/hosts\n\n2. You must have ``kinit`` installed locally. You may already have it, otherwise\n it's available through most package managers.\n\n3. You need to tell kerberos where the ``krb5.conf`` is for this domain. This is\n done with an environment variable. To make this easy, ``htcluster`` has a\n command to do this:\n\n .. code-block:: console\n\n $ eval $(htcluster kerbenv)\n\n4. At this point you should be able to kinit as testuser:\n\n .. code-block:: console\n\n $ kinit testuser@EXAMPLE.COM\n\n5. To access kerberos secured pages in your browser you'll need to do a bit of\n (simple) configuration. See [this documentation from\n Cloudera](https://www.cloudera.com/documentation/enterprise/5-9-x/topics/cdh_sg_browser_access_kerberos_protected_url.html)\n for information on what's needed for your browser.\n\n6. Since environment variables are only available for processes started in the\n environment, you have three options here:\n\n - Restart your browser from the shell in which you added the environment\n variables\n\n - Manually get a ticket for the ``HTTP/master.example.com`` principal. Note\n that this will delete your other tickets, but works fine if you just want\n to see the webpage\n\n .. code-block:: console\n\n $ kinit -S HTTP/master.example.com testuser\n\n - Use ``curl`` to authenticate the first time, at which point you'll already\n have the proper tickets in your cache, and the browser authentication will\n just work. Note that your version of curl must support the GSS-API.\n\n .. code-block:: console\n\n $ curl -V # Check your version of curl supports GSS-API\n curl 7.59.0 (x86_64-apple-darwin17.2.0) libcurl/7.59.0 SecureTransport zlib/1.2.11\n Release-Date: 2018-03-14\n Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp\n Features: AsynchDNS IPv6 Largefile GSS-API Kerberos SPNEGO NTLM NTLM_WB SSL libz UnixSockets\n\n $ curl --negotiate -u : http://master.example.com:50070 # get a HTTP ticket for master.example.com\n\n After doing one of these, you should be able to access any of the pages from\n your browser.\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/jcrist/hadoop-test-cluster", "keywords": "", "license": "BSD", "maintainer": "", "maintainer_email": "", "name": "hadoop-test-cluster", "package_url": "https://pypi.org/project/hadoop-test-cluster/", "platform": "", "project_url": "https://pypi.org/project/hadoop-test-cluster/", "project_urls": { "Homepage": "https://github.com/jcrist/hadoop-test-cluster" }, "release_url": "https://pypi.org/project/hadoop-test-cluster/0.1.0/", "requires_dist": null, "requires_python": "", "summary": "A CLI for managing hadoop clusters for testing", "version": "0.1.0" }, "last_serial": 5135916, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "2805b710020fac12d70ba3f92b73049d", "sha256": "7530a41dc15bae5057dbc1ee0fa813e78f76d8c9b005ae462cf9142524436f13" }, "downloads": -1, "filename": "hadoop_test_cluster-0.0.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "2805b710020fac12d70ba3f92b73049d", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 10829, "upload_time": "2018-06-22T02:54:30", "url": "https://files.pythonhosted.org/packages/df/55/94ab71771a254857140428e1d064cbda7953f4689c6f2ea117fee1e979b4/hadoop_test_cluster-0.0.1-py2.py3-none-any.whl" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "3b85ff80b9b9c177156893bad69928f2", "sha256": "1746cc7f2bce2222bc05816e00aefc6fe32f80205d05fc84430ddb1b76d0d3c2" }, "downloads": -1, "filename": "hadoop_test_cluster-0.0.2-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "3b85ff80b9b9c177156893bad69928f2", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 10970, "upload_time": "2018-06-22T03:13:49", "url": "https://files.pythonhosted.org/packages/44/5c/dd48a542e4e40129373b04f03f1d49a2ba84373110c8d0f55083134146d0/hadoop_test_cluster-0.0.2-py2.py3-none-any.whl" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "6bc0cfa235f75620d4b3cb3802d507a4", "sha256": "f429499e4944b0056615270ec6268ec202b87b36778f13575200a64570f09574" }, "downloads": -1, "filename": "hadoop_test_cluster-0.0.3-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "6bc0cfa235f75620d4b3cb3802d507a4", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 11409, "upload_time": "2018-06-22T03:57:00", "url": "https://files.pythonhosted.org/packages/f3/aa/d32162db0399c0e979ade3408172ed2d7bce0961aacb7a0d80b721b84477/hadoop_test_cluster-0.0.3-py2.py3-none-any.whl" } ], "0.0.4": [ { "comment_text": "", "digests": { "md5": "91ab34e4b6ba0ae1e1d0129003773fb4", "sha256": "ecc7dab141ccdf9b3bbf85f884f1d614171dd1d7331e695bed5734b7272ff488" }, "downloads": -1, "filename": "hadoop_test_cluster-0.0.4-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "91ab34e4b6ba0ae1e1d0129003773fb4", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 8045, "upload_time": "2018-11-14T22:38:07", "url": "https://files.pythonhosted.org/packages/bd/0d/5cd8a51c30304a53c6aa357af70e831f10b6e16bfcae9344a540ec146baa/hadoop_test_cluster-0.0.4-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "cbd0b20d24e717fd6c982ec2e0be132c", "sha256": "3f0094789a8f4fd06bea548eb73196d1e3df5931eb63b66a40b6e3f235c3af6a" }, "downloads": -1, "filename": "hadoop-test-cluster-0.0.4.tar.gz", "has_sig": false, "md5_digest": "cbd0b20d24e717fd6c982ec2e0be132c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7367, "upload_time": "2018-11-14T22:38:09", "url": "https://files.pythonhosted.org/packages/7a/c4/315ac7b11d6e8404f149d8ac2b7016bf0253b49b1f4eda5b29d2aaf5de63/hadoop-test-cluster-0.0.4.tar.gz" } ], "0.0.5": [ { "comment_text": "", "digests": { "md5": "62a0eb95130c7863262a05236dbc3b43", "sha256": "b05f6299e05c0e362cb6796cc6ee26dc32f04623c45be5026bc7e27a0d28497f" }, "downloads": -1, "filename": "hadoop_test_cluster-0.0.5-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "62a0eb95130c7863262a05236dbc3b43", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 8081, "upload_time": "2019-04-09T21:29:40", "url": "https://files.pythonhosted.org/packages/a7/81/deb61ad457b71b40c7091d858795cbfc264bbc57a8745e9de865c10be262/hadoop_test_cluster-0.0.5-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "e5748da25bb289bda5f383df9b298d39", "sha256": "8a04b81b7bc694b5da8a4ff480ddadd43271cbe913d6c4a194b6631db9b16704" }, "downloads": -1, "filename": "hadoop-test-cluster-0.0.5.tar.gz", "has_sig": false, "md5_digest": "e5748da25bb289bda5f383df9b298d39", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7404, "upload_time": "2019-04-09T21:29:42", "url": "https://files.pythonhosted.org/packages/eb/f8/39b23619303e0b3e6caafbe4266a35b20040f735ac7df01adbc667605609/hadoop-test-cluster-0.0.5.tar.gz" } ], "0.1.0": [ { "comment_text": "", "digests": { "md5": "da2b95ded520a6807b7ead531d76143c", "sha256": "91f272e237e35d2328a435e49357e146f29ac5641663f158445786f10ddee52b" }, "downloads": -1, "filename": "hadoop_test_cluster-0.1.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "da2b95ded520a6807b7ead531d76143c", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 8465, "upload_time": "2019-04-12T22:17:55", "url": "https://files.pythonhosted.org/packages/35/da/b00b1afc5f343cac01eef31171c228e829e25c85aee8f21a6193165abb49/hadoop_test_cluster-0.1.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "e40c2935f0c77675b042840c5a5a0198", "sha256": "b72948f296c0537b9800f8b7befb2806bbbd99de3b54656ccb8bdb7be99b4424" }, "downloads": -1, "filename": "hadoop-test-cluster-0.1.0.tar.gz", "has_sig": false, "md5_digest": "e40c2935f0c77675b042840c5a5a0198", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7837, "upload_time": "2019-04-12T22:17:57", "url": "https://files.pythonhosted.org/packages/c3/47/7cd71ef91b90776b2af0061b5937469b609f1e5c9eb254a470b9876d84b3/hadoop-test-cluster-0.1.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "da2b95ded520a6807b7ead531d76143c", "sha256": "91f272e237e35d2328a435e49357e146f29ac5641663f158445786f10ddee52b" }, "downloads": -1, "filename": "hadoop_test_cluster-0.1.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "da2b95ded520a6807b7ead531d76143c", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 8465, "upload_time": "2019-04-12T22:17:55", "url": "https://files.pythonhosted.org/packages/35/da/b00b1afc5f343cac01eef31171c228e829e25c85aee8f21a6193165abb49/hadoop_test_cluster-0.1.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "e40c2935f0c77675b042840c5a5a0198", "sha256": "b72948f296c0537b9800f8b7befb2806bbbd99de3b54656ccb8bdb7be99b4424" }, "downloads": -1, "filename": "hadoop-test-cluster-0.1.0.tar.gz", "has_sig": false, "md5_digest": "e40c2935f0c77675b042840c5a5a0198", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7837, "upload_time": "2019-04-12T22:17:57", "url": "https://files.pythonhosted.org/packages/c3/47/7cd71ef91b90776b2af0061b5937469b609f1e5c9eb254a470b9876d84b3/hadoop-test-cluster-0.1.0.tar.gz" } ] }