{ "info": { "author": "", "author_email": "", "bugtrack_url": null, "classifiers": [ "Development Status :: 5 - Production/Stable", "License :: OSI Approved :: MIT License", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6" ], "description": "# Isilon Hadoop Tools\n\nTools for Using Hadoop with OneFS\n\n- `isilon_create_users` creates identities needed by Hadoop distributions compatible with OneFS.\n- `isilon_create_directories` creates a directory structure with appropriate ownership and permissions in HDFS on OneFS.\n\n![IHT Demo](demo.gif)\n\n## Installation\n\nIsilon Hadoop Tools (IHT) currently requires Python 3.5+ and supports OneFS 8+.\n\n- Python support schedules can be found [in the Python Developer's Guide](https://devguide.python.org/#status-of-python-branches).\n- OneFS support schedules can be found in the [Isilon Product Availability Guide](https://support.emc.com/docu45445_Isilon-Product-Availability.pdf).\n\n### Option 1: Install as a stand-alone command line tool.\n\n
\nUse pipx to install IHT.\n
\n\n> _`pipx` requires Python 3.6 or later. For other versions or **offline installations**, see Option 2._\n\n1. [Install `pipx`:](https://pipxproject.github.io/pipx/installation/)\n\n ``` sh\n python3 -m pip install --user pipx\n ```\n\n - Tip: Newer versions of some Linux distributions (e.g. [Debian 10](https://packages.debian.org/buster/pipx), [Ubuntu 19.04](https://packages.ubuntu.com/disco/pipx), etc.) offer native packages for `pipx`.\n\n
\n\n ``` sh\n python3 -m pipx ensurepath\n ```\n\n - Note: You may need to restart your terminal for the `$PATH` updates to take effect.\n\n2. Use `pipx` to install [`isilon_hadoop_tools`](https://pypi.org/project/isilon_hadoop_tools/):\n\n ``` sh\n pipx install isilon_hadoop_tools\n ```\n\n3. Test the installation:\n\n ``` sh\n isilon_create_users --help\n isilon_create_directories --help\n ```\n\n- Use `pipx` to uninstall at any time:\n\n ``` sh\n pipx uninstall isilon_hadoop_tools\n ```\n\nSee Python's [Installing stand alone command line tools](https://packaging.python.org/guides/installing-stand-alone-command-line-tools/) guide for more information.\n
\n\n### Option 2: Create an ephemeral installation.\n\n
\nUse pip to install IHT in a virtual environment.\n
\n\n> Python \"Virtual Environments\" allow Python packages to be installed in an isolated location for a particular application, rather than being installed globally.\n\n1. Use the built-in [`venv`](https://docs.python.org/3/library/venv.html) module to create a virtual environment:\n\n ``` sh\n python3 -m venv ./iht\n ```\n\n2. Install [`isilon_hadoop_tools`](https://pypi.org/project/isilon_hadoop_tools/) into the virtual environment:\n\n ``` sh\n iht/bin/pip install isilon_hadoop_tools\n ```\n\n - Note: This requires access to an up-to-date Python Package Index (PyPI, usually https://pypi.org/).\n For offline installations, necessary resources can be downloaded to a USB flash drive which can be used instead:\n\n ``` sh\n pip3 download --dest /media/usb/iht-dists isilon_hadoop_tools\n ```\n ``` sh\n iht/bin/pip install --no-index --find-links /media/usb/iht-dists isilon_hadoop_tools\n ```\n\n3. Test the installation:\n\n ``` sh\n iht/bin/isilon_create_users --help\n ```\n\n - Tip: Some users find it more convenient to \"activate\" the virtual environment (which prepends the virtual environment's `bin/` to `$PATH`):\n\n ``` sh\n source iht/bin/activate\n isilon_create_users --help\n isilon_create_directories --help\n deactivate\n ```\n\n- Remove the virtual environment to uninstall at any time:\n\n ``` sh\n rm --recursive iht/\n ```\n\nSee Python's [Installing Packages](https://packaging.python.org/tutorials/installing-packages/) tutorial for more information.\n
\n\n## Usage\n\n- Tip: `--help` can be used with any IHT script to see extended usage information.\n\nTo use IHT, you will need the following:\n\n- `$onefs`, an IP address, hostname, or SmartConnect name associated with the OneFS System zone\n - Unfortunately, Zone-specific Role-Based Access Control (ZRBAC) is not fully supported by OneFS's RESTful Access to Namespace (RAN) service yet, which is required by `isilon_create_directories`.\n- `$iht_user`, a OneFS System zone user with the following privileges:\n - `ISI_PRIV_LOGIN_PAPI`\n - `ISI_PRIV_AUTH`\n - `ISI_PRIV_HDFS`\n - `ISI_PRIV_IFS_BACKUP` (only needed by `isilon_create_directories`)\n - `ISI_PRIV_IFS_RESTORE` (only needed by `isilon_create_directories`)\n- `$zone`, the name of the access zone on OneFS that will host HDFS\n - The System zone should **NOT** be used for HDFS.\n- `$dist`, the distribution of Hadoop that will be deployed with OneFS (e.g. CDH, HDP, etc.)\n- `$cluster_name`, the name of the Hadoop cluster\n\n### Connecting to OneFS via HTTPS\n\nOneFS ships with a self-signed SSL/TLS certificate by default, and such a certificate will not be verifiable by any well-known certificate authority. If you encounter `CERTIFICATE_VERIFY_FAILED` errors while using IHT, it may be because OneFS is still using the default certificate. To remedy the issue, consider encouraging your OneFS administrator to install a verifiable certificate instead. Alternatively, you may choose to skip certificate verification by using the `--no-verify` option, but do so at your own risk!\n\n### Preparing OneFS for Hadoop Deployment\n\n_Note: This is not meant to be a complete guide to setting up Hadoop with OneFS. If you stumbled upon this page or have not otherwise consulted the appropriate install guide for your distribution, please do so at https://community.emc.com/docs/DOC-61379._\n\nThere are 2 tools in IHT that are meant to assist with the setup of OneFS as HDFS for a Hadoop cluster:\n1. `isilon_create_users`, which creates users and groups that must exist on all hosts in the Hadoop cluster, including OneFS\n2. `isilon_create_directories`, which sets the correct ownership and permissions on directories in HDFS on OneFS\n\nThese tools must be used _in order_ since a user/group must exist before it can own a directory.\n\n#### `isilon_create_users`\n\nUsing the information from above, an invocation of `isilon_create_users` could look like this:\n``` sh\nisilon_create_users --dry \\\n --onefs-user \"$iht_user\" \\\n --zone \"$zone\" \\\n --dist \"$dist\" \\\n --append-cluster-name \"$cluster_name\" \\\n \"$onefs\"\n```\n- Note: `--dry` causes the script to log without executing. Use it to ensure the script will do what you intend before actually doing it.\n\nIf anything goes wrong (e.g. the script stopped because you forgot to give `$iht_user` the `ISI_PRIV_HDFS` privilege), you can safely rerun with the same options. IHT should figure out that some of its job has been done already and work with what it finds.\n- If a particular user/group already exists with a particular UID/GID, the ID it already has will be used.\n- If a particular UID/GID is already in use by another user/group, IHT will try again with a different, higher ID.\n- IHT may **NOT** detect previous runs that used different options.\n\n##### Generated Shell Script\n\nAfter running `isilon_create_users`, you will find a new file in `$PWD` named like so:\n``` sh\n$unix_timestamp-$zone-$dist-$cluster_name.sh\n```\n\nThis script should be copied to and run on all the other hosts in the Hadoop cluster (excluding OneFS).\nIt will create the same users/groups with the same UIDs/GIDs and memberships as on OneFS using LSB utilities such as `groupadd`, `useradd`, and `usermod`.\n\n#### `isilon_create_directories`\n\nUsing the information from above, an invocation of `isilon_create_directories` could look like this:\n``` sh\nisilon_create_directories --dry \\\n --onefs-user \"$iht_user\" \\\n --zone \"$zone\" \\\n --dist \"$dist\" \\\n --append-cluster-name \"$cluster_name\" \\\n \"$onefs\"\n```\n- Note: `--dry` causes the script to log without executing. Use it to ensure the script will do what you intend before actually doing it.\n\nIf anything goes wrong (e.g. the script stopped because you forgot to run `isilon_create_users` first), you can safely rerun with the same options. IHT should figure out that some of its job has been done already and work with what it finds.\n- If a particular directory already exists but does not have the correct ownership or permissions, IHT will correct it.\n- If a user/group has been deleted and re-created with a new UID/GID, IHT will adjust ownership accordingly.\n- IHT may **NOT** detect previous runs that used different options.\n\n## Development\n\nSee [the Contributing Guidelines](CONTRIBUTING.md) for information on project development.\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/isilon/isilon_hadoop_tools", "keywords": "", "license": "MIT", "maintainer": "Isilon", "maintainer_email": "support@isilon.com", "name": "isilon-hadoop-tools", "package_url": "https://pypi.org/project/isilon-hadoop-tools/", "platform": "", "project_url": "https://pypi.org/project/isilon-hadoop-tools/", "project_urls": { "Homepage": "https://github.com/isilon/isilon_hadoop_tools" }, "release_url": "https://pypi.org/project/isilon-hadoop-tools/4.0.0/", "requires_dist": [ "future (~=0.17.1)", "isi-sdk-7-2 (~=0.2.7)", "isi-sdk-8-0 (~=0.2.7)", "isi-sdk-8-0-1 (~=0.2.7)", "isi-sdk-8-1-0 (~=0.2.7)", "isi-sdk-8-1-1 (~=0.2.7)", "isi-sdk-8-2-0 (~=0.2.7)", "requests (>=2.20.0)", "setuptools (>=41.0.0)", "enum34 (>=1.1.6) ; python_version < \"3.4\"" ], "requires_python": ">=2.7,!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*", "summary": "Tools for Using Hadoop with OneFS", "version": "4.0.0" }, "last_serial": 5980380, "releases": { "4.0.0": [ { "comment_text": "", "digests": { "md5": "f8697287d3447f1e9d9ece88a897e503", "sha256": "5369f7c35ae0ea496c766473b6788c693ba1543473b4aee7184f958b65f2b25f" }, "downloads": -1, "filename": "isilon_hadoop_tools-4.0.0-py3-none-any.whl", "has_sig": false, "md5_digest": "f8697287d3447f1e9d9ece88a897e503", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=2.7,!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*", "size": 26053, "upload_time": "2019-10-15T23:34:44", "url": "https://files.pythonhosted.org/packages/e6/69/24365279aa067fe545d893c94e6d2bcb7a803a698b19381dd9aa1a552639/isilon_hadoop_tools-4.0.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "5d5b9fc882a9c735fc52b081be75f726", "sha256": "000a9a3fb95e8321b36a09c71fa21f029121efcb32083a51e1793a584a051c7b" }, "downloads": -1, "filename": "isilon_hadoop_tools-4.0.0.tar.gz", "has_sig": false, "md5_digest": "5d5b9fc882a9c735fc52b081be75f726", "packagetype": "sdist", "python_version": "source", "requires_python": ">=2.7,!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*", "size": 11589049, "upload_time": "2019-10-15T23:34:54", "url": "https://files.pythonhosted.org/packages/e3/c2/9450bc600234df4fd08933f447f744783f55727d524e90cd718fca732322/isilon_hadoop_tools-4.0.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "f8697287d3447f1e9d9ece88a897e503", "sha256": "5369f7c35ae0ea496c766473b6788c693ba1543473b4aee7184f958b65f2b25f" }, "downloads": -1, "filename": "isilon_hadoop_tools-4.0.0-py3-none-any.whl", "has_sig": false, "md5_digest": "f8697287d3447f1e9d9ece88a897e503", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=2.7,!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*", "size": 26053, "upload_time": "2019-10-15T23:34:44", "url": "https://files.pythonhosted.org/packages/e6/69/24365279aa067fe545d893c94e6d2bcb7a803a698b19381dd9aa1a552639/isilon_hadoop_tools-4.0.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "5d5b9fc882a9c735fc52b081be75f726", "sha256": "000a9a3fb95e8321b36a09c71fa21f029121efcb32083a51e1793a584a051c7b" }, "downloads": -1, "filename": "isilon_hadoop_tools-4.0.0.tar.gz", "has_sig": false, "md5_digest": "5d5b9fc882a9c735fc52b081be75f726", "packagetype": "sdist", "python_version": "source", "requires_python": ">=2.7,!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*", "size": 11589049, "upload_time": "2019-10-15T23:34:54", "url": "https://files.pythonhosted.org/packages/e3/c2/9450bc600234df4fd08933f447f744783f55727d524e90cd718fca732322/isilon_hadoop_tools-4.0.0.tar.gz" } ] }