{
    "info": {
        "author": "Swarvanu Sengupta (s8sg)",
        "author_email": "swarvanusg@gmail.com",
        "bugtrack_url": null,
        "classifiers": [
            "Development Status :: 3 - Alpha",
            "Intended Audience :: Developers",
            "License :: OSI Approved :: MIT License",
            "Programming Language :: Python :: 2",
            "Programming Language :: Python :: 2.6",
            "Programming Language :: Python :: 2.7",
            "Programming Language :: Python :: 3",
            "Programming Language :: Python :: 3.3",
            "Programming Language :: Python :: 3.4",
            "Programming Language :: Python :: 3.5",
            "Topic :: Software Development :: Build Tools"
        ],
        "description": "Spark Py Submit\n===============\n\nA python library that can submit spark job to spark yarn cluster using\nrest API\n\n| **Note: It Currently supports the CDH(5.6.1) and\n  HDP(2.3.2.0-2950,2.4.0.0-169)**\n| The Library is Inspired from:\n  ``github.com/bernhard-42/spark-yarn-rest-api``\n\nGetting Started:\n~~~~~~~~~~~~~~~~\n\nUse the library\n^^^^^^^^^^^^^^^\n\n.. code:: python\n\n    # Import the SparkJobHandler\n    from spark_job_handler import SparkJobHandler\n\n    ...\n\n    logger = logging.getLogger('TestLocalJobSubmit')\n    # Create a spark JOB\n    # jobName:           name of the Spark Job   \n    # jar:               location of the Jar (local/hdfs)  \n    # run_class:         entry class of the appliaction   \n    # hadoop_rm:         hadoop resource manager host ip  \n    # hadoop_web_hdfs:   hadoop web hdfs ip   \n    # hadoop_nn:         hadoop name node ip (Normally same as of web_hdfs)  \n    # env_type:          env type is CDH or HDP  \n    # local_jar:         flag to define if a jar is local (Local jar gets uploaded to hdfs)  \n    # spark_properties:  custom properties that need to be set \n    sparkJob = SparkJobHandler(logger=logger, job_name=\"test_local_job_submit\", \n                jar=\"./simple-project/target/scala-2.10/simple-project_2.10-1.0.jar\",\n                run_class=\"IrisApp\", hadoop_rm='rma', hadoop_web_hdfs='nn', hadoop_nn='nn',\n                env_type=\"CDH\", local_jar=True, spark_properties=None)\n    trackingUrl = sparkJob.run()\n    print \"Job Tracking URL: %s\" % trackingUrl\n\n| The above code starts an spark application using the local jar\n  (simple-project/target/scala-2.10/simple-project\\_2.10-1.0.jar)\n| For more example see the\n  `test\\_spark\\_job\\_handler.py <https://github.com/s8sg/spark-py-submit/blob/master/test_spark_job_handler.py>`__\n\nBuild the simple-project\n^^^^^^^^^^^^^^^^^^^^^^^^\n\n.. code:: bash\n\n      $ cd simple-project\n      $ sbt package;cd ..\n\nThe above steps will create the target jar as:\n``./simple-project/target/scala-2.10/simple-project_2.10-1.0.jar``\n\nUpdate the nodes Ip in test:\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n| Add the node IP for hadoop resource manager and Name node in the\n  test\\_cases:\n| \\* rm: Resource Manager \\* nn: Name Node\n\nload the data and make it available to HDFS:\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n.. code:: bash\n\n      $ wget https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data\n\nupload data to the HDFS:\n\n.. code:: bash\n\n      $ python upload_to_hdfs.py <name_nodei_ip> iris.data /tmp/iris.data\n\nRun the test cases:\n^^^^^^^^^^^^^^^^^^^\n\nMake the simple-project jar available in HDFS to test remote jar:\n\n.. code:: bash\n\n      $ python upload_to_hdfs.py <name_nodei_ip> simple-project/target/scala-2.10/simple-project_2.10-1.0.jar /tmp/test_data/simple-project_2.10-1.0.jar\n\nRun the test:\n\n.. code:: bash\n\n      $ python test_spark_job_handler.py \n\nUtility:\n~~~~~~~~\n\n-  upload\\_to\\_hdfs.py: upload local file to hdfs file system\n\nNotes:\n~~~~~~\n\n| The Library is still in early stage and need testing, bug-fixing and\n  documentation\n| Before running, follow the below steps:\n| \\* Update the ResourceManager,NameNode and WebHDFS Port if required in\n  settings.py\n| \\* Make the spark-jar available in hdfs as:\n  ``hdfs:/user/spark/share/lib/spark-assembly.jar``\n| For Contribution Please Create Issue corresponding PR\n\n\n",
        "description_content_type": null,
        "docs_url": null,
        "download_url": "",
        "downloads": {
            "last_day": -1,
            "last_month": -1,
            "last_week": -1
        },
        "home_page": "https://github.com/s8sg/spark-py-submit",
        "keywords": "spark yarn submit bigdata hadoop",
        "license": "MIT",
        "maintainer": "",
        "maintainer_email": "",
        "name": "spark-yarn-submit",
        "package_url": "https://pypi.org/project/spark-yarn-submit/",
        "platform": "",
        "project_url": "https://pypi.org/project/spark-yarn-submit/",
        "project_urls": {
            "Homepage": "https://github.com/s8sg/spark-py-submit"
        },
        "release_url": "https://pypi.org/project/spark-yarn-submit/1.0.0/",
        "requires_dist": null,
        "requires_python": "",
        "summary": "library to handle spark job submit in a yarn cluster in different environment",
        "version": "1.0.0"
    },
    "last_serial": 2630297,
    "releases": {
        "1.0.0": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "fa51e6c96cfa71c5657a3a521438cd8c",
                    "sha256": "b757b2d7b3a47997dc803609eb40df3ae93026618c83febdf3e66ada3bd15fcd"
                },
                "downloads": -1,
                "filename": "spark_yarn_submit-1.0.0-py2.py3-none-any.whl",
                "has_sig": false,
                "md5_digest": "fa51e6c96cfa71c5657a3a521438cd8c",
                "packagetype": "bdist_wheel",
                "python_version": "py2.py3",
                "requires_python": null,
                "size": 12252,
                "upload_time": "2017-02-09T09:41:34",
                "url": "https://files.pythonhosted.org/packages/d2/5d/f8b9747498ebbcf36c82e6d17f0c8e3964e2a5eb238588c077360e54c8f9/spark_yarn_submit-1.0.0-py2.py3-none-any.whl"
            }
        ]
    },
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "fa51e6c96cfa71c5657a3a521438cd8c",
                "sha256": "b757b2d7b3a47997dc803609eb40df3ae93026618c83febdf3e66ada3bd15fcd"
            },
            "downloads": -1,
            "filename": "spark_yarn_submit-1.0.0-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "fa51e6c96cfa71c5657a3a521438cd8c",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 12252,
            "upload_time": "2017-02-09T09:41:34",
            "url": "https://files.pythonhosted.org/packages/d2/5d/f8b9747498ebbcf36c82e6d17f0c8e3964e2a5eb238588c077360e54c8f9/spark_yarn_submit-1.0.0-py2.py3-none-any.whl"
        }
    ]
}