{ "info": { "author": "Civis Analytics Inc", "author_email": "opensource@civisanalytics.com", "bugtrack_url": null, "classifiers": [], "description": "Maintenance\n===========\n\n\u2757 This project is released as-is, and is not actively maintained by Civis Analytics.\n\ncivis-compute\n=============\n\n.. image:: https://www.travis-ci.org/civisanalytics/civis-compute.svg?branch=master\n :target: https://www.travis-ci.org/civisanalytics/civis-compute\n\nBatch computing in the cloud with Civis Platform.\n\n.. contents:: :local:\n\nInstallation\n------------\n\nInstall from pip like this::\n\n pip install civis-compute\n\n\nQuick Start Example\n-------------------\n\nSuppose we have a Python script that fits a Random Forest to the Iris dataset and pickles the estimator::\n\n import os\n import pickle\n import numpy as np\n from sklearn.datasets import load_iris\n from sklearn.ensemble import RandomForestClassifier\n\n # Civis Platform container configuration.\n #CIVIS name=my iris example\n #CIVIS required_resources={'cpu': 1024, 'memory': 8192, 'disk_space': 10.0}\n\n # Load and shuffle data.\n iris = load_iris()\n X = iris.data\n y = iris.target\n\n # Shuffle the data.\n idx = np.arange(X.shape[0])\n np.random.seed(45687)\n np.random.shuffle(idx)\n X = X[idx]\n y = y[idx]\n\n # Fit and score.\n rf = RandomForestClassifier(n_estimators=10)\n clf = rf.fit(X, y)\n score = clf.score(X, y)\n print(\"score:\", score)\n\n # Now lets save the results.\n # Just write the data to the location given by the environment\n # variable CIVIS_JOB_DATA\n with open(os.path.expandvars(\n os.path.join('${CIVIS_JOB_DATA}', 'iris.pkl')), 'wb') as fp:\n pickle.dump(rf, fp)\n\n # This data will get tar gziped, put in the files endpoint and then attached to\n # the job state. You can get it by running civis-compute get {scriptid}.\n\nThis script fits a Random Forest to the Iris dataset and pickles the estimator.\n\nYou can submit this script to Civis Platform via ``civis-compute submit``::\n\n civis-compute submit iris.py\n # STDOUT: script_id\n\nThe ``civis-compute submit`` command prints out the ID of the created script/container\nto ``STDOUT``.\n\nThe log and the output data are attached to the script in the outputs field,\nvisible in Civis Platform under ``Run History``.\n\nYou can also download the outputs via ``civis-compute get``::\n\n civis-compute get SCRIPTID\n # default path is the current working directory\n # STDOUT: /path/to/archive.tar.gz\n\nYou can unpack the gzipped tar archive with::\n\n tar -xzvf /path/to/archive.tar.gz\n\nFinally, you can recover the estimator like this::\n\n with open('/path/to/archive/iris.pkl', 'rb') as fp:\n rf = pickle.load(fp)\n\nNote that any data written to the directory ``${CIVIS_JOB_DATA}`` in the job will be put into a\ngzipped tar archive, put on the files endpoint and attached as an output to the script. This\nbehavior means that you can write any type of file, including CSV, pickled python objects, plots,\nimages, etc., and possibly more than one file to this directory and easily pull the results back\nto your local machine via ``civis-compute get``.\n\nBring Your Own Container\n------------------------\n\nTo use the CLI, you must have the Civis Python API client pre-installed in the container.\nYou can get it via ``pip install civis`` or from https://github.com/civisanalytics/civis-python.\n\nSupport for Jupyter Notebooks\n-----------------------------\n\nThe CLI can execute jupyter notebooks on Civis Platform. Locally, your notebook is converted to a\npython script and then executed via ``ipython`` in a container script. This allows you to use and execute\nipython magics (e.g., ``%timeit``, etc.) in your notebooks. IPython magics that are jupyter specific\n(i.e., ``%matplotlib inline`` and ``%matplotlib notebook``) are replaced with ``pass`` before\nexecuting the notebook.\n\nSupport for R\n-------------\n\nWe have installed the Python API client into our ``datascience-r`` container. This container\ncan be used to execute R scripts.\n\nUse ``snake_case``, not ``CamelCase`` for Input Parameters\n----------------------------------------------------------\n\nAll input parameters in comments (like ``#CIVIS required_resources=...`` above)\nand the CLI are in ``snake_case``. This includes parameters not at the top level\n(e.g., the ``disk_space`` option for ``required_resources``).\n\nFor the command line, ``required_resources`` is written as ``required-resources`` in keeping with\n\\*nix conventions.\n\nUse YAML to Specify API Parameters That Require Lists or Hash Maps\n------------------------------------------------------------------\n\nFor example, in a comment in a script use::\n\n #CIVIS required_resources={'cpu': 1024}\n\nor on the command line use::\n\n civis-compute submit --required-resources=\"{'cpu': 1024}\" \n\nfor the ``required_resources`` hash map.\n\nAvailable CLI Utilities\n-----------------------\n\n``civis-compute submit``\n------------------------\n\nTo submit a local bash, python script, R script or jupyter notebook to Civis Platform, you can simply type::\n\n civis-compute submit SCRIPT [ARGS]\n\nThis command uploads the script to Civis Platform using the files endpoint and then executes it in a\ncontainer using a default setup (which gives you 1024 CPUs, 8192 MB of RAM, 16 GB of disk space, and\nuses the latest version of the ``datascience-python`` or the ``datascience-r`` docker image). You\ncan pass arguments to the script and they will be reproduced on Civis Platform. Any arguments which\nare files are automatically uploaded to the files endpoint.\n\nNote that you can also execute bash on Civis Platform directly by simply putting the commands right after\n``civis-compute submit``. For example::\n\n civis-compute submit sleep 3600\n\nwould make a container script execute ``sleep 3600``.\n\nIf you want to adjust these defaults or set any other parameters that can be set via the API,\nyou can simply add comments to your script that look like this::\n\n #CIVIS name=iris\n\nThis command would set the name of the custom script to 'iris'. Parameters can also be set from\nthe command line as options to ``civis-compute submit``. See the rest of the parameters that can be set here\nhttps://platform.civisanalytics.com/api#v1_post_scripts_containers.\n\nNote that special keys can be added to these comments or the command line for civis-compute CLI specific behavior\n\n- **Run a Shell Command Before the Script**\n\n You can run a shell command via::\n\n #CIVIS shell_cmd=pip install -q tqdm\n\n This shell command will execute after all data has been uploaded to the container\n script but before any python packages are installed.\n\n- **Upload Additional Files**\n\n To upload additional files, put them in a comment like this::\n\n #CIVIS files=data.csv,module.py\n\n These files will be put in the container job at the same relative path they are to the\n script that is uploaded.\n\n- **Caching File Uploads**\n\n The civis-compute CLI can maintain a local cache of MD5 checksums and file IDs on the Civis files\n endpoint. When you specify a file dependency, this local cache is checked first. If a file\n will not expire for at least two weeks and has the same checksum, then the already uploaded\n file is used. To turn on caching, you can specify a comment like this::\n\n #CIVIS use_file_cache=True\n\n- **Custom Repo Installs**\n\n If you specify a Git repo via the ``repo_http_uri`` option, then the ``repo_cmd`` option\n will determine how the repo is handled. By default, it is set to ``python setup.py install``.\n You can change this via::\n\n #CIVIS repo_cmd=python setup.py develop\n\n- **Adding AWS Credentials**\n\n You can pass AWS credentials (which are stored on Civis Platform) into your job by default using::\n\n #CIVIS add_aws_creds=True\n\n You can specify your AWS credential ID from Civis Platform like this::\n\n #CIVIS aws_cred_id=ID\n\n If you do not give a credential ID, the first one found in your list of AWS credentials in\n Civis Platform is used.\n\nFinally, any thing that can be set in the comments can be passed as a command line argument to\n``civis-compute submit``. Command line arguments override anything set in the script via the\ncomments.\n\nYou can do a dry run of a script via the command line via::\n\n civis-compute submit --dry-run\n\nThis command prints out the container config and command to be run. This feature can be used\nto help debug scripts before they run on Civis Platform.\n\n``civis-compute get``\n---------------------\n\nTo get the outputs of a script which has finished::\n\n civis-compute get SCRIPTID\n\nwhere ``SCRIPTID`` is the ID of the Civis Platform script, printed to STDOUT by ``civis-compute submit``.\nThis command will pull the outputs from the latest run. You can specify a specific run with the\n``--run-id=RUNID`` option.\n\nTo change the output directory::\n\n civis-compute get SCRIPTID path/to/output\n\nTo specify a specific run::\n\n civis-compute get SCRIPTID --run-id=RUNID\n\n``civis-compute status``\n------------------------\n\nTo view scripts that are running (and you have permissions to view)::\n\n civis-compute status\n\nTo see just your scripts::\n\n civis-compute status --mine\n\nTo see info about the most recent run of a specific container::\n\n civis-compute status SCRIPTID\n\nwhere ``SCRIPTID`` is the ID of the Civis Platform script, printed to STDOUT\nby ``civis-compute submit``.\n\nNote that only container scripts are listed by ``civis-compute status``, up\nto ~50 scripts.\n\n``civis-compute cancel``\n------------------------\n\nTo cancel a script running on Civis Platform::\n\n civis-compute cancel SCRIPTID\n\nwhere ``SCRIPTID`` is the ID of the Civis Platform script, printed to STDOUT\nby ``civis-compute submit``.\n\nNote that only containers which you are running (i.e., ``running_as`` is set you) can be canceled. This\ncommand will cancel both hidden and non-hidden scripts.\n\n``civis-compute cache``\n-----------------------\n\nThe civis-compute CLI can cache the MD5 checksums and files endpoint IDs of your files to avoid uploading\nthem more than once.\n\nTo see the files in your local cache::\n\n civis-compute cache list\n\nTo clear the local cache::\n\n civis-compute cache clear\n\nThe actual cache is a simple sqlite database stored at ``~/.civiscompute/fileidcache.db``.\n\nTo turn on this feature, either set ``use_file_cache: True`` in your ``~/.civiscompute/config.yml``, or pass\nthis argument to your script via the command line or a configuration comment.\n\n\nChanging the Default Script Submission Parameters\n-------------------------------------------------\n\nYou can change the default script submission parameters and turn on the file\ncache by default by editing your ``~/.civiscompute/config.yml`` file.\n\nHere is an example::\n\n # my civis-compute CLI config\n use_file_cache: False\n required_resources:\n cpu: 256\n memory: 1024\n disk_space: 1.0\n docker_image_name:\n python: civisanalytics/datascience-python\n r: civisanalytics/datascience-r\n repo_cmd:\n python: 'python setup.py install'\n add_aws_creds: False\n # put a default AWS credential ID here\n # aws_cred_id:\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://www.civisanalytics.com", "keywords": "", "license": "BSD-3", "maintainer": "", "maintainer_email": "", "name": "civis-compute", "package_url": "https://pypi.org/project/civis-compute/", "platform": "", "project_url": "https://pypi.org/project/civis-compute/", "project_urls": { "Homepage": "https://www.civisanalytics.com" }, "release_url": "https://pypi.org/project/civis-compute/0.2.0/", "requires_dist": [ "click (~=6.7)", "nbformat (~=4.4)", "nbconvert (~=5.3)", "civis (~=1.6)", "pyyaml (>=5.1)", "python-dateutil (~=2.6)", "ipython (~=5.4) ; python_version >= \"2.7\" and python_version < \"3.0\"", "ipython (~=6.2) ; python_version >= \"3.0\"" ], "requires_python": "", "summary": "Batch computing in the cloud with Civis Platform.", "version": "0.2.0" }, "last_serial": 5358273, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "1bc34c9678a9722472776ed03a2e03da", "sha256": "f7f36eb1de6cd85ee289d78cbafd7a4666f709c8fcf6c11ce6607dbaacb85015" }, "downloads": -1, "filename": "civis-compute-0.1.0.tar.gz", "has_sig": false, "md5_digest": "1bc34c9678a9722472776ed03a2e03da", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 30898, "upload_time": "2017-10-06T14:10:07", "url": "https://files.pythonhosted.org/packages/6c/a4/408b982b0eb5aaa325cf4455c21289cdc19c28ace2d4a6907e9763875c91/civis-compute-0.1.0.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "5d7eb021f4c76c16a71d10b0d4d99ed9", "sha256": "7c2b9df0cf249a070b3ebc7e2008f99769d5ef9b98dadfc8b895c60c5844386a" }, "downloads": -1, "filename": "civis_compute-0.2.0-py3-none-any.whl", "has_sig": false, "md5_digest": "5d7eb021f4c76c16a71d10b0d4d99ed9", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 36247, "upload_time": "2019-06-04T15:35:01", "url": "https://files.pythonhosted.org/packages/29/bf/e32fe08d020921eeb3c718ac0956032cbd884b58f190a2ba8230298d54c1/civis_compute-0.2.0-py3-none-any.whl" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "5d7eb021f4c76c16a71d10b0d4d99ed9", "sha256": "7c2b9df0cf249a070b3ebc7e2008f99769d5ef9b98dadfc8b895c60c5844386a" }, "downloads": -1, "filename": "civis_compute-0.2.0-py3-none-any.whl", "has_sig": false, "md5_digest": "5d7eb021f4c76c16a71d10b0d4d99ed9", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 36247, "upload_time": "2019-06-04T15:35:01", "url": "https://files.pythonhosted.org/packages/29/bf/e32fe08d020921eeb3c718ac0956032cbd884b58f190a2ba8230298d54c1/civis_compute-0.2.0-py3-none-any.whl" } ] }