{ "info": { "author": "Jesse Brennan", "author_email": "brennan@ucsc.edu", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Science/Research", "License :: OSI Approved :: Apache Software License", "Natural Language :: English", "Programming Language :: Python :: 3.6", "Topic :: Scientific/Engineering :: Bio-Informatics" ], "description": "# cgp-dss-data-loader\nSimple data loader for CGP HCA Data Store\n\n## Common Setup\n1. **(optional)** We recommend using a Python 3\n [virtual environment](https://docs.python.org/3/tutorial/venv.html).\n\n1. Run:\n\n `pip3 install cgp-dss-data-loader`\n\n## Setup for Development\n1. Clone the repo:\n\n `git clone https://github.com/DataBiosphere/cgp-dss-data-loader.git`\n\n1. Go to the root directory of the cloned project:\n\n `cd cgp-dss-data-loader`\n\n1. Make sure you are on the branch `develop`.\n\n1. Run (ideally in a new [virtual environment](https://docs.python.org/3/tutorial/venv.html)):\n\n `make develop`\n\n## Cloud Credentials Setup\nBecause this program uses Amazon Web Services and Google Cloud Platform, you will need to set up credentials\nfor both of these before you can run the program.\n\n### AWS credentials\n1. If you haven't already you will need to make an IAM user and create a new access key. Instructions are\n [here](https://docs.aws.amazon.com/general/latest/gr/managing-aws-access-keys.html).\n\n1. Next you will need to store your credentials so that Boto can access them. Instructions are\n [here](https://boto3.readthedocs.io/en/latest/guide/configuration.html).\n\n### GCP credentials\n1. Follow the steps [here](https://cloud.google.com/docs/authentication/getting-started) to set up your Google\n Credentials.\n\n## Running Tests\nRun:\n\n`make test`\n\n## Getting Data from Gen3 and Loading it\n\n1. The first step is to extract the Gen3 data you want using the\n [sheepdog exporter](https://github.com/david4096/sheepdog-exporter). The TopMed public data extracted\n from sheepdog is available [on the release page](https://github.com/david4096/sheepdog-exporter/releases/tag/0.3.1)\n under Assets. Assuming you use this data, you will now have a file called `topmed-public.json`\n\n1. Make sure you are running the virtual environment you set up in the **Setup** instructions.\n\n1. Now you will need to transform the data into the 'standard' loader format. Do this using the\n [newt-transformer](https://github.com/jessebrennan/newt-transformer).\n You can follow the [common setup](https://github.com/DataBiosphere/newt-transformer#common-setup), then the\n section for [transforming data from sheepdog](https://github.com/jessebrennan/newt-transformer#transforming-data-from-sheepdog-exporter).\n\n1. Now that we have our new transformed output we can run it with the loader.\n\n If you used the standard transformer use the command:\n\n ```\n dssload --no-dry-run --dss-endpoint MY_DSS_ENDPOINT --staging-bucket NAME_OF_MY_S3_BUCKET transformed-topmed-public.json\n ```\n\n1. You did it!", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/DataBiosphere/cgp-dss-data-loader", "keywords": "genomics,metadata,loading,NIHDataCommons", "license": "Apache License 2.0", "maintainer": "", "maintainer_email": "", "name": "cgp-dss-data-loader", "package_url": "https://pypi.org/project/cgp-dss-data-loader/", "platform": "", "project_url": "https://pypi.org/project/cgp-dss-data-loader/", "project_urls": { "Homepage": "https://github.com/DataBiosphere/cgp-dss-data-loader" }, "release_url": "https://pypi.org/project/cgp-dss-data-loader/1.1.0/", "requires_dist": null, "requires_python": "", "summary": "Simple data loader for CGP HCA Data Store", "version": "1.1.0" }, "last_serial": 4310549, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "ce0666f4b4d8c79448dc291cfd829f71", "sha256": "1f3bd3fa8fb338f13b384640267f5d2d94d4ad848606d92b5de62aa4ff39110f" }, "downloads": -1, "filename": "cgp-dss-data-loader-0.0.1.tar.gz", "has_sig": false, "md5_digest": "ce0666f4b4d8c79448dc291cfd829f71", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 14320, "upload_time": "2018-06-29T21:39:56", "url": "https://files.pythonhosted.org/packages/34/aa/4851d0f5d35216912ceb418a8bb704382d4cc8da3b0bc207f0318ebce9fc/cgp-dss-data-loader-0.0.1.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "c8d0037bdb42bad65a132ccc46155a7c", "sha256": "285ee366143fd4626a63d9eebe1301328fa9d20bb1182835c701f16d1fe5cebc" }, "downloads": -1, "filename": "cgp-dss-data-loader-0.0.2.tar.gz", "has_sig": false, "md5_digest": "c8d0037bdb42bad65a132ccc46155a7c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 14316, "upload_time": "2018-06-29T21:45:00", "url": "https://files.pythonhosted.org/packages/59/56/81efc4333edf6bf89e756fd235a0bda6ab456ec59408cfd7d8d940f563a6/cgp-dss-data-loader-0.0.2.tar.gz" } ], "0.1.0": [ { "comment_text": "", "digests": { "md5": "2742e9aa6aee1143eef1a2b374d1f864", "sha256": "4ea4978cf44a8d80725208768c853f06e12c2f79d0950422a0ef6b0b5e5cd6e7" }, "downloads": -1, "filename": "cgp-dss-data-loader-0.1.0.tar.gz", "has_sig": false, "md5_digest": "2742e9aa6aee1143eef1a2b374d1f864", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 14877, "upload_time": "2018-07-12T19:50:53", "url": "https://files.pythonhosted.org/packages/ce/6e/957ca4761a5dfc58bb8fab9361642713f7af7497d7cd13fa667edcf95d3a/cgp-dss-data-loader-0.1.0.tar.gz" } ], "1.1.0": [ { "comment_text": "", "digests": { "md5": "1010730468564d023cd28a179b78b30b", "sha256": "d67433cd93a1bb9336cf7505e9ad8a68ec111994d9ae5388b7bf764d0201a275" }, "downloads": -1, "filename": "cgp-dss-data-loader-1.1.0.tar.gz", "has_sig": false, "md5_digest": "1010730468564d023cd28a179b78b30b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 16084, "upload_time": "2018-09-25T22:45:23", "url": "https://files.pythonhosted.org/packages/99/af/c5d322e3f6ff620ab5f8e4de6423e6ff64d9453c87c934b7e8545e4c1c3e/cgp-dss-data-loader-1.1.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "1010730468564d023cd28a179b78b30b", "sha256": "d67433cd93a1bb9336cf7505e9ad8a68ec111994d9ae5388b7bf764d0201a275" }, "downloads": -1, "filename": "cgp-dss-data-loader-1.1.0.tar.gz", "has_sig": false, "md5_digest": "1010730468564d023cd28a179b78b30b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 16084, "upload_time": "2018-09-25T22:45:23", "url": "https://files.pythonhosted.org/packages/99/af/c5d322e3f6ff620ab5f8e4de6423e6ff64d9453c87c934b7e8545e4c1c3e/cgp-dss-data-loader-1.1.0.tar.gz" } ] }