{ "info": { "author": "Jiachen Yao", "author_email": "", "bugtrack_url": null, "classifiers": [ "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python", "Programming Language :: Python :: 3.6" ], "description": "Red Panda \ud83d\udc3c\ud83d\ude0a\n================\n\nData science on the cloud without frustration.\n\nCaveat\n------\n\nThis package only works with Python >= 3.6 because of the heavy reliance on `f-string `_.\n\n\nFeatures\n--------\n\n- DataFrame/files to and from S3 and Redshift.\n- Run queries on Redshift in Python.\n- Use built-in Redshift admin queries, such as checking running queries.\n- Use Redshift utility functions to easily accomplish common tasks such as create table.\n- Manage files on S3.\n- Also offers cli (feature not complete yet)\n\n\nInstallation\n------------\n\n.. code-block:: console\n\n $ pip install red-panda\n\n\nUsing red-panda\n---------------\n\nImport ``red-panda`` and create an instance of ``RedPanda``. If you create the instance with ``debug`` on (i.e. ``rp = RedPanda(redshift_conf, s3_conf, debug=True)``), ``red-panda`` will print the planned queries instead of executing them.\n\n.. code-block:: python\n\n from red_panda import RedPanda\n\n redshift_conf = {\n 'user': 'awesome-developer',\n 'password': 'strong-password',\n 'host': 'awesome-domain.us-east-1.redshift.amazonaws.com',\n 'port': 5432,\n 'dbname': 'awesome-db',\n }\n\n s3_conf = {\n 'aws_access_key_id': 'your-aws-access-key-id',\n 'aws_secret_access_key': 'your-aws-secret-access-key',\n # 'aws_session_token': 'temporary-token-if-you-have-one',\n }\n\n rp = RedPanda(redshift_conf, s3_conf)\n\n\nLoad your Pandas DataFrame into Redshift as a new table.\n\n.. code-block:: python\n\n import pandas as pd\n\n df = pd.DataFrame(data={'col1': [1, 2], 'col2': [3, 4]})\n\n s3_bucket = 's3-bucket-name'\n s3_path = 'parent-folder/child-folder' # optional, if you don't have any sub folders\n s3_file_name = 'test.csv' # optional, randomly generated if not provided\n rp.df_to_redshift(df, 'test_table', bucket=s3_bucket, path=s3_path, append=False)\n\n\nIt is also possible to: \n\n- Upload a DataFrame or flat file to S3\n- Delete files from S3\n- Load S3 data into Redshift\n- Unload a Redshift query result to S3\n- Obtain a Redshift query result as a DataFrame\n- Run queries on Redshift\n- Download S3 file to local\n- Read S3 file in memory as DataFrame\n- Run built-in Redshift admin queries, such as getting running query information\n- Use utility functions such as ``create_table`` to quickly create tables in Redshift\n- Separate concerns by using ``RedshiftUtils`` or ``S3Utils``\n\n\n.. code-block:: python\n\n s3_key = s3_path + '/' + s3_file_name\n\n # DataFrame uploaded to S3\n rp.df_to_s3(df, s3_bucket, s3_key)\n\n # Delete a file on S3\n rp.delete_from_s3(s3_bucket, s3_key)\n\n # Upload a local file to S3\n pd.to_csv(df, 'test_data.csv', index=False)\n rp.file_to_s3('test_data.csv', s3_bucket, s3_key)\n\n # Populate a Redshift table from S3 files\n # Use a dictionary for column definition, here we minimally define only data_type\n redshift_column_definition = {\n 'col1': {data_type: 'int'},\n 'col2': {data_type: 'int'},\n }\n rp.s3_to_redshift(\n s3_bucket, s3_key, 'test_table', column_definition=redshift_column_definition\n )\n\n # Unload Redshift query result to S3\n sql = 'select * from test_table'\n rp.redshift_to_s3(sql, s3_bucket, s3_path+'/unload', prefix='unloadtest_')\n\n # Obtain Redshift query result as a DataFrame\n df = rp.redshift_to_df('select * from test_table')\n\n # Run queries on Redshift\n rp.run_query('create table test_table_copy as select * from test_table')\n\n # Download S3 file to local\n rp.s3_to_file(s3_bucket, s3_key, 'local_file_name.csv')\n\n # Read S3 file in memory as DataFrame\n df = rp.s3_to_df(s3_bucket, s3_key, delimiter=',') # csv file in this example\n\n # Since we are only going to use Redshift functionalities, we can just use RedshiftUtils\n from red_panda.red_panda import RedshiftUtils\n ru = RedshiftUtils(redshift_conf)\n\n # Run built-in Redshift admin queries, such as getting running query information\n load_errors = ru.get_load_error(as_df=True)\n\n # Use utility functions such as create_table to quickly create tables in Redshift\n ru.create_table('test_table', redshift_column_definition, sortkey=['col2'], drop_first=True)\n\n\nFor API documentation, visit https://red-panda.readthedocs.io/en/latest/.\n\n\nTODO\n----\n\nIn no particular order:\n\n- Support more data formats for copy. Currently only support delimited files.\n- Support more data formats for s3 to df. Currently only support delimited files.\n- Improve tests and docs.\n- Better ways of inferring data types from dataframe to Redshift.\n- Explore using ``S3 Transfer Manager``'s ``upload_fileobj`` for ``df_to_s3`` to take advantage of automatic multipart upload.\n- Add COPY from S3 manifest file, in addition to COPY from S3 source path.\n- Build cli to manage data outside of Python.\n\nIn progress:\n\n- Support \u963f\u91cc\u4e91, GCP\n- EMR create cluster from a config file\n- Take advantage of Redshift slices for parallel processing. Split files for COPY.\n\nDone:\n\n- Unload from Redshift to S3.\n- Handle when user does have implicit column that is the index in a DataFrame. Currently index is automatically dropped.\n- Add encryption options for files uploaded to S3. *By adding support for all kwargs for s3 put_object/upload_file methods.*\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/yaojiach/red-panda", "keywords": "", "license": "MIT", "maintainer": "Jiachen Yao", "maintainer_email": "", "name": "red-panda", "package_url": "https://pypi.org/project/red-panda/", "platform": "", "project_url": "https://pypi.org/project/red-panda/", "project_urls": { "Code": "https://github.com/yaojiach/red-panda", "Homepage": "https://github.com/yaojiach/red-panda", "Issue tracker": "https://github.com/yaojiach/red-panda/issues" }, "release_url": "https://pypi.org/project/red-panda/0.1.9/", "requires_dist": [ "pandas", "psycopg2-binary", "boto3", "awscli", "oss2", "click", "python-dotenv", "pytest ; extra == 'dev'", "tox ; extra == 'dev'" ], "requires_python": ">=3.6", "summary": "Data science on the cloud", "version": "0.1.9" }, "last_serial": 5209406, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "6f65e9bc7cd14894006d267570adcef5", "sha256": "30ebce76f6bfa7996f3418c218c8810212ea7bb082b45dde484553a772d20f16" }, "downloads": -1, "filename": "red_panda-0.1.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "6f65e9bc7cd14894006d267570adcef5", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=3.6", "size": 6324, "upload_time": "2018-06-16T19:05:00", "url": "https://files.pythonhosted.org/packages/fe/f4/04b7a98d6e5554ccf0085bb64a0552272bd853b888eded7618fefb4f95ef/red_panda-0.1.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "99d684917c001feb63b2e5b5cdb7e121", "sha256": "3918187fb777d115ee99444128af7b0ca3a4f24e387fe90440724da63b7247a6" }, "downloads": -1, "filename": "red-panda-0.1.0.tar.gz", "has_sig": false, "md5_digest": "99d684917c001feb63b2e5b5cdb7e121", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 6453, "upload_time": "2018-06-16T19:05:01", "url": "https://files.pythonhosted.org/packages/3c/3a/de50598dc4619d70253611b1064a032473177fa7ddf0da2c30d27ff68421/red-panda-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "31d76a60db6facfc8f4b932592fb1b70", "sha256": "53907113000e8812484b88d768a8eb47ca6fda40176b0fa2282576466206771a" }, "downloads": -1, "filename": "red_panda-0.1.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "31d76a60db6facfc8f4b932592fb1b70", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=3.6", "size": 6905, "upload_time": "2018-06-16T22:56:25", "url": "https://files.pythonhosted.org/packages/78/b2/47c63129fc7bc3d85fe862cb0cbafa3093e07e8b7407cd6ef8c342686a07/red_panda-0.1.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "775e66518ad1d5aae0dbcc489b384eb7", "sha256": "256747f8dc39c447ec2c19b1475dddaa71367b25fe090eda08d48bb225c3800a" }, "downloads": -1, "filename": "red-panda-0.1.1.tar.gz", "has_sig": false, "md5_digest": "775e66518ad1d5aae0dbcc489b384eb7", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 7096, "upload_time": "2018-06-16T22:56:27", "url": "https://files.pythonhosted.org/packages/3d/82/d5b49089fa84d50cdac632ebe6114df2075e0f5fc52ef90c024a91fbe785/red-panda-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "8345c8b39448cae539e7fcf412f93c85", "sha256": "83fea4a6cb780765770b68941e7dda1a7f5a38e07806ea2df88dd2ed24ff4742" }, "downloads": -1, "filename": "red_panda-0.1.2-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "8345c8b39448cae539e7fcf412f93c85", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=3.6", "size": 7458, "upload_time": "2018-06-20T05:07:51", "url": "https://files.pythonhosted.org/packages/2d/35/763eae2121f0cb5cd13245a2bf6f2dcb65d65e128e7e360b136a03df9f9c/red_panda-0.1.2-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "556e0c0e3efebe0ed36b98e976e35713", "sha256": "f0831c109dc366570de34e4efd5d72d7d619dd99d35ee97446f7ba62c9f60dbe" }, "downloads": -1, "filename": "red-panda-0.1.2.tar.gz", "has_sig": false, "md5_digest": "556e0c0e3efebe0ed36b98e976e35713", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 7742, "upload_time": "2018-06-20T05:07:53", "url": "https://files.pythonhosted.org/packages/01/ee/fe81ef114ad472ed64d5b14b5c8bdb7842687e91ef008ad7b9f9e4d9557b/red-panda-0.1.2.tar.gz" } ], "0.1.3": [ { "comment_text": "", "digests": { "md5": "717202e0db9a97069906affa8e2387b8", "sha256": "8e928d5d52e6e436368f8b3db553591f46527e1c21f05af01673ab1db2e881c3" }, "downloads": -1, "filename": "red_panda-0.1.3-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "717202e0db9a97069906affa8e2387b8", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=3.6", "size": 7461, "upload_time": "2018-06-20T05:29:14", "url": "https://files.pythonhosted.org/packages/4c/c4/4d7fa716028aec90412aa944e10990d3533ccefa16ddbe3ac8b4c9b4ed76/red_panda-0.1.3-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "29eae91722d0bbbae5408edba23e6247", "sha256": "48d4d174ae0b0c5dee989ddf9044e2636726306250b23aefb75e772791cb2f74" }, "downloads": -1, "filename": "red-panda-0.1.3.tar.gz", "has_sig": false, "md5_digest": "29eae91722d0bbbae5408edba23e6247", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 7744, "upload_time": "2018-06-20T05:29:16", "url": "https://files.pythonhosted.org/packages/5a/b6/21ca3f3b8f5306322dce0fd29f77475bf8ac3e15a9504eb6be76017d1901/red-panda-0.1.3.tar.gz" } ], "0.1.4": [ { "comment_text": "", "digests": { "md5": "d54e4c2f97ff2bf11fd4ed9599b3d4c1", "sha256": "bfc6e062546d9b147033e84ffc9526bbd0ae2556d3b81dea052c97c15cbcc94f" }, "downloads": -1, "filename": "red_panda-0.1.4-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "d54e4c2f97ff2bf11fd4ed9599b3d4c1", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=3.6", "size": 7466, "upload_time": "2018-06-20T05:33:12", "url": "https://files.pythonhosted.org/packages/7a/f3/d056d39f693cf350cd0c16d8253fc31338ed4fe65d34e024e56c354bfeef/red_panda-0.1.4-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "e179ed8ac6b1ddf1ed474bb29c01ddc7", "sha256": "1d562e54b2e6ea82532d9ae459032abf473ebc73c93ed4a54a44be3e0b594abe" }, "downloads": -1, "filename": "red-panda-0.1.4.tar.gz", "has_sig": false, "md5_digest": "e179ed8ac6b1ddf1ed474bb29c01ddc7", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 7750, "upload_time": "2018-06-20T05:33:13", "url": "https://files.pythonhosted.org/packages/a2/f8/9c240dd72c4545f2f1710e03e9dac4cce76181e526b0d4b11740b631e0d5/red-panda-0.1.4.tar.gz" } ], "0.1.6": [ { "comment_text": "", "digests": { "md5": "c7421bc7823227d5f6b0e003aa1332e0", "sha256": "26ee8fd2ecd878c72165988fc6bf530fdbe51bde206f2f2a30c00d23d0e59a01" }, "downloads": -1, "filename": "red_panda-0.1.6-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "c7421bc7823227d5f6b0e003aa1332e0", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=3.6", "size": 9495, "upload_time": "2018-06-25T21:28:34", "url": "https://files.pythonhosted.org/packages/0e/36/0c95e2436d7d481cfa318f49d6bb76ae7b2a69f287a9c20eb36755edc5fd/red_panda-0.1.6-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "26eab419fb7c0c6bdee769e7504316ca", "sha256": "f09d5bb7236c79562a4b35d19f594ffd13d786a347506ca39de95775d291d89e" }, "downloads": -1, "filename": "red-panda-0.1.6.tar.gz", "has_sig": false, "md5_digest": "26eab419fb7c0c6bdee769e7504316ca", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 11133, "upload_time": "2018-06-25T21:28:35", "url": "https://files.pythonhosted.org/packages/0f/f7/85f8f97ad8801a141af420162bf4fafb0794220b80efb07fff03ae112b91/red-panda-0.1.6.tar.gz" } ], "0.1.7": [ { "comment_text": "", "digests": { "md5": "437b5257522df52903af51f5fcf5a115", "sha256": "6673bf04a58e0013483d55866107086f6aa396b003ea5b1c252a5ea110a78b0b" }, "downloads": -1, "filename": "red_panda-0.1.7-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "437b5257522df52903af51f5fcf5a115", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=3.6", "size": 14953, "upload_time": "2018-07-09T23:10:43", "url": "https://files.pythonhosted.org/packages/d6/2c/4b85ba0d06383440bff6a64e9b18293d9e039d73a5df6d883a444931a4d4/red_panda-0.1.7-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "a38a178f7466bb9b2c4fe7c972e0d9b8", "sha256": "c09f4d2026d54193f9c7269c23dae0a840af26c0af5002012d846f2f64b70f2a" }, "downloads": -1, "filename": "red-panda-0.1.7.tar.gz", "has_sig": false, "md5_digest": "a38a178f7466bb9b2c4fe7c972e0d9b8", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 17079, "upload_time": "2018-07-09T23:10:45", "url": "https://files.pythonhosted.org/packages/7d/ec/5d281304c38ea248c1e0e7909daa3990e1b99555740d91ddfde2461158c3/red-panda-0.1.7.tar.gz" } ], "0.1.8": [ { "comment_text": "", "digests": { "md5": "a643440d57dffb80cb7f7362a031bcc8", "sha256": "993f4050a3f495247390e36a045c935182d670cb1d18553969ab920e29446e28" }, "downloads": -1, "filename": "red_panda-0.1.8-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "a643440d57dffb80cb7f7362a031bcc8", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=3.6", "size": 24775, "upload_time": "2019-01-30T21:59:36", "url": "https://files.pythonhosted.org/packages/22/66/216273a015ce1112dc3ecdef4e4f64c4209f7a0bb2021cadb284acac0ac6/red_panda-0.1.8-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "2a529afbed9d6d4f5bb454cbfc36386a", "sha256": "93770275ae34eabdfda01344958a18b471b517dfad9c5128a28ee8fcfbc05de1" }, "downloads": -1, "filename": "red-panda-0.1.8.tar.gz", "has_sig": false, "md5_digest": "2a529afbed9d6d4f5bb454cbfc36386a", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 23325, "upload_time": "2019-01-30T21:59:38", "url": "https://files.pythonhosted.org/packages/7a/42/e35c23aefcb1338171ba1b36a1a9edd1717c71dfb19e4cc31618577d17fb/red-panda-0.1.8.tar.gz" } ], "0.1.9": [ { "comment_text": "", "digests": { "md5": "abd718f69ec8c673accd2160b1b1518c", "sha256": "6ed2420a05b26cab75724f00ec281beb7ed43a8b01489855b5420be81fd2c45d" }, "downloads": -1, "filename": "red_panda-0.1.9-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "abd718f69ec8c673accd2160b1b1518c", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=3.6", "size": 24049, "upload_time": "2019-04-30T17:46:22", "url": "https://files.pythonhosted.org/packages/54/54/029fa4eec865c11e41fb245e0f81b26cd1b753d90b7b57682a2bcea41177/red_panda-0.1.9-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "2ce90c391cc1b5bd553ce1f99d80dac0", "sha256": "519d099d39f3a1ec8a807adcf06f72b36891f21fc514969e408e5f95d0a86506" }, "downloads": -1, "filename": "red-panda-0.1.9.tar.gz", "has_sig": false, "md5_digest": "2ce90c391cc1b5bd553ce1f99d80dac0", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 23531, "upload_time": "2019-04-30T17:46:25", "url": "https://files.pythonhosted.org/packages/4d/96/5099714237a37139078a4c1ab5559acba3760d6a2a47e8fc6fb9c7b903d6/red-panda-0.1.9.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "abd718f69ec8c673accd2160b1b1518c", "sha256": "6ed2420a05b26cab75724f00ec281beb7ed43a8b01489855b5420be81fd2c45d" }, "downloads": -1, "filename": "red_panda-0.1.9-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "abd718f69ec8c673accd2160b1b1518c", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=3.6", "size": 24049, "upload_time": "2019-04-30T17:46:22", "url": "https://files.pythonhosted.org/packages/54/54/029fa4eec865c11e41fb245e0f81b26cd1b753d90b7b57682a2bcea41177/red_panda-0.1.9-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "2ce90c391cc1b5bd553ce1f99d80dac0", "sha256": "519d099d39f3a1ec8a807adcf06f72b36891f21fc514969e408e5f95d0a86506" }, "downloads": -1, "filename": "red-panda-0.1.9.tar.gz", "has_sig": false, "md5_digest": "2ce90c391cc1b5bd553ce1f99d80dac0", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 23531, "upload_time": "2019-04-30T17:46:25", "url": "https://files.pythonhosted.org/packages/4d/96/5099714237a37139078a4c1ab5559acba3760d6a2a47e8fc6fb9c7b903d6/red-panda-0.1.9.tar.gz" } ] }