{ "info": { "author": "Xiaowen Zhang", "author_email": "seanxwzhang@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# WRDS (or SAS) to PostgreSQL\nThis software has two functions:\n- Download tables from [WRDS](https://wrds-web.wharton.upenn.edu/wrds/) and feeds them to a PostgreSQL database. (Requires access to WRDS and to the data in question.)\n- Import a SAS file (`*.sas7dbat`) into a PostgreSQL database.\n\n## Requirements\n\n### 1. Python\nThe software uses Python 3 and depends on Pandas, SQLAlchemy and Paramiko. In addition, the Python scripts generally interact with PostgreSQL using SQLAlchemy and the [Psycopg](http://initd.org/psycopg/) library. \nThese dependencies are installed when you use Pip (see instructions below).\n\n### 2. A WRDS ID\nTo use public-key authentication to access WRDS, follow hints taken from [here](https://debian-administration.org/article/152/Password-less_logins_with_OpenSSH) to set up a public key.\nCopy that key to the WRDS server from the terminal on my computer. \n(Note that this code assumes you have a directory `.ssh` in your home directory. If not, log into WRDS via SSH, then type `mkdir ~/.ssh` to create this.) \nHere's code to create the key and send it to WRDS (for me):\n```\nssh-keygen -t rsa\ncat ~/.ssh/id_rsa.pub | ssh iangow@wrds-cloud.wharton.upenn.edu \"cat >> ~/.ssh/authorized_keys\"\n```\nUse an empty passphrase in setting up the key so that the scripts can run without user intervention.\n\n### 3. PostgreSQL\nYou should have a PostgreSQL database to store the data.\n\n### 4. Environment variables\n\nEnvironment variables that the code can use include:\n\n- `PGDATABASE`: The name of the PostgreSQL database you use.\n- `PGUSER`: Your username on the PostgreSQL database.\n- `PGHOST`: Where the PostgreSQL database is to be found (this will be `localhost` if its on the same machine as you're running the code on)\n- `WRDS_ID`: Your [WRDS](https://wrds-web.wharton.upenn.edu/wrds/) ID.\n\nI set these environment variables in `~/.profile`:\n\n```\nexport PGHOST=\"localhost\"\nexport PGDATABASE=\"crsp\"\nexport WRDS_ID=\"iangow\"\nexport PGUSER=\"igow\"\n```\n\n## Using the function `wrds_update`.\n\nTwo arguments `table_name` and `schema` are required.\n\n### 1. WRDS Settings\nSet `WRDS_ID` using either `wrds_id=your_wrds_id` in the function call or the environment variable `WRDS_ID`.\n\n### 2. PG Settings\nThe software will use the environment variables `PGHOST`, `PGDATABASE`, and `PGUSER` if you If you have set them. Otherwise, you need to provide values as arguments to `wrds_udpate()`. Default `PGPORT` is`5432`.\n\n### 3. Table Settings\nTo tailor your request, specify the following arguments:\n\n- `fix_missing`: set to `True` to fix missing values. This addresses special missign values, which SAS's `PROC EXPORT` dumps as strings. Default value is `False`. \n- `fix_cr`: set to `True` to fix characters. Default value is `False`.\n- `drop`: add column names to be dropped (e.g., `drop=\"id name\"` will drop columns `id` and `name`).\n- `obs`: add maxium number of observations (e.g., `obs=10` will import the first 10 rows from the table on WRDS).\n- `rename`: rename columns (e.g., `rename=\"fee=mngt_fee\"` renames `fee` to `mngt_fee`).\n- `force`: set to `True` to force update. Default value is `False`.\n\n## Importing SAS data into PostgreSQL\nThe software can also upload SAS file directly to PostgreSQL. \nYou need to have local SAS in order to use this function.\nUse `fpath` to specify the path to the file to be imported\n\n### Examples\nHere are some examples.\n\nIf you are at the home directory of this git repo, you can import and use the software as shown below.\n\nTo install it from Github:\n\n```\nsudo -H pip3 install git+https://github.com/iangow/wrds2pg --upgrade\n```\n\nThis software is also available from PyPI. To install it from [PyPI](https://pypi.org/project/wrds2pg/):\n```\npip3 install wrds2pg\n```\nExample usage:\n```py\nfrom wrds2pg.wrds2pg import wrds_update\n\n# 1. Download crsp.mcti from wrds and upload to pg as crps.mcti\n# Simplest version\nwrds_update(table_name=\"mcti\", schema=\"crsp\")\n\n# Tailored arguments \nwrds_update(table_name=\"mcti\", schema=\"crsp\", host=your_pghost, \n\tdbname=your_pg_database, \n\tfix_missing=True, fix_cr=True, drop=\"b30ret b30ind\", obs=10, \n\trename=\"caldt=calendar_date\", force=True)\n\n# 2. Upload test.sas7dbat to pg as crsp.mcti\nwrds_update(table_name=\"mcti\", schema=\"crsp\", fpath=\"your_path/test.sas7dbat\")\n```\n\n### Report Bugs\nAuthor: Ian Gow, \n\nContributor: Jingyu Zhang, ", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/seanxwzhang/wrds2pg/", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "wrds2postgres", "package_url": "https://pypi.org/project/wrds2postgres/", "platform": "", "project_url": "https://pypi.org/project/wrds2postgres/", "project_urls": { "Homepage": "https://github.com/seanxwzhang/wrds2pg/" }, "release_url": "https://pypi.org/project/wrds2postgres/0.0.6/", "requires_dist": null, "requires_python": ">=3", "summary": "Import WRDS tables or SAS data into PostgreSQL. Forked from https://github.com/iangow/wrds2pg.", "version": "0.0.6" }, "last_serial": 5863488, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "eaecacb4e99487d47dcad9e660e12aab", "sha256": "16bc2dd1bbd2ae0ee93f6badc519b8e9062a9e04287053518b7d7c652da831bc" }, "downloads": -1, "filename": "wrds2postgres-0.0.1.tar.gz", "has_sig": false, "md5_digest": "eaecacb4e99487d47dcad9e660e12aab", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 8292, "upload_time": "2019-09-07T05:06:50", "url": "https://files.pythonhosted.org/packages/7f/73/aa15e3e7ffde3c1c4d90274bfd037132a6429e4e39e760ac4a6645e51294/wrds2postgres-0.0.1.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "bb8efe19201bf0171ba895324f55328d", "sha256": "a3a74e283216389cc5f91a41fe19d1dfc73e76ef9056ed640109d79beb5368c1" }, "downloads": -1, "filename": "wrds2postgres-0.0.2.tar.gz", "has_sig": false, "md5_digest": "bb8efe19201bf0171ba895324f55328d", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 7748, "upload_time": "2019-09-16T18:53:16", "url": "https://files.pythonhosted.org/packages/50/7d/f261569fe9e2ce1e33c2878ffae21bc40032c40e561eaed70cd8faef5acd/wrds2postgres-0.0.2.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "22ab637edcc815bfbef0e93ed41a0055", "sha256": "cc4b4d415101412e350649f694f6721944282ad76f0374f80d9b4daf2c14d887" }, "downloads": -1, "filename": "wrds2postgres-0.0.3.tar.gz", "has_sig": false, "md5_digest": "22ab637edcc815bfbef0e93ed41a0055", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 7760, "upload_time": "2019-09-16T19:02:49", "url": "https://files.pythonhosted.org/packages/1b/45/9236da8e2f071a71737433a49de67357db1ccfbe82fc12aad02fe07ed290/wrds2postgres-0.0.3.tar.gz" } ], "0.0.4": [ { "comment_text": "", "digests": { "md5": "420e95e64741206d0952b5ba1a35dfcf", "sha256": "ce4da13dc03848786c18c61b259f20fff82e4163abc09ffd7c97bbe055947a7e" }, "downloads": -1, "filename": "wrds2postgres-0.0.4.tar.gz", "has_sig": false, "md5_digest": "420e95e64741206d0952b5ba1a35dfcf", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 8364, "upload_time": "2019-09-19T06:10:11", "url": "https://files.pythonhosted.org/packages/e8/e1/b24b30b392fc863d24c3160e709d846102cf7d8de7e037a84f41b43ea5e9/wrds2postgres-0.0.4.tar.gz" } ], "0.0.5": [ { "comment_text": "", "digests": { "md5": "de36aa6f76335cc2b77dffc13f5c12a8", "sha256": "09e1a0c81c61a909e957c55800e161b0d47302e43cc25627fdb3acd7fdfc25e6" }, "downloads": -1, "filename": "wrds2postgres-0.0.5.tar.gz", "has_sig": false, "md5_digest": "de36aa6f76335cc2b77dffc13f5c12a8", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 8686, "upload_time": "2019-09-19T21:22:46", "url": "https://files.pythonhosted.org/packages/21/59/525e830b16692b6c8b83a5409d6ca43733ba5c38cfb4bef63617fc86fd07/wrds2postgres-0.0.5.tar.gz" } ], "0.0.6": [ { "comment_text": "", "digests": { "md5": "cd8aeb8fd949169ab0e102b014f25b26", "sha256": "2925f770a1a13136537325f625e840e55c00e0db7866af7e1eeaef93b58750db" }, "downloads": -1, "filename": "wrds2postgres-0.0.6.tar.gz", "has_sig": false, "md5_digest": "cd8aeb8fd949169ab0e102b014f25b26", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 8702, "upload_time": "2019-09-20T18:00:52", "url": "https://files.pythonhosted.org/packages/ee/b8/0fe830519fe41ee82e901a76e350a7391b87d198dc53bfdffaa0d7271c85/wrds2postgres-0.0.6.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "cd8aeb8fd949169ab0e102b014f25b26", "sha256": "2925f770a1a13136537325f625e840e55c00e0db7866af7e1eeaef93b58750db" }, "downloads": -1, "filename": "wrds2postgres-0.0.6.tar.gz", "has_sig": false, "md5_digest": "cd8aeb8fd949169ab0e102b014f25b26", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 8702, "upload_time": "2019-09-20T18:00:52", "url": "https://files.pythonhosted.org/packages/ee/b8/0fe830519fe41ee82e901a76e350a7391b87d198dc53bfdffaa0d7271c85/wrds2postgres-0.0.6.tar.gz" } ] }