{ "info": { "author": "University of North Texas Libraries", "author_email": "lauren.ko@unt.edu", "bugtrack_url": null, "classifiers": [ "Intended Audience :: System Administrators", "License :: OSI Approved :: BSD License", "Natural Language :: English", "Programming Language :: Python", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Topic :: Communications :: File Sharing" ], "description": "# py-wasapi-client [![Build Status](https://travis-ci.org/unt-libraries/py-wasapi-client.svg)](https://travis-ci.org/unt-libraries/py-wasapi-client)\nA client for the WASAPI Data Transfer API. Initially developed according to the\n[Archive-It specification](https://github.com/WASAPI-Community/data-transfer-apis/tree/master/ait-specification), the client now additionally supports [Webrecorder.io](https://webrecorder.io/).\n\n## Requirements\n\n* Python 3.4-3.7\n\n## Installation\n\nTo run the latest code, the WASAPI client may be downloaded or cloned\nfrom [GitHub](https://github.com/unt-libraries/py-wasapi-client). From inside the top-level of the py-wasapi-client directory,\ninstall with:\n\n```\n $ python setup.py install\n```\n\nAlternatively, the most recent release (not guaranteed to be the latest\ncode) may be installed from [PyPi](https://pypi.org/project/py-wasapi-client/):\n\n```\n $ pip install py-wasapi-client\n```\n\nOnce installed, run the client at the command line with:\n\n```\n $ wasapi-client --help\n```\n\nThat gives you usage instructions:\n\n```\nusage: wasapi-client [-h] [-b BASE_URI] [-d DESTINATION] [-l LOG] [-n] [-v]\n [--profile PROFILE | -u USER | -t TOKEN]\n [-c | -m | -p PROCESSES | -s | -r]\n [--collection COLLECTION [COLLECTION ...]]\n [--filename FILENAME] [--crawl CRAWL]\n [--crawl-time-after CRAWL_TIME_AFTER]\n [--crawl-time-before CRAWL_TIME_BEFORE]\n [--crawl-start-after CRAWL_START_AFTER]\n [--crawl-start-before CRAWL_START_BEFORE]\n\n Download WARC files from a WASAPI access point.\n\n Acceptable date/time formats are:\n 2017-01-01\n 2017-01-01T12:34:56\n 2017-01-01 12:34:56\n 2017-01-01T12:34:56Z\n 2017-01-01 12:34:56-0700\n 2017\n 2017-01\n\noptional arguments:\n -h, --help show this help message and exit\n -b BASE_URI, --base-uri BASE_URI\n base URI for WASAPI access; default:\n https://partner.archive-it.org/wasapi/v1/webdata\n -d DESTINATION, --destination DESTINATION\n location for storing downloaded files\n -l LOG, --log LOG file to which logging should be written\n -n, --no-manifest do not generate checksum files (ignored when used in\n combination with --manifest)\n -v, --verbose log verbosely; -v is INFO, -vv is DEBUG\n --profile PROFILE profile to use for API authentication\n -u USER, --user USER username for API authentication\n -t TOKEN, --token TOKEN\n token for API authentication\n -c, --count print number of files for download and exit\n -m, --manifest generate checksum files only and exit\n -p PROCESSES, --processes PROCESSES\n number of WARC downloading processes\n -s, --size print count and total size of files and exit\n -r, --urls list URLs for downloadable files only and exit\n\nquery parameters:\n parameters for webdata request\n\n --collection COLLECTION [COLLECTION ...]\n collection identifier\n --filename FILENAME exact webdata filename to download\n --crawl CRAWL crawl job identifier\n --crawl-time-after CRAWL_TIME_AFTER\n request files created on or after this date/time\n --crawl-time-before CRAWL_TIME_BEFORE\n request files created before this date/time\n --crawl-start-after CRAWL_START_AFTER\n request files from crawl jobs starting on or after\n this date/time\n --crawl-start-before CRAWL_START_BEFORE\n request files from crawl jobs starting before this\n date/time\n```\n\n## Configuration\n\nWhen you are using the tool to query an Archive-It or Webrecorder WASAPI\nendpoint, you will need to supply a username and password for the API. You have\nthree options to provide these credentials.\n\n1. Supply a username with `-u`, and you will be prompted for a password.\n2. Set an environment variable called 'WASAPI_USER' to supply a username\nand a variable called 'WASAPI_PASS' to supply a password.\n3. Supply a profile `--profile` defined in a configuration\nfile. The configuration file should be at `~/.wasapi-client`.\n\nAn example profile:\n\n```\n[unt]\nusername = exampleUser\npassword = examplePassword\n```\n\nOrder of precedence is command line, environment, config file.\n\n## Example Usage\n\nThe following command downloads the WARC files available from a crawl\nwith `crawl id` 256119 and logs program output to a file named\n`out.log`. The program will prompt the user to enter the password for\nuser `myusername`. Downloads are carried out by one process.\n\n```\n $ wasapi-client -u myusername --crawl 256119 --log /tmp/out.log -p 1\n```\n\nThe following command downloads similarly, but user credentials are\nsupplied by a configuration file.\n\n```\n $ wasapi-client --profile unt --crawl 256119 --log out.log -p 1\n```\n\nYou may supply an API token instead of user credentials.\n\n```\n $ wasapi-client --token thisistheAPItokenIwasgiven --crawl 256119 --log out.log -p 1\n```\n\nThe following command downloads the WARC files available from crawls\nthat occurred in the specified time range. Verbose logging is being\nwritten to a file named out.log. Downloads are happening via four\nprocesses and written to a directory at /tmp/wasapi_warcs/.\n\n```\n $ wasapi-client --profile unt --crawl-start-after 2016-12-22T13:01:00 --crawl-start-before 2016-12-22T15:11:00 -vv --log out.log -p 4 -d /tmp/wasapi_warcs/\n\n```\n\nThe following command produces the size and file count of all content\navailable to the user.\n\n```\n $ wasapi-client --profile unt -s \n```\n\nThe following command gives the user the number of files available by\nthe given query parameters.\n\n```\n $ wasapi-client --profile unt --crawl 256119 -c \n```\n\nThe following command downloads the file called example.warc.gz to\nthe current working directory.\n\n```\n$ wasapi-client --profile unt --filename example.warc.gz\n```\n\nBy default, manifest files are generated to provide checksums for the\nfiles to be downloaded. One manifest file is generated for each hash algorithm\nprovided by the WASAPI access point. The manifest files are written to the\ndownload destination. If you don't want manifest files, use the --no-manifest\nflag.\n\n```\n$ wasapi-client --profile unt --crawl 256119 --log out.log --no-manifest\n```\n\nIf you want to generate manifest files for your available webdata files\nwithout actually downloading the webdata files, use the --manifest flag.\n\n```\n$ wasapi-client --profile unt --crawl 256119 --manifest\n```\n\nIf you would like to produce a list of URLs where your webdata files can\nlater be downloaded by another tool (such as wget) rather than having\nwasapi-client do the downloading, use the --urls flag.\n\n```\n$ wasapi-client --profile unt --crawl 256119 --urls\n```\n\nTo use the client with Webrecorder (not all query parameters may be supported),\nsupply the base URL with -b.\n\n```\n$ wasapi-client -b https://webrecorder.io/api/v1/download/webdata --profile webrecorder --collection my_collection -d warcs\n```\n\n## Run the Tests\n\n```\n$ python setup.py test\n```\n\nor\n\n```\n$ pip install tox\n$ tox\n```\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/unt-libraries/py-wasapi-client", "keywords": "", "license": "BSD", "maintainer": "", "maintainer_email": "", "name": "py-wasapi-client", "package_url": "https://pypi.org/project/py-wasapi-client/", "platform": "", "project_url": "https://pypi.org/project/py-wasapi-client/", "project_urls": { "Homepage": "https://github.com/unt-libraries/py-wasapi-client" }, "release_url": "https://pypi.org/project/py-wasapi-client/1.1.0/", "requires_dist": [ "requests (>=2.18.1)" ], "requires_python": "", "summary": "A client for the Archive-It and Webrecorder WASAPI Data Transer API", "version": "1.1.0" }, "last_serial": 5997616, "releases": { "1.0.0": [ { "comment_text": "", "digests": { "md5": "6c9e6cc2b52ce1e3102aaacbd403b50b", "sha256": "78970e31455854f9ac42cf50eee531e49f6f0df1aa3f7c97809ed49e98e3b83e" }, "downloads": -1, "filename": "py_wasapi_client-1.0.0-py3-none-any.whl", "has_sig": false, "md5_digest": "6c9e6cc2b52ce1e3102aaacbd403b50b", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 17894, "upload_time": "2019-08-14T20:16:26", "url": "https://files.pythonhosted.org/packages/f0/e8/f49343eb1b90b395b374d2fb54d92f4ff2214466f83800614cae097aaacb/py_wasapi_client-1.0.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "ce4fbd58a9cb04debfb4adc70de0fb34", "sha256": "0e550aafb586484523dabb186b53f244f8ccd19629e722301f2bc0b5760c068d" }, "downloads": -1, "filename": "py-wasapi-client-1.0.0.tar.gz", "has_sig": false, "md5_digest": "ce4fbd58a9cb04debfb4adc70de0fb34", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10199, "upload_time": "2019-08-14T20:16:28", "url": "https://files.pythonhosted.org/packages/fa/bc/f5d45ea80d0c2c374cf70b56b0be05d73a05b09575dbbf1469f1872ee8a6/py-wasapi-client-1.0.0.tar.gz" } ], "1.1.0": [ { "comment_text": "", "digests": { "md5": "58e4ba270afd8a1db95d4f99a9eda128", "sha256": "377bbfc0c24ba860222b52b23778a3281ff0b549a66865fc6f539c5b5988aa0f" }, "downloads": -1, "filename": "py_wasapi_client-1.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "58e4ba270afd8a1db95d4f99a9eda128", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 19017, "upload_time": "2019-10-18T21:15:55", "url": "https://files.pythonhosted.org/packages/ca/a8/15ed2c6f496917ff8b36b20e6042b847fcd567947fd8daeb9227c47572f8/py_wasapi_client-1.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "9d1b4b17e13e7150b53fb4e94704965b", "sha256": "9f677deaf735854ac054d6425e9e98f8b5cc30e039310e12789b00ebfa6efafc" }, "downloads": -1, "filename": "py-wasapi-client-1.1.0.tar.gz", "has_sig": false, "md5_digest": "9d1b4b17e13e7150b53fb4e94704965b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10925, "upload_time": "2019-10-18T21:15:57", "url": "https://files.pythonhosted.org/packages/34/bc/ac8d33dfc8889173022c523517fb621b6eee2c51d4488c1d8b00c2848bd1/py-wasapi-client-1.1.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "58e4ba270afd8a1db95d4f99a9eda128", "sha256": "377bbfc0c24ba860222b52b23778a3281ff0b549a66865fc6f539c5b5988aa0f" }, "downloads": -1, "filename": "py_wasapi_client-1.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "58e4ba270afd8a1db95d4f99a9eda128", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 19017, "upload_time": "2019-10-18T21:15:55", "url": "https://files.pythonhosted.org/packages/ca/a8/15ed2c6f496917ff8b36b20e6042b847fcd567947fd8daeb9227c47572f8/py_wasapi_client-1.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "9d1b4b17e13e7150b53fb4e94704965b", "sha256": "9f677deaf735854ac054d6425e9e98f8b5cc30e039310e12789b00ebfa6efafc" }, "downloads": -1, "filename": "py-wasapi-client-1.1.0.tar.gz", "has_sig": false, "md5_digest": "9d1b4b17e13e7150b53fb4e94704965b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10925, "upload_time": "2019-10-18T21:15:57", "url": "https://files.pythonhosted.org/packages/34/bc/ac8d33dfc8889173022c523517fb621b6eee2c51d4488c1d8b00c2848bd1/py-wasapi-client-1.1.0.tar.gz" } ] }