{ "info": { "author": "Sean Hammond", "author_email": "deadoralive@seanh.cc", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Intended Audience :: Other Audience", "License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)", "Programming Language :: Python :: 2.6", "Programming Language :: Python :: 2.7", "Topic :: Internet :: WWW/HTTP :: Site Management :: Link Checking" ], "description": "[![Build Status](https://travis-ci.org/ckan/deadoralive.svg)](https://travis-ci.org/ckan/deadoralive)\n[![Coverage Status](https://img.shields.io/coveralls/ckan/deadoralive.svg)](https://coveralls.io/r/ckan/deadoralive)\n\nDead or Alive\n=============\n\nDead or Alive is a simple dead link checker service: it's a command-line script\nor cron job that checks your site for broken links and posts the results back\nto your site using an API (defined below) that your site must implement.\n\nFor [CKAN](http://ckan.org/) sites you can install\n[ckanext-deadoralive](https://github.com/ckan/ckanext-deadoralive) to add Dead\nor Alive support.\n\nIn the future we'd like to make this into a web service rather than just a\ncron job. This will enable _ad-hoc_ link checking in response to user\ninteractions (i.e. the user clicks on a \"check this link/these links now\"\nbutton on the client website, or checking a new resource as soon as a user\ncreates it) as well as the currently implemented automatic hourly checks.\nSee .\n\n\nRequirements\n------------\n\nPython 2.6 or 2.7.\n\n\nInstallation\n------------\n\nTo install run:\n\n pip install deadoralive\n\nIf you want to check a CKAN site for broken links you also need to install\n[ckanext-deadoralive](https://github.com/ckan/ckanext-deadoralive) on your\nCKAN site.\n\nTo install for development, create and activate a Python virtual environment\nthen do:\n\n git clone https://github.com/ckan/deadoralive.git\n cd deadoralive\n python setup.py develop\n\n\nUsage\n-----\n\nTo check a site for broken links run:\n\n deadoralive --url --apikey \n\nReplace `` with the URL of the CKAN or other client\nsite you want to check (e.g. `http://demo.ckan.org`).\n\nReplace `` with the API key of the site user that you want the\nlink checker to run as.\n\nIf the site is a CKAN site the API key should be the CKAN API key of one of the\nusers listed in the `ckanext.deadoralive.authorized_users` config setting in\nyour CKAN config file (see\n[ckanext-deadoralive's docs](https://github.com/ckan/ckanext-deadoralive) for\ndetails).\n\n\nAdding a Cron Job\n-----------------\n\nTo setup the link checker to run automatically you can add a cron job for it.\nOn most UNIX systems you can add a cron job by running `crontab -e` to edit\nyour crontab file. Add a line like the following to the file and save it:\n\n @hourly deadoralive --url '' --apikey \n\nAs before replace `` with the URL of the site you want to check,\nand `` with an API key from the site.\n\nYou can also use `@daily` or `@weekly` instead of `@hourly` if you want link\nchecking to happen less often.\n\n\nRunning the Link Checker on a Different Machine\n-----------------------------------------------\n\nAlthough the ckanext-deadoralive extension has to be installed and activated on\nthe CKAN site that you want to check the links of, the link checker script\nitself does not need to be run from the same machine. Because it does all\ncommunication with the CKAN site via CKAN's API, you can run it from a\ndifferent server or from your laptop: just install and run deadoralive\nas described above.\n\n\nRunning Multiple Link Checkers at Once\n--------------------------------------\n\nIt's perfectly fine to run multiple instances of the `deadoralive` link\nchecker script at once. For example:\n\n* Have a cron job setup on the server to run the link checker hourly, but\n occassionally run another copy of the link checker manually on your laptop.\n\n* Have multiple cron jobs on multiple servers running link checkers against\n the same CKAN site.\n\nCKAN will hand out different resources to each link checker, and won't let two\ncheckers check the same resource at the same time.\n\n\n### Running Multiple Link Checkers on the Same Machine\n\nBy default deadoralive uses a socket to prevent two instances of the\nscript from running at the same time on the same machine. This is to prevent\nlink checker processes from piling up when the link checker is being run as a\ncron job and doesn't finish checking all the links before cron runs it again.\n\nIf you _want_ to run multiple instances on the same machine at the same time,\njust use the `--port` option to specify a different port for each.\nFor example:\n\n deadoralive --url '' --apikey --port 4567\n deadoralive --url '' --apikey --port 4568\n deadoralive --url '' --apikey --port 4569\n\n(deadoralive doesn't actually do anything with the port, it just binds a\nsocket to it to prevent any other deadoralive processes with the same port\nfrom running.)\n\n\nChecking Multiple Sites\n-----------------------\n\nYou can use a single instance of the link checker to check multiple sites:\njust pass it different `--url` and `--apikey` arguments.\n\nFor example you might setup three cron jobs to check three different sites,\ngiving each job a different port so that they can run simultaneously:\n\n @hourly deadoralive --url '' --apikey --port 4567\n @hourly deadoralive --url '' --apikey --port 4568\n @hourly deadoralive --url '' --apikey --port 4569\n\n\nAPI\n---\n\nThe API that a site must implement to be compatible with deadoralive is as\nfollows:\n\n* All request and response bodies are in JSON-formatted text.\n\n* deadoralive sends the API key (from its `--apikey` option) in all HTTP\n requests as an Authorization header.\n\n For example `deadoralive --url --apikey foo` will send requests with\n header `Authorization: foo`.\n\n* `GET /deadoralive/get_resources_to_check`\n\n Return a list of IDs of resources to be checked. An ID can be any string that\n uniquely identifies a resource that has a URL.\n\n deadoralive will request this endpoint once at the start of each run.\n\n Example output: `[\"id_1\", \"id_2\", ...]`.\n\n* `GET /deadoralive/get_url_for_resource?resource_id=`\n\n Return the URL of the link to be checked for the given resource ID.\n\n deadoralive will request this endpoint once for each of the resource IDs\n it was given by the previous request.\n\n Example output: `{\"url\": \"http://demo.ckan.org/data.csv\"}`\n\n* `POST /deadoralive/upsert`\n\n deadoralive will post to this endpoint to report the result of each link\n check that it does.\n\n Example post body:\n\n {\"resource_id\": \"id_1\",\n \"alive\": false,\n \"status\": 500,\n \"reason\": \"Internal Server Error\",\n \"url\": \"http://demo.ckan.org/data.csv\"\n }\n\nSee [ckanext-deadoralive](https://github.com/ckan/ckanext-deadoralive) for a\nreference implementation.\n\n\nRunning the Tests\n-----------------\n\nFirst activate your virtualenv then install the dev requirements:\n\n pip install -r dev-requirements.txt\n\nThen to run the tests do:\n\n nosetests\n\nTo run the tests and produce a test coverage report do:\n\n nosetests --with-coverage --cover-inclusive --cover-erase --cover-tests", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/ckan/deadoralive", "keywords": "ckan", "license": "AGPL", "maintainer": null, "maintainer_email": null, "name": "deadoralive", "package_url": "https://pypi.org/project/deadoralive/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/deadoralive/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://github.com/ckan/deadoralive" }, "release_url": "https://pypi.org/project/deadoralive/0.1.1/", "requires_dist": null, "requires_python": null, "summary": "A broken link checker service", "version": "0.1.1" }, "last_serial": 1381096, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "32ea51a9b6f72a619a24c280ebdbd9fd", "sha256": "12ff9bef4517650aa31ba849da10bb93c1136b4294efd6e10cb2f23223f77a67" }, "downloads": -1, "filename": "deadoralive-0.1.0.tar.gz", "has_sig": false, "md5_digest": "32ea51a9b6f72a619a24c280ebdbd9fd", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8042, "upload_time": "2014-10-01T19:33:45", "url": "https://files.pythonhosted.org/packages/f4/f4/d64b1992db70a7e48062e27d8dbb67bb2cdb306e0f296a48e12548c57aba/deadoralive-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "73a3938a830714723602182533858cfd", "sha256": "f581d2a029ad98b2d33cf8176326a6df4364eba45b0e579668a573836d3619a8" }, "downloads": -1, "filename": "deadoralive-0.1.1.tar.gz", "has_sig": false, "md5_digest": "73a3938a830714723602182533858cfd", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9941, "upload_time": "2014-10-13T20:52:22", "url": "https://files.pythonhosted.org/packages/fd/f8/9397b6ba75b0f17e4d514dd0469e8a94b17f377a346d9e4a578fd9d3c255/deadoralive-0.1.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "73a3938a830714723602182533858cfd", "sha256": "f581d2a029ad98b2d33cf8176326a6df4364eba45b0e579668a573836d3619a8" }, "downloads": -1, "filename": "deadoralive-0.1.1.tar.gz", "has_sig": false, "md5_digest": "73a3938a830714723602182533858cfd", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9941, "upload_time": "2014-10-13T20:52:22", "url": "https://files.pythonhosted.org/packages/fd/f8/9397b6ba75b0f17e4d514dd0469e8a94b17f377a346d9e4a578fd9d3c255/deadoralive-0.1.1.tar.gz" } ] }