{ "info": { "author": "LIU Yu", "author_email": "liuyu@opencps.net", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.6", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.2", "Programming Language :: Python :: 3.3", "Topic :: Internet :: WWW/HTTP :: Dynamic Content", "Topic :: Internet :: WWW/HTTP :: Dynamic Content :: CGI Tools/Libraries", "Topic :: Internet :: WWW/HTTP :: WSGI", "Topic :: Internet :: WWW/HTTP :: WSGI :: Application", "Topic :: Internet :: WWW/HTTP :: WSGI :: Middleware", "Topic :: Software Development :: Libraries :: Python Modules" ], "description": ".. snapsearch-client-python document\n :noindex:\n\n========================\nSnapSearch Client Python\n========================\n\n.. image:: https://travis-ci.org/SnapSearch/SnapSearch-Client-Python.png?branch=master\n :target: https://travis-ci.org/SnapSearch/SnapSearch-Client-Python\n\n.. image:: https://badge.fury.io/py/snapsearch-client-python.svg\n :target: http://badge.fury.io/py/snapsearch-client-python\n\nSnapSearch Client Python is a Python based framework agnostic HTTP client\nlibrary for SnapSearch (https://snapsearch.io/).\n\nSnapsearch is a search engine optimisation (SEO) and robot proxy for complex\nfront-end javascript & AJAX enabled (potentially realtime) HTML5 web applications.\n\nSearch engines like Google's crawler and dumb HTTP clients such as Facebook's\nimage extraction robot cannot execute complex javascript applications. Complex\njavascript applications include websites that utilise AngularJS, EmberJS, KnockoutJS,\nDojo, Backbone.js, Ext.js, jQuery, JavascriptMVC, Meteor, SailsJS, Derby, RequireJS\nand much more. Basically any website that utilises javascript in order to bring in\ncontent and resources asynchronously after the page has been loaded, or utilises\njavascript to manipulate the page's content while the user is viewing them such\nas animation.\n\nSnapsearch intercepts any requests made by search engines or robots and sends its\nown javascript enabled robot to extract your page's content and creates a cached\nsnapshot. This snapshot is then passed through your own web application back to\nthe search engine, robot or browser.\n\nSnapsearch's robot is an automated load balanced Firefox browser. This Firefox\nbrowser is kept up to date with the nightly versions, so we'll always be able\nto serve the latest in HTML5 technology. Our load balancer ensures your requests\nwon't be hampered by other user's requests.\n\nFor more details on how this works and the benefits of usage see\nhttps://snapsearch.io/\n\nSnapSearch provides similar libraries in other languages:\nhttps://github.com/SnapSearch/Snapsearch-Clients\n\nInstallation\n============\n\nThe Pythonic ``snapsearch-client`` requires a dependable HTTP library that can\nverify SSL certificates for HTTPS requests. Normally, this means you need to\nhave either `requests`_ or `PycURL`_.\n\n.. _`PycURL`: http://pycurl.sourceforge.net/\n.. _`requests`: http://python-requests.org/\n\nTo install with ``pip``, simply:\n\n.. code-block:: bash\n\n $ pip install snapsearch-client-python\n\nOr, if you prefer ``easy_install``:\n\n.. code-block:: bash\n\n $ easy_install snapsearch-client-python\n\n\nUsage\n=====\n\nThe `Pythonic SnapSearch Client`_ provides `WSGI`_ and `CGI`_ middlewares for\nintegrating SnapSearch with respective Python Web applications. There are also\nframework agnostic core objects that can be used independently.\n\n.. _`Pythonic SnapSearch Client`: https://github.com/SnapSearch/SnapSearch-Client-Python\n.. _`WSGI`: http://legacy.python.org/dev/peps/pep-3333/\n.. _`CGI`: http://docs.python.org/library/cgi.html\n\nThe following examples include step-by-step instructions on the context of\nusing the `Pythonic SnapSearch Client`_ in your Python web applications.\n\n- `Integration with WSGI Applications I (Flask) `_\n- `Integration with WSGI Applications II (Django) `_\n- `Integration with Python CGI Scripts `_\n\nFor full documentation on the API and API request parameters see:\nhttps://snapsearch.io/documentation\n\n**By the way, you need to blacklist non-html resources such as `sitemap.xml`. This is explained in https://snapsearch.io/documentation#notes**\n\nBasic Usage\n-----------\n\nThe below instructions is an abridged version of the Flask_ example. The\nfollowing python script serves a simple ``\"Hello World\"`` page through any of\nthe public IP address(es) of the runner machine.\n\n.. _Flask: http://flask.pocoo.org/\n\n.. code:: python\n\n from flask import Flask\n app = Flask(__name__)\n\n @app.route('/')\n def hello_world():\n return \"Hello World!\\r\\n\"\n\n if __name__ == '__main__':\n app.run(host=\"0.0.0.0\", port=5000)\n\nTo start the application,\n\n.. code-block:: bash\n\n $ pip install Flask\n $ pip install snapsearch-client-python\n $ python main.py\n * Running on http://0.0.0.0:5000/\n\nTo enable SnapSearch-based interception for this application,\n\n1. initialize an ``Interceptor``.\n\n.. code-block:: python\n\n from SnapSearch import Client, Detector, Interceptor\n interceptor = Interceptor(Client(api_email, api_key), Detector())\n\n\n2. deploy the ``Interceptor``.\n\n.. code-block:: python\n\n from SnapSearch.wsgi import InterceptorMiddleware\n app.wsgi_app = InterceptorMiddleware(app.wsgi_app, interceptor)\n\n\n3. putting it all together.\n\n.. code-block:: python\n\n from flask import Flask\n app = Flask(__name__)\n\n @app.route('/')\n def hello_world():\n return \"Hello World!\\r\\n\"\n\n if __name__ == '__main__':\n # API credentials\n api_email = \"\" # change this to the registered email\n api_key = \"\" # change this to the real api credential\n\n # initialize the interceptor\n from SnapSearch import Client, Detector, Interceptor\n interceptor = Interceptor(Client(api_email, api_key), Detector())\n\n # deploy the interceptor\n from SnapSearch.wsgi import InterceptorMiddleware\n app.wsgi_app = InterceptorMiddleware(app.wsgi_app, interceptor)\n\n # start servicing\n app.run(host=\"0.0.0.0\", port=5000)\n\n\nAdvanced Topics\n---------------\n\nCustomizing the ``Detector``\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nThe ``Detector`` class can take ``ignored_routes`` and ``matched_routes`` as\noptional arguments to its constructor and perform interception detection in a\nper-route basis. For example, the following ``detector`` will bypass\ninterception for any access to ``http:///ignored.*``, and enforce\ninterception for any access to ``http:///matched.*``.\n\n.. code-block:: python\n\n from SnapSearch import Detector\n detector = Detector(ignored_routes=[\"^\\/ignored\", ],\n matched_routes=[\"^\\/matched\", ])\n\nThe ``Detector`` class can take external ``robots.json`` and ``extensions.json``\nfiles as optional arguments to its constructor. Namely,\n\n.. code-block:: python\n\n from SnapSearch import Detector\n detector = Detector(robots_json=\"path/to/external/robots.json\",\n extensions_json=\"path/to/external/extensions.json\")\n\nYou can also modify the lists of robots and extension through the ``robots``\nand ``extensions`` properties of the ``detector`` object. For example,\nthe following customization will bypass interception for ``Googlebot``.\n\n.. code-block:: python\n\n from SnapSearch import Detector\n detector = Detector(robots_json=\"path/to/external/robots.json\",\n extensions_json=\"path/to/external/extensions.json\")\n detector.robots['ignore'].append(\"Googlebot\")\n\n\nCustomizing the ``Client``\n~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nThe ``Client`` class can take an optional ``dict`` of ``request_parameters``\nthat contains additional parameters defined in \nhttps://snapsearch.io/documentation#parameters . Note that the ``url`` parameter\nis always overwritten by the ``Interceptor`` with the encoded URL from the\nassociated ``Detector`` object. It can also take optional ``api_url`` and\n``ca_path`` to communicate with an alternative backend service.\n\n\nCustomizing the ``Interceptor``\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nThe ``Interceptor`` class can take two optional callback functions, namely\n``before_intercept()`` and ``after_intercept()``.\n\nAt the presence of ``before_intercept()``, the ``Interceptor`` object will\nbypass any communication with the backend service of SnapSearch, and return\nthe ``result`` of ``before_intercept()`` as if it were returned by the\nassociated ``Client`` object.\n\n.. code-block:: python\n\n def before_intercept(url):\n ...\n return result\n\nAs for ``after_intercept()``, the ``Interceptor`` will provide the response\nfrom the ``Client`` object to ``after_intercept()`` which can perform, say,\ndata extraction or logging as appropriate.\n\n.. code-block:: python\n\n def after_intercept(url, response):\n ...\n return None\n\nThe return value of ``after_response()`` is ignored by the ``Interceptor`` and\nit does not affect the interception process.\n\n\nDevelopers' Resources\n=====================\n\n- `Official Documentation of SnapSearch `_\n- `Future Development of the Pythonic Client Package `_", "description_content_type": null, "docs_url": "https://pythonhosted.org/snapsearch-client-python/", "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/liuyu81/SnapSearch-Client-Python", "keywords": "SnapSearch,client,SEO", "license": "MIT", "maintainer": null, "maintainer_email": null, "name": "snapsearch-client-python", "package_url": "https://pypi.org/project/snapsearch-client-python/", "platform": "All", "project_url": "https://pypi.org/project/snapsearch-client-python/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://github.com/liuyu81/SnapSearch-Client-Python" }, "release_url": "https://pypi.org/project/snapsearch-client-python/0.0.9/", "requires_dist": null, "requires_python": null, "summary": "Pythonic HTTP Client and Middleware Library for SnapSearch", "version": "0.0.9" }, "last_serial": 2803249, "releases": { "0.0.7": [ { "comment_text": "", "digests": { "md5": "e2dae06ef86e95260eb019d0c0d5af35", "sha256": "5c371b9e0b9e52ea0db651ae8ee15882c93274ca4eb7fe36f7f446ea28d5f15a" }, "downloads": -1, "filename": "snapsearch-client-python-0.0.7.tar.gz", "has_sig": false, "md5_digest": "e2dae06ef86e95260eb019d0c0d5af35", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 311572, "upload_time": "2014-03-14T20:23:00", "url": "https://files.pythonhosted.org/packages/a3/da/a326ab8a8c16ad10c3192afb70920a90b930c5b0cc3453250d2f21254003/snapsearch-client-python-0.0.7.tar.gz" } ], "0.0.8": [ { "comment_text": "", "digests": { "md5": "1acd520b8b064bc6bdad3b927322cb4a", "sha256": "80648f619baef00bd6a260dd450272859317d79a1d72e0ef2e28ba2fc8d0b8fb" }, "downloads": -1, "filename": "snapsearch-client-python-0.0.8.tar.gz", "has_sig": false, "md5_digest": "1acd520b8b064bc6bdad3b927322cb4a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 308662, "upload_time": "2015-04-30T13:10:54", "url": "https://files.pythonhosted.org/packages/f3/7a/1e2a1cdceceb250fa8b77e80f36e3e1dcb04a128aeb01b1162ee83c78300/snapsearch-client-python-0.0.8.tar.gz" } ], "0.0.9": [ { "comment_text": "", "digests": { "md5": "cc6016f543c9c5af0c55bd19d736f98f", "sha256": "7253024e3446a55d5f202a33571316dcd8992c2a53a3b1385d2389eda662ffc1" }, "downloads": -1, "filename": "snapsearch-client-python-0.0.9.tar.gz", "has_sig": false, "md5_digest": "cc6016f543c9c5af0c55bd19d736f98f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 308647, "upload_time": "2017-04-14T07:52:51", "url": "https://files.pythonhosted.org/packages/00/3f/e6ec18ce200f963a9db3c2ce87a1b2051feef64065c2e55054c02d68bddd/snapsearch-client-python-0.0.9.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "cc6016f543c9c5af0c55bd19d736f98f", "sha256": "7253024e3446a55d5f202a33571316dcd8992c2a53a3b1385d2389eda662ffc1" }, "downloads": -1, "filename": "snapsearch-client-python-0.0.9.tar.gz", "has_sig": false, "md5_digest": "cc6016f543c9c5af0c55bd19d736f98f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 308647, "upload_time": "2017-04-14T07:52:51", "url": "https://files.pythonhosted.org/packages/00/3f/e6ec18ce200f963a9db3c2ce87a1b2051feef64065c2e55054c02d68bddd/snapsearch-client-python-0.0.9.tar.gz" } ] }