{ "info": { "author": "Scrapinghub", "author_email": "info@scrapinghub.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "License :: OSI Approved :: BSD License", "Operating System :: OS Independent", "Programming Language :: Python", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.5" ], "description": "==================\nscrapy-magicfields\n==================\n\n.. image:: https://travis-ci.org/scrapy-plugins/scrapy-magicfields.svg?branch=master\n :target: https://travis-ci.org/scrapy-plugins/scrapy-magicfields\n\n.. image:: https://codecov.io/gh/scrapy-plugins/scrapy-magicfields/branch/master/graph/badge.svg\n :target: https://codecov.io/gh/scrapy-plugins/scrapy-magicfields\n\nThis is a Scrapy spider middleware to add extra fields to items,\nbased on the configuration settings ``MAGIC_FIELDS`` and ``MAGIC_FIELDS_OVERRIDE``.\n\n\nInstallation\n============\n\nInstall scrapy-magicfields using ``pip``::\n\n $ pip install scrapy-magicfields\n\n\nConfiguration\n=============\n\n1. Add MagicFieldsMiddleware by including it in ``SPIDER_MIDDLEWARES``\n in your ``settings.py`` file::\n\n SPIDER_MIDDLEWARES = {\n 'scrapy_magicfields.MagicFieldsMiddleware': 100,\n }\n\n Here, priority ``100`` is just an example.\n Set its value depending on other middlewares you may have enabled already.\n\n2. Enable the middleware using ``MAGIC_FIELDS`` (and optionally ``MAGIC_FIELDS_OVERRIDE``)\n in your ``setting.py``.\n\n\nUsage\n=====\n\nBoth settings ``MAGIC_FIELDS`` and ``MAGIC_FIELDS_OVERRIDE`` are dicts:\n\n* the keys are the destination field names,\n* their value is a string which accepts **magic variables**,\n \u2014 identified by a starting ``$`` (dollar sign),\n which will be substituted by a corresponding value at runtime.\n\nSome magic variables also accept arguments, and are specified after the magic name,\nusing a ``:`` (column) as separator.\n\n\nYou can set project-global magics with ``MAGIC_FIELDS``,\nand tune them for a specific spider using ``MAGIC_FIELDS_OVERRIDE``.\n\nIn case there is more than one argument, they must come separated by ``,`` (comma sign).\nSo the generic magic format is::\n\n $[:arg1,arg2,...]\n\n\nSupported magic variables\n-------------------------\n\n``$time``\n the UTC timestamp at which the item was scraped, in format ``'%Y-%m-%d %H:%M:%S'``.\n\n``$unixtime``\n the unixtime (number of seconds since the Epoch, i.e. ``time.time()``)\n at which the item was scraped.\n\n``$isotime``\n the UTC timestamp at which the item was scraped, with format ``'%Y-%m-%dT%H:%M:%S\"``.\n\n``$spider``\n must be followed by an argument,\n which is the name of an attribute of the spider (like an argument passed to it).\n\n``$env``\n the value of an environment variable.\n It acccepts as argument the name of the variable.\n\n``$jobid``\n the job id (shortcut for ``$env:SCRAPY_JOB``)\n\n``$jobtime``\n the UTC timestamp at which the job started, in format ``'%Y-%m-%d %H:%M:%S'``.\n\n``$response``\n Access to some response properties.\n\n ``$response:url``\n The url from where the item was extracted from.\n\n ``$response:status``\n Response http status.\n\n ``$response:headers``\n Response http headers.\n\n``$setting``\n Access the given Scrapy setting. It accepts one argument: the name of the setting.\n\n``$field``\n Allows to copy the value of one field to another\n Its argument is the source field.\n Effects are unpredicable if you use as source a field that is filled\n using magic fields.\n\n\nExamples\n--------\n\nThe following configuration will add two fields to each scraped item:\n\n- ``'timestamp'``, which will be filled with the string ``'item scraped at '``,\n- and ``'spider'``, which will contain the spider name\n\n::\n\n MAGIC_FIELDS = {\n \"timestamp\": \"item scraped at $time\",\n \"spider\": \"$spider:name\"\n }\n\nThe following configuration will copy the url to the field sku::\n\n MAGIC_FIELDS = {\n \"sku\": \"$field:url\"\n }\n\nMagics also accept a regular expression argument which allows to extract\nand assign only part of the value generated by the magic.\nYou have to specify it using the ``r''`` notation.\n\nLet's pretend that the urls of your items look like ``'http://www.example.com/product.html?item_no=345'``\nand you want to assign to the ``sku`` field only the item number.\n\nThe following example, similar to the previous one but with a second regular expression argument,\nwill do the task::\n\n MAGIC_FIELDS = {\n \"sku\": \"$field:url,r'item_no=(\\d+)'\"\n }\n\n", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://github.com/scrapy-plugins/scrapy-magicfields", "keywords": "", "license": "BSD", "maintainer": "", "maintainer_email": "", "name": "scrapy-magicfields", "package_url": "https://pypi.org/project/scrapy-magicfields/", "platform": "Any", "project_url": "https://pypi.org/project/scrapy-magicfields/", "project_urls": { "Homepage": "http://github.com/scrapy-plugins/scrapy-magicfields" }, "release_url": "https://pypi.org/project/scrapy-magicfields/1.1.0/", "requires_dist": [ "scrapy" ], "requires_python": "", "summary": "Scrapy middleware to add extra \"magic\" fields to items", "version": "1.1.0" }, "last_serial": 2195426, "releases": { "1.0.0": [ { "comment_text": "", "digests": { "md5": "71973a8e9062b558a0e3b1423bc9d4ff", "sha256": "52f26f41628cf5be80d500865d041ad2fee1fed2a8d530085a8b6845fe5aeb13" }, "downloads": -1, "filename": "scrapy_magicfields-1.0.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "71973a8e9062b558a0e3b1423bc9d4ff", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 3907, "upload_time": "2016-06-29T19:28:04", "url": "https://files.pythonhosted.org/packages/b5/14/30b44e6f5a66fe7bd5ccb1e367e1aa4e9acba63af77539436504a2ccd8c5/scrapy_magicfields-1.0.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "f431f1b8728748f40caa5b13f3d2cc8c", "sha256": "18fe0cba1411dacbad2d6b85b337fd7edd540adeb247dbf7dbad40635e554913" }, "downloads": -1, "filename": "scrapy-magicfields-1.0.0.tar.gz", "has_sig": false, "md5_digest": "f431f1b8728748f40caa5b13f3d2cc8c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3929, "upload_time": "2016-06-29T19:28:08", "url": "https://files.pythonhosted.org/packages/c8/a8/3c54cabd3e0bace74b4be3f4db3dad8ab67f1ca44d4fd3d8da7d7e29de21/scrapy-magicfields-1.0.0.tar.gz" } ], "1.1.0": [ { "comment_text": "", "digests": { "md5": "9fa6a1be5c050ad75427f5ed4f115211", "sha256": "56538546df1c8f8edf334f40b1a14904fbfce88763cfb4e8f15e7b7a7ce49158" }, "downloads": -1, "filename": "scrapy_magicfields-1.1.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "9fa6a1be5c050ad75427f5ed4f115211", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 3913, "upload_time": "2016-06-30T08:37:55", "url": "https://files.pythonhosted.org/packages/53/65/e9766e89031dd1a6ed4f2fae055e2c5ac07b681f5b1548afed125f487809/scrapy_magicfields-1.1.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "b14513d5a51439972d13859cc65fa990", "sha256": "e2a3edec49be246e410c4de2a31d271710e0c8e945f6476de7e1c9bbdb5e045f" }, "downloads": -1, "filename": "scrapy-magicfields-1.1.0.tar.gz", "has_sig": false, "md5_digest": "b14513d5a51439972d13859cc65fa990", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3938, "upload_time": "2016-06-30T08:37:58", "url": "https://files.pythonhosted.org/packages/9a/55/78044b09b40eb909e0e3922bb4fd30914a1a3f634055ec9850d1880c113f/scrapy-magicfields-1.1.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "9fa6a1be5c050ad75427f5ed4f115211", "sha256": "56538546df1c8f8edf334f40b1a14904fbfce88763cfb4e8f15e7b7a7ce49158" }, "downloads": -1, "filename": "scrapy_magicfields-1.1.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "9fa6a1be5c050ad75427f5ed4f115211", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 3913, "upload_time": "2016-06-30T08:37:55", "url": "https://files.pythonhosted.org/packages/53/65/e9766e89031dd1a6ed4f2fae055e2c5ac07b681f5b1548afed125f487809/scrapy_magicfields-1.1.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "b14513d5a51439972d13859cc65fa990", "sha256": "e2a3edec49be246e410c4de2a31d271710e0c8e945f6476de7e1c9bbdb5e045f" }, "downloads": -1, "filename": "scrapy-magicfields-1.1.0.tar.gz", "has_sig": false, "md5_digest": "b14513d5a51439972d13859cc65fa990", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3938, "upload_time": "2016-06-30T08:37:58", "url": "https://files.pythonhosted.org/packages/9a/55/78044b09b40eb909e0e3922bb4fd30914a1a3f634055ec9850d1880c113f/scrapy-magicfields-1.1.0.tar.gz" } ] }