{ "info": { "author": "Grammy Jiang", "author_email": "grammy.jiang@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 2 - Pre-Alpha", "Environment :: Plugins", "Framework :: Scrapy", "Intended Audience :: Developers", "License :: OSI Approved :: BSD License", "Operating System :: OS Independent", "Programming Language :: Python", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Topic :: Internet :: WWW/HTTP", "Topic :: Software Development :: Libraries :: Python Modules" ], "description": "=======================\nScrapy-Pipeline-MongoDB\n=======================\n\n.. image:: https://img.shields.io/pypi/v/scrapy-pipeline-mongodb.svg\n :target: https://pypi.python.org/pypi/scrapy-pipeline-mongodb\n :alt: PyPI Version\n\n.. image:: https://img.shields.io/travis/grammy-jiang/scrapy-pipeline-mongodb/master.svg\n :target: http://travis-ci.org/grammy-jiang/scrapy-pipeline-mongodb\n :alt: Build Status\n\n.. image:: https://img.shields.io/badge/wheel-yes-brightgreen.svg\n :target: https://pypi.python.org/pypi/scrapy-pipeline-mongodb\n :alt: Wheel Status\n\n.. image:: https://img.shields.io/codecov/c/github/grammy-jiang/scrapy-pipeline-mongodb/master.svg\n :target: http://codecov.io/github/grammy-jiang/scrapy-pipeline-mongodb?branch=master\n :alt: Coverage report\n\nOverview\n========\n\nScrapy is a great framework for web crawling. This package provides two\npipelines of saving items into MongoDB in both async and sync ways for scrapy.\nAlso it provides a highly customized way to interact with MongoDB in both async\nand sync ways:\n\n* Save an item and get Object ID with this pipeline\n\n* Update an item and get Object ID with this pipeline\n\nRequirements\n============\n\n* Txmongo, a async MongoDB driver with Twisted\n\n* Tests on Python 3.5\n\n* Tests on Linux, but it's a pure python module, should work on other platforms\n with official python and Twisted supported\n\nInstallation\n============\n\nThe quick way::\n\n pip install scrapy-pipeline-mongodb\n\nOr put this middleware just beside the scrapy project.\n\nDocumentation\n=============\n\nSet Block Inspector in ``ITEMPIPELINES`` in ``settings.py``, for example::\n\n from txmongo.filter import ASCENDING\n from txmongo.filter import DESCENDING\n\n # -----------------------------------------------------------------------------\n # PIPELINE MONGODB ASYNC\n # -----------------------------------------------------------------------------\n\n ITEM_PIPELINES.update({\n 'scrapy_pipeline_mongodb.pipelines.mongodb_async.PipelineMongoDBAsync': 500,\n })\n\n MONGODB_USERNAME = 'user'\n MONGODB_PASSWORD = 'password'\n MONGODB_HOST = 'localhost'\n MONGODB_PORT = 27017\n MONGODB_DATABASE = 'test_mongodb_async_db'\n MONGODB_COLLECTION = 'test_mongodb_async_coll'\n\n # MONGODB_OPTIONS_ = 'MONGODB_OPTIONS_'\n\n MONGODB_INDEXES = [('field_0', ASCENDING, {'unique': True}),\n (('field_0', 'field_1'), ASCENDING),\n (('field_0', ASCENDING), ('field_0', DESCENDING))]\n\n MONGODB_PROCESS_ITEM = 'scrapy_pipeline_mongodb.utils.process_item.process_item'\n\n\nSettings Reference\n==================\n\nMONGODB_USERNAME\n----------------\n\nA string of the username of the database.\n\nMONGODB_PASSWORD\n----------------\n\nA string of the password of the database.\n\nMONGODB_HOST\n------------\n\nA string of the ip address or the domain of the database.\n\nMONGODB_PORT\n------------\n\nA int of the port of the database.\n\nMONGODB_DATABASE\n----------------\n\nA string of the name of the database.\n\nMONGODB_COLLECTION\n------------------\n\nA list of the indexes to create on the collection.\n\nMONGODB_OPTIONS_*\n-----------------\n\nOptions can be attached when the pipeline start to connect to MongoBD.\n\nIf any options are needed, the option can be set with the prefix\n``MONGODB_OPTIONS_``, the pipeline will parse it.\n\nFor example:\n\n+---------------+-------------------------------+\n| option name | in ``settings.py`` |\n+---------------+-------------------------------+\n| authMechanism | MONGODB_OPTIONS_authMechanism |\n+---------------+-------------------------------+\n\nFor more options, please refer to the page:\n\n`Connection String URI Format \u2014 MongoDB Manual 3.4`_\n\n.. _`Connection String URI Format \u2014 MongoDB Manual 3.4`: https://docs.mongodb.com/manual/reference/connection-string/#connections-standard-connection-string-format\n\nMONGODB_INDEXES\n---------------\n\nA list of the indexes defined in this setting will be created when the spider is\nopen.\n\nIf the index has already existed, the warning or error will be suspended.\n\nMONGODB_PROCESS_ITEM\n--------------------\n\nThis pipeline provides a setting to define the function ``process_item``, which\ncan help to customize the way to interact with MongoDB. With this package there\nis one default function provided: calling the method ``insert_one`` of\n``collection`` to save the item into MongoDB, then return the item.\n\nIf a customize method is provided to replace the default one, please note the\nbehavior should follow the requirement which is clearly written in the scrapy\ndocuments:\n\n`Item Pipeline \u2014 Scrapy 1.4.0 documentation`_\n\n.. _`Item Pipeline \u2014 Scrapy 1.4.0 documentation`: https://doc.scrapy.org/en/latest/topics/item-pipeline.html#writing-your-own-item-pipelin\n\nBuilt-in Functions For Processing Item\n======================================\n\nscrapy_pipeline_mongodb.utils.process_item.process_item\n-------------------------------------------------------\n\nThis is a built-in function to call the method ``insert_one`` of ``collection``,\nand return the item.\n\nTo use this function, in ``settings.py``::\n\n MONGODB_PROCESS_ITEM = 'scrapy_pipeline_mongodb.utils.process_item.process_item'\n\nNOTE\n====\n\nThe database drivers may have different api for the same operation, this\npipeline adopts txmongo as the async driver for MongoDB. Please read the\nrelative documents to make sure the customized method can run fluently in this\npipeline.\n\n* `Welcome to TxMongo\u2019s documentation!`_\n\n.. _`Welcome to TxMongo\u2019s documentation!`: https://txmongo.readthedocs.io/en/latest/\n\n* `pymongo \u2013 Python driver for MongoDB`_\n\n.. _`pymongo \u2013 Python driver for MongoDB`: http://api.mongodb.com/python/current/api/pymongo/\n\nTODO\n====\n* Add a unit test for the index created function\n\n* Add a sync pipeline\n\n\n", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/grammy-jiang/scrapy-pipeline-mongodb", "keywords": "", "license": "BSD", "maintainer": "", "maintainer_email": "", "name": "scrapy-pipeline-mongodb", "package_url": "https://pypi.org/project/scrapy-pipeline-mongodb/", "platform": "", "project_url": "https://pypi.org/project/scrapy-pipeline-mongodb/", "project_urls": { "Homepage": "https://github.com/grammy-jiang/scrapy-pipeline-mongodb" }, "release_url": "https://pypi.org/project/scrapy-pipeline-mongodb/0.0.7/", "requires_dist": [ "txmongo", "scrapy (>=1.4.0)" ], "requires_python": "", "summary": "", "version": "0.0.7" }, "last_serial": 3438657, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "813752bb8c67cf0bd5ad8a29de027b5c", "sha256": "ebf5618507fb3f317280f1cd8432b49e4da1b5bc29138ddd4619ff09f4170f90" }, "downloads": -1, "filename": "scrapy_pipeline_mongodb-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "813752bb8c67cf0bd5ad8a29de027b5c", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 11137, "upload_time": "2017-10-27T03:33:33", "url": "https://files.pythonhosted.org/packages/08/5a/dc90c44c87202b7ff1132da2e5a2f1b76ba3708b3f1b808f43560d71db49/scrapy_pipeline_mongodb-0.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "b57e267b30a484ca3622c77e25750d41", "sha256": "0e97fe3a23628eb914581df17e778580d50b4c5e9de0ab123deae6e868bb25ba" }, "downloads": -1, "filename": "scrapy-pipeline-mongodb-0.0.1.tar.gz", "has_sig": false, "md5_digest": "b57e267b30a484ca3622c77e25750d41", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 23899, "upload_time": "2017-10-27T03:33:49", "url": "https://files.pythonhosted.org/packages/07/07/c5a44c7b4f54ad014abb7a3025d0271fd257c6883b90063f9a03532d63c5/scrapy-pipeline-mongodb-0.0.1.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "5407251cf5e355224bb067adc2c0faf8", "sha256": "27b5eb49c63e92c86a2bda92206f04c9a4bfa2d48419dd2a997549457dd68c8f" }, "downloads": -1, "filename": "scrapy_pipeline_mongodb-0.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "5407251cf5e355224bb067adc2c0faf8", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 11187, "upload_time": "2017-10-27T09:15:52", "url": "https://files.pythonhosted.org/packages/e6/3a/95a64731325497c4c8bc4209a3092de62335c018941ab3b0c965026c99d5/scrapy_pipeline_mongodb-0.0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "0c2d2bacfdc494ced8b0d36dba985a63", "sha256": "73856e2c245133a7e2640842b2edc3cdc55a4e1465cd40bc8f06170befec46a4" }, "downloads": -1, "filename": "scrapy-pipeline-mongodb-0.0.3.tar.gz", "has_sig": false, "md5_digest": "0c2d2bacfdc494ced8b0d36dba985a63", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 23947, "upload_time": "2017-10-27T09:15:55", "url": "https://files.pythonhosted.org/packages/91/e2/3951970b7c5cf7ed7d36f88e623d0c0982f6327e78e4c699a1fe60730cb4/scrapy-pipeline-mongodb-0.0.3.tar.gz" } ], "0.0.4": [ { "comment_text": "", "digests": { "md5": "7575ac4e7c1975463be56a29bb89fcb7", "sha256": "82ec76e17e638e2eb80a99a8fe153fef22271340f54287ffb8985eff8837f163" }, "downloads": -1, "filename": "scrapy_pipeline_mongodb-0.0.4-py3-none-any.whl", "has_sig": false, "md5_digest": "7575ac4e7c1975463be56a29bb89fcb7", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 11180, "upload_time": "2017-10-30T03:18:57", "url": "https://files.pythonhosted.org/packages/b4/5d/959f3b59c77c408ccbfdf8c2b1d234130f561523256af3b056a69c3ce859/scrapy_pipeline_mongodb-0.0.4-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "517ff1a404672ff441b8b77a4dbcb9cf", "sha256": "14e273a7af183ce47afca2fa26c07216b271e25feedddc6f85ff3884752b713f" }, "downloads": -1, "filename": "scrapy-pipeline-mongodb-0.0.4.tar.gz", "has_sig": false, "md5_digest": "517ff1a404672ff441b8b77a4dbcb9cf", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 23949, "upload_time": "2017-10-30T03:18:58", "url": "https://files.pythonhosted.org/packages/5f/a1/b1a09496f612270adf3121e6ddcc98b95f68f56aa36d2a0744edb1e70893/scrapy-pipeline-mongodb-0.0.4.tar.gz" } ], "0.0.5": [ { "comment_text": "", "digests": { "md5": "258d9f64bad90d1ca65dbf94d1057810", "sha256": "8cfbfe5add63ba47a9cee476a300eec1485a62d8e92e9440210a4f0ca33e8c13" }, "downloads": -1, "filename": "scrapy_pipeline_mongodb-0.0.5-py3-none-any.whl", "has_sig": false, "md5_digest": "258d9f64bad90d1ca65dbf94d1057810", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 10997, "upload_time": "2017-11-01T01:49:07", "url": "https://files.pythonhosted.org/packages/5c/69/0b1304986389359ed0024ee7def1228d98dfd3ff659e72946721dbabf378/scrapy_pipeline_mongodb-0.0.5-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3efac12fe06b73c52054205d228c1856", "sha256": "38b61cfe71e740a3197302c279856d9cee395899ef0b7bccfb4994069cbdbc49" }, "downloads": -1, "filename": "scrapy-pipeline-mongodb-0.0.5.tar.gz", "has_sig": false, "md5_digest": "3efac12fe06b73c52054205d228c1856", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 23959, "upload_time": "2017-11-01T01:49:08", "url": "https://files.pythonhosted.org/packages/c8/85/be15c0f56ec41378519c4d91f5187c322948ece0b9ad750747428d7a425f/scrapy-pipeline-mongodb-0.0.5.tar.gz" } ], "0.0.7": [ { "comment_text": "", "digests": { "md5": "20cb6519b076c1374ff553dc24568adb", "sha256": "361adf1239091aed439ac9064ee2443245e3b03213ad07d4509b443d20453581" }, "downloads": -1, "filename": "scrapy_pipeline_mongodb-0.0.7-py3-none-any.whl", "has_sig": false, "md5_digest": "20cb6519b076c1374ff553dc24568adb", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 10990, "upload_time": "2017-12-23T06:54:39", "url": "https://files.pythonhosted.org/packages/8f/0d/042688838e70247e4fb582b2306b9e08fcc558b329a430780a76d86be55f/scrapy_pipeline_mongodb-0.0.7-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "59209415f93de793b63f1854cdc1c17a", "sha256": "e8bce330aa6cf94d13d1077c56bdb1961b74e60f83789d62ecb32ef1720af444" }, "downloads": -1, "filename": "scrapy-pipeline-mongodb-0.0.7.tar.gz", "has_sig": false, "md5_digest": "59209415f93de793b63f1854cdc1c17a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 23957, "upload_time": "2017-12-23T06:54:40", "url": "https://files.pythonhosted.org/packages/e2/e8/b878bb56b3e462d0cd08b238e7907343db6ddee7174ac930fd556f96a0f3/scrapy-pipeline-mongodb-0.0.7.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "20cb6519b076c1374ff553dc24568adb", "sha256": "361adf1239091aed439ac9064ee2443245e3b03213ad07d4509b443d20453581" }, "downloads": -1, "filename": "scrapy_pipeline_mongodb-0.0.7-py3-none-any.whl", "has_sig": false, "md5_digest": "20cb6519b076c1374ff553dc24568adb", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 10990, "upload_time": "2017-12-23T06:54:39", "url": "https://files.pythonhosted.org/packages/8f/0d/042688838e70247e4fb582b2306b9e08fcc558b329a430780a76d86be55f/scrapy_pipeline_mongodb-0.0.7-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "59209415f93de793b63f1854cdc1c17a", "sha256": "e8bce330aa6cf94d13d1077c56bdb1961b74e60f83789d62ecb32ef1720af444" }, "downloads": -1, "filename": "scrapy-pipeline-mongodb-0.0.7.tar.gz", "has_sig": false, "md5_digest": "59209415f93de793b63f1854cdc1c17a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 23957, "upload_time": "2017-12-23T06:54:40", "url": "https://files.pythonhosted.org/packages/e2/e8/b878bb56b3e462d0cd08b238e7907343db6ddee7174ac930fd556f96a0f3/scrapy-pipeline-mongodb-0.0.7.tar.gz" } ] }