{ "info": { "author": "Prayson Wilfred Daniel", "author_email": "praysonwilfred@gmail.com", "bugtrack_url": null, "classifiers": [ "Intended Audience :: Developers", "License :: OSI Approved :: BSD License", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Topic :: Software Development :: Libraries :: Python Modules" ], "description": "# TrustPilotReader\nUnofficial TrustPilot reviews collector. For Academic Use Only. READ: [TrustPilot Terms of Use](https://legal.trustpilot.com/end-user-terms-and-conditions)\n\n# Disclamer:\nYou, and you alone, are responsible for following TrustPilot terms and using this tool to gather their data. Respect\ntheir servers and be thoughtful when gathering large amount of data. \n\n\n# Unmatured Documetation :)\n\nThis code implements basic data scraping of TrustPilot [default:Danish] Reviews .\n\nIt is a prototype to be used for academic reasons only.\nTrustPilot offers APIs to gather their data\n \n# Get it from PyPI\n```bash\npip install trustpilotreviews\n```\n\n\n# How to use it:\n\nImport package\n\n```python\nfrom trustpilotreviews import GetReviews\n```\n\n# 1. Initiat Class \n\nInitiate the class with either (a) passing a dictionary of companies as keys\nand companies TrustPilot id as items or (b) adding them with dictionary syntax.\n\ne.g.\n```python\n# way a: Using dictionary with business ids\nid_dict = {'Skat':'470bce96000064000501e32d','DR':'4690598c00006400050003ee'}\nd = GetReviews(id_dict)\n\n# ids dictionary can be loaded from text files e.g.\nlines = np.genfromtxt('data/business_ids.csv', delimiter=',',\n dtype=str,skip_header=1) #skipped header\ncsv_dict = {key:item for key, item in lines}\nd = GetReviews(csv_dict)\n\n# way b: Using dictionary assignment \nd = GetReviews()\nd['Skat'] = '470bce96000064000501e32d'\n```\n \nNo business ids, no problem:\n\n```python\nfrom trustpilotreviews import GetReviews\n\n# Initiate it. Language will be required\nt = GetReviews()\n\n# Pass in web-page address as it appears in trustpilot.com\nmate_id = t.get_id('www.mate.bike')\n\n# Check if everything is ok\nif mate_id.ok:\n print(mate.business_id)\n\n# Gather data from that id\ndata = t.get_reviews() \n \n ```\n \n Having multiple websites, well, no problem:\n \n ```python\nfrom trustpilotreviews import GetReviews\n\nt = GetReviews()\n\n# pass multiple web-pages as a list\nids = t.get_ids(['www.ford.dk','www.mate.bike'])\n\nprint(ids) # same as print(t) as ids are added to que\n\n# gather data for those ids \ndata = t.get_reviews() \n ```\n\nWant to save it on a database instead of Pandas, done:\n\n ```python\nfrom trustpilotreviews import GetReviews\n\nt = GetReviews()\nids = t.get_ids(['www.ford.dk','www.mate.bike'])\n\n# mine data for those ids \nt.get_reviews()\n\n# send them to in memory database\nt.send_db('../data/','reviews') \n ```\n \n \n## 2. Reading Data\n\n\n```python\ndf = pd.DataFrame(t.dictData)\n```\nor from stored source\n\n```python\ndf = pd.read_pickle('TrustPilotData.pkl')\n```\n# A full example:\n\n```python\nimport numpy as np\nfrom trustpilotreviews import GetReviews\n\n\n# Dictionary from Data \nlines = np.genfromtxt('companies_ids.csv', delimiter=',',\n dtype=str, skip_header=1)\ncsv_dict = {key: item for key, item in lines}\n\nd = GetReviews(csv_dict) # Select no for Norwegian Reviews\nd.gather_data()\n\n# Saves as pandas dataframe pickle\nd.save_data(file_name='NoTrustPilotData')\n```\n\n# TODOs:\n * Allow different saving formats e.g. df.to_XXX\n * Split page_review funciton into connection and data parsing (better way to handler bad requests)\n * Add more features\n * Write a better documetation\n\n## [TrustPilot Terms of Use](https://legal.trustpilot.com/end-user-terms-and-conditions)\n\n![c091684c-879c-4d6e-90c6-92fbc53cb676](https://user-images.githubusercontent.com/14926709/43354373-980e2882-924b-11e8-8b85-237f3e4b1dde.jpeg)\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/Proteusiq/TrustPilotReader", "keywords": "", "license": "", "maintainer": "Prayson Wilfred Daniel", "maintainer_email": "praysonwilfred@gmail.com", "name": "trustpilotreviews", "package_url": "https://pypi.org/project/trustpilotreviews/", "platform": "", "project_url": "https://pypi.org/project/trustpilotreviews/", "project_urls": { "Homepage": "https://github.com/Proteusiq/TrustPilotReader", "Repository": "https://github.com/Proteusiq/TrustPilotReader" }, "release_url": "https://pypi.org/project/trustpilotreviews/0.1.2/", "requires_dist": [ "requests (>=2.20,<3.0)", "pandas (>=0.23.0,<0.24.0)", "numpy (>=1.15,<2.0)", "dataset (>=1.1,<2.0)", "stuf (>=0.9.16,<0.10.0)", "tqdm (>=4.32,<5.0)" ], "requires_python": ">=3.6,<4.0", "summary": "Unoffice TrustPilot API to download reviews scores and contents", "version": "0.1.2" }, "last_serial": 5405952, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "44cccc40b1e00407822964f93aaae3d1", "sha256": "bdf49e6e38e5d9dd2bf7ef1d05848a7ebec7bf6bc929a2ecac60bc4148133943" }, "downloads": -1, "filename": "trustpilotreviews-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "44cccc40b1e00407822964f93aaae3d1", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6,<4.0", "size": 6973, "upload_time": "2019-06-15T18:01:38", "url": "https://files.pythonhosted.org/packages/73/f0/31e7367b1a549d6fb07b52cb4c8f94b208339529c9482f019eb46f359d52/trustpilotreviews-0.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "98006c91914850214e3e14d666e9753d", "sha256": "5bf7f5b28ebbf824601eb56d81b20df933fe81d3fda3fa2d1dd4db2ad08f8171" }, "downloads": -1, "filename": "trustpilotreviews-0.1.0.tar.gz", "has_sig": false, "md5_digest": "98006c91914850214e3e14d666e9753d", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6,<4.0", "size": 5997, "upload_time": "2019-06-15T18:01:40", "url": "https://files.pythonhosted.org/packages/e0/63/8e8effcd9df481c0f79e8b1f510c4fd686238511c264aa153c89ef561022/trustpilotreviews-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "577d020e4230d78e3c68370b06bb7ac4", "sha256": "69e30fce8169bf92ef916bbdd1435409fb6c42f748f3187681cdcd5c2c78e576" }, "downloads": -1, "filename": "trustpilotreviews-0.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "577d020e4230d78e3c68370b06bb7ac4", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6,<4.0", "size": 8410, "upload_time": "2019-06-15T18:12:53", "url": "https://files.pythonhosted.org/packages/d9/80/66c8dbbef0662b64ddcee8ba8990806cddc57a7348c850d2981e8f3de9dd/trustpilotreviews-0.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "5e52acead624c57ef8286e169fd93c03", "sha256": "a5418b022ac9823ff5a2c21c86260fba62b780b2f5b30d710ba7cfbda58ebd3c" }, "downloads": -1, "filename": "trustpilotreviews-0.1.1.tar.gz", "has_sig": false, "md5_digest": "5e52acead624c57ef8286e169fd93c03", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6,<4.0", "size": 7785, "upload_time": "2019-06-15T18:12:54", "url": "https://files.pythonhosted.org/packages/79/be/53f22230c12007eb5ab7440ae75abf4b18c17a90cdd33715ba36c72029b8/trustpilotreviews-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "c1a851e117fa7e570a1231ef39771aad", "sha256": "8a831dfc334e1fd2af6473adff32278ddd46366341ce626b637fb665615c7ca6" }, "downloads": -1, "filename": "trustpilotreviews-0.1.2-py3-none-any.whl", "has_sig": false, "md5_digest": "c1a851e117fa7e570a1231ef39771aad", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6,<4.0", "size": 8556, "upload_time": "2019-06-16T08:36:01", "url": "https://files.pythonhosted.org/packages/23/bb/6d158741041a15b22f5ed0e5a4a5087eab23301801ede8852f22aae44615/trustpilotreviews-0.1.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1bfbd8de1611b5e5d256d0fdeec203ee", "sha256": "522449da496d0a556a35a10779581532b3abbada525746584be6db761a1bdaef" }, "downloads": -1, "filename": "trustpilotreviews-0.1.2.tar.gz", "has_sig": false, "md5_digest": "1bfbd8de1611b5e5d256d0fdeec203ee", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6,<4.0", "size": 8013, "upload_time": "2019-06-16T08:36:03", "url": "https://files.pythonhosted.org/packages/f4/e8/486ba44de22a8348939a387274da71d9067dfaf191499a0d40f83db6967e/trustpilotreviews-0.1.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "c1a851e117fa7e570a1231ef39771aad", "sha256": "8a831dfc334e1fd2af6473adff32278ddd46366341ce626b637fb665615c7ca6" }, "downloads": -1, "filename": "trustpilotreviews-0.1.2-py3-none-any.whl", "has_sig": false, "md5_digest": "c1a851e117fa7e570a1231ef39771aad", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6,<4.0", "size": 8556, "upload_time": "2019-06-16T08:36:01", "url": "https://files.pythonhosted.org/packages/23/bb/6d158741041a15b22f5ed0e5a4a5087eab23301801ede8852f22aae44615/trustpilotreviews-0.1.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1bfbd8de1611b5e5d256d0fdeec203ee", "sha256": "522449da496d0a556a35a10779581532b3abbada525746584be6db761a1bdaef" }, "downloads": -1, "filename": "trustpilotreviews-0.1.2.tar.gz", "has_sig": false, "md5_digest": "1bfbd8de1611b5e5d256d0fdeec203ee", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6,<4.0", "size": 8013, "upload_time": "2019-06-16T08:36:03", "url": "https://files.pythonhosted.org/packages/f4/e8/486ba44de22a8348939a387274da71d9067dfaf191499a0d40f83db6967e/trustpilotreviews-0.1.2.tar.gz" } ] }