{ "info": { "author": "Junpei Kawamoto", "author_email": "kawamoto.junpei@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Intended Audience :: Science/Research", "License :: OSI Approved :: GNU General Public License v3 (GPLv3)", "Natural Language :: English", "Programming Language :: Python", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3.6", "Topic :: Scientific/Engineering :: Information Analysis", "Topic :: Software Development :: Libraries" ], "description": "Amazon Dataset Loader\n=====================\n\n|GPLv3| |Build Status| |Release| |PyPi|\n\n|Logo|\n\nFor the `Review Graph Mining project `__,\nthis package provides a loader of the `Six Categories of Amazon Product\nReviews dataset `__ provided by\n`Dr. Wang `__.\n\nInstallation\n------------\n\nUse pip to install this package.\n\n.. code:: shell\n\n $ pip install --upgrade rgmining-amazon-dataset\n\nNote that this installation will download a big data file from the\n`original web site `__.\n\nUsage\n-----\n\nThis package provides module ``amazon`` and this module provides\nfunction ``load``. The ``load`` function takes a graph object which\nimplements the `graph\ninterface `__\ndefined in `Review Graph Mining\nproject `__. The funciton ``load`` also\ntakes an optional argument, a list of categories. If this argument is\ngiven, only reviews for products which belong to the given categories\nwill be loaded.\n\nFor example, the following code constructs a graph object provides the\n`FRAUDAR `__\nalgorithm, loads the Amazon dataset, runs the algorithm, and then\noutputs names of anomalous reviewers. Since this dataset consists of\nhuge reviews, loading may take long time.\n\n.. code:: py\n\n import fraudar\n import amazon\n\n # Construct a graph and load the dataset.\n graph = fraudar.ReviewGraph()\n amazon.load(graph)\n\n # Run the analyzing algorithm.\n graph.update()\n\n # Print names of reviewers who are judged as anomalous.\n for r in graph.reviewers:\n if r.anomalous_score == 1:\n print r.name\n\n # The number of reviewers the dataset has: -> 634295.\n len(graph.reviewers)\n\n # The number of reviewers judged as anomalous: -> 91.\n len([r for r in graph.reviewers if r.anomalous_score == 1])\n\nNote that you may need to install the FRAUDAR algorithm for the Review\nMining Project by ``pip install rgmining-fraudar``.\n\nLicense\n-------\n\nThis software is released under The GNU General Public License Version\n3, see\n`COPYING `__ for\nmore detail.\n\nThe authors of the Trip Advisor dataset, which this software imports,\nrequires to cite the following papers when you publish research papers\nusing this package:\n\n- `Hongning Wang `__, `Yue\n Lu `__, and `ChengXiang\n Zhai `__, \"`Latent Aspect Rating\n Analysis without Aspect Keyword\n Supervision `__,\"\n In Proc. of the 17th ACM SIGKDD Conference on Knowledge Discovery and\n Data Mining (KDD'2011), pp.618-626, 2011;\n- `Hongning Wang `__, `Yue\n Lu `__, and `Chengxiang\n Zhai `__, \"`Latent Aspect Rating\n Analysis on Review Text Data: A Rating Regression\n Approach `__,\"\n In Proc. of the 16th ACM SIGKDD Conference on Knowledge Discovery and\n Data Mining (KDD'2010), pp.783-792, 2010.\n\n.. |GPLv3| image:: https://img.shields.io/badge/license-GPLv3-blue.svg\n :target: https://www.gnu.org/copyleft/gpl.html\n.. |Build Status| image:: https://travis-ci.org/rgmining/amazon.svg?branch=master\n :target: https://travis-ci.org/rgmining/amazon\n.. |Release| image:: https://img.shields.io/badge/release-0.5.1-brightgreen.svg\n :target: https://github.com/rgmining/amazon/releases/tag/v0.5.1\n.. |PyPi| image:: https://img.shields.io/badge/pypi-0.5.1-brightgreen.svg\n :target: https://pypi.python.org/pypi/rgmining-tripadvisor-dataset\n.. |Logo| image:: https://rgmining.github.io/amazon/_static/image.png\n :target: https://rgmining.github.io/amazon/", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/rgmining/amazon", "keywords": "", "license": "GPLv3", "maintainer": "", "maintainer_email": "", "name": "rgmining-amazon-dataset", "package_url": "https://pypi.org/project/rgmining-amazon-dataset/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/rgmining-amazon-dataset/", "project_urls": { "Homepage": "https://github.com/rgmining/amazon" }, "release_url": "https://pypi.org/project/rgmining-amazon-dataset/0.5.1/", "requires_dist": null, "requires_python": "", "summary": "An Amazon dataset for Review Graph Mining Project", "version": "0.5.1" }, "last_serial": 2975012, "releases": { "0.5.0": [ { "comment_text": "", "digests": { "md5": "8bd65dbec2a2a6f3055d376837a5329d", "sha256": "e3265caee889aebddc23fac2c1d69851a6834bfcd079732d860423d42aca88e9" }, "downloads": -1, "filename": "rgmining-amazon-dataset-0.5.0.tar.gz", "has_sig": false, "md5_digest": "8bd65dbec2a2a6f3055d376837a5329d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17428, "upload_time": "2017-01-22T07:50:53", "url": "https://files.pythonhosted.org/packages/b8/3e/de261242725fc17d2eae29126cd83e537387769357f757bae37f1a011db6/rgmining-amazon-dataset-0.5.0.tar.gz" } ], "0.5.0.post1": [ { "comment_text": "", "digests": { "md5": "69f287810705eb8083280900a6d49531", "sha256": "90f5584f922438a128b4301d16ff2c550420fd1248d5e5afe10c04616f158b82" }, "downloads": -1, "filename": "rgmining-amazon-dataset-0.5.0.post1.tar.gz", "has_sig": false, "md5_digest": "69f287810705eb8083280900a6d49531", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 54885, "upload_time": "2017-01-26T19:10:00", "url": "https://files.pythonhosted.org/packages/dc/93/ae7358b073a44e030db4d42e02664869680a50187680b5dff3b9a0a90fbb/rgmining-amazon-dataset-0.5.0.post1.tar.gz" } ], "0.5.1": [ { "comment_text": "", "digests": { "md5": "1542d55f386fb4f165b9bd5e4b75577f", "sha256": "9d2fc5bf16816f4838cba437b4cae0c623472c3fab75aa09326463b15269647c" }, "downloads": -1, "filename": "rgmining-amazon-dataset-0.5.1.tar.gz", "has_sig": false, "md5_digest": "1542d55f386fb4f165b9bd5e4b75577f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 56171, "upload_time": "2017-06-24T04:24:41", "url": "https://files.pythonhosted.org/packages/02/6d/82602dcf7f0281385e89b2236f05c665fb4ab693124c582d35f02a3b0609/rgmining-amazon-dataset-0.5.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "1542d55f386fb4f165b9bd5e4b75577f", "sha256": "9d2fc5bf16816f4838cba437b4cae0c623472c3fab75aa09326463b15269647c" }, "downloads": -1, "filename": "rgmining-amazon-dataset-0.5.1.tar.gz", "has_sig": false, "md5_digest": "1542d55f386fb4f165b9bd5e4b75577f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 56171, "upload_time": "2017-06-24T04:24:41", "url": "https://files.pythonhosted.org/packages/02/6d/82602dcf7f0281385e89b2236f05c665fb4ab693124c582d35f02a3b0609/rgmining-amazon-dataset-0.5.1.tar.gz" } ] }