{ "info": { "author": "Kenneth Hau", "author_email": "kitchun0402@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "# WebScraping_Instagram_igpicker \"latest\n\n\n**Date for record: 6th October 2019 by Kenneth Hau**\n\n**Renamed the package from igenemy to igpicker**\n\n**Updated: Hashtag combination, Starting post**\n## About\nThis is used to scrape images / videos from Instagam by using chrome driver.\n\nBy setting those parameters, you can easily scrape either images or videos or both as well as select your designated path to save them.\n\nIt will automatically create folders in a location where you stated in 'save_to_path'. Those folders are named by each username and hashtag.\n\nBefore scraping, you will be informed to login your IG account in order to smoothen the scraping process. Don't worry, it wouldn't store your username and password.\n## Please follow the instructions below to install chrome driver on Colab\n```\n!apt install chromium-chromedriver\n!cp /usr/lib/chromium-browser/chromedriver /usr/bin\n!apt-get update\nset chromedriver_path = 'chromedriver'\nset chrome_headless = True\n```\n## Install\n```\npip install igpicker\n```\n## Upgrade\n```\npip install igpicker --upgrade\n```\n## Library used\nselenium, bs4, time, getpass, IPython, urllib, os, re, tqdm, wget, ssl\n## Reminder\nSometimes it may not run properly after an intensive scraping. Please wait for a while and start your scraping journey again.\n## Limitations\n- Only allows scraping either by 'username' or 'hashtag' at the same time (but you can easily change 'target_is_hashtag' parameter after finishing your first scraping)\n- Only allows chromedriver\n- Only allows to set the total number of posts you want\n## Possible function that can be created in the future\nStore each post's information (e.g. like, post time, post location, post description, users' number of followers, etc.) into dataframes, or even consolidate them into databases. Therefore, they can be used to do descriptive analysis, train up machine learning models or build up a recommendation system. \n## Parameters & Attributes\n**(1) target** : A list of string(s), default: []\n - either target username(s) or hashtag(s), if they are hashtags, 'target_is_hashtag' must be set to True\n\n**(2) target_is_hashtag** : Boolean, default: False\n - True: you want to scrape by using hashtags\n\n**(3) chromedriver_path** : String, default: './chromedriver'\n - a path of your chrome driver, you should name your driver as 'chromedrive'\n\n**(4) save_to_path** : String, default: '.'\n - a path where the image(s) / video(s) will be saved into\n\n**(5) chromedriver_autoquit** : Boolean, default: True\n - True: automatically quit the driver after finishing the scraping\n - if you don't want it, you can quit the driver manually by using a build-in function called 'close_driver'\n\n**(6) chrome_headless** : Boolean, default: True\n - True: run chrome driver in the backend\n - if you want to see how the chrome driver works, you can set it to False\n\n**(7) save_img** : Boolean, default: True\n - True: save images\n\n**(8) save_video** : Boolean, default: False\n - True: save videos \n\n**(9) enable_gpu** : Boolean, default: False\n - True: enable gpu in chrome driver\n\n**(10) ipython_display_image** : Boolean, default: False\n - True: display images, only works in notebook but not terminal\n - if you set True when using terminal to display, it will fail to scrape images\n## Methods\n**(1) login** : no parameter is required, return chrome_driver\n - used to access Instagram\n\n**(2) scraper** : two parameters (chrome_driver, num_post), return a list of all targeted url\n\n```\n(a) chrome_driver : Selenium Webdriver\n - used for web scraping\n\n(b) num_post : int, default: 10\n - the total number of posts you want to scrape\n - if this number is beyond the actual number of posts, it will stop scraping automatically\n\n(c) start_from: unsigned int, default: 1\n - the start post of scraping\n\n(d) hashtag_combination: list, default: []\n - only scrap posts matching all designated hashtags\n```\n\n**(3) close_driver** : one parameters (chrome_driver)\n - manually close the web driver\n## Import Library\n```\nfrom igpicker import IGpicker\n```\n## Example 1 (Normal flow):\n```\nigpicker = IGpicker(target = ['hkfoodtalk', 'sportscenter'], target_is_hashtag = False, chromedriver_path= './chromedriver',\n save_to_path = './', chromedriver_autoquit = False,\n chrome_headless= True, save_img=True, save_video=False, enable_gpu = False, \n ipython_display_image = True)\n\nchrome_driver = igpicker.login()\n\nall_target = igpicker.scraper(chrome_driver = chrome_driver, num_post = 10, start_from = 1)\n\nigpicker.close_driver(chrome_driver) #manually close if 'chromedriver_autoquit' is False\n```\n## Example 2 (Hashtag Combination):\n```\nigpicker = IGpicker(target = ['pizza'], target_is_hashtag = True, chromedriver_path= 'chromedriver',\n save_to_path = './', chromedriver_autoquit = False,\n chrome_headless= True, save_img=True, save_video=True, enable_gpu = False, \n ipython_display_image = False)\n\nchrome_driver = igpicker.login()\n\nall_target = igpicker.scraper(chrome_driver = chrome_driver, num_post = 10, start_from = 10, hashtag_combination= ['cheesy', 'cheese'])\n```\n## Example 3 (Change attributes):\n```\nigpicker.save_to_path = '../' #change path\n\nigpicker.target = ['burger', 'hkmusic','pasta'] #change target\n\nigpicker.target_is_hashtag = True #is it \"hashtag\" page?\n\nigpicker.save_video = True\n\nigpicker.save_img = True\n\nall_target = igpicker.scraper(chrome_driver = chrome_driver, num_post = 10) \n```\n**You can run 'scraper' again after you change the parameters if you haven't closed the chrome driver.**\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/kitchun0402/WebScraping_Instagram_igpicker", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "igpicker", "package_url": "https://pypi.org/project/igpicker/", "platform": "", "project_url": "https://pypi.org/project/igpicker/", "project_urls": { "Homepage": "https://github.com/kitchun0402/WebScraping_Instagram_igpicker" }, "release_url": "https://pypi.org/project/igpicker/0.4.4/", "requires_dist": [ "selenium (>=3.141.0)", "bs4 (>=0.0.1)", "ipython (>=7.8.0)", "wget (>=3.2)", "tqdm (>=4.36.1)" ], "requires_python": ">= 3", "summary": "WebScraping_Instagram", "version": "0.4.4" }, "last_serial": 5960607, "releases": { "0.4.0": [ { "comment_text": "", "digests": { "md5": "2b4635715ebb5e70e9660b3d7617afcd", "sha256": "a5cb3fca4b5de64b51f9f7162c13908e24fb0b12ef2c147b661f1b67d5889e48" }, "downloads": -1, "filename": "igpicker-0.4.0-py3-none-any.whl", "has_sig": false, "md5_digest": "2b4635715ebb5e70e9660b3d7617afcd", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">= 3", "size": 10647, "upload_time": "2019-10-06T14:47:52", "url": "https://files.pythonhosted.org/packages/fa/14/c0a5c84ebc88f3415bf3c1932a1414bcfe439200275731d1e7412950fda6/igpicker-0.4.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "b5c7345a485f317268adaee3f62211eb", "sha256": "56777a302d49aaa08972da284ea02888b52e34d4806e932ff3b947ac7369b081" }, "downloads": -1, "filename": "igpicker-0.4.0.tar.gz", "has_sig": false, "md5_digest": "b5c7345a485f317268adaee3f62211eb", "packagetype": "sdist", "python_version": "source", "requires_python": ">= 3", "size": 9008, "upload_time": "2019-10-06T14:47:54", "url": "https://files.pythonhosted.org/packages/50/2c/e9b3cde03d143f463930b3e2d05a8572a73d2f029d0cdb087b53e405289b/igpicker-0.4.0.tar.gz" } ], "0.4.1": [ { "comment_text": "", "digests": { "md5": "6c6aed73419837254fc5536cf228d1cd", "sha256": "f82598d10776961a359575d6f11f0954d536ed0e1b1af93549b7969a2cf1884e" }, "downloads": -1, "filename": "igpicker-0.4.1-py3-none-any.whl", "has_sig": false, "md5_digest": "6c6aed73419837254fc5536cf228d1cd", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">= 3", "size": 10655, "upload_time": "2019-10-06T15:58:34", "url": "https://files.pythonhosted.org/packages/23/57/1800c04e9b67a386bea710d83e6e039d7ddfa8d3734fb4d7b1a5a234ca1b/igpicker-0.4.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "966108eeb81b852834cd59c77be04e85", "sha256": "26fbd586a0c878558243dfb687b725c65b7c9dd41dc9424554ecf3fdabe01ba8" }, "downloads": -1, "filename": "igpicker-0.4.1.tar.gz", "has_sig": false, "md5_digest": "966108eeb81b852834cd59c77be04e85", "packagetype": "sdist", "python_version": "source", "requires_python": ">= 3", "size": 9019, "upload_time": "2019-10-06T15:58:37", "url": "https://files.pythonhosted.org/packages/71/ce/604a399c8944e3043383a611948b0af4f53add95a5d8872e3349c7d5feee/igpicker-0.4.1.tar.gz" } ], "0.4.2": [ { "comment_text": "", "digests": { "md5": "2c65e5f51459086218f0ac8fbcbfb86b", "sha256": "921006d40b7071539a66f890d21fb3b6e633ac4fb50d253fdf0977216e2ce45f" }, "downloads": -1, "filename": "igpicker-0.4.2-py3-none-any.whl", "has_sig": false, "md5_digest": "2c65e5f51459086218f0ac8fbcbfb86b", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">= 3", "size": 10744, "upload_time": "2019-10-06T17:05:16", "url": "https://files.pythonhosted.org/packages/09/08/697a79a7c3f8cf4f2ed8a343678d12eb627aa156a745ffd3b727ca529025/igpicker-0.4.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "24c62c216c5784bf650b2905279ffef0", "sha256": "4c1373b142d2d498e7d6dbd6a7b9f1759c5b3c5a99cfc08731b563011e8304a0" }, "downloads": -1, "filename": "igpicker-0.4.2.tar.gz", "has_sig": false, "md5_digest": "24c62c216c5784bf650b2905279ffef0", "packagetype": "sdist", "python_version": "source", "requires_python": ">= 3", "size": 9195, "upload_time": "2019-10-06T17:05:19", "url": "https://files.pythonhosted.org/packages/ca/ee/cab2b07895d7f0b629a0a187b83e9c773b2c153ddc3123c2cd16bca29cf2/igpicker-0.4.2.tar.gz" } ], "0.4.3": [ { "comment_text": "", "digests": { "md5": "ff4399a135a5b2b479e277d8503377d6", "sha256": "e83f67db488dc4bdd67623b344bd2c2b70757e7d6880044afcc10afaaccac721" }, "downloads": -1, "filename": "igpicker-0.4.3-py3-none-any.whl", "has_sig": false, "md5_digest": "ff4399a135a5b2b479e277d8503377d6", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">= 3", "size": 10869, "upload_time": "2019-10-07T16:11:43", "url": "https://files.pythonhosted.org/packages/c5/ee/85a72451cebc258db110e209f9f6b59702ba2b320e96e4ee0a4e208a4d02/igpicker-0.4.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "a5d565625003506e65b6606bf4ee6502", "sha256": "fa760e735a4c34809d4fe43eb7a98726dfb009b8af560d1998f745df820d9d2a" }, "downloads": -1, "filename": "igpicker-0.4.3.tar.gz", "has_sig": false, "md5_digest": "a5d565625003506e65b6606bf4ee6502", "packagetype": "sdist", "python_version": "source", "requires_python": ">= 3", "size": 9405, "upload_time": "2019-10-07T16:11:49", "url": "https://files.pythonhosted.org/packages/f7/60/5ffc5326a7101977f13471e67b93f46e5c4e77247daf0ae574f4bd0bb6ae/igpicker-0.4.3.tar.gz" } ], "0.4.4": [ { "comment_text": "", "digests": { "md5": "e3658d9ea6dbfd57c881507811d55e6d", "sha256": "ff35e232d53f30a58cb21072e854c4be1d20e127c85bd7cfa1a532b2ea6a50d4" }, "downloads": -1, "filename": "igpicker-0.4.4-py3-none-any.whl", "has_sig": false, "md5_digest": "e3658d9ea6dbfd57c881507811d55e6d", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">= 3", "size": 10865, "upload_time": "2019-10-11T14:46:13", "url": "https://files.pythonhosted.org/packages/9d/3a/40b2f52e9e6e80776ea489cef71fc1a531601dc5c6ed7e43701a02507e32/igpicker-0.4.4-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "24b10deaa5595b2e8d6eff29cea5c6b7", "sha256": "cf724c9501b5fbb1d9556f525d21b87e38252479ab2bd611550bcaf9575ba526" }, "downloads": -1, "filename": "igpicker-0.4.4.tar.gz", "has_sig": false, "md5_digest": "24b10deaa5595b2e8d6eff29cea5c6b7", "packagetype": "sdist", "python_version": "source", "requires_python": ">= 3", "size": 9406, "upload_time": "2019-10-11T14:46:19", "url": "https://files.pythonhosted.org/packages/fe/1c/69c6616362dad7b03d9f47b652c19b16d1bf32576d4ea753dd5538b63d26/igpicker-0.4.4.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "e3658d9ea6dbfd57c881507811d55e6d", "sha256": "ff35e232d53f30a58cb21072e854c4be1d20e127c85bd7cfa1a532b2ea6a50d4" }, "downloads": -1, "filename": "igpicker-0.4.4-py3-none-any.whl", "has_sig": false, "md5_digest": "e3658d9ea6dbfd57c881507811d55e6d", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">= 3", "size": 10865, "upload_time": "2019-10-11T14:46:13", "url": "https://files.pythonhosted.org/packages/9d/3a/40b2f52e9e6e80776ea489cef71fc1a531601dc5c6ed7e43701a02507e32/igpicker-0.4.4-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "24b10deaa5595b2e8d6eff29cea5c6b7", "sha256": "cf724c9501b5fbb1d9556f525d21b87e38252479ab2bd611550bcaf9575ba526" }, "downloads": -1, "filename": "igpicker-0.4.4.tar.gz", "has_sig": false, "md5_digest": "24b10deaa5595b2e8d6eff29cea5c6b7", "packagetype": "sdist", "python_version": "source", "requires_python": ">= 3", "size": 9406, "upload_time": "2019-10-11T14:46:19", "url": "https://files.pythonhosted.org/packages/fe/1c/69c6616362dad7b03d9f47b652c19b16d1bf32576d4ea753dd5538b63d26/igpicker-0.4.4.tar.gz" } ] }