{ "info": { "author": "Hrissimir", "author_email": "hrisimir.dakov@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Programming Language :: Python" ], "description": "===========\nscrape_jobs\n===========\n\n\nCLI jobs scraper targeting multiple sites\n\n\nDescription\n===========\n\n\nCurrently supported sites\n-------------------------\n\n- linkedin.com\n- seek.com.au\n\n\nInstallation\n------------\n\n`pip install -U scrape_jobs`\n\n\nShort instructions:\n-------------------\n\n- Ensure your machine has available chromedriver in path\n- Prepare upload spreadsheet (detailed instructions bellow)\n- Open cmd / terminal\n- Install / Update from pip with `pip install -U scrape_jobs`\n- Call `scrape-jobs-init-config` to init sample .ini file in the current dir\n- Edit the config file and save\n- Call `scrape-jobs SITE scrape-jobs.ini` and let it roll\n- Check your spreadsheet after execution completes\n\n\nLong Instructions:\n------------------\n\n- Prepare the spreadsheet and the spreadsheet's secrets .json (instructions at the bottom)\n\n - In seek.com.au results sheet, set the first row values (the header) to:\n\n 'date', 'location', 'title', 'company', 'classification', 'url', 'is_featured', 'salary'\n\n- Open a cmd/terminal\n\n- Navigate to some work folder of your choice (e.g \"c:\\job_scrape\", referred to later as CWD)\n\n- Init empty config file by calling `scrape-jobs-init-config`\n\n- Edit the newly created `CWD\\\\scrape-jobs.ini` with params of your choice\n\n - set the path to the CREDENTIALS SECRETS .JSON\n\n - set the proper spreadsheet name\n\n - set the index of the worksheet where you want the results to be uploaded (0-based index)\n\n- To trigger a seek.com.au scrape (currently this is the only supported site):\n\n - run `scrape-jobs seek.com.au scrape-jobs.ini`\n\n - you will see output in the console, but a scrape-jobs.log will be created too\n\n - to have more detailed output call `scrape-jobs -vv seek.com.au scrape-jobs.ini` instead\n\n- After the scrape is complete you should see the newly discovered jobs in your spreadsheet\n\n- Alternatively you can init a config at a known place and just pass it's path:\n\n `scrape-jobs seek.com.au abs\\path\\to\\your_config.ini`\n\n\nNote\n====\n\nYou need to prepare secrets .json file in advance that is to be used for authentication with GoogleSheets\n\nThe term 'Spreadsheet' refers to a single document that is shown in the GoogleSpreadsheets landing page\n\nA single 'Spreadsheet' can contain one or more 'worksheets'\n\nUsually a newly created 'Spreadsheet' contains a single 'worksheet' named 'Sheet1'\n\n\nInstructions for preparing a shared Google Spreadsheet CREDENTIALS SECRETS .JSON:\n---------------------------------------------------------------------------------\n\n 1. Go to https://console.developers.google.com/\n\n 2. Login with the google account that is to be owner of the 'Spreadsheet'.\n\n 3. At the top-left corner, there is a drop-down right next to the \"Google APIs\" text\n\n 4. Click the drop-down and a modal-dialog will appear, then click \"NEW PROJECT\" at it's top-right\n\n 5. Name the project relevant to how the sheet is to be used, don't select 'Location*', just press 'CREATE'\n\n 6. Open the newly created project from the same drop-down as in step 3.\n\n 7. There should be 'APIs' area with a \"-> Go to APIs overview\" at it's bottom - click it\n\n 8. A new page will load having '+ ENABLE APIS AND SERVICES' button at the top side's middle - click it\n\n 9. A new page will load having a 'Search for APIs & Services' input - use it to find and open 'Google Drive API'\n\n 10. In the 'Google Drive API' page click \"ENABLE\" - you'll be redirected back to the project's page\n\n 11. There will be a new 'CREATE CREDENTIALS' button at the top - click it\n\n 12. Setup the new credentials as follows:\n\n - Which API are you using? -> 'Google Drive API'\n\n - Where will you be calling the API from? -> 'Web server (e.g. node.js, Tomcat)\n\n - What data will you be accessing? -> 'Application data'\n\n - Are you planning to use this API with App Engine or Compute Engine? -> No, I'm not using them.\n\n 13. Click the blue button 'What credentials do I need', will take you to 'Add credentials to you project' page\n\n 14. Setup the credentials as follows:\n\n - Service account name: {whatever name you type is OK, as long the input accepts it}\n\n - Role: Project->Editor\n\n - Key type: JSON\n\n 15. Press the blue 'Continue' button, and a download of the CREDENTIALS SECRETS .JSON file will begin (store it safe)\n\n 16. Close the modal and go back to the project 'Dashboard' using the left-side navigation panel\n\n 17. Repeat step 8.\n\n 18. Search for 'Google Sheets API', then open the result and click the blue 'ENABLE' button\n\n 19. Open the downloaded secrets.json file and copy the value of the 'client_email'\n\n 20. Using the same google account as in step 2. , go to the normal google sheets and create & open the 'Spreadsheet'\n\n - do a final renaming to the spreadsheet now to avoid coding issues in future\n\n 21. 'Share' the document with the email copied in step 19., giving it 'Edit' permissions\n\n - you might want to un-tick 'Notify people' before clicking 'Send' as it's a service email you're sharing with\n\n - 'Send' will change to 'OK' upon un-tick, but we're cool with that - just click it.\n\n You are now ready to use this class for retrieving 'Spreadsheet' handle in the code!\n\n\n", "description_content_type": "text/x-rst; charset=UTF-8", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/Hrissimir/scrape_jobs", "keywords": "", "license": "mit", "maintainer": "", "maintainer_email": "", "name": "scrape-jobs", "package_url": "https://pypi.org/project/scrape-jobs/", "platform": "any", "project_url": "https://pypi.org/project/scrape-jobs/", "project_urls": { "Documentation": "https://pyscaffold.org/", "Homepage": "https://github.com/Hrissimir/scrape_jobs" }, "release_url": "https://pypi.org/project/scrape-jobs/2.1.1/", "requires_dist": [ "hed-utils (==1.6.0)", "pytest ; extra == 'testing'", "pytest-cov ; extra == 'testing'", "flake8 ; extra == 'testing'" ], "requires_python": ">=3.6.5", "summary": "CLI jobs scraper targeting multiple sites", "version": "2.1.1" }, "last_serial": 5821823, "releases": { "2.1.1": [ { "comment_text": "", "digests": { "md5": "857c1ddba041b2bdaddaf0552b8fcd76", "sha256": "4881416894ee1eeb50b127daae26eab56a12d0a0244a5a881b5e542d766a9f1f" }, "downloads": -1, "filename": "scrape_jobs-2.1.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "857c1ddba041b2bdaddaf0552b8fcd76", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=3.6.5", "size": 35260, "upload_time": "2019-09-12T18:09:22", "url": "https://files.pythonhosted.org/packages/20/50/3736ebc6689ef66f71649332cd80861d1cd96e81a3631f38281bb52df386/scrape_jobs-2.1.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "48239dbdbe618d1f7b3bc62eb530bbb6", "sha256": "fd81aec15b706163cda03c191a59dff8161b410bf931678d20d17183dd533323" }, "downloads": -1, "filename": "scrape_jobs-2.1.1.tar.gz", "has_sig": false, "md5_digest": "48239dbdbe618d1f7b3bc62eb530bbb6", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6.5", "size": 1258696, "upload_time": "2019-09-12T18:09:25", "url": "https://files.pythonhosted.org/packages/43/83/d7c72578246bbfe1e89ac0dae7a319c2e45b3077c522dadcbff967b6d39b/scrape_jobs-2.1.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "857c1ddba041b2bdaddaf0552b8fcd76", "sha256": "4881416894ee1eeb50b127daae26eab56a12d0a0244a5a881b5e542d766a9f1f" }, "downloads": -1, "filename": "scrape_jobs-2.1.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "857c1ddba041b2bdaddaf0552b8fcd76", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=3.6.5", "size": 35260, "upload_time": "2019-09-12T18:09:22", "url": "https://files.pythonhosted.org/packages/20/50/3736ebc6689ef66f71649332cd80861d1cd96e81a3631f38281bb52df386/scrape_jobs-2.1.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "48239dbdbe618d1f7b3bc62eb530bbb6", "sha256": "fd81aec15b706163cda03c191a59dff8161b410bf931678d20d17183dd533323" }, "downloads": -1, "filename": "scrape_jobs-2.1.1.tar.gz", "has_sig": false, "md5_digest": "48239dbdbe618d1f7b3bc62eb530bbb6", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6.5", "size": 1258696, "upload_time": "2019-09-12T18:09:25", "url": "https://files.pythonhosted.org/packages/43/83/d7c72578246bbfe1e89ac0dae7a319c2e45b3077c522dadcbff967b6d39b/scrape_jobs-2.1.1.tar.gz" } ] }