{ "info": { "author": "onpositive", "author_email": "vasily@onpositive.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# Workspace Puller\n\n**A cli tool for building your python workspace and upload results to the Google Drive or Kaggle.com**\n\n## Description\n\nThis tools allows to: \n * download workspace files or archives and extract(bztar, gztar, tar, xztar, zip) it if needed.\n * clone git repos\n * execute a python files\n * upload result to the Google Drive\n * upload result to the Kaggle.com\n\n## Installation:\n\n`pip install workspace-puller`\n\n## Usage\n\n`workspace-puller --config_url {url to the config file} --bot_token {telegram bot token}`\n\n**config_url** parameter specifies the URL to the configuration file (Required). Enclose the value in quotes (Recomended).\n\n**bot_token** parameter sets the Telegram bot token (Optional). [See: Creating a new bot](https://core.telegram.org/bots#6-botfather)\n\n#### Authorization\n\nThe `workspace-puller` has two Google Drive authorization methods:\n\n* local CLI. Just add [Google Drive folders IDs](#Google-drive-uploading) in config. .\n\n* remote via telegram chat bot. You need to add [Google Drive folders IDs](#Google-drive-uploading) in config, [telegram chat IDs](#Telegram-channels) and specify CLI param `--bot_token`.\nThe Telegram bot will send a authorization link, the user must visit link and send the Google token in the response message.\n\nKaggle credentials should be set inside configuration file in the [Kaggle](#Kaggle-uploading) settings section.\n\nAfter successful authorization Workspace Puller saves credentials inside workspace.\n\n## Configuartion\n\nConfiguration file has a YAML format. The file must be accessible through HTTP. \n\n#### Dataset list\n\nList of files to download. \nBztar, gztar, tar, xztar, zip files are unzipped by default. If it is not necessary, specify the node `extract_archives:` with a value `false`. (Default value: true)\n\n*Example:*\n```\ndataset_list:\n - https://tlk.s3.yandex.net/dataset/TlkAgg2.zip\n - https://tlk.s3.yandex.net/dataset/TlkPersonaChatRus.zip\n - https://tlk.s3.yandex.net/dataset/TlkAgg5.zip\n - https://tlk.s3.yandex.net/dataset/LRWC.zip\n - https://tlk.s3.yandex.net/dataset/WordSenseRus.zip\nextract_archives: false \n```\n\n#### Git repos\n\nRepos list for cloning. The repository will clone into a folder with the name specified as the record key.\nTo add a repository, add a record key with a child node `url: {repo url}`. \n\nIf you need a specific branch, specify its name in the child node `branch:` (Optional).\n\n*Example:*\n\n```\nrepos:\n downloader: \n url: https://github.com/bantonj/downloader\n pyreadstat: \n url: https://github.com/Roche/pyreadstat\n branch: dev\n```\n\n#### Executing python code\n\nTo run the code, specify a list of files in the root node `script_file:` \n\n*Example:*\n```\nscript_file:\n - example_file.py\n```\n\n#### Output folders\n\nList of folders to upload.\n\n*Example:*\n```\noutput_folder:\n - ./results\n - ./example\n```\n\n#### Google drive uploading\n\nList of google drive folder IDs to load results into.\n\n*Example:*\n```\ngoogle_drive:\n - 1MOUoYor2N2TEbJFDKg-4HDKqF-a9AKRq\n - 1MmhIYor2N2CVRJFDKg-4QACKF-a9AUGq\n - 1ATAGart2N2TEbJFDKg-4QfACN-a9NRGq\n```\n\n#### Kaggle uploading\n\nKaggle new dataset settings\n\n- `username:` Kaggle user name (Required)\n- `key:` Kaggel api token (Required). See: https://github.com/Kaggle/kaggle-api#api-credentials\n- `slug:` The URL slug of the dataset you want to create. (Default value: `unknown_slug`)\n- `license:` \n You can specify the following licenses for your dataset (Optional. Default value: `CC0-1.0`):\n\n * `CC0-1.0:` CC0 Public Domain\n * `CC-BY-SA-3.0:` CC BY-SA 3.0\n * `CC-BY-SA-4.0:` CC BY-SA 4.0\n * `CC-BY-NC-SA-4.0:` CC BY-NC-SA 4.0\n * `GPL-2.0:` GPL 2\n * `ODbL-1.0:` Database: Open Database, Contents: \u00a9 Original Authors\n * `DbCL-1.0:` Database: Open Database, Contents: Database Contents\n * `copyright-authors:` Data files \u00a9 Original Authors\n * `other:` Other (specified in description)\n * `unknown:` Unknown\n\n- `dataset_title:` Dataset title (Default value: `Unknown title`)\n- `isPrivate:` Create publicly (Optional. Default value: `true`)\n- `convertToCsv:` Convert tabular files to CSV (Optional. Default value: `true`)\n- `dirmode:` What to do with directories: \"skip\" - ignore; \"zip\" - compressed upload; \"tar\" - uncompressed upload (Optional. Default value: `skip`)\n\n*Example:*\n\n```\nkaggle:\n username: user_name\n key: api_key\n slug: slg_value \n license: CC0-1.0\n dataset_title: dataset_title\n isPrivate: true\n convertToCsv: true\n dirmode: tar\n```\n\n#### Telegram channels\n\nTelegram chats list for remote Google Drive authorization. The tool sends a message to these chats after the end of work.\n\n*Example:*\n\n```\ntelegram_channels:\n - 750423741\n - 750423742\n - 750423743\n``` \n\n### Config example:\n\n```\ndataset_list:\n - https://tlk.s3.yandex.net/dataset/TlkAgg2.zip\n - https://tlk.s3.yandex.net/dataset/TlkPersonaChatRus.zip\n - https://tlk.s3.yandex.net/dataset/TlkAgg5.zip\n - https://tlk.s3.yandex.net/dataset/LRWC.zip\n - https://tlk.s3.yandex.net/dataset/WordSenseRus.zip\nextract_archives: false \nrepos:\n downloader: \n url: https://github.com/bantonj/downloader\n pyreadstat: \n url: https://github.com/Roche/pyreadstat\n branch: dev\nscript_file:\n - file.py\noutput_folder:\n - ./results \ngoogle_drive:\n - NTHDIXBMWNCGFIXBM-NTbmTHD-TGFIT\nkaggle:\n username: OEKOIXBT\n key: THDIDHTR4209YDHTR4209ydhcDHT\n slug: wp_test_slg \n license: CC0-1.0\n dataset_title: wsp_test_dataset\n isPrivate: true\n convertToCsv: true\n dirmode: tar\ntelegram_channels:\n - 750423741\n```", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/VasiliyLysokobylko/workspace-puller", "keywords": "workspace,pull,puller", "license": "", "maintainer": "", "maintainer_email": "", "name": "workspace-puller", "package_url": "https://pypi.org/project/workspace-puller/", "platform": "", "project_url": "https://pypi.org/project/workspace-puller/", "project_urls": { "Homepage": "https://github.com/VasiliyLysokobylko/workspace-puller" }, "release_url": "https://pypi.org/project/workspace-puller/0.0.22/", "requires_dist": null, "requires_python": "", "summary": "Workspace build tool.", "version": "0.0.22" }, "last_serial": 5735482, "releases": { "0.0.22": [ { "comment_text": "", "digests": { "md5": "5c175db0ebc6b343354ae3ff0a2143ed", "sha256": "aee82b8b2405cbd03e509e75fc36c9c2eff19c9ff4c141b0f33aedde9a3a6d86" }, "downloads": -1, "filename": "workspace-puller-0.0.22.tar.gz", "has_sig": false, "md5_digest": "5c175db0ebc6b343354ae3ff0a2143ed", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10354, "upload_time": "2019-08-27T08:31:01", "url": "https://files.pythonhosted.org/packages/4e/c6/e84cc9a6ff5901563a5c17b47b088a11cbdde3e6aebad073fb48335ab0e2/workspace-puller-0.0.22.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "5c175db0ebc6b343354ae3ff0a2143ed", "sha256": "aee82b8b2405cbd03e509e75fc36c9c2eff19c9ff4c141b0f33aedde9a3a6d86" }, "downloads": -1, "filename": "workspace-puller-0.0.22.tar.gz", "has_sig": false, "md5_digest": "5c175db0ebc6b343354ae3ff0a2143ed", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10354, "upload_time": "2019-08-27T08:31:01", "url": "https://files.pythonhosted.org/packages/4e/c6/e84cc9a6ff5901563a5c17b47b088a11cbdde3e6aebad073fb48335ab0e2/workspace-puller-0.0.22.tar.gz" } ] }