{ "info": { "author": "Daniel Perez", "author_email": "tuvistavie@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# bigcode-fetcher\n\nA utility to search and fetch code from GitHub.\nThis tool was build to easily create datasets for repository analysis.\n\nThe tool works in two phases, `search` finds repositories using the GitHub API,\nand saves the result in a JSON file. `download` fetch all the repositories\ninside the JSON file.\n\n## Install\n\nThis tool can be installed by running\n\n```\npip install bigcode-fetcher\n```\n\nor by fetching this repository and running\n\n```\npip install .\n```\n\nin this directory.\n\n## Usage\n\n### `search` command\n\nBy default, the utility searches for repositories fulfilling the following conditions\n\n* `size` between 1M and 100M\n* `stars` count > 10\n* non-viral `license` (MIT,Apache-2.0,MPL-2.0,BSD-2-Clause,BSD-3-Clause,BSD-4-Clause,MS-PL)\n\nand retrieves the first 100 projects, ordered by number of stars.\n\nTo avoid API rate limiting, an access token can be provided either with the `--token`\nCLI argument or with the `GITHUB_TOKEN` environment variable.\n\nSee the help to see all the options:\n\n```\nbigcode-fetcher search -h\n```\n\n#### Example\n\nSearch for all Apache commons projects written in Java\n\n```\nmkdir -p apache-common-projects\nbigcode-fetcher search --language Java --user apache --stars '>0' --keyword commons --max-repos 500 -o apache-common-projects/apache-commons.json\n```\n\n### `download` command\n\nThis commands will simply `git clone` all the repositories in the\n`JSON` generated by the `search` command.\n\nTo reduce the download size, only the latest revision is fetched by default (i.e. `git clone --depth 1`). This can be disabled by passing in the `--full` flag.\n\n`USERNAME/REPO` will be fetched in `OUTPUT_DIR/USERNAME/REPO`, where\n`OUTPUT_DIR` is set by the `--output` option.\n\nThe command will ignore the project if the directory already exists,\nso running the command multiple times is safe, and recommended to make\nsure all repositories have been fetched.\n\nSee the help for more information:\n\n```\nbigcode-fetcher download -h\n```\n\n#### Example\n\nDownload all the Apache commons project generated above\n\n```\nmkdir -p apache-common-projects/repositories\nbigcode-fetcher download -i apache-common-projects/apache-commons.json -o apache-common-projects/repositories\n```\n\n\n", "description_content_type": null, "docs_url": null, "download_url": "https://github.com/tuvistavie/bigcode-tools/archive/master.zip", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/tuvistavie/bigcode-tools/tree/master/code-fetcher", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "bigcode-fetcher", "package_url": "https://pypi.org/project/bigcode-fetcher/", "platform": "", "project_url": "https://pypi.org/project/bigcode-fetcher/", "project_urls": { "Download": "https://github.com/tuvistavie/bigcode-tools/archive/master.zip", "Homepage": "https://github.com/tuvistavie/bigcode-tools/tree/master/code-fetcher" }, "release_url": "https://pypi.org/project/bigcode-fetcher/0.1.2/", "requires_dist": [ "requests", "grequests", "nose; extra == 'test'", "requests-mock; extra == 'test'" ], "requires_python": "", "summary": "Tool to search and fetch code from GitHub", "version": "0.1.2" }, "last_serial": 3424055, "releases": { "0.1.1": [ { "comment_text": "", "digests": { "md5": "8014e5928325461e3c273562af1e450a", "sha256": "c0227a6a7307fb0842c9e920c282a6a96089e4ecf5c726454dfc230337485b55" }, "downloads": -1, "filename": "bigcode_fetcher-0.1.1-py3-none-any.whl", "has_sig": true, "md5_digest": "8014e5928325461e3c273562af1e450a", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9424, "upload_time": "2017-10-12T05:17:32", "url": "https://files.pythonhosted.org/packages/38/9a/ae5e183c7a88ffb3ca17a88faa435ba17b52dc48e9c42890807ddc52b05f/bigcode_fetcher-0.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "2258cfb9d930aa804414195e6cc03456", "sha256": "e592a609946b5d4b58e46cb3b9bcae06ec2872dbff0f7537e0c7b0e44a19a88c" }, "downloads": -1, "filename": "bigcode-fetcher-0.1.1.tar.gz", "has_sig": true, "md5_digest": "2258cfb9d930aa804414195e6cc03456", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5573, "upload_time": "2017-10-12T05:17:34", "url": "https://files.pythonhosted.org/packages/9c/8b/a15a8b107d48b4c99fdc1b1d40bd3bb7ffdf8c8b08ec25ccaf014730aa62/bigcode-fetcher-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "b38e3b230424bb402307cb7e078f97a0", "sha256": "97d929d59d68a39fd59dbd37acd2163357eb3343af2dd19c64ad5ec01cc900ad" }, "downloads": -1, "filename": "bigcode_fetcher-0.1.2-py3-none-any.whl", "has_sig": true, "md5_digest": "b38e3b230424bb402307cb7e078f97a0", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9467, "upload_time": "2017-12-18T03:30:19", "url": "https://files.pythonhosted.org/packages/c3/ff/b7e4d79f7eb0c02cb1675e3173d109d447ff36f8b2298cd5f50f837d50f2/bigcode_fetcher-0.1.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "05f22a4be3b1f401a497cc5eaeb7ca46", "sha256": "3c24fd921cc86d3b327ad9ab99faa10eafd40b948368d76f3abff9a40dbf1524" }, "downloads": -1, "filename": "bigcode-fetcher-0.1.2.tar.gz", "has_sig": true, "md5_digest": "05f22a4be3b1f401a497cc5eaeb7ca46", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5625, "upload_time": "2017-12-18T03:30:20", "url": "https://files.pythonhosted.org/packages/97/ba/16e36d081a5c03ce21e411a98999375c5eb959d92d6e7405838fbcf9cd76/bigcode-fetcher-0.1.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "b38e3b230424bb402307cb7e078f97a0", "sha256": "97d929d59d68a39fd59dbd37acd2163357eb3343af2dd19c64ad5ec01cc900ad" }, "downloads": -1, "filename": "bigcode_fetcher-0.1.2-py3-none-any.whl", "has_sig": true, "md5_digest": "b38e3b230424bb402307cb7e078f97a0", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 9467, "upload_time": "2017-12-18T03:30:19", "url": "https://files.pythonhosted.org/packages/c3/ff/b7e4d79f7eb0c02cb1675e3173d109d447ff36f8b2298cd5f50f837d50f2/bigcode_fetcher-0.1.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "05f22a4be3b1f401a497cc5eaeb7ca46", "sha256": "3c24fd921cc86d3b327ad9ab99faa10eafd40b948368d76f3abff9a40dbf1524" }, "downloads": -1, "filename": "bigcode-fetcher-0.1.2.tar.gz", "has_sig": true, "md5_digest": "05f22a4be3b1f401a497cc5eaeb7ca46", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5625, "upload_time": "2017-12-18T03:30:20", "url": "https://files.pythonhosted.org/packages/97/ba/16e36d081a5c03ce21e411a98999375c5eb959d92d6e7405838fbcf9cd76/bigcode-fetcher-0.1.2.tar.gz" } ] }