{ "info": { "author": "Noah Gift", "author_email": "consulting@noahgift.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Programming Language :: Python", "Programming Language :: Python :: 3.6", "Topic :: Software Development :: Libraries :: Python Modules" ], "description": "|Codacy Badge| |CircleCI|\n\ndevml\n=====\n\nMachine Learning, Statistics and Utilities around Developer Productivity\n\nA few handy bits of functionality:\n\n- Can checkout all repositories in Github\n- Converts a tree of checked out repositories on disk into a pandas\n dataframe\n- Statistics on combined DataFrames\n\nInstallation\n------------\n\n::\n\n pip install devml\n\nThis pip install installs a command-line tool: dml (which is referenced\nin the documentation below). And also library devml, which is referenced\nbelow as well.\n\nGet environment setup\n---------------------\n\nCode is written to support Python 3.6 or greater. You can get that here:\nhttps://www.python.org/downloads/release/python-360/.\n\nAn easy way to run the project locally is to check the repo out and in the root of the repo run:\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n::\n\n make setup\n\nThis create a virtualenv in ~/.devml\n\nNext, source that virtualenv:\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n::\n\n source ~/.devml/bin/activate\n\nRun Make All (installs, lints and tests)\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n::\n\n make all\n\n # #Example output\n #(.devml) \u279c devml git:(master) make all \n #pip install -r requirements.txt\n #Requirement already satisfied: pytest in /Users/noahgift/.devml/lib/python3.6/site-packages (from -r requirements.txt (line #1)\n ---------- coverage: platform darwin, python 3.6.2-final-0 -----------\n Name Stmts Miss Cover\n ----------------------------------------------\n devml/__init__.py 1 0 100%\n devml/author_stats.py 6 6 0%\n devml/fetch_repo.py 54 42 22%\n devml/mkdata.py 84 21 75%\n devml/org_stats.py 76 55 28%\n devml/post_processing.py 50 35 30%\n devml/state.py 29 9 69%\n devml/stats.py 55 43 22%\n devml/ts.py 29 14 52%\n devml/util.py 12 4 67%\n dml.py 111 66 41%\n ----------------------------------------------\n TOTAL 507 295 42%\n ....\n\nYou don't use virtualenv or don't want to use it. No problem, just run\n``make all`` it should probably work if you have python 3.6 installed.\n\n::\n\n\n make all\n\nExplore Jupyter Notebooks on Github Organizations\n-------------------------------------------------\n\nYou can explore combined datasets here using this example as a starter:\n\nhttps://github.com/noahgift/devml/blob/master/notebooks/github\\_data\\_exploration.ipynb\n\n.. figure:: https://user-images.githubusercontent.com/58792/31581904-66ee7fc0-b12a-11e7-804a-7b0f1728f30a.png\n :alt: Pallets Project\n\n Pallets Project\n\nExplore Jupyter Notebooks on Repository Churn\n---------------------------------------------\n\nYou can explore File Metadata exploration example here:\n\nhttps://github.com/noahgift/devml/blob/master/notebooks/repo\\_file\\_exploration.ipynb\n\nAll Files Churned by type:\n^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n.. figure:: https://user-images.githubusercontent.com/58792/31587879-59d9724e-b19e-11e7-942e-999c02d7b566.png\n :alt: Pallets Project Relative Churn by file type\n\n Pallets Project Relative Churn by file type\n\nSummary Churn Statistics by type:\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n.. figure:: https://user-images.githubusercontent.com/58792/31587931-5d79199e-b19f-11e7-89c2-98185fdef909.png\n :alt: Pallets Project by file type Churn statistics\n\n Pallets Project by file type Churn statistics\n\nExpected Configuration\n----------------------\n\nThe command-line tools expects for you to create a project directory\nwith a config.json file. Inside the config.json file, you will need to\nprovide an oath token. You can find information about how to do that\nhere:\nhttps://help.github.com/articles/creating-a-personal-access-token-for-the-command-line/.\n\nAlternately, you can pass these values in via the python API or via the\ncommand-line as options. They stand for the following:\n\n- org: Github Organization (To clone entire tree of repos)\n- checkout\\_dir: place to checkout\n- oath: personal oath token generated from Github\n\n::\n\n \u279c devml git:(master) \u2717 cat project/config.json \n {\n \"project\" : \n {\n \"org\":\"pallets\",\n \"checkout_dir\": \"/tmp/checkout\",\n \"oath\": \"\"\n }\n \n }\n\nBasic command-line Usage\n------------------------\n\nYou can find out stats for a checkout or a directory full of checkout as\nfollows\n\n.. code:: bash\n\n\n dml gstats author --path ~/src/mycompanyrepo(s)\n Top Commits By Author: author_name commits\n 0 John Smith 3059\n 1 Sally Joe 2995\n 2 Greg Mathews 2194\n 3 Jim Mayflower 1448\n\nBasic API Usage (Converting a tree of repo(s) into a pandas DataFrame)\n----------------------------------------------------------------------\n\n::\n\n In [1]: from devml import (mkdata, stats)\n\n In [2]: org_df = mkdata.create_org_df(path=/src/mycompanyrepo(s)\")\n In [3]: author_counts = stats.author_commit_count(org_df)\n\n In [4]: author_counts.head()\n Out[4]: \n author_name commits\n 0 John Smith 3059\n 1 Sally Joe 2995\n 2 Greg Mathews 2194\n 3 Jim Mayflower 1448\n 4 Truck Pritter 1441\n\nClone all repos in Github using API\n-----------------------------------\n\n::\n\n In [1]: from devml import (mkdata, stats, state, fetch_repo)\n\n In [2]: dest, token, org = state.get_project_metadata(\"../project/config.json\")\n In [3]: fetch_repo.clone_org_repos(token, org, \n dest, branch=\"master\")\n 017-10-14 17:11:36,590 - devml - INFO - Creating Checkout Root: /tmp/checkout\n 2017-10-14 17:11:37,346 - devml - INFO - Found Repo # 1 REPO NAME: flask , URL: git@github.com:pallets/flask.git \n 2017-10-14 17:11:37,347 - devml - INFO - Found Repo # 2 REPO NAME: pallets-sphinx-themes , URL: git@github.com:pallets/pallets-sphinx-themes.git \n 2017-10-14 17:11:37,347 - devml - INFO - Found Repo # 3 REPO NAME: markupsafe , URL: git@github.com:pallets/markupsafe.git \n 2017-10-14 17:11:37,348 - devml - INFO - Found Repo # 4 REPO NAME: jinja , URL: git@github.com:pallets/jinja.git \n 2017-10-14 17:11:37,349 - devml - INFO - Found Repo # 5 REPO NAME: werkzeug , URL: git@githu\n In [4]: !ls -l /tmp/checkout\n total 0\n drwxr-xr-x 21 noahgift wheel 672 Oct 14 17:11 click\n drwxr-xr-x 25 noahgift wheel 800 Oct 14 17:11 flask\n drwxr-xr-x 11 noahgift wheel 352 Oct 14 17:11 flask-docs\n drwxr-xr-x 12 noahgift wheel 384 Oct 14 17:11 flask-ext-migrate\n drwxr-xr-x 8 noahgift wheel 256 Oct 14 17:11 flask-snippets\n drwxr-xr-x 14 noahgift wheel 448 Oct 14 17:11 flask-website\n drwxr-xr-x 18 noahgift wheel 576 Oct 14 17:11 itsdangerous\n drwxr-xr-x 23 noahgift wheel 736 Oct 14 17:11 jinja\n drwxr-xr-x 18 noahgift wheel 576 Oct 14 17:11 markupsafe\n drwxr-xr-x 4 noahgift wheel 128 Oct 14 17:11 meta\n drwxr-xr-x 10 noahgift wheel 320 Oct 14 17:11 pallets-sphinx-themes\n drwxr-xr-x 9 noahgift wheel 288 Oct 14 17:11 pocoo-sphinx-themes\n drwxr-xr-x 15 noahgift wheel 480 Oct 14 17:11 website\n drwxr-xr-x 25 noahgift wheel 800 Oct 14 17:11 werkzeug\n\nAdvanced CLI-Author: Get Activity Statistics for a Tree of Checkouts or a Checkout and sort\n-------------------------------------------------------------------------------------------\n\n::\n\n \u279c devml git:(master) \u2717 dml gstats activity --path /tmp/checkout --sort active_days \n\n Top Unique Active Days: author_name active_days active_duration active_ratio\n 86 Armin Ronacher 989 3817 days 0.260000\n 501 Markus Unterwaditzer 342 1820 days 0.190000\n 216 David Lord 129 712 days 0.180000\n 664 Ron DuPlain 78 854 days 0.090000\n 444 Kenneth Reitz 68 2566 days 0.030000\n 197 Daniel Neuh\u00e4user 42 1457 days 0.030000\n 297 Georg Brandl 41 1337 days 0.030000\n 196 Daniel Neuha\u0308user 36 435 days 0.080000\n 450 Keyan Pishdadian 28 885 days 0.030000\n 169 Christopher Grebs 28 1515 days 0.020000\n 666 Ronny Pfannschmidt 27 3060 days 0.010000\n 712 Simon Sapin 22 793 days 0.030000\n 372 Jeff Widman 19 840 days 0.020000\n 427 Julen Ruiz Aizpuru 16 36 days 0.440000\n 21 Adrian 16 1935 days 0.010000\n 569 Nicholas Wiles 14 197 days 0.070000\n 912 lord63 14 692 days 0.020000\n 756 ThiefMaster 12 1287 days 0.010000\n 763 Thomas Waldmann 11 1560 days 0.010000\n 628 Priit Laes 10 1567 days 0.010000\n 23 Adrian Moennich 10 521 days 0.020000\n 391 Jochen Kupperschmidt 10 3060 days 0.000000\n\nAdvanced CLI-Churn: Get churn by file type\n------------------------------------------\n\nGet the top ten files sorted by churn count with the extension .py:\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n::\n\n \u2717 dml gstats churn --path /Users/noahgift/src/flask --limit 10 --ext .py\n 2017-10-15 12:10:55,783 - devml.post_processing - INFO - Running churn cmd: [git log --name-only --pretty=format:] at path [/Users/noahgift/src/flask]\n files churn_count line_count extension \\\n 1 b'flask/app.py' 316 2183.0 .py \n 3 b'flask/helpers.py' 176 1019.0 .py \n 5 b'tests/flask_tests.py' 127 NaN .py \n 7 b'flask.py' 104 NaN .py \n 8 b'setup.py' 80 112.0 .py \n 10 b'flask/cli.py' 75 759.0 .py \n 11 b'flask/wrappers.py' 70 194.0 .py \n 12 b'flask/__init__.py' 65 49.0 .py \n 13 b'flask/ctx.py' 62 415.0 .py \n 14 b'tests/test_helpers.py' 62 888.0 .py \n\n relative_churn \n 1 0.14 \n 3 0.17 \n 5 NaN \n 7 NaN \n 8 0.71 \n 10 0.10 \n 11 0.36 \n 12 1.33 \n 13 0.15 \n 14 0.07 \n\nGet descriptive statistics for extension .py and compare to another repository\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\nIn this example, flask, this repo and cpython are all compared to see\nhow the median churn is.\n\n::\n\n (.devml) \u279c devml git:(master) dml gstats metachurn --path /Users/noahgift/src/flask --ext .py --statistic median \n 2017-10-15 12:39:44,781 - devml.post_processing - INFO - Running churn cmd: [git log --name-only --pretty=format:] at path [/Users/noahgift/src/flask]\n MEDIAN Statistics:\n\n churn_count line_count relative_churn\n extension \n .py 2 85.0 0.13\n (.devml) \u279c devml git:(master) dml gstats metachurn --path /Users/noahgift/src/devml --ext .py --statistic median\n 2017-10-15 12:40:10,999 - devml.post_processing - INFO - Running churn cmd: [git log --name-only --pretty=format:] at path [/Users/noahgift/src/devml]\n MEDIAN Statistics:\n\n churn_count line_count relative_churn\n extension \n .py 1 62.5 0.02\n\n (.devml) \u279c devml git:(master) dml gstats metachurn --path /Users/noahgift/src/cpython --ext .py --statistic median\n 2017-10-15 12:42:19,260 - devml.post_processing - INFO - Running churn cmd: [git log --name-only --pretty=format:] at path [/Users/noahgift/src/cpython]\n MEDIAN Statistics:\n\n churn_count line_count relative_churn\n extension \n .py 7 169.5 0.1\n\nGet Relative Churn for an Author\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n::\n\n\n dml gstats authorchurnmeta --author \"Armin Ronacher\" --path /tmp/checkout/flask --ext .py\n\n #He has 6.5% median relative churn...very good.\n\n count 193.000000\n mean 0.331860\n std 0.625431\n min 0.001000\n 25% 0.030000\n 50% 0.065000\n 75% 0.250000\n max 3.000000\n Name: author_rel_churn, dtype: float64\n\nCompare CPython Active Ratio with Linux Active Ratio\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n::\n\n # Linux Development Active Ratio\n dml gstats activity --path /Users/noahgift/src/linux --sort active_days\n\n author_name active_days active_duration active_ratio\n 14541 Takashi Iwai 1677 4590 days 0.370000\n 4382 Eric Dumazet 1460 4504 days 0.320000\n 3641 David S. Miller 1428 4513 days 0.320000\n 7216 Johannes Berg 1329 4328 days 0.310000\n 8717 Linus Torvalds 1281 4565 days 0.280000\n 275 Al Viro 1249 4562 days 0.270000\n 9915 Mauro Carvalho Chehab 1227 4464 days 0.270000\n 9375 Mark Brown 1198 4187 days 0.290000\n 3172 Dan Carpenter 1158 3972 days 0.290000\n 12979 Russell King 1141 4602 days 0.250000\n 1683 Axel Lin 1040 2720 days 0.380000\n 400 Alex Deucher 1036 3497 days 0.300000\n\n\n # CPython Development Active Ratio\n\n author_name active_days active_duration active_ratio\n 146 Guido van Rossum 2256 9673 days 0.230000\n 301 Raymond Hettinger 1361 5635 days 0.240000\n 128 Fred Drake 1239 5335 days 0.230000\n 47 Benjamin Peterson 1234 3494 days 0.350000\n 132 Georg Brandl 1080 4091 days 0.260000\n 375 Victor Stinner 980 2818 days 0.350000\n 235 Martin v. L\u00f6wis 958 5266 days 0.180000\n 36 Antoine Pitrou 883 3376 days 0.260000\n 362 Tim Peters 869 5060 days 0.170000\n 164 Jack Jansen 800 4998 days 0.160000\n 24 Andrew M. Kuchling 743 4632 days 0.160000\n 330 Serhiy Storchaka 720 1759 days 0.410000\n 44 Barry Warsaw 696 8485 days 0.080000\n 52 Brett Cannon 681 5278 days 0.130000\n 262 Neal Norwitz 559 2573 days 0.220000\n\n In this analysis, Guido of Python has a 23% probability of working on a given day, and Linux has a 28% chance.\n\nDeletion Statistics\n-------------------\n\nFind all delete files from repository\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n::\n\n dml gstats deleted --path /Users/noahgift/src/flask\n\n DELETION STATISTICS\n\n files ext\n 0 b'tests/test_deprecations.py' .py\n 1 b'scripts/flask-07-upgrade.py' .py\n 2 b'flask/ext/__init__.py' .py\n 3 b'flask/exthook.py' .py\n 4 b'scripts/flaskext_compat.py' .py\n 5 b'tests/test_ext.py' .py\n\nFAQ\n---\n\nWhat is Churn and Why Do I Care?\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\nCode churn is the amount of times a file has been modified. Relative\nchurn is the amount of times it has been modified relative to lines of\ncode. Research into defects in software has shown that relative code\nchurn is highly predictive of defects, i.e., the greater the relative\nchurn number the higher the amount of defects.\n\n\"Increase in relative code churn measures is accompanied by an increase\nin system defect density; \"\n\nYou can read the entire study here:\nhttps://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/icse05churn.pdf\n\n.. |Codacy Badge| image:: https://api.codacy.com/project/badge/Grade/3e382eedf6424c1282aab4dd13e54c26\n :target: https://www.codacy.com/app/noahgift/devml?utm_source=github.com&utm_medium=referral&utm_content=noahgift/devml&utm_campaign=badger\n.. |CircleCI| image:: https://circleci.com/gh/noahgift/devml.svg?style=svg\n :target: https://circleci.com/gh/noahgift/devml", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/noahgift/devml", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "devml", "package_url": "https://pypi.org/project/devml/", "platform": "any", "project_url": "https://pypi.org/project/devml/", "project_urls": { "Homepage": "https://github.com/noahgift/devml" }, "release_url": "https://pypi.org/project/devml/0.5.1/", "requires_dist": null, "requires_python": "", "summary": "Machine Learning, Statistics and Utilities around Developer Productivity, Company Productivity and Project Productivity", "version": "0.5.1" }, "last_serial": 3270855, "releases": { "0.2": [ { "comment_text": "", "digests": { "md5": "6003f45eba4ba1c54311588ed274c046", "sha256": "219caaca453aa0d5256df2beeed68683da4b17551978ae30c8eecf1bab4b4dc5" }, "downloads": -1, "filename": "devml-0.2.tar.gz", "has_sig": false, "md5_digest": "6003f45eba4ba1c54311588ed274c046", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8976, "upload_time": "2017-10-21T20:27:20", "url": "https://files.pythonhosted.org/packages/f8/dd/eff92a6af07787aaabf55f4281f250bff68198da32648aaac789f1784187/devml-0.2.tar.gz" } ], "0.3": [ { "comment_text": "", "digests": { "md5": "7f60e2e40bd271c7344f514f8496c64f", "sha256": "50e39924b56a6ced540842e975075010989bae3a36d94220818ea58965422f00" }, "downloads": -1, "filename": "devml-0.3.tar.gz", "has_sig": false, "md5_digest": "7f60e2e40bd271c7344f514f8496c64f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 14354, "upload_time": "2017-10-21T20:42:04", "url": "https://files.pythonhosted.org/packages/ff/96/b0935b376806bc3ae68896a0c8646c57d75c376fe218db7c8c2562bea06a/devml-0.3.tar.gz" } ], "0.4": [ { "comment_text": "", "digests": { "md5": "0dc0fdfb5ca964a49cfc8cbd864e1ff1", "sha256": "b31eaade63ecc4c000c190d6cb60dbf316643abe7ec286b0f40c5c9ccbdacba6" }, "downloads": -1, "filename": "devml-0.4.tar.gz", "has_sig": false, "md5_digest": "0dc0fdfb5ca964a49cfc8cbd864e1ff1", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 16408, "upload_time": "2017-10-21T21:18:12", "url": "https://files.pythonhosted.org/packages/ed/08/60ca5fe92b4cf1c421e9c7095770b3d02db06604a361c75ce490c4ae3087/devml-0.4.tar.gz" } ], "0.5": [ { "comment_text": "", "digests": { "md5": "fa74e3a9888412f71f923eb84e5039b7", "sha256": "0ecf2cef1e96340e1ea80562ac7fcba01123f48d3ad18126f6e5ebed5e4d79b8" }, "downloads": -1, "filename": "devml-0.5.tar.gz", "has_sig": false, "md5_digest": "fa74e3a9888412f71f923eb84e5039b7", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 16746, "upload_time": "2017-10-23T00:34:57", "url": "https://files.pythonhosted.org/packages/56/a5/97224debf31c776ac42d41e47771f7dd63e26df414ba81a59169414e4bdc/devml-0.5.tar.gz" } ], "0.5.1": [ { "comment_text": "", "digests": { "md5": "4d6fccd00648df0634310d6c87ab410a", "sha256": "fd5d92ffe31f01dde01626ca5703db999749e1d6e88e44d59b70c477ee6fa6f3" }, "downloads": -1, "filename": "devml-0.5.1.tar.gz", "has_sig": false, "md5_digest": "4d6fccd00648df0634310d6c87ab410a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 22585, "upload_time": "2017-10-23T02:20:17", "url": "https://files.pythonhosted.org/packages/fc/b7/7fc5c02b9741ba3ba5966347f1231fc0e6f384f3311a3a00f514ac282165/devml-0.5.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "4d6fccd00648df0634310d6c87ab410a", "sha256": "fd5d92ffe31f01dde01626ca5703db999749e1d6e88e44d59b70c477ee6fa6f3" }, "downloads": -1, "filename": "devml-0.5.1.tar.gz", "has_sig": false, "md5_digest": "4d6fccd00648df0634310d6c87ab410a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 22585, "upload_time": "2017-10-23T02:20:17", "url": "https://files.pythonhosted.org/packages/fc/b7/7fc5c02b9741ba3ba5966347f1231fc0e6f384f3311a3a00f514ac282165/devml-0.5.1.tar.gz" } ] }