{ "info": { "author": "Luca Giacomel", "author_email": "luca.giacomel@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# Bulldog\n\nThe guardian dog that prevents you from writing poor code when doing data analysis in Python.\n\n## Installation\n\nSimply run:\n\n`pip install bulldog`\n\n## Philosophy\n\nBulldog is a library for writing better code in your analysis that largely borrows from the state management libraries for application development ([Redux](https://github.com/reduxjs/redux), [Flux](https://github.com/facebook/flux), [Vuex](https://github.com/vuejs/vuex), [Katana](https://github.com/BendingSpoons/katana-swift)..).\n\nBulldog models are composed of five main building blocks:\n1) `data`, which is our model initial data\n2) `data_modifiers`, which are special function whose main task is to modify the model's data\n3) `business_logic`, which are function whose main task is to execute the business logic and invoke `data_modifiers`\n4) `analyses`, which subscribe to change on the model's `data`\n5) `history`, which is a backlog of all the operations that have occurred and the corresponding state of the model `data`\n\nThe philosophy behind bulldog is to **separate** data transformations, business logic and analyses, in order to make\nclarity, testing and debugging easier to achieve.\n\n## Working with bulldog\n\nTo create a bulldog model simply run:\n```python\nfrom bulldog.model import Model\nimport pandas as pd\n\nmodel = Model(data={\n 'df': pd.DataFrame(pd.np.ones((100, 100))),\n 'other_data': [1, 2, 3]\n})\n```\n\nAll the data is stored in our `model.data` and it's not directly modifiable. In fact, whenever we access `model.data` we are actually accessing a copy of the original model data.\n\nIn order to alter/modify our data we need to create some special functions called `data_modifiers`.\n\n```python\n@model.data_modifier\ndef data_step(data, factor):\n df = data['df']\n df *= factor\n return data # this will modify the data\n```\n\nAs we can see data modifiers are just simple, pure functions that take our model data as input and perform some kind of alteration on it \nand return the altered data. The signature of a business model is `function(data, *args, **kwargs)` and it needs to be\ndecorated with the `@model.data_modifier` decorator (where `model` is your instance of Bulldog's `Model`).\n\nIf we want to execute a `data_modifier`, rather than calling it directly we need to ask the model to commit it:\n\n```python\nmodel.commit('data_step', factor=9) # 'data_step' is the name of our `data_modifier`\n```\n\nNote that any other way of calling the function will result in an error. E.g.:\n\n```python\ndata_step(data=model.data, factor=9) # wrong; this will throw an error\n```\n\nGreat! but what if we need to run some business logic to conditionally modify our dataset?\nMaybe we need to download some data and based on that perform some actions that will eventually \nlead us to modify our data. In this case we should use a `business_logic` function.\n\n```python\n@model.business_logic\n@model.checkpoint\ndef action1(data, commit):\n data['df'] /= 8000 # this has no effect whatsoever on our data, remember? We are modifying a copy\n if max(data['df']) < 0.38:\n commit(\"data_step\", 9) # but this will actually modify our data\n```\n\nAs we can see `business_logic` are function with the signature `function(data, commit, *args, **kwargs)` which take as input the data\nand have the possibility of committing `data_modifier` functions to our original model\n\nYou might have noted the additional `@model.checkpoint` decorator (which can also be applied to `data_modifiers`). It will basically tell our model to store the current state data after computing\nthis function (and store it in `model.history`), allowing us to restore it or inspect it at a later stage, which is very convenient for debugging.\n\nSimilarly to `data_modifiers`, also `business_logic` cannot be execute directly, and have to be dispatched through the model in this way:\n\n```python\nmodel.dispatch('action1')\n```\n\nNow, you might wonder how to run analyses on the model's data. That's fairly simple!\n\n```python\n@model.analysis\n@model.parallelizable\ndef analysis(data, history):\n df = data['df']\n time.sleep(3)\n print('fast 1', list(history.keys())[-1].name, pd.np.mean(df.values))\n```\n\nAnalyses are functions with signature `function(data, history)` that are run automatically every time a checkpoint step of our model is executed.\nOptionally analyses can be run in parallel (if you use the `@model.parallelizable` decorator, as above). This is particularly convenient\nin case we are computing a large number of metrics and want to leverage our CPU as much as possible.\nNote that only analyses can be parallelized in Bulldog.\n\n### Custom checkpoints management\n\nOut of the box, Bulldog doesn't implement any custom diffing logic for the model `data` (since it's a generic dictionary which could contain anything),\nbut you can provide your own functions to checkpoint & restore your data. For example you might want to write/read:\n\n1) from a database\n2) from a pickled file on disk\n3) from h5df\n3) diffs from custom diffing tools (or generic ones like [csv-diff](https://github.com/aswinkarthik/csvdiff))\n\nIf you want to provide some custom save/load logic to handle checkpoint save & restore, pass these two functions to the Model initializer:\n\n1) `on_checkpoint_save(data, version_key, history)`: this function is responsible for saving the `data` (or a diff of it which you can compute by comparing it with your model `history`, holding every other checkpoint data)\n2) `on_checkpoint_restore(version_key, history)`: this function is responsible for restoring data from a previous checkpoint\n\nFor example if you want to read from disk pickled objects you might do:\n\n```python\ndef on_checkpoint_save(data, key, history):\n file_name = 'data_{}.pkl'.format(key.step)\n pickle.dump(data, open('data_{}.pkl'.format(key.step), 'wb'))\n return file_name # only the file name will be saved in memory\n\n\ndef on_checkpoint_restore(key, history):\n file_name = history[key]\n return pickle.load(open(file_name, 'rb')) # store this in model.data\n\n\nmodel = Model(\n data={\n 'df': pd.DataFrame(pd.np.ones((100, 100))),\n },\n on_checkpoint_save=on_checkpoint_save,\n on_checkpoint_restore=on_checkpoint_restore\n)\n```\n\n### Advanced usage\n\nBulldog has a few nice features for people that use interactive editors (like `ipython` or `jupyter notebook`).\n\n1) You can prevent the same `business_logic` from running multiple times by setting `unique_bl_step=True` in `Model`. This will prevent your state from being modified multiple times if you re-run cells in a notebook.\n2) You can restore the version model data at a previous checkpoint by running either `rollback(n_steps)` or `revert_version(Version)`. This is useful both for reproducibility/debugging and for jupyter users who don't want to re-run a whole lengthy analysis after a wrong alteration of the model data.\n3) *Testing:* still to be developed. Ideally bulldog will allow you to test every single component in a much easier way and possibly also with mocked data.\n\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/luke14free/bulldog", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "bulldog", "package_url": "https://pypi.org/project/bulldog/", "platform": "", "project_url": "https://pypi.org/project/bulldog/", "project_urls": { "Homepage": "https://github.com/luke14free/bulldog" }, "release_url": "https://pypi.org/project/bulldog/1.0.3/", "requires_dist": [ "pathos" ], "requires_python": ">=3.6", "summary": "State management for Data Science & Analytics", "version": "1.0.3", "yanked": false, "yanked_reason": null }, "last_serial": 6006852, "releases": { "1.0.0": [ { "comment_text": "", "digests": { "md5": "00e682a5f03ae5dc621a2cd4cdf02e5e", "sha256": "4ea260019c7d66e18bdad5a631483cf8860b9cb83d383f0eff0d2412908f3242" }, "downloads": -1, "filename": "bulldog-1.0.0-py3-none-any.whl", "has_sig": false, "md5_digest": "00e682a5f03ae5dc621a2cd4cdf02e5e", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 3988, "upload_time": "2019-10-21T09:11:47", "upload_time_iso_8601": "2019-10-21T09:11:47.122782Z", "url": "https://files.pythonhosted.org/packages/86/ed/fe97964972abed9d1b5f6bf99b743c15b4d7a5db87a568d0d6e558fbd757/bulldog-1.0.0-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "5c9fa3776e50209fce7c3562a42d035c", "sha256": "aa7540834104f00cca31da9fd12ea963cc5c771107877703bbcb8ce177513fa3" }, "downloads": -1, "filename": "bulldog-1.0.0.tar.gz", "has_sig": false, "md5_digest": "5c9fa3776e50209fce7c3562a42d035c", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 4392, "upload_time": "2019-10-21T09:11:50", "upload_time_iso_8601": "2019-10-21T09:11:50.878778Z", "url": "https://files.pythonhosted.org/packages/d7/47/5946f3b3f5635ac4de967c4460a3f83532556ac38a822aa003960810fb6d/bulldog-1.0.0.tar.gz", "yanked": false, "yanked_reason": null } ], "1.0.1": [ { "comment_text": "", "digests": { "md5": "834f4b3c177d68930597d946a8474cab", "sha256": "958c50149903705df0a5ce5339c46adb46acc81cf4d3db4a14ec707528da4f6c" }, "downloads": -1, "filename": "bulldog-1.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "834f4b3c177d68930597d946a8474cab", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 5914, "upload_time": "2019-10-21T09:17:06", "upload_time_iso_8601": "2019-10-21T09:17:06.390782Z", "url": "https://files.pythonhosted.org/packages/0b/f6/eb07bbdc3f8e48a46d49e711fe0d01ad67000e8c46340610c3a4116f56f0/bulldog-1.0.1-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "887d043c2cd7906d1c9897f00251b33f", "sha256": "8e9733a3dbe5c5d23272951692d539aea61b5d8b6548d431ca1d3a6da6857cf7" }, "downloads": -1, "filename": "bulldog-1.0.1.tar.gz", "has_sig": false, "md5_digest": "887d043c2cd7906d1c9897f00251b33f", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 5986, "upload_time": "2019-10-21T09:17:10", "upload_time_iso_8601": "2019-10-21T09:17:10.442783Z", "url": "https://files.pythonhosted.org/packages/ce/7c/0270a48ca20222a30d8b5344455aa653ff1a8dab964fce9b95de8f7ba382/bulldog-1.0.1.tar.gz", "yanked": false, "yanked_reason": null } ], "1.0.2": [ { "comment_text": "", "digests": { "md5": "3136968ff0c5ad1abfb32c1322da20cd", "sha256": "db7b8aa1c6f06a72bc70095d7be0ca4ca59549fa05d11bdbadaabce4a5860b3c" }, "downloads": -1, "filename": "bulldog-1.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "3136968ff0c5ad1abfb32c1322da20cd", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 5918, "upload_time": "2019-10-21T09:20:18", "upload_time_iso_8601": "2019-10-21T09:20:18.114836Z", "url": "https://files.pythonhosted.org/packages/82/83/048f924f4593c0a1970e7e3a1760c022f35b276c899e6e6fc974012ef8c9/bulldog-1.0.2-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "5b92d19574e85676e4068ea7bf8ce864", "sha256": "ab7f62e1ea17eff13aa8c20a027b582c67e57119900e5ded393706406a390e10" }, "downloads": -1, "filename": "bulldog-1.0.2.tar.gz", "has_sig": false, "md5_digest": "5b92d19574e85676e4068ea7bf8ce864", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 5988, "upload_time": "2019-10-21T09:20:22", "upload_time_iso_8601": "2019-10-21T09:20:22.906117Z", "url": "https://files.pythonhosted.org/packages/77/2b/d9b0ea6b103ca7888eabf8494e7c19b900d72e79a04198c7be998ed53c51/bulldog-1.0.2.tar.gz", "yanked": false, "yanked_reason": null } ], "1.0.3": [ { "comment_text": "", "digests": { "md5": "5a0c506457db08b178a6dc5ff968b618", "sha256": "f716b06b066cd2243ebceef782f5deef85c879debdba4d7817cbc1e5f330e290" }, "downloads": -1, "filename": "bulldog-1.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "5a0c506457db08b178a6dc5ff968b618", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 5928, "upload_time": "2019-10-21T11:24:06", "upload_time_iso_8601": "2019-10-21T11:24:06.770920Z", "url": "https://files.pythonhosted.org/packages/45/70/fd4dabc619db2f9fea5c17a02133cfc3352f0791c57e84ac2b5db639eb76/bulldog-1.0.3-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "cafaed8e1f0ab53028029374455f51ff", "sha256": "232a044c7de58b798d9bbb2f9366a17fba90a16d2b2d8964fe8f07f1da86bb8e" }, "downloads": -1, "filename": "bulldog-1.0.3.tar.gz", "has_sig": false, "md5_digest": "cafaed8e1f0ab53028029374455f51ff", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 6029, "upload_time": "2019-10-21T11:24:12", "upload_time_iso_8601": "2019-10-21T11:24:12.474476Z", "url": "https://files.pythonhosted.org/packages/5b/3d/71965de81e56e9c617f1cbc9373f13a222a6a38d1fd3788ed358a46fee4f/bulldog-1.0.3.tar.gz", "yanked": false, "yanked_reason": null } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "5a0c506457db08b178a6dc5ff968b618", "sha256": "f716b06b066cd2243ebceef782f5deef85c879debdba4d7817cbc1e5f330e290" }, "downloads": -1, "filename": "bulldog-1.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "5a0c506457db08b178a6dc5ff968b618", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 5928, "upload_time": "2019-10-21T11:24:06", "upload_time_iso_8601": "2019-10-21T11:24:06.770920Z", "url": "https://files.pythonhosted.org/packages/45/70/fd4dabc619db2f9fea5c17a02133cfc3352f0791c57e84ac2b5db639eb76/bulldog-1.0.3-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "cafaed8e1f0ab53028029374455f51ff", "sha256": "232a044c7de58b798d9bbb2f9366a17fba90a16d2b2d8964fe8f07f1da86bb8e" }, "downloads": -1, "filename": "bulldog-1.0.3.tar.gz", "has_sig": false, "md5_digest": "cafaed8e1f0ab53028029374455f51ff", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 6029, "upload_time": "2019-10-21T11:24:12", "upload_time_iso_8601": "2019-10-21T11:24:12.474476Z", "url": "https://files.pythonhosted.org/packages/5b/3d/71965de81e56e9c617f1cbc9373f13a222a6a38d1fd3788ed358a46fee4f/bulldog-1.0.3.tar.gz", "yanked": false, "yanked_reason": null } ], "vulnerabilities": [] }