{ "info": { "author": "Pavan Mallapragada", "author_email": "UNKNOWN", "bugtrack_url": null, "classifiers": [ "Intended Audience :: Developers", "License :: OSI Approved :: Apache Software License", "Natural Language :: English", "Programming Language :: Python", "Programming Language :: Python :: 2.6", "Programming Language :: Python :: 2.7", "Topic :: Utilities" ], "description": "ediblepickle\n=========================\n.. image:: https://travis-ci.org/mpavan/ediblepickle.png?branch=master\n :target: https://travis-ci.org/mpavan/ediblepickle\n\nediblepickle is an Apache v 2.0 licensed `checkpointing `__ utility.\nThe simplest use case is to checkpoint an expensive computation that need not be repeated every time the program is\nexecuted, as in:\n\n.. code-block:: python\n\n import string\n import time\n from ediblepickle import checkpoint\n\n # A checkpointed expensive function\n @checkpoint(key=string.Template('m{0}_n{1}_${iterations}_$stride.csv'), work_dir='/tmp/intermediate_results', refresh=True)\n def expensive_computation(m, n, iterations=4, stride=1):\n for i in range(iterations):\n time.sleep(1)\n return range(m, n, stride)\n\n # First call, evaluates the function and saves the results\n begin = time.time()\n expensive_computation(-100, 200, iterations=4, stride=2)\n time_taken = time.time() - begin\n\n print time_taken\n\n # Second call, since the checkpoint exists, the result is loaded from that file and returned.\n begin = time.time()\n expensive_computation(-100, 200, iterations=4, stride=2)\n time_taken = time.time() - begin\n\n print time_taken\n\nFeatures\n--------\n\n- Generic Decorator API\n- Checkpoint expensive functions to avoid having to re-compute while developing\n- Configurable computation cache storage format (i.e use human friendly keys and data, instead of pickle binary data)\n- Specify refresh to flush the cache and recompute\n- Specify your own serialize/de-serialize (save/load) functions\n- Python logging, just define your own logger to activate it\n\n\nInstallation\n------------\n\nTo install ediblepickle, simply:\n\n.. code-block:: bash\n\n $ pip install ediblepickle\n\nOr:\n\n.. code-block:: bash\n\n $ easy_install ediblepickle\n\n\nExamples\n----------\n\nAnother nice feature is the ability to define your own serializers and deserializers\nsuch that they are human readable. For instance, you can use numpy/scipy utils to\nsave matrices or csv files to debug:\n\n.. code-block:: python\n\n import string\n import time\n from ediblepickle import checkpoint\n from similarity.utils import dict_config\n\n def my_pickler(integers, f):\n print integers\n for i in integers:\n f.write(str(i))\n f.write('\\n')\n\n def my_unpickler(f):\n return f.read().split('\\n')\n\n @checkpoint(key=string.Template('m{0}_n{1}_${iterations}_$stride.csv'),\n pickler=my_pickler,\n unpickler=my_unpickler,\n refresh=False)\n def expensive_computation(m, n, iterations=4, stride=1):\n for i in range(iterations):\n time.sleep(1)\n return range(m, n, stride)\n\n begin = time.time()\n print expensive_computation(-100, 200, iterations=4, stride=2)\n time_taken = time.time() - begin\n\n print time_taken\n\n begin = time.time()\n print expensive_computation(-100, 200, iterations=4, stride=2)\n time_taken = time.time() - begin\n\n print time_taken\n\nKey Specification\n------------------\nThe key to cache the function output can be specified in 4 different ways.\n\n1. **a python string**: A key specified using a python str object is taken as is. The output of the function decorated is saved\n in a file with that file name.\n\n2. **string.Template object**: The args and kwargs sent to the function are used to generate a name using the string.Template object.\n For instance, for a function f(a, b, arg3=10, arg4=9), (a, b) are the arguments and (arg3, arg4) are the keyword arguments.\n Non-keyword arguments are represented using their position. That is {0} gets converted to the value of the parameter a. \n Keyword arguments are represented using the standard Template notation. For instance, ${arg3} will take the value of arg3.\n\n For instance: \n \n.. code-block:: python\n \n @checkpoint(key=string.Template('{0}_bvalue_{1}_${arg3}_${arg4}_output.txt'))\n def f(a, b, arg3=8, arg4='subtract'):\n # do something with the args\n result = (a - b)/arg3\n return result\n.. end\n\n On a call: f(3, 4, arg3=19, arg4='add')\n Generates: '3_bvalue_4_19_add_output.txt'\n\n3. **lambda function**: Any lambda function of the form lambda args, kwargs: ... is suitable as a key generator. The non-keyword arguments\n are sent in as a tuple in place of 'args', and the keyword arguments are sent as a dictionary in the place of 'kwargs'. You may use\n them to write any complex function to generate and return a key name.\n\n For instance either of the two belowmentioned options, on a call: f(3, 4, arg3=19, arg4='add'), generates: '3_bvalue_4_19_add_output.txt'\n\n.. code-block:: python\n \n @checkpoint(key= lambda args, kwargs: '_'.join(map(str, [args[0], 'bvalue', args[1], kwargs['arg3'], kwargs['arg4'], 'output.txt')))\n def f(a, b, arg3=8, arg4='subtract'):\n # do something with the args\n result = (a - b)/arg3\n return result\n.. end \n\n.. code-block:: python\n\n @checkpoint(key= lambda args, kwargs string.Template('{0}_bvalue_{1}_${arg3}_${arg4}_output.txt').substitute(kwargs).format(args))\n def f(a, b, arg3=8, arg4='subtract'):\n # do something with the args\n result = (a - b)/arg3\n return result\n.. end\n\n\n4. **function object**: This is similar to the lambda object, but key can take in a named function as well. The function that returns a key should\n accept arguments of the form namer(args, kwargs), where args is a tuple containing all the non-keyword arguments, and kwargs is a dictionary\n containing the keywords and their values. For a call: f(3, 4, arg3=19, arg4='add'), this generates the key name to be '3_bvalue_4_19_add_output.txt'\n\n The advantage of this approach is that if you are dealing with arguments that cannot be directly used in the template, you can convert \n them to something that is addable to a name.\n\n.. code-block:: python\n\n def key_namer(args, kwargs):\n return '_'.join(map(str, [args[0], 'bvalue', args[1], kwargs['arg3'], kwargs['arg4'], 'output.txt'))\n\n @checkpoint(key=key_namer)\n def f(a, b, arg3=8, arg4='subtract'):\n # do something with the args\n result = (a - b)/arg3\n return result\n\n\n**Imporatant Note**: When you checkpoint a function, remember to send non-keyword args as non-keyword and key-word args as keyword based on your\ntemplate specification. Although the third argument arg3 can be sent without saying arg3=19, the template will not pick up arg3 since we\nrely on the keyword name matching to that in the template.\n\n\nPicklers/Unpicklers\n--------------------\n\nA pickler must have the following definition:\n\n.. code-block:: python\n\n def my_pickler(f, object):\n # f is an open file descriptor\n save_to_f(object)\n pass\n\n\nAn unpickler must have the following definition:\n\n\n.. code-block:: python\n\n def my_unpickler(f):\n # f is an open file descriptor\n objec = load_object(f)\n return object\n\nThese can be wrappers around numpy.loadtxt, pandas.DataFrame.to_csv,\npandas.DataFrame.from_csv, and many more such serializing functions. Using them\nwith those utility functions to load/save numpy/pandas objects is one of the\nmost important use cases for expensive numerical computations.\n\n\nThe refresh Option\n-------------------\nThe keyword argument 'refresh' ignores the cache if it is set to True and recomputes the function. In a process with multiple steps, this can be used to\nrefresh only those things that need to be refreshed. While you may specify True/False directly, a more convenient approach could be to collect all the\nrefresh values for different functions into a single file, and set them there.\n\nThe refresh can be passed as a boolean option, or as a function.\n\nWhen collecting refresh values together for better managing when and which functions to refresh, one needs to use the\nfunction argument for refresh for several reasons explained below.\n\nFor instance, if I have a process that runs on input x, as a sequence of steps, that give you y1 = f1(x1), y2 = f2(y1), and yn = fn(yn-1). The checkpoint\ndecoration could be of the form:\n\n.. code-block:: python\n\n import defs\n\n @checkpoint(key=key_namer, refresh=defs.TASK1_REFRESH)\n def f1(x1):\n y1 = do_something(x1)\n return y1 \n\n\n @checkpoint(key=key_namer, refresh=defs.TASK2_REFRESH)\n def f2(y1):\n y2 = do_something(y1)\n return y2 \n\n\n @checkpoint(key=key_namer, refresh=defs.TASK3_REFRESH)\n def f3(y2):\n y3 = do_something(y2)\n return y3 \n\n.. end\n\nThese functions can now be independently controlled using these definitions elsewhere, say in defs.py, or from main.py:\n\n.. code-block:: python\n\n # defs.py\n import os\n\n # Caveat: defs.py should contain a mutable object like a dict or a list.\n refresh_dict = {}\n refresh_dict['task1'] = True\n refresh_dict['task2'] = os.environ['TASK2_REFRESH_OPTION']\n refresh_dict['task3'] = True\n\n\n # main.py\n import defs\n if sys.argv[1] == 'task1':\n defs.refresh_dict['task1'] = False\n if sys.argv[1] == 'task3':\n defs.TASK1_REFRESH['task3'] = True\n\n\n**Caveat 1**\n\nWhen collecting refresh options in a python module (say defs.py) using immutable variables like REFRESH = True,\none needs to be cautious if there is a need to change them during runtime:\n\n.. code-block:: python\n\n import defs # NOT from defs import REFRESH\n defs.REFRESH = True # NOT REFRESH = True\n\nIn python modules are objects, and doing `import defs` will give a reference to the module, and the variable REFRESH\ncan be changed in the module using `defs.REFRESH = True`. However, `from defs import REFRESH` gives us a reference\nto the immutable, a local copy of which is made when changed, without altering the module variable.\n\n**Caveat 2**\n\nWhen changing the refresh option through command line options, or the like, it is better to use a lambda function as\n\n.. code-block:: python\n\n # module.py\n import defs\n @checkpoint(..., refresh=lambda: defs.REFRESH)\n def myfunc():\n pass\n\nSince the default values are evaluated at the definition time and are bound to the argument, using a lambda,\n(or something mutable) we ensure that we are taking the current value of REFRESH.\n\nContribute\n----------\n\n#. Check for open issues or open a fresh issue to start a discussion around a feature idea or a bug.\n#. Fork `the repository`_ on GitHub to start making your changes to the **master** branch (or branch off of it).\n#. Write a test which shows that the bug was fixed or that the feature works as expected.\n#. Send a pull request and bug the maintainer until it gets merged and published. :) Make sure to add yourself to AUTHORS_.\n\n.. _`the repository`: http://github.com/mpavan/ediblepickle\n.. _AUTHORS: https://github.com/mpavan/ediblepickle/blob/master/AUTHORS.rst", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/mpavan/ediblepickle", "keywords": "decorator,checkpoint,intermediate results,serialization,deserialization", "license": "Apache 2.0", "maintainer": null, "maintainer_email": null, "name": "ediblepickle", "package_url": "https://pypi.org/project/ediblepickle/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/ediblepickle/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://github.com/mpavan/ediblepickle" }, "release_url": "https://pypi.org/project/ediblepickle/1.1.3/", "requires_dist": null, "requires_python": null, "summary": "Checkpoint", "version": "1.1.3" }, "last_serial": 865189, "releases": { "1.0": [ { "comment_text": "", "digests": { "md5": "01cf3f30976a9c815932cfb07c8a6321", "sha256": "4fad6a71bd59d7bb32294cd8184ee82e382d10c3e95436196de675fcea56a40e" }, "downloads": -1, "filename": "ediblepickle-1.0.tar.gz", "has_sig": false, "md5_digest": "01cf3f30976a9c815932cfb07c8a6321", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5116, "upload_time": "2013-07-16T00:02:49", "url": "https://files.pythonhosted.org/packages/ae/da/f081574b98da07f6dde507f320dd774526757d54b72a30d9f7cfc37bd1b2/ediblepickle-1.0.tar.gz" } ], "1.0.1": [ { "comment_text": "", "digests": { "md5": "bc7e768f6c04b907f6d94438ba4e6b17", "sha256": "c1e0295c3de84253366c8a1714c59f28ffffb500d8ec2220d4bcb3b2706f7e33" }, "downloads": -1, "filename": "ediblepickle-1.0.1.tar.gz", "has_sig": false, "md5_digest": "bc7e768f6c04b907f6d94438ba4e6b17", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5143, "upload_time": "2013-07-16T05:54:25", "url": "https://files.pythonhosted.org/packages/11/f2/5d65dffe35db2b665a65a9c3bad08198a0afc386095e3b82c28d0f62b9b9/ediblepickle-1.0.1.tar.gz" } ], "1.0.2": [ { "comment_text": "", "digests": { "md5": "cb0d699aad64e5fd2552da0c068bf616", "sha256": "c0153053a59316e526c3af5442bef76f5bda7a104c7351052bc1da991d9bcb36" }, "downloads": -1, "filename": "ediblepickle-1.0.2.tar.gz", "has_sig": false, "md5_digest": "cb0d699aad64e5fd2552da0c068bf616", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7266, "upload_time": "2013-07-16T23:29:40", "url": "https://files.pythonhosted.org/packages/b1/b1/b113865899f4b91dd7aa509d3481e9ddfd2ae0418bf7180e5512465f4e4a/ediblepickle-1.0.2.tar.gz" } ], "1.1.0": [ { "comment_text": "", "digests": { "md5": "03b755f63a9fd45b70213df5d17a7672", "sha256": "7f02fc6ca51ac2d638c71f5582225b83ac386aaa7065b7ce93132e49ec53195a" }, "downloads": -1, "filename": "ediblepickle-1.1.0.tar.gz", "has_sig": false, "md5_digest": "03b755f63a9fd45b70213df5d17a7672", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7401, "upload_time": "2013-09-13T19:25:51", "url": "https://files.pythonhosted.org/packages/3d/00/aebd5490202410116f9fa5f1ab441f9f96a4a989db0f5640801abd99a0b4/ediblepickle-1.1.0.tar.gz" } ], "1.1.1": [ { "comment_text": "", "digests": { "md5": "a89b8c35370883e2b81e4e0cd25924c9", "sha256": "945dda0cc9e46465405fb20762aaf96569b66cde131806eb4c10dffb2b76f064" }, "downloads": -1, "filename": "ediblepickle-1.1.1.tar.gz", "has_sig": false, "md5_digest": "a89b8c35370883e2b81e4e0cd25924c9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7943, "upload_time": "2013-09-13T20:56:32", "url": "https://files.pythonhosted.org/packages/b3/87/8da1eed48ce16ad67fd0f8403cba382e802e0f25c230ba6935e9433fed48/ediblepickle-1.1.1.tar.gz" } ], "1.1.2": [ { "comment_text": "", "digests": { "md5": "a9a1517e3445529f0a62629ec38e0218", "sha256": "3696fe6dbe8f9446697ad7d24737f84b856feb53f3d5a97dbc82d6ea1f2f8925" }, "downloads": -1, "filename": "ediblepickle-1.1.2.tar.gz", "has_sig": false, "md5_digest": "a9a1517e3445529f0a62629ec38e0218", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7995, "upload_time": "2013-09-13T21:00:23", "url": "https://files.pythonhosted.org/packages/85/0e/94db6294703de49c35f74c1a7b0b55a2e6868dbe2a2d8305d7fbcf605ce2/ediblepickle-1.1.2.tar.gz" } ], "1.1.3": [ { "comment_text": "", "digests": { "md5": "c0ec625856dd670243c7a5d563bb1d24", "sha256": "daa4546c772dc2f7291cf6581d8c4e6525243d47e5eb9eb21df03984a7904df7" }, "downloads": -1, "filename": "ediblepickle-1.1.3.tar.gz", "has_sig": false, "md5_digest": "c0ec625856dd670243c7a5d563bb1d24", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8000, "upload_time": "2013-09-13T21:03:17", "url": "https://files.pythonhosted.org/packages/3f/e9/9a91273827d7900a8c591945087a3911e2a2390895876920bcc2a46bc1a0/ediblepickle-1.1.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "c0ec625856dd670243c7a5d563bb1d24", "sha256": "daa4546c772dc2f7291cf6581d8c4e6525243d47e5eb9eb21df03984a7904df7" }, "downloads": -1, "filename": "ediblepickle-1.1.3.tar.gz", "has_sig": false, "md5_digest": "c0ec625856dd670243c7a5d563bb1d24", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8000, "upload_time": "2013-09-13T21:03:17", "url": "https://files.pythonhosted.org/packages/3f/e9/9a91273827d7900a8c591945087a3911e2a2390895876920bcc2a46bc1a0/ediblepickle-1.1.3.tar.gz" } ] }