{ "info": { "author": "nteract contributors", "author_email": "nteract@googlegroups.com", "bugtrack_url": null, "classifiers": [ "Intended Audience :: Developers", "Intended Audience :: Science/Research", "Intended Audience :: System Administrators", "License :: OSI Approved :: BSD License", "Programming Language :: Python", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3" ], "description": "\n\n[![Build Status](https://travis-ci.org/nteract/scrapbook.svg?branch=master)](https://travis-ci.org/nteract/scrapbook)\n[![image](https://codecov.io/github/nteract/scrapbook/coverage.svg?branch=master)](https://codecov.io/github/nteract/scrapbook=master)\n[![Documentation Status](https://readthedocs.org/projects/nteract-scrapbook/badge/?version=latest)](https://nteract-scrapbook.readthedocs.io/en/latest/?badge=latest)\n[![badge](https://tinyurl.com/ybk8qa3j)](https://mybinder.org/v2/gh/nteract/scrapbook/master?filepath=binder%2FResultsDemo.ipynb)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/ambv/black)\n\n# scrapbook\n\n**scrapbook** is a library for recording a notebook\u2019s data values (scraps) and\ngenerated visual content (snaps). These recorded scraps and snaps can be read\nat a future time.\n\nTwo new names for information are introduced in scrapbook:\n\n- **scraps**: serializable data values such as strings, lists of objects, pandas\n dataframes, or data table references.\n- **snaps**: named displays of information such as a generated image, plot,\n or UI message which encapsulate information but do not store the underlying\n data.\n\n## Use Case\n\nNotebook users may wish to record data produced during a notebook execution.\nThis recorded data can then be read to be used at a later time or be passed to\nanother notebook as input.\n\nNamely scrapbook lets you:\n\n- **persist** data (scraps) in a notebook\n- **sketch** named displays (snaps) in notebooks\n- **recall** any persisted scrap of data or displayed snap\n- **summarize collections** of notebooks\n\n## API Calls\n\nScrapbook adds a few basic api commands which enable saving and retrieving data.\n\n### `glue` to persist scraps\n\nRecords a `scrap` (data value) in the given notebook cell.\n\nThe `scrap` (recorded value) can be retrieved during later inspection of the\noutput notebook.\n\n```python\nsb.glue(\"hello\", \"world\")\nsb.glue(\"number\", 123)\nsb.glue(\"some_list\", [1, 3, 5])\nsb.glue(\"some_dict\", {\"a\": 1, \"b\": 2})\nsb.glue(\"non_json\", df, 'arrow')\n```\n\nThe scrapbook library can be used later to recover scraps (recorded values)\nfrom the output notebook:\n\n```python\nnb = sb.read_notebook('notebook.ipynb')\nnb.scraps\n```\n\n**scrapbook** will imply the storage format by the value type of any registered\ndata translators. Alternatively, the implied storage format can be overwritten by\nsetting the `storage` argument to the registered name (e.g. `\"json\"`) of a\nparticular translator.\n\nThis data is persisted by generating a display output with a special media type\nidentifying the content storage format and data. These outputs are not visible in\nnotebook rendering but still exist in the document. Scrapbook then can rehydrate\nthe data associated with the notebook in the future by reading these cell outputs.\n\n### `sketch` to save _display output_\n\nDisplay a named snap (visible display output) in a retrievable manner.\n\nUnlike `glue`, `sketch` is intended to generate a visible display output\nfor notebook interfaces to render.\n\n```python\n# record an image highlight\nsb.sketch(\"sharable_png\", IPython.display.Image(filename=get_fixture_path(\"sharable.png\")))\n# record a UI message highlight\nsb.sketch(\"hello\", \"Hello World\")\n```\n\nLike scraps, these can be retrieved at a later time. Unlike scraps, highlights\ndo not carry any actual underlying data, keeping just the display result of some\nobject.\n\n```python\nnb = sb.read_notebook('notebook.ipynb')\n# Returns the dict of name -> snap pairs saved in `nb`\nnb.snaps\n```\n\nMore usefully, you can copy snaps from earlier notebook executions to re-display\nthe object in the current notebook.\n\n```python\nnb = sb.read_notebook('notebook.ipynb')\nnb.copy_highlight(\"sharable_png\")\n```\n\n### `read_notebook` reads one notebook\n\nReads a Notebook object loaded from the location specified at `path`.\nYou've already seen how this function is used in the above api call examples,\nbut essentially this provides a thin wrapper over an `nbformat` notebook object\nwith the ability to extract scrapbook scraps and snaps.\n\n```python\nnb = sb.read_notebook('notebook.ipynb')\n```\n\nThe abstraction makes saved content available as a dataframe referencing each\nkey and source. More of these methods will be made available in later versions.\n\n```python\n# Produces a data frame with [\"name\", \"value\", \"type\", \"filename\"] as columns\nnb.scrap_dataframe\n```\n\nThe Notebook object also has a few legacy functions for backwards compatability\nwith papermill's Notebook object model. As a result, it can be used to read\npapermill execution statistics as well as scrapbook abstractions:\n\n```python\nnb.cell_timing # List of cell execution timings in cell order\nnb.execution_counts # List of cell execution counts in cell order\nnb.papermill_metrics # Dataframe of cell execution counts and times\nnb.parameter_dataframe # Dataframe of notebook parameters\nnb.papermill_dataframe # Dataframe of notebook parameters and cell scraps\n```\n\nThe notebook reader relies on [papermill's registered iorw](https://papermill.readthedocs.io/en/latest/reference/papermill-io.html)\nto enable access to a variety of sources such as -- but not limited to -- S3,\nAzure, and Google Cloud.\n\n### `read_notebooks` reads many notebooks\n\nReads all notebooks located in a given `path` into a Scrapbook object.\n\n```python\n# create a scrapbook named `book`\nbook = sb.read_notebooks('path/to/notebook/collection/')\n# get the underlying notebooks as a list\nbook.sorted_notebooks\n```\n\nThe Scrapbook (`book` in this example) can be used to recall all scraps across\nthe collection of notebooks:\n\n```python\nbook.scraps # Map of {notebook -> {name -> scrap}}\nbook.flat_scraps # Map of {name -> scrap}\n```\n\nOr to collect snaps:\n\n```python\nbook.snaps # Map of {notebook -> {name -> snap}}\nbook.flat_highlights # Map of {name -> snap}\n```\n\nThe Scrapbook collection can be used to `display` all the snaps from the\ncollection as a markdown structured output as well.\n\n```python\nbook.display()\n```\n\nThis display can filter on snap names and keys, as well as enable or disable\nan overall header for the display.\n\nFinally the scrapbook has two backwards compatible features for deprecated\n`papermill` capabilities:\n\n```python\nbook.papermill_dataframe\nbook.papermill_metrics\n```\n\nThese function also relies on [papermill's registered `iorw`](https://papermill.readthedocs.io/en/latest/reference/papermill-io.html)\nto list and read files form various sources.\n\n## Storage Formats\n\nStorage formats are accessible by key names to Translator objects registered\nagainst the `translators.registry` object. To register new data\ntranslator / loaders simply call:\n\n```python\n# add translator to the registry\nregistry.register(\"custom_store_name\", MyCustomTranslator())\n```\n\nThe store class must implement two methods, `translate` and `load`:\n\n```python\nclass MyCustomTranslator(object):\n def translate(self, scrap):\n pass # TODO: Implement\n\n def load(self, scrap):\n pass # TODO: Implement\n```\n\nThis can read transform scraps into a string representing their contents or\nlocation and load those strings back into the original data objects.\n\n### `unicode`\n\nA basic string storage format that saves data as python strings.\n\n```python\nsb.glue(\"hello\", \"world\", \"unicode\")\n```\n\n### `json`\n\n```python\nsb.glue(\"foo_json\", {\"foo\": \"bar\", \"baz\": 1}, \"json\")\n```\n\n### `arrow`\n\nImplementation Pending!\n\n## papermill's deprecated `record` feature\n\n**scrapbook** provides a robust and flexible recording schema. This library is\nintended to replace [papermill](https://papermill.readthedocs.io)'s existing\n`record` functionality.\n\n[Documentation for papermill record](https://papermill.readthedocs.io/en/latest/usage.html#recording-values-to-the-notebook)\nIn brief:\n\n`pm.record(name, value)`: enabled users the ability to record values to be saved\nwith the notebook [[API documentation]](https://papermill.readthedocs.io/en/latest/reference/papermill.html#papermill.api.record)\n\n```python\npm.record(\"hello\", \"world\")\npm.record(\"number\", 123)\npm.record(\"some_list\", [1, 3, 5])\npm.record(\"some_dict\", {\"a\": 1, \"b\": 2})\n```\n\n`pm.read_notebook(notebook)`: pandas could be used later to recover recorded\nvalues by reading the output notebook into a dataframe.\n\n```python\nnb = pm.read_notebook('notebook.ipynb')\nnb.dataframe\n```\n\n### Limitations and challenges\n\n- The `record` function didn't follow papermill's pattern of linear execution\n of a notebook codebase. (It was awkward to describe `record` as an additional\n feature of papermill this week. It really felt like describing a second less\n developed library.)\n- Recording / Reading required data translation to JSON for everything. This is\n a tedious, painful process for dataframes.\n- Reading recorded values into a dataframe would result in unintuitive dataframe\n shapes.\n- Less modularity and flexiblity than other papermill components where custom\n operators can be registered.\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/nteract/scrapbook", "keywords": "jupyter mapreduce nteract pipeline notebook", "license": "BSD", "maintainer": "", "maintainer_email": "", "name": "scrapbook-beta", "package_url": "https://pypi.org/project/scrapbook-beta/", "platform": "", "project_url": "https://pypi.org/project/scrapbook-beta/", "project_urls": { "Funding": "https://nteract.io", "Homepage": "https://github.com/nteract/scrapbook", "Source": "https://github.com/nteract/scrapbook/", "Tracker": "https://github.com/nteract/scrapbook/issues" }, "release_url": "https://pypi.org/project/scrapbook-beta/0.1.0/", "requires_dist": [ "pandas", "six", "papermill", "future", "ipython (>=5.0)", "requests (>=2.21.0)", "futures ; python_version < \"3.0\"", "bumpversion ; extra == 'dev'", "wheel (>=0.31.0) ; extra == 'dev'", "setuptools (>=38.6.0) ; extra == 'dev'", "twine (>=1.11.0) ; extra == 'dev'", "flake8 ; extra == 'dev'", "tox ; extra == 'dev'", "mock ; extra == 'dev'", "pytest (>=4.1) ; extra == 'dev'", "pytest-cov (>=2.6.1) ; extra == 'dev'", "pytest-mock (>=1.10) ; extra == 'dev'", "pytest-env (>=0.6.2) ; extra == 'dev'", "codecov ; extra == 'dev'", "coverage ; extra == 'dev'", "bumpversion ; extra == 'test'", "wheel (>=0.31.0) ; extra == 'test'", "setuptools (>=38.6.0) ; extra == 'test'", "twine (>=1.11.0) ; extra == 'test'", "flake8 ; extra == 'test'", "tox ; extra == 'test'", "mock ; extra == 'test'", "pytest (>=4.1) ; extra == 'test'", "pytest-cov (>=2.6.1) ; extra == 'test'", "pytest-mock (>=1.10) ; extra == 'test'", "pytest-env (>=0.6.2) ; extra == 'test'", "codecov ; extra == 'test'", "coverage ; extra == 'test'" ], "requires_python": "", "summary": "A library for recording and reading data in Jupyter and nteract Notebooks", "version": "0.1.0" }, "last_serial": 4776226, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "8be17e9b7d90de852b100b631c1efbfe", "sha256": "1ac0950680981a1017cd030099a9d76548a1a8e3b150ac417094fbe0ecda7fa1" }, "downloads": -1, "filename": "scrapbook_beta-0.1.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "8be17e9b7d90de852b100b631c1efbfe", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 12973, "upload_time": "2019-02-04T01:57:14", "url": "https://files.pythonhosted.org/packages/ae/a3/1e9212e7660654d1d29372342574fdd821d60f0ad1ef2214754f01c11b28/scrapbook_beta-0.1.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "b13b8f9624c28682336b822cc2cb0f94", "sha256": "11768595e437483c5f71b2abc3bbfb903fb60e6bf243d30927f81fe09c072905" }, "downloads": -1, "filename": "scrapbook-beta-0.1.0.tar.gz", "has_sig": false, "md5_digest": "b13b8f9624c28682336b822cc2cb0f94", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 35272, "upload_time": "2019-02-04T01:57:16", "url": "https://files.pythonhosted.org/packages/e7/8f/eaf6b9b525943a4598e05029ac4a1abfa3b4ef0b915fdde30f1a573478bd/scrapbook-beta-0.1.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "8be17e9b7d90de852b100b631c1efbfe", "sha256": "1ac0950680981a1017cd030099a9d76548a1a8e3b150ac417094fbe0ecda7fa1" }, "downloads": -1, "filename": "scrapbook_beta-0.1.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "8be17e9b7d90de852b100b631c1efbfe", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 12973, "upload_time": "2019-02-04T01:57:14", "url": "https://files.pythonhosted.org/packages/ae/a3/1e9212e7660654d1d29372342574fdd821d60f0ad1ef2214754f01c11b28/scrapbook_beta-0.1.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "b13b8f9624c28682336b822cc2cb0f94", "sha256": "11768595e437483c5f71b2abc3bbfb903fb60e6bf243d30927f81fe09c072905" }, "downloads": -1, "filename": "scrapbook-beta-0.1.0.tar.gz", "has_sig": false, "md5_digest": "b13b8f9624c28682336b822cc2cb0f94", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 35272, "upload_time": "2019-02-04T01:57:16", "url": "https://files.pythonhosted.org/packages/e7/8f/eaf6b9b525943a4598e05029ac4a1abfa3b4ef0b915fdde30f1a573478bd/scrapbook-beta-0.1.0.tar.gz" } ] }