{ "info": { "author": "Michael Wooley", "author_email": "michael.wooley@us.gt.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Operating System :: OS Independent", "Programming Language :: Python :: 3.7" ], "description": "# `yapij-python`: Python-side of `yapij` package.\n\n## Implementation Details\n\nThere are two key challenges that one faces when implementing an interpreter emulator:\n\n1. Catching and processing program output.\n1. Interrupting code before it is completed.\n\nAt the same time, we'd also like:\n\n1. The ability to send other commands to the python process _while code is running._ For example, save a workspace while code is running.\n1. Check on the health of the process with a heartbeat.\n\nThe main ingredients of the solution are:\n\n1. Multi-threading for a main interface, the interpreter, and a heartbeat.\n1. `asyncio` scheduling off the main loop.\n1. Context managers that overwrite `sys.stdout` and `sys.stderr` with an emulator. _Appropriate placement of the context managers are key!_\n\n### Misc. Details\n\n#### Placement of context managers\n\nA context manager that is called within a thread will \"bubble up\" to parent threads so long as it is running. See the [appendix example](#context-managers-in-a-thread). (However, there is no \"bubbling down\" from parent to child threads.)\n\nThis is problematic in our context because the threads will run as long as the context. (Therefore, `Thread.join` is not an option.)\n\nTherefore, we place `catch_output` - the main context manager that formats `print` statements, exceptions, and `sys.stdout` in general - in the child thread `ExecSession`. This thread executes commands sent to the editor.\n\nAll such similar statements in the main thread are also handled by `catch_output` due to the \"bubbling up\" behavior.\n\n### Rejected Alternatives\n\n- Use standard `exec` and `runpy` to excute input:\n - Built-in module `code` provides `InteractiveInterpreter` and `InteractiveConsole` classes for doing just this.\n - Running code on instances of these objects still blocks. Therefore, does nothing for the `KeyboardInterrupt` problem.\n - Also do nothing for the last line print.\n\n## Known Issues and Limitations\n\n### Execution\n\n#### Threading\n\n- The session interpreter runs on its own thread. Therefore, certain applications may not run as expected.\n- For example, the `signal` module cannot run on a non-main thread.\n- Consider flipping around so that \"main\" thread is `ExecSession`.\n\n#### `sys.stdout` and `sys.stderr`\n\n- In order to communicate with the node process, `sys.stdout` and `sys.stderr` are overridden with an instance of a custom class `ZMQIOWrapper`.\n- The custom class is built to emulate the classes underlying `sys.stdout`. In particular, it inherits class `io.TextIOWrapper`.\n- However, full equivelance is not gauranteed at this time.\n- Moreover, attempts to re-route `sys.stdout` from within the interpreter may not work as expected or may fail to revert as expected.\n\n### Security\n\nThe point of this module is to permit _arbitrary code execution_. It is by no means secure.\n\n#### Workspace Manager\n\n- The workspace manager currently saves objects using the `dill` module, which is based on `pickle`\n- We use `dill` because it allows us to preserve the state of a huge range of objects.\n- The problem is that, if it is possible to pickle anything, then it will also be possible to pickle malicious code.\n - See the useful articles by [Nicolas Lara](https://lincolnloop.com/blog/playing-pickle-security/) and [Kevin London](https://www.kevinlondon.com/2015/08/15/dangerous-python-functions-pt2.html).\n - An example of a malicious `dill` exploit can be found in the [appendix](#a-dill-exploit)\n- The current approach is to add a key to each file following the approach outlined [here](https://www.synopsys.com/blogs/software-security/python-pickling/).\n - This will raise a flag and fail to load if the generated key does not match the data.\n - **_It cannot protect in cases where someone malicious correctly decodes then re-encodes a file (or puts malicious code in the file to start)._**\n - Thus, this is best thought of as a way of being protected from code that might be naively injected into the pickled workspace when it is transferred between two known users (i.e. via a poorly-executed [man-in-the-middle attack](https://en.wikipedia.org/wiki/Man-in-the-middle_attack).)\n- Further refinements might included using `pickletools.dis` to inspect files for red flags. (See the [example](#a-dill-exploit) code for what that spits out.)\n - This will still never be completely secure.\n- Jupyter Notebook stores keys in a separate `db`.\n - [Docs](https://jupyter-notebook.readthedocs.io/en/stable/security.html#the-details-of-trust)\n - [Some code references](https://github.com/jupyter/jupyter_core/blob/f1e18b8a52cd526c0cd1402b6041778dd60f20dc/jupyter_core/migrate.py#L16)\n - Where would it be stored on this module? How is db started?\n\n#### A \"Safe Mode\"?\n\n- It is really hard to do anything like a sandbox for python.\n- In Python 2.3 [`rexec`](https://docs.python.org/2/library/restricted.html) was disabled due to \"various known and not readily fixable security holes.\"\n- Therefore, we take the stance that - instead of trying to offer security some of the time - we will always allow arbitrary execution in the hopes that this keeps users vigilant.\n\n#### Security Best Practices\n\nBest practices for yapij are identical to best practices for running any python code:\n\n- Never load a workspace from someone that you do not know and trust.\n- Never install a python package that you do not know or trust.\n\n## Packaging\n\n- Packaging is carried out with [PyPRI](https://www.python-private-package-index.com/).\n- A new version is compiled by a job (using `.gitlab-ci.yaml`) every time that the a new commit is pushed with version (I think it depends on a tag being added.)\n- Go to CLI to see the jobs.\n- Use `pipreqs yapij` to make `requirements.txt`\n\n### Dependencies\n\nThe main non-standard dependencies are:\n\n- **[`pyzmq`/`zmq`](https://github.com/zeromq/pyzmq)**: \"\u00c3\u02dcMQ is a lightweight and fast messaging implementation.\"\n- **[`msgpack_python`/`msgpack`](https://github.com/msgpack/msgpack-python)**: \"MessagePack is an efficient binary serialization format. It lets you exchange data among multiple languages like JSON. But it's faster and smaller.\"\n- **[`dill`](https://pypi.org/project/dill/)**: \"dill extends python\u00e2\u20ac\u2122s pickle module for serializing and de-serializing python objects to the majority of the built-in python types.\"\n\nWe also provide custom serialization for `NumPy` arrays and `Pandas` dataframes. Thus, these become dependencies as well.\n\n## About\n\n### Contact\n\nMichael Wooley\n\n[michael.wooley@us.gt.com](mailto:michael.wooley@us.gt.com)\n\n[michaelwooley.github.io](michaelwooley.github.io)\n\n### License\n\nUNLICENSED\n\n(Sorry, not my choice.)\n\n## Appendix\n\n### A `dill` Exploit\n\nDrawn from [Kevin London's _Dangerous Python Functions, Part 2_](https://www.kevinlondon.com/2015/08/15/dangerous-python-functions-pt2.html)\n\n```python\nimport os\nimport dill\nimport pickletools\n\n# Exploit that we want the target to unpickle\nclass Exploit(object):\n def __reduce__(self):\n # Note: this will only list files in your directory.\n # It is a proof of concept.\n return (os.system, ('dir',))\n\n\ndef serialize_exploit():\n shellcode = dill.dumps({'e': Exploit(), 's': dill.dumps})\n return shellcode\n\n\ndef insecure_deserialize(exploit_code):\n dill.loads(exploit_code)\n\n\nif __name__ == '__main__':\n shellcode = serialize_exploit()\n print('~'*80,'IF I CAN SEE YOUR FILES I CAN USUALLY DELETE THEM AS WELL', '~'*80, sep='\\n')\n insecure_deserialize(shellcode)\n\n print('~'*80,'WHAT IF WE MADE USE OF SHELL CODE TO LOOK FOR RED FLAGS LIKE \"REDUCE\"?', '~'*80, sep='\\n')\n pickletools.dis(shellcode)\n```\n\n### Context managers in a thread\n\n```python\nimport threading\nimport os\nimport sys\nimport contextlib\nimport copy\n\n# Original\nprint_original = copy.copy(__builtins__.print)\n\ndef print_modified(*objects, sep=' ', end='\\n', file=sys.stdout, flush=True):\n return print_original('[Context]', *objects, sep=sep, end=end, file=file, flush=flush)\n\n@contextlib.contextmanager\ndef catch_output():\n try:\n __builtins__.print = print_modified\n yield\n finally:\n __builtins__.print = print_original\n\nclass WorkerThread(threading.Thread):\n\n def run(self):\n with catch_output(False):\n time.sleep(3)\n print('Inside Context')\n time.sleep(3)\n print('Outside Context')\n\nw = WorkerThread()\nw.start()\nprint('Yep')\n```\n\nWill return something like:\n\n```\n[Context] Yep\n[Context] Inside Context\nOutside Context\n```\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/michaelwooley/yapij", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "yapij-py", "package_url": "https://pypi.org/project/yapij-py/", "platform": "", "project_url": "https://pypi.org/project/yapij-py/", "project_urls": { "Homepage": "https://github.com/michaelwooley/yapij" }, "release_url": "https://pypi.org/project/yapij-py/999/", "requires_dist": [ "dill", "traitlets", "typing", "msgpack-python", "pyzmq" ], "requires_python": ">=3.5", "summary": "Python-side of YAPIJ js-to-python interpreter.", "version": "999" }, "last_serial": 4573332, "releases": { "999": [ { "comment_text": "", "digests": { "md5": "0f5494bbdd95d689b886d17f49f3e4b0", "sha256": "0d9ec4601dd23c4360dd495f8f05417460a9b5237abda6045ab197505f17d8fb" }, "downloads": -1, "filename": "yapij_py-999-py3-none-any.whl", "has_sig": false, "md5_digest": "0f5494bbdd95d689b886d17f49f3e4b0", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 27748, "upload_time": "2018-12-07T21:02:36", "url": "https://files.pythonhosted.org/packages/1a/dc/414d8232d9fc9b99c4c243034719e27b510c7c6b762fdfb04a72173fb5c1/yapij_py-999-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1c482af3f680ce4a7cc3ec1630bc3f3d", "sha256": "14ed3751adbfb48379e15e1729fb781ac1bd56788373526ae9b9325d1ccfb527" }, "downloads": -1, "filename": "yapij-py-999.tar.gz", "has_sig": false, "md5_digest": "1c482af3f680ce4a7cc3ec1630bc3f3d", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 26349, "upload_time": "2018-12-07T21:02:38", "url": "https://files.pythonhosted.org/packages/32/64/18b20c6cd6be2d54d8e11b92cde88597c26e91e5d5ded9684d9d88162b20/yapij-py-999.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "0f5494bbdd95d689b886d17f49f3e4b0", "sha256": "0d9ec4601dd23c4360dd495f8f05417460a9b5237abda6045ab197505f17d8fb" }, "downloads": -1, "filename": "yapij_py-999-py3-none-any.whl", "has_sig": false, "md5_digest": "0f5494bbdd95d689b886d17f49f3e4b0", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 27748, "upload_time": "2018-12-07T21:02:36", "url": "https://files.pythonhosted.org/packages/1a/dc/414d8232d9fc9b99c4c243034719e27b510c7c6b762fdfb04a72173fb5c1/yapij_py-999-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1c482af3f680ce4a7cc3ec1630bc3f3d", "sha256": "14ed3751adbfb48379e15e1729fb781ac1bd56788373526ae9b9325d1ccfb527" }, "downloads": -1, "filename": "yapij-py-999.tar.gz", "has_sig": false, "md5_digest": "1c482af3f680ce4a7cc3ec1630bc3f3d", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 26349, "upload_time": "2018-12-07T21:02:38", "url": "https://files.pythonhosted.org/packages/32/64/18b20c6cd6be2d54d8e11b92cde88597c26e91e5d5ded9684d9d88162b20/yapij-py-999.tar.gz" } ] }