{ "info": { "author": "Jon\u00e1\u0161 Kulh\u00e1nek", "author_email": "jonas.kulhanek@live.com", "bugtrack_url": null, "classifiers": [], "description": "# Deep RL PyTorch\n[![https://www.singularity-hub.org/static/img/hosted-singularity--hub-%23e32929.svg](https://www.singularity-hub.org/static/img/hosted-singularity--hub-%23e32929.svg)](https://singularity-hub.org/collections/2581)\n\nThis repo contains implementation of popular Deep RL algorithms. Furthermore it contains unified interface for training and evaluation with unified model saving and visualization. It can be used as a good starting point when implementing new RL algorithm in PyTorch.\n\n## Getting started\nIf you want to base your algorithm on this repository, start by installing it as a package\n```\npip install git+https://github.com/jkulhanek/deep-rl-pytorch.git\n```\n\nIf you want to run attached experiments yourself, feel free to clone this repository.\n```\ngit clone https://github.com/jkulhanek/deep-rl-pytorch.git\n```\n\nAll dependencies are prepared in a docker container. If you have nvidia-docker enabled, you can use this image. To pull and start the image just run:\n\n```\ndocker run --runtime=nvidia --net=host -it kulhanek/deep-rl-pytorch:latest bash\n```\n\nFrom there, you can either clone your own repository containing your experiments or clone this one.\n\n## Concepts\nAll algorithms are implemented as base classes. In your experiment your need to subclass from those base classes. The `deep_rl.core.AbstractTrainer` class is used for all trainers and all algorithms inherit this class. Each trainer can be wrapped in several wrappers (classes extending `deep_rl.core.AbstractWrapper`). Those wrappers are used for saving, logging, terminating the experiment and etc. All experiments should be registered using `@deep_rl.register_trainer` decorator. This decorator than wraps the trainer with default wrappers. This can be controlled by passing arguments to the decorator. All registered trainers (experiments) can be run by calling `deep_rl.make_trainer(<>).run()`.\n\n## Implemented algorithms\n### A2C\nA2C is a synchronous, deterministic variant of Asynchronous Advantage Actor Critic (A3C) [2] which according to OpenAI [1] gives equal performance. It is however more efficient for GPU utilization.\n\nStart your experiment by subclassing `deep_rl.a2c.A2CTrainer`.\nSeveral models are included in `deep_rl.a2c.model`. You may want to use at least some helper modules contained in this package when designing your own experiment.\n\nIn most of the models, initialization is done according to [3].\n\n### Asynchronous Advantage Actor Critic (A3C) [2]\nThis implementation uses multiprocessing. It comes with two optimizers - RMSprop and Adam.\n\n### Actor Critic using Kronecker-Factored Trust Region (ACKTR) [1]\nThis is an improvement of A2C described in [1].\n\n## Experiments\n> Comming soon\n\n## Requirements\nThose packages must be installed before using the framework for your own algorithm:\n- OpenAI baselines (can be installed by running `pip install git+https://github.com/openai/baselines.git`)\n- PyTorch\n- Visdom (`pip install visdom`)\n- Gym (`pip install gym`)\n- MatPlotLib\n\nThose packages must be installed prior running experiments:\n- DeepMind Lab\n- Gym[atari]\n\n## Sources\nThis repository is based on work of several other authors. We would like to express our thanks.\n- https://github.com/openai/baselines/tree/master/baselines\n- https://github.com/ikostrikov/pytorch-a2c-ppo-acktr/tree/master/a2c_ppo_acktr\n- https://github.com/miyosuda/unreal\n- https://github.com/openai/gym\n\n## References\n[1] Wu, Y., Mansimov, E., Grosse, R.B., Liao, S. and Ba, J., 2017. Scalable trust-region method for deep reinforcement learning using kronecker-factored approximation. In Advances in neural information processing systems (pp. 5279-5288).\n\n[2] Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D. and Kavukcuoglu, K., 2016, June. Asynchronous methods for deep reinforcement learning. In International conference on machine learning (pp. 1928-1937).\n\n[3] Saxe, A.M., McClelland, J.L. and Ganguli, S., 2013. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. arXiv preprint arXiv:1312.6120.\n\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "", "keywords": "", "license": "MIT License", "maintainer": "", "maintainer_email": "", "name": "deep-rl", "package_url": "https://pypi.org/project/deep-rl/", "platform": "", "project_url": "https://pypi.org/project/deep-rl/", "project_urls": null, "release_url": "https://pypi.org/project/deep-rl/0.2.7/", "requires_dist": [ "gym" ], "requires_python": "", "summary": "", "version": "0.2.7" }, "last_serial": 5851151, "releases": { "0.2.7": [ { "comment_text": "", "digests": { "md5": "7e017bff3f2e27537460aea8fb157edd", "sha256": "d309984a3583b8d9b491143b45b8a18b09f3159fba997781fa5242728f09387f" }, "downloads": -1, "filename": "deep_rl-0.2.7-py3-none-any.whl", "has_sig": false, "md5_digest": "7e017bff3f2e27537460aea8fb157edd", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 72235, "upload_time": "2019-09-18T15:58:43", "url": "https://files.pythonhosted.org/packages/47/a7/d265612b40c44c58c8fa5ecd195b314418ccf96e6576d768bccf7945eafb/deep_rl-0.2.7-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "8bdfffe214ab1abde1a2deed465279c5", "sha256": "e92cf7dd1432d48fd8a078a969be1d87f73e7e8dc26a66e2ba29180b8d954f0c" }, "downloads": -1, "filename": "deep_rl-0.2.7.tar.gz", "has_sig": false, "md5_digest": "8bdfffe214ab1abde1a2deed465279c5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 52163, "upload_time": "2019-09-18T15:58:45", "url": "https://files.pythonhosted.org/packages/37/cb/671ab02f899670c1d28430d8286c38b0c233884d13d80e3adac4de61b578/deep_rl-0.2.7.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "7e017bff3f2e27537460aea8fb157edd", "sha256": "d309984a3583b8d9b491143b45b8a18b09f3159fba997781fa5242728f09387f" }, "downloads": -1, "filename": "deep_rl-0.2.7-py3-none-any.whl", "has_sig": false, "md5_digest": "7e017bff3f2e27537460aea8fb157edd", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 72235, "upload_time": "2019-09-18T15:58:43", "url": "https://files.pythonhosted.org/packages/47/a7/d265612b40c44c58c8fa5ecd195b314418ccf96e6576d768bccf7945eafb/deep_rl-0.2.7-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "8bdfffe214ab1abde1a2deed465279c5", "sha256": "e92cf7dd1432d48fd8a078a969be1d87f73e7e8dc26a66e2ba29180b8d954f0c" }, "downloads": -1, "filename": "deep_rl-0.2.7.tar.gz", "has_sig": false, "md5_digest": "8bdfffe214ab1abde1a2deed465279c5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 52163, "upload_time": "2019-09-18T15:58:45", "url": "https://files.pythonhosted.org/packages/37/cb/671ab02f899670c1d28430d8286c38b0c233884d13d80e3adac4de61b578/deep_rl-0.2.7.tar.gz" } ] }