{ "info": { "author": "Yasuhiro Fujita", "author_email": "fujita@preferred.jp", "bugtrack_url": null, "classifiers": [], "description": "
\n\n# ChainerRL\n[![Build Status](https://travis-ci.org/chainer/chainerrl.svg?branch=master)](https://travis-ci.org/chainer/chainerrl)\n[![Coverage Status](https://coveralls.io/repos/github/chainer/chainerrl/badge.svg?branch=master)](https://coveralls.io/github/chainer/chainerrl?branch=master)\n[![Documentation Status](https://readthedocs.org/projects/chainerrl/badge/?version=latest)](http://chainerrl.readthedocs.io/en/latest/?badge=latest)\n[![PyPI](https://img.shields.io/pypi/v/chainerrl.svg)](https://pypi.python.org/pypi/chainerrl)\n\nChainerRL is a deep reinforcement learning library that implements various state-of-the-art deep reinforcement algorithms in Python using [Chainer](https://github.com/chainer/chainer), a flexible deep learning framework.\n\n![Breakout](assets/breakout.gif)\n![Humanoid](assets/humanoid.gif)\n![Grasping](assets/grasping.gif)\n\n## Installation\n\nChainerRL is tested with Python 2.7+ and 3.5.1+. For other requirements, see [requirements.txt](requirements.txt).\n\nChainerRL can be installed via PyPI:\n```\npip install chainerrl\n```\n\nIt can also be installed from the source code:\n```\npython setup.py install\n```\n\nRefer to [Installation](http://chainerrl.readthedocs.io/en/latest/install.html) for more information on installation. \n\n## Getting started\n\nYou can try [ChainerRL Quickstart Guide](examples/quickstart/quickstart.ipynb) first, or check the [examples](examples) ready for Atari 2600 and Open AI Gym.\n\nFor more information, you can refer to [ChainerRL's documentation](http://chainerrl.readthedocs.io/en/latest/index.html).\n\n## Algorithms\n\n| Algorithm | Discrete Action | Continous Action | Recurrent Model | CPU Async Training |\n|:----------|:---------------:|:----------------:|:---------------:|:------------------:|\n| DQN (including DoubleDQN etc.) | \u2713 | \u2713 (NAF) | \u2713 | x |\n| Categorical DQN | \u2713 | x | \u2713 | x |\n| Rainbow | \u2713 | x | \u2713 | x |\n| IQN | \u2713 | x | x | x |\n| DDPG | x | \u2713 | \u2713 | x |\n| A3C | \u2713 | \u2713 | \u2713 | \u2713 |\n| ACER | \u2713 | \u2713 | \u2713 | \u2713 |\n| NSQ (N-step Q-learning) | \u2713 | \u2713 (NAF) | \u2713 | \u2713 |\n| PCL (Path Consistency Learning) | \u2713 | \u2713 | \u2713 | \u2713 |\n| PPO | \u2713 | \u2713 | \u2713 | x |\n| TRPO | \u2713 | \u2713 | x | x |\n| TD3 | x | \u2713 | x | x |\n\nFollowing algorithms have been implemented in ChainerRL:\n- A3C (Asynchronous Advantage Actor-Critic)\n- ACER (Actor-Critic with Experience Replay)\n- Asynchronous N-step Q-learning\n- Rainbow\n- Categorical DQN\n- IQN\n- DQN (including Double DQN, Persistent Advantage Learning (PAL), Double PAL, Dynamic Policy Programming (DPP))\n- DDPG (Deep Deterministic Policy Gradients) (including SVG(0))\n- PGT (Policy Gradient Theorem)\n- PCL (Path Consistency Learning)\n- PPO (Proximal Policy Optimization)\n- TRPO (Trust Region Policy Optimization)\n- TD3 (Twin Delayed Deep Deterministic policy gradient algorithm)\n\nQ-function based algorithms such as DQN can utilize a Normalized Advantage Function (NAF) to tackle continuous-action problems as well as DQN-like discrete output networks.\n\n## Paper Implementations\nThe following papers have been implemented in ChainerRL:\n- [Playing Atari with Deep Reinforcement Learning](https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf)\n- [Human-level control through Deep Reinforcement Learning](https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf)\n- [Deep Reinforcement Learning with Double Q-learning](https://arxiv.org/abs/1509.06461)\n- [Prioritized Experience Replay](https://arxiv.org/abs/1511.05952)\n- [Dueling Network Architectures for Deep Reinforcement Learning](https://arxiv.org/abs/1511.06581)\n- [Asynchronous Methods for Deep Reinforcement Learning](https://arxiv.org/abs/1602.01783)\n- [A Distributional Perspective on Reinforcement Learning](https://arxiv.org/abs/1707.06887)\n- [Implicit Quantile Networks for Distributional Reinforcement Learning](https://arxiv.org/abs/1806.06923)\n- [Rainbow: Combining Improvements in Deep Reinforcement Learning](https://arxiv.org/abs/1710.02298)\n- [Increasing the Action Gap: New Operators for Reinforcement Learning](https://arxiv.org/abs/1512.04860)\n- [Noisy Networks for Exploration](https://arxiv.org/abs/1706.10295)\n- [Continuous control with deep reinforcement learning](https://arxiv.org/abs/1509.02971)\n- [Proximal Policy Optimization Algorithms](https://arxiv.org/abs/1707.06347)\n- [Trust Region Policy Optimization](https://arxiv.org/abs/1502.05477)\n- [Sample Efficient Actor-Critic with Experience Replay](https://arxiv.org/abs/1611.01224)\n- [Bridging the Gap Between Value and Policy Based Reinforcement Learning](https://arxiv.org/abs/1702.08892)\n- [Addressing Function Approximation Error in Actor-Critic Methods](https://arxiv.org/abs/1802.09477)\n\n\n## Visualization\n\nChainerRL has a set of accompanying [visualization tools](https://github.com/chainer/chainerrl-visualizer) in order to aid developers' ability to understand and debug their RL agents. With this visualization tool, the behavior of ChainerRL agents can be easily inspected from a browser UI.\n\n\n## Environments\n\nEnvironments that support the subset of OpenAI Gym's interface (`reset` and `step` methods) can be used.\n\n## Contributing\n\nAny kind of contribution to ChainerRL would be highly appreciated! If you are interested in contributing to ChainerRL, please read [CONTRIBUTING.md](CONTRIBUTING.md).\n\n## License\n\n[MIT License](LICENSE).", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "", "keywords": "", "license": "MIT License", "maintainer": "", "maintainer_email": "", "name": "chainerrl", "package_url": "https://pypi.org/project/chainerrl/", "platform": "", "project_url": "https://pypi.org/project/chainerrl/", "project_urls": null, "release_url": "https://pypi.org/project/chainerrl/0.7.0/", "requires_dist": null, "requires_python": "", "summary": "ChainerRL, a deep reinforcement learning library", "version": "0.7.0" }, "last_serial": 5460958, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "e9a57221a77dcbcc951e7a6291eb458b", "sha256": "7b730810cae3e040d64df8e77f3859dda69967e533b06990163e3821ac9a1db0" }, "downloads": -1, "filename": "chainerrl-0.0.1.tar.gz", "has_sig": false, "md5_digest": "e9a57221a77dcbcc951e7a6291eb458b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 666, "upload_time": "2017-02-01T10:47:02", "url": "https://files.pythonhosted.org/packages/a5/04/47c39325bc51ded4f8d1ecdbe96a9efbe2446c1b63010d1b669e81d2303b/chainerrl-0.0.1.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "058d8443840f5cd7e68e94b52d7b1ca2", "sha256": "2031317267e358578fdecdd43d00a608b004014ae7ad6aaea6d4a291d815d9ee" }, "downloads": -1, "filename": "chainerrl-0.0.2.tar.gz", "has_sig": false, "md5_digest": "058d8443840f5cd7e68e94b52d7b1ca2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 41312, "upload_time": "2017-02-20T09:19:44", "url": "https://files.pythonhosted.org/packages/08/e5/0523ae5a0f1c44e86304530fa0e13299ca698fdc63129c24993803babefe/chainerrl-0.0.2.tar.gz" } ], "0.1.0": [ { "comment_text": "", "digests": { "md5": "01ef7b16cf85b9e74b4fa79136155d97", "sha256": "7df6fdcf5cf366a35b6f15c46eab828595d936bea8777e0f29271e259da437a2" }, "downloads": -1, "filename": "chainerrl-0.1.0.tar.gz", "has_sig": false, "md5_digest": "01ef7b16cf85b9e74b4fa79136155d97", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 51938, "upload_time": "2017-03-27T03:00:23", "url": "https://files.pythonhosted.org/packages/6f/55/4352d2ef454934e26fcf15683cc62fa8205c9c6d145bc08090a80842d107/chainerrl-0.1.0.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "5f3ea712b04a9db1e2806ca27d64b75d", "sha256": "b1c470e6c8cf3a8d698592117cc4b80293038de98e46a416ca89da2ad69b730f" }, "downloads": -1, "filename": "chainerrl-0.2.0.tar.gz", "has_sig": false, "md5_digest": "5f3ea712b04a9db1e2806ca27d64b75d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 56434, "upload_time": "2017-06-08T06:56:27", "url": "https://files.pythonhosted.org/packages/cf/f1/82979944600911dd941db66530a9ce9c81eddf4957bb9e7ae174821dc548/chainerrl-0.2.0.tar.gz" } ], "0.3.0": [ { "comment_text": "", "digests": { "md5": "9f83adda9478a660fd0e46ffb248c6a4", "sha256": "5c698d5978fc827fcc94d904967c9f96504af50ba59e165ff66384010dd54c9d" }, "downloads": -1, "filename": "chainerrl-0.3.0.tar.gz", "has_sig": false, "md5_digest": "9f83adda9478a660fd0e46ffb248c6a4", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 64452, "upload_time": "2017-12-08T09:39:06", "url": "https://files.pythonhosted.org/packages/a7/d2/e71626685001985cf02b76f5942c1fd1634291187cd77d1a038d54436d94/chainerrl-0.3.0.tar.gz" } ], "0.4.0": [ { "comment_text": "", "digests": { "md5": "a752daa4882ccb044dbb5f338bfc3c41", "sha256": "8d7eea218872c35b171e65dd76f2eef0ef0791ba5259793e4780692010f55105" }, "downloads": -1, "filename": "chainerrl-0.4.0.tar.gz", "has_sig": false, "md5_digest": "a752daa4882ccb044dbb5f338bfc3c41", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 75951, "upload_time": "2018-07-23T13:45:15", "url": "https://files.pythonhosted.org/packages/e9/0a/ef1b1522c8d18da8e9cda5a59486d9b62864806454410da2e283295ed9fe/chainerrl-0.4.0.tar.gz" } ], "0.5.0": [ { "comment_text": "", "digests": { "md5": "2cf1952f316a73fe0824c55df8b31f80", "sha256": "cac2880bcd587d2e472d14d4a3772ce22aff188d8a109dced54536491014db1f" }, "downloads": -1, "filename": "chainerrl-0.5.0.tar.gz", "has_sig": false, "md5_digest": "2cf1952f316a73fe0824c55df8b31f80", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 87757, "upload_time": "2018-11-15T08:18:28", "url": "https://files.pythonhosted.org/packages/e0/ce/d1d1a5a86a847407f3dad2315ca9e63de081e1072799fc9ea9771e0b4f85/chainerrl-0.5.0.tar.gz" } ], "0.6.0": [ { "comment_text": "", "digests": { "md5": "f1ecdb2a0b5a6f246d328687d748303c", "sha256": "0354e372021520f193928a3cc5034eb441ffa068c5875c01f0ce21aa56135642" }, "downloads": -1, "filename": "chainerrl-0.6.0.tar.gz", "has_sig": false, "md5_digest": "f1ecdb2a0b5a6f246d328687d748303c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 94794, "upload_time": "2019-02-28T08:55:13", "url": "https://files.pythonhosted.org/packages/66/bd/d418838af29161a30e3600d232397d49518b559ff3eb5ed816ea1c0ddda0/chainerrl-0.6.0.tar.gz" } ], "0.7.0": [ { "comment_text": "", "digests": { "md5": "f9e3f24dcdc55738ec0e5ef474896dd7", "sha256": "a0bc67348ac5795aec86af60c62d94467011aae8c0c4de7f649e0b5698af222a" }, "downloads": -1, "filename": "chainerrl-0.7.0.tar.gz", "has_sig": false, "md5_digest": "f9e3f24dcdc55738ec0e5ef474896dd7", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 109375, "upload_time": "2019-06-28T10:09:44", "url": "https://files.pythonhosted.org/packages/2d/d8/d03b4ecc859c06c757fd0ee40c66414bd64872e8900d292fdebcb6449cd7/chainerrl-0.7.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "f9e3f24dcdc55738ec0e5ef474896dd7", "sha256": "a0bc67348ac5795aec86af60c62d94467011aae8c0c4de7f649e0b5698af222a" }, "downloads": -1, "filename": "chainerrl-0.7.0.tar.gz", "has_sig": false, "md5_digest": "f9e3f24dcdc55738ec0e5ef474896dd7", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 109375, "upload_time": "2019-06-28T10:09:44", "url": "https://files.pythonhosted.org/packages/2d/d8/d03b4ecc859c06c757fd0ee40c66414bd64872e8900d292fdebcb6449cd7/chainerrl-0.7.0.tar.gz" } ] }