{
    "info": {
        "author": "Seungjae (Ryan) Lee",
        "author_email": "seungjaeryanlee@gmail.com",
        "bugtrack_url": null,
        "classifiers": [
            "License :: OSI Approved :: MIT License",
            "Operating System :: OS Independent",
            "Programming Language :: Python :: 3"
        ],
        "description": "<img src=\"docs/rldb.png\" align=\"right\" width=\"20%\"/>\n\n# rldb\n\n[![Build Status](https://travis-ci.com/seungjaeryanlee/rldb.svg?branch=master)](https://travis-ci.com/seungjaeryanlee/rldb)\n\n![Environments tracked in rldb](https://img.shields.io/badge/environments-114-blue.svg)\n![Papers tracked in rldb](https://img.shields.io/badge/papers-22-blue.svg)\n![Repos tracked in rldb](https://img.shields.io/badge/repos-2-blue.svg)\n![Algorithms tracked in rldb](https://img.shields.io/badge/algorithms-59-blue.svg)\n![Entries tracked in rldb](https://img.shields.io/badge/entries-3266-blue.svg)\n\nDatabase of RL algorithms\n\n| Atari Space Invaders Scores | MuJoCo Walker2d Scores |\n|:-:|:-:|\n| ![Atari Space Invaders Scores](/docs/atari-space-invaders.png) | ![MuJoCo Walker2d Scores](/docs/mujoco-walker2d.png) |\n\n## Examples\n\nYou can use `rldb.find_all({})` to retrieve all existing entries in `rldb`.\n\n```python\nimport rldb\n\n\nall_entries = rldb.find_all({})\n```\n\nYou can also filter entries by specifying key-value pairs that the entry must match:\n\n```python\nimport rldb\n\n\ndqn_entries = rldb.find_all({'algo-nickname': 'DQN'})\nbreakout_noop_entries = rldb.find_all({\n    'env-title': 'atari-breakout',\n    'env-variant': 'No-op start',\n})\n```\n\nYou can also use `rldbl.find_one(filter_dict)` to find one entry that matches the key-value pair specified in `filter_dict`:\n\n```python\nimport rldb\nimport pprint\n\n\nentry = rldb.find_one({\n    'env-title': 'atari-pong',\n    'algo-title': 'Human',\n})\npprint.pprint(entry)\n```\n\n\n<details><summary>Output</summary>\n<p>\n\n```python\n{\n    'algo-nickname': 'Human',\n    'algo-title': 'Human',\n    'env-title': 'atari-pong',\n    'env-variant': 'No-op start',\n    'score': 14.6,\n    'source-arxiv-id': '1511.06581',\n    'source-arxiv-version': 3,\n    'source-authors': [   'Ziyu Wang',\n                          'Tom Schaul',\n                          'Matteo Hessel',\n                          'Hado van Hasselt',\n                          'Marc Lanctot',\n                          'Nando de Freitas'],\n    'source-bibtex': '@article{DBLP:journals/corr/WangFL15,\\n'\n                     '    author    = {Ziyu Wang and\\n'\n                     '                 Nando de Freitas and\\n'\n                     '                 Marc Lanctot},\\n'\n                     '    title     = {Dueling Network Architectures for Deep '\n                     'Reinforcement Learning},\\n'\n                     '    journal   = {CoRR},\\n'\n                     '    volume    = {abs/1511.06581},\\n'\n                     '    year      = {2015},\\n'\n                     '    url       = {http://arxiv.org/abs/1511.06581},\\n'\n                     '    archivePrefix = {arXiv},\\n'\n                     '    eprint    = {1511.06581},\\n'\n                     '    timestamp = {Mon, 13 Aug 2018 16:48:17 +0200},\\n'\n                     '    biburl    = '\n                     '{https://dblp.org/rec/bib/journals/corr/WangFL15},\\n'\n                     '    bibsource = {dblp computer science bibliography, '\n                     'https://dblp.org}\\n'\n                     '}',\n    'source-nickname': 'DuDQN',\n    'source-title': 'Dueling Network Architectures for Deep Reinforcement '\n                    'Learning'\n}\n```\n\n</p>\n</details>\n\n## Entry Structure\n\nHere is the format of every entry:\n\n```python3\n{\n    # BASICS\n    \"source-title\": \"\",\n    \"source-nickname\": \"\",\n    \"source-authors\": [],\n\n    # MISC.\n    \"source-bibtex\": \"\",\n\n    # ALGORITHM\n    \"algo-title\": \"\",\n    \"algo-nickname\": \"\",\n    \"algo-source-title\": \"\",\n\n    # SCORE\n    \"env-title\": \"\",\n    \"score\": 0,\n}\n```\n\n- `source-title` is the full title of the source of the score: it can be the title of the paper or GitHub repository title. `source-nickname` is a popular nickname or acronym for that title if it exists, otherwise it is the same as `source-title`. \n- `source-authors` are a list of authors or contributors.\n- `source-bibtex` is a BibTeX-format citation.\n- `algo-title` is the full title of the algorithm used. `algo-nickname` is the nickname or acronym for that algorithm if it exists, otherwise it is the same as `algo-nickname`.\n- `algo-source-title` is the title of the source of the **algorithm**. It can and often is different from `source-title`.\n\nFor example, the **Space Invaders** score of **Asynchronous Advantage Actor Critic (A3C)** algorithm in the **Noisy Networks for Exploration (NoisyNet)** paper is represented by the following entry:\n\n```python3\n{\n    #  BASICS\n    \"source-title\": \"Noisy Networks for Exploration\",\n    \"source-nickname\": \"NoisyNet\",\n    \"source-authors\": [\n        \"Meire Fortunato\",\n        \"Mohammad Gheshlaghi Azar\",\n        \"Bilal Piot\",\n        \"Jacob Menick\",\n        \"Ian Osband\",\n        \"Alex Graves\",\n        \"Vlad Mnih\",\n        \"Remi Munos\",\n        \"Demis Hassabis\",\n        \"Olivier Pietquin\",\n        \"Charles Blundell\",\n        \"Shane Legg\",\n    ],\n\n    #  ARXIV\n    \"source-arxiv-id\": \"1706.10295\",\n    \"source-arxiv-version\": 2,\n\n    #  MISC.\n    \"source-bibtex\": \"\"\"\n@article{DBLP:journals/corr/FortunatoAPMOGM17,\n    author    = {Meire Fortunato and\n                 Mohammad Gheshlaghi Azar and\n                 Bilal Piot and\n                 Jacob Menick and\n                 Ian Osband and\n                 Alex Graves and\n                 Vlad Mnih and\n                 R{\\'{e}}mi Munos and\n                 Demis Hassabis and\n                 Olivier Pietquin and\n                 Charles Blundell and\n                 Shane Legg},\n    title     = {Noisy Networks for Exploration},\n    journal   = {CoRR},\n    volume    = {abs/1706.10295},\n    year      = {2017},\n    url       = {http://arxiv.org/abs/1706.10295},\n    archivePrefix = {arXiv},\n    eprint    = {1706.10295},\n    timestamp = {Mon, 13 Aug 2018 16:46:11 +0200},\n    biburl    = {https://dblp.org/rec/bib/journals/corr/FortunatoAPMOGM17},\n    bibsource = {dblp computer science bibliography, https://dblp.org}\n}\"\"\",\n\n    # ALGORITHM\n    \"algo-title\": \"Asynchronous Advantage Actor Critic\",\n    \"algo-nickname\": \"A3C\",\n    \"algo-source-title\": \"Asynchronous Methods for Deep Reinforcement Learning\",\n\n    # HYPERPARAMETERS\n    \"algo-frames\": 320 * 1000 * 1000,  # Number of frames\n\n    # SCORE\n    \"env-title\": \"atari-space-invaders\",\n    \"env-variant\": \"No-op start\",\n    \"score\": 1034,\n    \"stddev\": 49,\n}\n```\n\nNote that, as shown here, the entry can contain additional information.\n\n## Sources\n\n### Papers\n\n#### Deep Q-Networks\n\n- [x] [Playing Atari with Deep Reinforcement Learning (Mnih et al., 2013)](https://arxiv.org/abs/1312.5602)\n- [x] [Human-level control through deep reinforcement learning (Mnih et al., 2015)](https://deepmind.com/research/dqn/)\n- [x] [Deep Recurrent Q-Learning for Partially Observable MDPs (Hausknecht and Stone, 2015)](https://arxiv.org/abs/1507.06527)\n- [x] [Massively Parallel Methods for Deep Reinforcement Learning (Nair et al., 2015)](https://arxiv.org/abs/1507.04296)\n- [x] [Deep Reinforcement Learning with Double Q-learning (Hasselt et al., 2015)](https://arxiv.org/abs/1509.06461)\n- [x] [Prioritized Experience Replay (Schaul et al., 2015)](https://arxiv.org/abs/1511.05952)\n- [x] [Dueling Network Architectures for Deep Reinforcement Learning (Wang et al., 2015)](https://arxiv.org/abs/1511.06581)\n- [x] [Noisy Networks for Exploration (Fortunato et al., 2017)](https://arxiv.org/abs/1706.10295)\n- [x] [A Distributional Perspective on Reinforcement Learning (Bellemare et al., 2017)](https://arxiv.org/abs/1707.06887)\n- [x] [Rainbow: Combining Improvements in Deep Reinforcement Learning (Hessel et al., 2017)](https://arxiv.org/abs/1710.02298)\n- [x] [Distributional Reinforcement Learning with Quantile Regression (Dabney et al., 2017)](https://arxiv.org/abs/1710.10044)\n- [x] [Implicit Quantile Networks for Distributional Reinforcement Learning (Dabney et al., 2018)](https://arxiv.org/abs/1806.06923)\n\n#### Policy Gradients\n\n- [x] [Asynchronous Methods for Deep Reinforcement Learning (Mnih et al., 2016)](https://arxiv.org/abs/1602.01783)\n- [x] [Trust Region Policy Optimization (Schulman et al., 2015)](https://arxiv.org/abs/1502.05477)\n- [x] [Proximal Policy Optimization Algorithms (Schulman et al., 2017)](https://arxiv.org/abs/1707.06347)\n- [x] [Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (Wu et al., 2017)](https://arxiv.org/abs/1708.05144)\n- [x] [Addressing Function Approximation Error in Actor-Critic Methods (Fujimoto et al., 2018)](https://arxiv.org/abs/1802.09477)\n- [x] [IMPALA: Importance Weighted Actor-Learner Architectures (Espeholt et al., 2018)](https://arxiv.org/abs/1802.01561)\n- [x] [The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning (Gruslys et al., 2017)](https://arxiv.org/abs/1704.04651)\n\n#### Exploration\n\n- [x] [Exploration by Random Network Distillation (Burda et al., 2018)](https://arxiv.org/abs/1810.12894)\n\n#### Misc.\n\n- [x] [Trust-PCL: An Off-Policy Trust Region Method for Continuous Control (Nachum et al., 2017)](https://arxiv.org/abs/1707.01891)\n\n### Repositories\n\n- [x] [OpenAI Baselines](https://github.com/openai/baselines)\n- [x] [RL Baselines Zoo](https://github.com/araffin/rl-baselines-zoo)\n\n\n",
        "description_content_type": "text/markdown",
        "docs_url": null,
        "download_url": "",
        "downloads": {
            "last_day": -1,
            "last_month": -1,
            "last_week": -1
        },
        "home_page": "https://github.com/seungjaeryanlee/rldb",
        "keywords": "",
        "license": "",
        "maintainer": "",
        "maintainer_email": "",
        "name": "rldb",
        "package_url": "https://pypi.org/project/rldb/",
        "platform": "",
        "project_url": "https://pypi.org/project/rldb/",
        "project_urls": {
            "Homepage": "https://github.com/seungjaeryanlee/rldb"
        },
        "release_url": "https://pypi.org/project/rldb/0.0.0/",
        "requires_dist": null,
        "requires_python": "",
        "summary": "Performances of Reinforcement Learning Agents",
        "version": "0.0.0"
    },
    "last_serial": 5275040,
    "releases": {
        "0.0.0": [
            {
                "comment_text": "",
                "digests": {
                    "md5": "3c33243225adc13b22171d58b7a51343",
                    "sha256": "ec2ebf98140759edf515c7f2274b802ea4c12483dd109f4a814ba04aa9a51cfb"
                },
                "downloads": -1,
                "filename": "rldb-0.0.0-py3-none-any.whl",
                "has_sig": false,
                "md5_digest": "3c33243225adc13b22171d58b7a51343",
                "packagetype": "bdist_wheel",
                "python_version": "py3",
                "requires_python": null,
                "size": 146571,
                "upload_time": "2019-05-16T02:06:01",
                "url": "https://files.pythonhosted.org/packages/c7/ab/95db1d80c06b53b232569953cf12baa4ba3f3f60666124ae98583b713be2/rldb-0.0.0-py3-none-any.whl"
            },
            {
                "comment_text": "",
                "digests": {
                    "md5": "f662d95c7f10bf62184972bd8acfbbb7",
                    "sha256": "39a7499a04882800c2421a30af2efebb2854c69b75629ff5058084c4766d0817"
                },
                "downloads": -1,
                "filename": "rldb-0.0.0.tar.gz",
                "has_sig": false,
                "md5_digest": "f662d95c7f10bf62184972bd8acfbbb7",
                "packagetype": "sdist",
                "python_version": "source",
                "requires_python": null,
                "size": 59185,
                "upload_time": "2019-05-16T02:06:05",
                "url": "https://files.pythonhosted.org/packages/6d/59/ece9626bc9e433562490fefe956c09f097bd4b142b37475c21e7af3dd85e/rldb-0.0.0.tar.gz"
            }
        ]
    },
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "3c33243225adc13b22171d58b7a51343",
                "sha256": "ec2ebf98140759edf515c7f2274b802ea4c12483dd109f4a814ba04aa9a51cfb"
            },
            "downloads": -1,
            "filename": "rldb-0.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "3c33243225adc13b22171d58b7a51343",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 146571,
            "upload_time": "2019-05-16T02:06:01",
            "url": "https://files.pythonhosted.org/packages/c7/ab/95db1d80c06b53b232569953cf12baa4ba3f3f60666124ae98583b713be2/rldb-0.0.0-py3-none-any.whl"
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "f662d95c7f10bf62184972bd8acfbbb7",
                "sha256": "39a7499a04882800c2421a30af2efebb2854c69b75629ff5058084c4766d0817"
            },
            "downloads": -1,
            "filename": "rldb-0.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "f662d95c7f10bf62184972bd8acfbbb7",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 59185,
            "upload_time": "2019-05-16T02:06:05",
            "url": "https://files.pythonhosted.org/packages/6d/59/ece9626bc9e433562490fefe956c09f097bd4b142b37475c21e7af3dd85e/rldb-0.0.0.tar.gz"
        }
    ]
}