{ "info": { "author": "Nestor Sanchez", "author_email": "nestor.sag@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# rlmodels: a reinforcement learning library\n\nThis project is a collection of some popular optimisation algorithms for reinforcement learning problem. At the moment the available models are:\n\n* DQN\n* DDPG\n* CMAES\n* AC\n\nwith some more going to be added in the future.\n\nIt works with Pytorch models and environment classes like the OpenAI gym ones. Any environment class wrapper that mimic their basic functionality should be fine, but more on that below.\n\n## Getting Started\n\n### Prerequisites\n\nThe project uses ```python 3.6``` and ```torch 1.1.0```.\n\n### Installing\n\nIt can be installed directly from pip like \n```bash\npip install rlmodels\n```\n\n## Usage\n\nBelow is a summary of how the program works. **To see the full documentation click [here](https://nestorsag.github.io/rlmodels/index.html#package)**\n\n### Initialization\n\nThe following is an example with the popular CartPole environment using a double Q network. First the setup. \n\n```python\nimport numpy as np\nimport torch\nimport torch.optim as optim\nimport gym\n\nfrom rlmodels.models.DQN import *\nfrom rlmodels.nets import VanillaNet\n\nimport logging\n\n#logger parameters\nFORMAT = '%(asctime)-15s: %(message)s'\nlogging.basicConfig(level=logging.INFO,format=FORMAT,filename=\"model_fit.log\",filemode=\"a\")\n\nmax_ep_ts = 200\n\nenv = gym.make('CartPole-v0')\nenv._max_episode_steps = max_ep_ts\n\nenv.seed(1)\nnp.random.seed(1)\ntorch.manual_seed(1)\n```\nthe episode and timepstep numbers as well as the average reward trace is logged to the file ```model_fit.log```. Setting the logging level to ```logging.DEBUG``` will also log information about gradient descent steps.\n\nThe library also has a basic network definition, VanillaNet, to which we only need to specify number and size of hidden layer, input and output sizes, and last activation function. It uses ReLU everywhere else by default.\n\nlet's create the basic objects \n\n```python\ndqn_scheduler = DQNScheduler(\n\tbatch_size = lambda t: 200, #constant\n\texploration_rate = lambda t: max(0.01,0.05 - 0.01*int(t/2500)), #decrease exploration down to 1% after 10,000 steps\n\tPER_alpha = lambda t: 1, #constant\n\tPER_beta = lambda t: 1, #constant\n\ttau = lambda t: 100, #constant\n\tagent_lr_scheduler_fn = lambda t: 1.25**(-int(t/1000)), #decrease step size every 2,500 steps,\n\tsteps_per_update = lambda t: 1) #constant\n\nagent_lr = 0.5 #initial learning rate\nagent_model = VanillaNet([60],4,2,None)\nagent_opt = optim.SGD(agent_model.parameters(),lr=agent_lr,weight_decay = 0, momentum = 0)\n\nagent = Agent(agent_model,agent_opt)\n```\n\nthe models take a scheduler object as argument which allows parameters to be changed at runtime accordint to user-defined rules. For example reducing learning rate and exploration rate after a certain number of iterations, as above. Finally, all gradient-based algorithms receive as input an ```Agent``` instance that contains the network deffinition and optimisation algorithm. Once all this is setup we are good to go.\n\n\n```python\ndqn = DQN(agent,env,dqn_scheduler)\n\ndqn.fit(\n\tn_episodes=170,\n\tmax_ts_by_episode=max_ep_ts,\n\tmax_memory_size=2000,\n\ttd_steps=1)\n\n\n```\n\nOnce the agent is trained we can visualize the reward trace. If we are using an environment with a render method (like OpenAI ones) we can also visualise the trained agent. We can also use the trained model using the ```forward``` method of the ```ddq``` object or simply extract it with ```ddq.agent```\n\n```python\ndqn.plot() #plot reward traces\ndqn.play(n=200) #observe the agent play\n```\n\nsee the ```example``` folder for an analogous use of the other algorithms.\n\n### Environment\nFor custom environments or custom rewards, its possible to make a wrapper tha mimics te behavior of the step() and reset() function of gym's environemnts\n```python\nclass MyCustomEnv(object):\n\tdef __init__(self,env):\n\t\tself.env = env\n\tdef step(self,action):\n\t\t## get next state s, reward, termination flag (boolean) and any additional info\n\t\treturn s,r, terminated, info #need to output these 4 things (info can be None)\n\tdef reset(self):\n\t\t#something\n\tdef seed(self):\n\t\t#something\n```", "description_content_type": "text/markdown", "docs_url": null, "download_url": "https://github.com/nestorSag/rlmodels/archive/1.0.7.tar.gz", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/nestorsag/rlmodels", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "rlmodels", "package_url": "https://pypi.org/project/rlmodels/", "platform": "", "project_url": "https://pypi.org/project/rlmodels/", "project_urls": { "Download": "https://github.com/nestorSag/rlmodels/archive/1.0.7.tar.gz", "Homepage": "https://github.com/nestorsag/rlmodels" }, "release_url": "https://pypi.org/project/rlmodels/1.0.7/", "requires_dist": null, "requires_python": "", "summary": "Implementation of some popular reinforcement learning models", "version": "1.0.7" }, "last_serial": 5660044, "releases": { "1.0": [ { "comment_text": "", "digests": { "md5": "a7c2ed1e46685f97fcee64f067a24ea5", "sha256": "984632987dd1923657c297538ecc93f459cbd44933c4c49aab0cb4966acf885a" }, "downloads": -1, "filename": "rlmodels-1.0.tar.gz", "has_sig": false, "md5_digest": "a7c2ed1e46685f97fcee64f067a24ea5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3031, "upload_time": "2019-07-27T10:53:13", "url": "https://files.pythonhosted.org/packages/54/71/16e9a88b28bf04efbd5ba22d550aee24c86706cf0b5e45e0f09aef0d3842/rlmodels-1.0.tar.gz" } ], "1.0.1": [ { "comment_text": "", "digests": { "md5": "ef8d62474ac14132b626cf7e2e930f7c", "sha256": "6d30da9a54fe7ce286554464134f99a4d5dce488c23dbd2e9ef6bf62cbd3b071" }, "downloads": -1, "filename": "rlmodels-1.0.1.tar.gz", "has_sig": false, "md5_digest": "ef8d62474ac14132b626cf7e2e930f7c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3035, "upload_time": "2019-07-27T12:06:14", "url": "https://files.pythonhosted.org/packages/9d/fb/28b9cfe4ad06e1d4f2da78ba96e0f91995d703ea028d3e0e9a752e080eb5/rlmodels-1.0.1.tar.gz" } ], "1.0.2": [ { "comment_text": "", "digests": { "md5": "0e5d0d7bd95e148189342d1b3a452c20", "sha256": "0559cfa9011dd5ab79b777abc0ccea62366343875378d0f54414521c82418b44" }, "downloads": -1, "filename": "rlmodels-1.0.2.tar.gz", "has_sig": false, "md5_digest": "0e5d0d7bd95e148189342d1b3a452c20", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3036, "upload_time": "2019-07-27T12:45:38", "url": "https://files.pythonhosted.org/packages/d7/a4/d2e981c7c0814960165d62b3506e39b4693580b187749b5142b031ac6c93/rlmodels-1.0.2.tar.gz" } ], "1.0.3": [ { "comment_text": "", "digests": { "md5": "2d2378cb2403645f464ddc6673e64fb7", "sha256": "a787ca0c3d050916976bd2de56eec7294ad416cb1e5c1505b2409a2fe46c8a9f" }, "downloads": -1, "filename": "rlmodels-1.0.3.tar.gz", "has_sig": false, "md5_digest": "2d2378cb2403645f464ddc6673e64fb7", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9686, "upload_time": "2019-07-27T13:32:37", "url": "https://files.pythonhosted.org/packages/0f/d1/2c6fcae1feadabaf96faa926e4a2b9cd0e1085fe38e0b80ed0f15045ac98/rlmodels-1.0.3.tar.gz" } ], "1.0.4": [ { "comment_text": "", "digests": { "md5": "2d9183b1c3304642246114c048238b24", "sha256": "bd795e540de90c9f87dd39629c3238f4bbad57584485c70998b3ac273d66c82f" }, "downloads": -1, "filename": "rlmodels-1.0.4.tar.gz", "has_sig": false, "md5_digest": "2d9183b1c3304642246114c048238b24", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9825, "upload_time": "2019-07-28T18:10:21", "url": "https://files.pythonhosted.org/packages/e0/13/29b8f6e6ce3403469812dd53c846d34736c655d9c02a32753aeab81dc99c/rlmodels-1.0.4.tar.gz" } ], "1.0.5": [ { "comment_text": "", "digests": { "md5": "bbb1cb4a3237091d967418b96b362501", "sha256": "53c6bfa749b8af13fdb3aafde0a93d1103f9aaf10668222bf4e54dfddeff3392" }, "downloads": -1, "filename": "rlmodels-1.0.5.tar.gz", "has_sig": false, "md5_digest": "bbb1cb4a3237091d967418b96b362501", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11523, "upload_time": "2019-07-31T16:59:09", "url": "https://files.pythonhosted.org/packages/d6/d3/dfc8552d8fff812e061fb6716c85b2bb782fa5ae09747d2f79f045c5c5b2/rlmodels-1.0.5.tar.gz" } ], "1.0.6": [ { "comment_text": "", "digests": { "md5": "3cb5b527ffe63fef5b395917590ddbce", "sha256": "2b586c45a265fc5f625a195c9761f9e120f85c3146ddc9277c53381668e255d7" }, "downloads": -1, "filename": "rlmodels-1.0.6.tar.gz", "has_sig": false, "md5_digest": "3cb5b527ffe63fef5b395917590ddbce", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11865, "upload_time": "2019-08-02T10:26:58", "url": "https://files.pythonhosted.org/packages/60/70/cca9c1fa4aa315ad21e85c719473866573b48e08dad35273d06900c835b7/rlmodels-1.0.6.tar.gz" } ], "1.0.7": [ { "comment_text": "", "digests": { "md5": "be05825b7d6b87bdfd56dcc8ab2d5610", "sha256": "7e44a51f1b43a88b177f4378cbc646e243b71de11b74a91d14c3a3c073053c68" }, "downloads": -1, "filename": "rlmodels-1.0.7.tar.gz", "has_sig": false, "md5_digest": "be05825b7d6b87bdfd56dcc8ab2d5610", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 13859, "upload_time": "2019-08-10T18:47:50", "url": "https://files.pythonhosted.org/packages/4a/83/f9526731b45cc2850fb5a1f0decb90bee5c2bf43ead075714fb88400fa23/rlmodels-1.0.7.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "be05825b7d6b87bdfd56dcc8ab2d5610", "sha256": "7e44a51f1b43a88b177f4378cbc646e243b71de11b74a91d14c3a3c073053c68" }, "downloads": -1, "filename": "rlmodels-1.0.7.tar.gz", "has_sig": false, "md5_digest": "be05825b7d6b87bdfd56dcc8ab2d5610", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 13859, "upload_time": "2019-08-10T18:47:50", "url": "https://files.pythonhosted.org/packages/4a/83/f9526731b45cc2850fb5a1f0decb90bee5c2bf43ead075714fb88400fa23/rlmodels-1.0.7.tar.gz" } ] }