{ "info": { "author": "heron", "author_email": "adept@heronsystems.com", "bugtrack_url": null, "classifiers": [], "description": "# adept\n\nadept is a library designed to accelerate reinforcement learning research by providing:\n* baseline reinforcement learning models and algorithms for PyTorch\n* multi-GPU compute options\n* access to various environments\n* built-in tensorboard logging, model saving, reloading, evaluation, and rendering\n* abstractions for building custom networks, agents, execution modes, and experiments\n* proven hyperparameter defaults\n\nThis code is alpha, expect rough edges.\n\n## Features\nAgents / Networks\n* Actor Critic with Generalized Advantage Estimation\n* Stateful networks (ie. LSTMs)\n* Batch Normalization for reinforcement learning\n\nExecution Modes\n* Local (Single-GPU, A2C)\n* Towered (Multi-GPU, A3C-variant)\n* Importance Weighted Actor Learner Architectures, [IMPALA](https://arxiv.org/pdf/1802.01561.pdf) (Faster Multi-GPU)\n\nEnvironments\n* OpenAI Gym\n* StarCraft 2 (alpha, impala mode does not work with SC2 yet)\n\nWe designed this library to be flexible and extensible. Plugging in novel research ideas should be doable.\n\n## Major Dependencies\n* gym\n* PyTorch 0.4.x (excluding 0.4.1 due to an [unbind bug](https://github.com/pytorch/pytorch/pull/9995))\n* Python 3.5+\n\n## Installation\n* Follow instructions for [PyTorch](https://pytorch.org/) \n* (Optional) Follow instructions for [StarCraft 2](https://github.com/Blizzard/s2client-proto#downloads)\n\n```\n# Remove mpi, sc2, profiler if you don't plan on using these features:\npip install adeptRL[mpi,sc2,profiler]\n```\n\n## Performance\n* Used to win a [Doom competition](http://vizdoom.cs.put.edu.pl/competition-cig-2018/competition-results) (Ben Bell / Marv2in)\n* ~2500 training frames per second single-GPU performance on a Dell XPS 15\" laptop (Geforce 1050Ti)\n* Will post Atari/SC2 baseline scores here at some point\n\n## Examples\nIf you write your own scripts, you can provide your own agents or networks, but we have some presets you can run out of the box.\nLogs go to `/tmp/adept_logs/` by default.\nThe log directory contains the tensorboard file, saved models, and other metadata.\n\n```\n# Local Mode (A2C)\n# We recommend 4GB+ GPU memory, 8GB+ RAM, 4+ Cores\npython -m adept.scripts.local --env-id BeamRiderNoFrameskip-v4\n\n# Towered Mode (A3C Variant, requires mpi4py)\n# We recommend 2+ GPUs, 8GB+ GPU memory, 32GB+ RAM, 4+ Cores\npython -m adept.scripts.towered --env-id BeamRiderNoFrameskip-v4\n\n# IMPALA (requires mpi4py and is resource intensive)\n# We recommend 2+ GPUs, 8GB+ GPU memory, 32GB+ RAM, 4+ Cores\nmpiexec -n 3 python -m adept.scripts.impala --env-id BeamRiderNoFrameskip-v4\n\n# StarCraft 2 (IMPALA not supported yet)\n# Warning: much more resource intensive than Atari\npython -m adept.scripts.local --env-id CollectMineralShards\n\n# To see a full list of options:\npython -m adept.scripts.local -h\npython -m adept.scripts.towered -h\npython -m adept.scripts.impala -h\n```\n\n## API Reference\n![architecture](images/architecture.png)\n### Agents\nAn Agent acts on and observes the environment.\nCurrently only ActorCritic is supported. Other agents, such as DQN or ACER may be added later.\n### Containers\nContainers hold all of the application state. Each subprocess gets a container in Towered and IMPALA modes.\n### Environments\nEnvironments run in subprocesses and send their observation, rewards, terminals, and infos to the host process.\nThey work pretty much the same way as OpenAI's code.\n### Experience Caches\nAn Experience Cache is a Rollout or Experience Replay that is written to after stepping and read before learning.\n### Modules\nModules are generally useful PyTorch modules used in Networks.\n### Networks\nNetworks are not PyTorch modules, they need to implement our abstract NetworkInterface or ModularNetwork classes.\nA ModularNetwork consists of a trunk, body, and head.\nThe Trunk can consist of multiple networks for vision or discrete data. It flattens these into an embedding.\nThe Body network operates on the flattened embedding and would typically be an LSTM, Linear layer, or a combination.\nThe Head depends on the Environment and Agent and is created accordingly.\n\n## Acknowledgements\nWe borrow pieces of OpenAI's [gym](https://github.com/openai/gym) and [baselines](https://github.com/openai/baselines) code.\nWe indicate where this is done.\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/heronsystems/adeptRL", "keywords": "", "license": "GNU", "maintainer": "", "maintainer_email": "", "name": "adeptRL", "package_url": "https://pypi.org/project/adeptRL/", "platform": "", "project_url": "https://pypi.org/project/adeptRL/", "project_urls": { "Homepage": "https://github.com/heronsystems/adeptRL" }, "release_url": "https://pypi.org/project/adeptRL/0.1.1/", "requires_dist": [ "tensorboardX (>=1.2)", "cloudpickle (>=0.5)", "opencv-python (>=3.4)", "mpi4py (>=3.0); extra == 'mpi'", "pyinstrument (>=2.0); extra == 'profiler'", "pysc2 (>=2.0); extra == 'sc2'", "numpy (>=1.14)", "gym[atari] (>=0.10)", "absl-py (>=0.2)" ], "requires_python": ">=3.5.0", "summary": "Reinforcement Learning Framework", "version": "0.1.1" }, "last_serial": 4215255, "releases": { "0.1.1": [ { "comment_text": "", "digests": { "md5": "9b5432e28539a7aa86a12ed0a0dd6fa7", "sha256": "8b1db429029149009097ba1bf9e51f48d9c41a36ad3e9fa150eea69855a26c8b" }, "downloads": -1, "filename": "adeptRL-0.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "9b5432e28539a7aa86a12ed0a0dd6fa7", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5.0", "size": 93078, "upload_time": "2018-08-28T15:03:02", "url": "https://files.pythonhosted.org/packages/d4/be/20b489139f7471e98a6a021d25a766e0dd0c7eadd108bebc062eaf88a218/adeptRL-0.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "b5ad655a9592aa7eb2f699d0880df5ea", "sha256": "0001c87dc1f707cd8877035e9524852509df99c02ef8607f453f5dc145f1671e" }, "downloads": -1, "filename": "adeptRL-0.1.1.tar.gz", "has_sig": false, "md5_digest": "b5ad655a9592aa7eb2f699d0880df5ea", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5.0", "size": 48866, "upload_time": "2018-08-28T15:03:04", "url": "https://files.pythonhosted.org/packages/19/99/d994495f282af042c75f382184d946aef3ec0918dd9d8dd749bd6926084a/adeptRL-0.1.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "9b5432e28539a7aa86a12ed0a0dd6fa7", "sha256": "8b1db429029149009097ba1bf9e51f48d9c41a36ad3e9fa150eea69855a26c8b" }, "downloads": -1, "filename": "adeptRL-0.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "9b5432e28539a7aa86a12ed0a0dd6fa7", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5.0", "size": 93078, "upload_time": "2018-08-28T15:03:02", "url": "https://files.pythonhosted.org/packages/d4/be/20b489139f7471e98a6a021d25a766e0dd0c7eadd108bebc062eaf88a218/adeptRL-0.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "b5ad655a9592aa7eb2f699d0880df5ea", "sha256": "0001c87dc1f707cd8877035e9524852509df99c02ef8607f453f5dc145f1671e" }, "downloads": -1, "filename": "adeptRL-0.1.1.tar.gz", "has_sig": false, "md5_digest": "b5ad655a9592aa7eb2f699d0880df5ea", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5.0", "size": 48866, "upload_time": "2018-08-28T15:03:04", "url": "https://files.pythonhosted.org/packages/19/99/d994495f282af042c75f382184d946aef3ec0918dd9d8dd749bd6926084a/adeptRL-0.1.1.tar.gz" } ] }