{ "info": { "author": "Reza Sherafat", "author_email": "sherafat.us@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# Tic Tac Toe Game in OpenAI Gym\nThe 3D version of Tic Tac Toe is implemented as an OpenAI's Gym environment. The [`learning`](./learning) folder includes several Jupyter notebooks for deep neural network models used to implement a computer-based player.\n\n## Complexity\nThe traditional (2D) Tic Tac Toe has a very small game space (9^3). In comparison, the 3D version in this repo has a much larger space which is in the order of 81^3. This makes computer-based players using search and pruning techniques of the game space prohibitively expensive.\n\nRather, the current learning models are based on policy gradient and deep Q-learning. The [DQN model](learning/TicTacToe-RL-DQN-TF-v2.ipynb) has produced very promising results. Feel free to experience on your own and contribute if interested. The [PG-based model](learning/TicTacToe-RL-PG-TF.ipynb) needs more work :)\n\n## Contributions\nThe repo is also open for pull requests and collaborations both in game development as well as learning.\n\n## Dependencies\n- Base dependency: `gym`.\n- Plot-rendering dependencies: `numpy`, `matplotlib`.\n- DQN learning dependencies: `tensorflow`, `numpy`.\n\n## Installation\nTo install run:\n```console\n# In your virtual environment\npip install gym-tictactoe\n```\n\n## Usage\nCurrently 2 types of environments with different rendering modes are supported.\n\n### Textual rendering\nTo use textual rendering create environment as `tictactoe-v0` like so:\n```python\nimport gym\nimport gym_tictactoe\n\ndef play_game(actions, step_fn=input):\n env = gym.make('tictactoe-v0')\n env.reset()\n\n # Play actions in action profile\n for action in actions:\n print(env.step(action))\n env.render()\n if step_fn:\n step_fn()\n return env\n\nactions = ['1021', '2111', '1221', '2222', '1121']\n_ = play_game(actions, None)\n```\nThe output produced is:\n\n```\nStep 1:\n- - - - - - - - - \n- - x - - - - - - \n- - - - - - - - - \n\nStep 2:\n- - - - - - - - - \n- - x - o - - - - \n- - - - - - - - - \n\nStep 3:\n- - - - - - - - - \n- - x - o - - - x \n- - - - - - - - - \n\nStep 4:\n- - - - - - - - - \n- - x - o - - - x \n- - - - - - - - o \n\nStep 5:\n- - - - - - - - - \n- - X - o X - - X \n- - - - - - - - o \n```\nThe winning sequence after gameplay: `(0,2,1), (1,2,1), (2,2,1)`.\n\n### Plotted rendering\nTo use textual rendering create environment as `tictactoe-plt-v0` like so:\n```python\nimport gym\nimport gym_tictactoe\n\ndef play_game(actions, step_fn=input):\n env = gym.make('tictactoe-plt-v0')\n env.reset()\n\n # Play actions in action profile\n for action in actions:\n print(env.step(action))\n env.render()\n if step_fn:\n step_fn()\n return env\n\nactions = ['1021', '2111', '1221', '2222', '1121']\n_ = play_game(actions, None)\n```\nThis produces the following gameplay:\n\nStep 1:\n

\n \n

\nStep 2:\n

\n \n

\nStep 3:\n

\n \n

\nStep 4:\n

\n \n

\nStep 5:\n

\n \n

\n\n\n## DQN Learning\nThe current models are under [`learning`](./learning) folder. See [Jupyter notebook](./learning/TicTacToe-RL-DQN-TF-v2-eval.ipynb) for a DQN learning with a 2-layer neural network and using actor-critic technique.\n\nSample game plays produced by the trained model (the winning sequence is `(0,0,0), (1,0,0), (2,0,0)`):\n

\n \n

\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/shkreza/gym-tictactoe", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "gym-tictactoe", "package_url": "https://pypi.org/project/gym-tictactoe/", "platform": "", "project_url": "https://pypi.org/project/gym-tictactoe/", "project_urls": { "Homepage": "https://github.com/shkreza/gym-tictactoe" }, "release_url": "https://pypi.org/project/gym-tictactoe/0.30/", "requires_dist": [ "gym", "matplotlib", "numpy" ], "requires_python": "", "summary": "Tic-Tac-Toe environment in OpenAI gym", "version": "0.30" }, "last_serial": 4667123, "releases": { "0.30": [ { "comment_text": "", "digests": { "md5": "7ef844626c8d534fc9ab349744a8f527", "sha256": "389e9078990a1f1419dfe09f3881e75d0d8405cbc79f34ab7e470d3bdca91c0c" }, "downloads": -1, "filename": "gym_tictactoe-0.30-py3-none-any.whl", "has_sig": false, "md5_digest": "7ef844626c8d534fc9ab349744a8f527", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 5800, "upload_time": "2019-01-06T19:16:10", "url": "https://files.pythonhosted.org/packages/6e/29/368a5dc8abc95ced695c458fa5bf8175f5941ed8404b2f0337bc331506d2/gym_tictactoe-0.30-py3-none-any.whl" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "7ef844626c8d534fc9ab349744a8f527", "sha256": "389e9078990a1f1419dfe09f3881e75d0d8405cbc79f34ab7e470d3bdca91c0c" }, "downloads": -1, "filename": "gym_tictactoe-0.30-py3-none-any.whl", "has_sig": false, "md5_digest": "7ef844626c8d534fc9ab349744a8f527", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 5800, "upload_time": "2019-01-06T19:16:10", "url": "https://files.pythonhosted.org/packages/6e/29/368a5dc8abc95ced695c458fa5bf8175f5941ed8404b2f0337bc331506d2/gym_tictactoe-0.30-py3-none-any.whl" } ] }