{ "info": { "author": "Reza Sherafat", "author_email": "sherafat.us@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# Tic Tac Toe Game in OpenAI Gym\nThe 3D version of Tic Tac Toe is implemented as an OpenAI's Gym environment. The [`learning`](./learning) folder includes several Jupyter notebooks for deep neural network models used to implement a computer-based player.\n\n## Complexity\nThe traditional (2D) Tic Tac Toe has a very small game space (9^3). In comparison, the 3D version in this repo has a much larger space which is in the order of 81^3. This makes computer-based players using search and pruning techniques of the game space prohibitively expensive.\n\nRather, the current learning models are based on policy gradient and deep Q-learning. The [DQN model](learning/TicTacToe-RL-DQN-TF-v2.ipynb) has produced very promising results. Feel free to experience on your own and contribute if interested. The [PG-based model](learning/TicTacToe-RL-PG-TF.ipynb) needs more work :)\n\n## Contributions\nThe repo is also open for pull requests and collaborations both in game development as well as learning.\n\n## Dependencies\n- Base dependency: `gym`.\n- Plot-rendering dependencies: `numpy`, `matplotlib`.\n- DQN learning dependencies: `tensorflow`, `numpy`.\n\n## Installation\nTo install run:\n```console\n# In your virtual environment\npip install gym-tictactoe\n```\n\n## Usage\nCurrently 2 types of environments with different rendering modes are supported.\n\n### Textual rendering\nTo use textual rendering create environment as `tictactoe-v0` like so:\n```python\nimport gym\nimport gym_tictactoe\n\ndef play_game(actions, step_fn=input):\n env = gym.make('tictactoe-v0')\n env.reset()\n\n # Play actions in action profile\n for action in actions:\n print(env.step(action))\n env.render()\n if step_fn:\n step_fn()\n return env\n\nactions = ['1021', '2111', '1221', '2222', '1121']\n_ = play_game(actions, None)\n```\nThe output produced is:\n\n```\nStep 1:\n- - - - - - - - - \n- - x - - - - - - \n- - - - - - - - - \n\nStep 2:\n- - - - - - - - - \n- - x - o - - - - \n- - - - - - - - - \n\nStep 3:\n- - - - - - - - - \n- - x - o - - - x \n- - - - - - - - - \n\nStep 4:\n- - - - - - - - - \n- - x - o - - - x \n- - - - - - - - o \n\nStep 5:\n- - - - - - - - - \n- - X - o X - - X \n- - - - - - - - o \n```\nThe winning sequence after gameplay: `(0,2,1), (1,2,1), (2,2,1)`.\n\n### Plotted rendering\nTo use textual rendering create environment as `tictactoe-plt-v0` like so:\n```python\nimport gym\nimport gym_tictactoe\n\ndef play_game(actions, step_fn=input):\n env = gym.make('tictactoe-plt-v0')\n env.reset()\n\n # Play actions in action profile\n for action in actions:\n print(env.step(action))\n env.render()\n if step_fn:\n step_fn()\n return env\n\nactions = ['1021', '2111', '1221', '2222', '1121']\n_ = play_game(actions, None)\n```\nThis produces the following gameplay:\n\nStep 1:\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n