GitHub - fumin/tdttt: Temporal difference reinforcement learning for the game of Tic-Tac-Toe

Reinforcement learning AI for the game of tic-tac-toe

This project implements the SARSA and Q-learning algorithms. The training method is to bootstrap two random agents playing against themselves. During game play, the agents do not know nor do they need the rules and heuristics of tic-tac-toe. For example, these agents do not know that states with only one move left is equivalent to an end game. In fact, throughout the training process the only factors affecting the agents are:

the definition of win and lose of tic-tac-toe, i.e. the existence of horizontal, vertical, or diagonal lines
and the hyperparameters of the learning algorithm

The following graph shows the reward per episodes of both SARSA and Q-learning algorithms during training:

Running this code

To run this code, run python main.py --algo sarsa to start a SARSA training session, and python main.py --algo qlearning for a Q-learning one. For both these cases, after training, a terminal interface will be provided to play with these trained agents.

Run tests

To run the tests of this code, run python testing.py.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
readme_static		readme_static
.gitignore		.gitignore
README.md		README.md
game.py		game.py
main.py		main.py
testing.py		testing.py
tictactoe.py		tictactoe.py
tictactoe_test.py		tictactoe_test.py
train.py		train.py
train_test.py		train_test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme_static

readme_static

.gitignore

.gitignore

README.md

README.md

game.py

game.py

main.py

main.py

testing.py

testing.py

tictactoe.py

tictactoe.py

tictactoe_test.py

tictactoe_test.py

train.py

train.py

train_test.py

train_test.py

Repository files navigation

Reinforcement learning AI for the game of tic-tac-toe

Running this code

Run tests

About

Releases

Packages

Languages

fumin/tdttt

Folders and files

Latest commit

History

Repository files navigation

Reinforcement learning AI for the game of tic-tac-toe

Running this code

Run tests

About

Resources

Stars

Watchers

Forks

Languages