This project is about making use of a Q-learning algorithm to implement a game-playing agent which learns how to play a game of 3x3 Dots and Boxes optimally against an automated player.
The starting state is an empty grid of dots (16 dots in case of a 3x3 size board). Both players take turns making a move; a move consists of adding either a horizontal or vertical line between two unjoined adjacent dots. If making a move completes a 1x1 box, then the player who made that move wins that particular box (essentially, gets a point); the player also retains their turn. The game ends when there are no more available moves left to make. The player with the most points number of points is the winner of the game.
Determining how to store and represent the game is a bit tricky, since both the dots and their intermediate edges are valid to the game state. However, representing both dots and edges is not feasible since doing so requires either multiple lists or nested ones, both of which are not unviable to use as input parameters to the neural network.
One can, however, observe that the dots are constant for every state. Hence, a game state can be represented solely by its edges. All edges in the game are represented as a list (of length 24, since there are 24 edges in a 3x3 size game), with 0 denoting that an edge does not exist, and one denoting otherwise.
- Python 3
- Numpy
- Tensorflow 1.1
- Anaconda
clone the repository
open a terminal
run dots&boxes_Play.py
This command brings up a console-based game of dots and boxes where you can play against the trained model on a 3 x 3 board.
You can also elect to be player 2 by using the optional parameter player-choice
python3 dots&boxes_Play.py --player-choice 2
The edge ordering being considered is:
· 0 · 1 · 2 ·
12 13 14 15
· 3 · 4 · 5 ·
16 17 18 19
· 6 · 7 · 8 ·
20 21 22 23
· 9 · 10 · 11 ·
open anaconda navigator
activate your virtual enironment
open a Jupyter Notebook
Run dots&boxes_train.ipynb
Currently, this file is hard-coded to halt at 10000 iterations.
Since the current model has already been trained on 10000 iterations,
you must either delete the previous model AND log file (located in the models/size3 subdirectory) or
manual change the iteration limit in the Train_Dots_Boxes.ipynb main program if you wish to run a training simulation.
This parameter is denoted by n_games