Pengkun--Master-Thesis--2015

Implementation And Experimentation of Coherent Inference in Game Tree

Implementation in C++, forked from Charles-Lau-/thesis.

Usage of this program

You can either use the Makefile or the run script, or compile manually:

# Compile the program
g++ -std=c++11 -o playGame *.cpp algorithm/*.cpp
# Run it (and keep its output)
./playGame | tee -a ./playGame.log

You can have a look to the (long) playGame.log file for an example of its output.

:bug: Limitation?

The implementation is generic enough to be able to play to higher dimension Tic-Tac-Toe, not only the usual 3x3.

Example with a 4x4 Tic-Tac-Toe

When the player X has to move, it examines a few possible moves, and compute the Gaussian value function learned for each position. For instance, image that X has to play on this board state:

O O X O
_ _ _ X
O _ X X
_ X X O

Then it will explore all the possible moves (6 in our example):

V value: Gaussian N(-0.224098, 0.862775)...
O O X O
_ _ _ X
O _ X X
_ X X O

... 2 other moves

V value: Gaussian N(0.939144, 0.521763)...  <-- The best one!
O O X O
_ _ X _
O _ X X
_ X X O


V value: Gaussian N(-1.40412, 0.9321)...
O O X O
_ _ _ _
O X X X
_ X X O


V value: Gaussian N(-0.658714, 0.932628)...
O O X O
_ _ _ _
O _ X X
X X X O

Finally, the Coherent Inference algorithm will sample from each of these Gaussian distribution (associated to each children of the current node). It selects the move corresponding to the highest sample (usuall, from the more optimistic Gaussian value function V).

In the example showed above, N(0.93, 0.52) has a mean value of 0.93 (really higher than any other), so with high probability the sampling will select this action. Here it is the (only) one which directly brings X to a winning position:

O O X O
_ _ X _
O _ X X
_ X X O

This was only a very rough explanation of the algorithm. For all the details, please refer to the initial paper, or the slides or report for my project.

At the end, the coherent Gaussian inference player (X) wins 72% of the time, which for 4x4 Tic-Tac-Toe is an extremely satisfactory result! :tada:

Overview of layout of the program

`Board.main()` (in the code folder)

Entrance of the program:

In this function, two computer players are initialized; and the Board is initialized.
Then a variable rounds is set, which means how many rounds are played between these two players.

`ComputerPlayer` (in the code folder)

Model a ComputerPlayer:

In this module, there are two functions, which are randomPlay and algorithmPlay respectively. These two functions will change the current board state. And they both return an identifier, which can let Board.main() determine whether the game should be continued or the game is winned by a player.
In randomPlay, the player just plays the game randomly, while the player plays according to the algorithm in algorithmPlay.
In algorithmPlay, just three things are done: first, set up parameters like number of descents, the root of descent tree; second, call Ep.descent to do descent whose parameter the the root; third, determine which move is the best one.

`Ep` (Expectation-Propagation) (in the algorithm folder)

Core of the expectation propagation algorithm:

Basically, in this module, the algorithm is implemented and all related message passing methods are implemented.
The method descent is the backbone and other message passing methods serve for it.
Additionally, all calculation related to Gaussian distribution are carried out in Distribution module.

`Distribution` (in the algorithm folder)

All dirty works of calculation about Gaussian distributions and correlated coefficient.
In this module, all messy distribution and correlated coefficient calcualtions are implemented.
Notice should be paid tothe Min/Max calculation of multiple dependent Gaussian distributions.

`Node` (in the algorithm folder)

Support build-up of Game tree:

It is an entity module, which supports implementation of Game node in a Game tree.
In every node, it has both pointer to its parent and its children.
Also, binded values and distributions are its properties.
Notice: the getChild() function is varied if the Board is changed. For different game board, the strategy to get valid children nodes of a specific node is varied

Other files?

The rest of the project are programs either easier to understand or less useful.

See this other project for a more simple Reinforcement Learning agent for 3x3 Tic-Tac-Toe.