Title: Move Prediction in Go with the Maximum Entropy Method
1Move Prediction in Go with the Maximum Entropy
Method
- Nobuo Araki
- The University of Tokyo
2Contents
- About Go
- What is move prediction?
- Background
- Our Method
- Experiment
- Discussion and Conclusion
3About Go
4About Go?
- Go is a board game played by two players (black
and white). - Two players place stones one after the other.
(placing a stone is called move)
5About Go?
- This figure shows halfway through a Go match.
- One match ends with about 250 moves.
6About Go?
- Purpose of Go is surrounding as big area as
possible.
Black68-White64.5, Black won.
7About Go?
- Unlike chess programs, Go programs are still very
weak. Maybe Go programs are weaker than the
middle level of amateurs. - So, making strong Go programs is a hot topic in
AI research.
8What is move prediction?
9What is move prediction??
- In Go, there are many legal moves.
- In this situation, all the empty places are legal
positions to place a stone.
10What is move prediction??
- Unlike chess, there are too many legal moves to
apply brute-force game tree search - chess88
- Gousually 1919, very large
11What is move prediction??
- But, in Go, strong human players can select a
small number of candidate moves instantly
(without thinking deeply).
12What is move prediction??
- Move prediction is doing such selection by a
computer program.
13What is move prediction??
- In Go programs, forward pruning by move
prediction is necessary. - In this research, we tried to attain more
accurate move prediction than existing approaches.
14Background
15- First, I will explain Stern et al.s research
which is the basis of this research. - Next, I will explain the Maximum Entropy Method
which we used in this research.
16BackgroundStern et als work?
- Stern et als research 2005,2006 is the basis
of our research. - They tried to predict moves by simple pattern
matching of stone positions.
17BackgroundStern et als work?
- In Go, patterns of stone positions are very
important in tactics. - In Stern et al. 2006, they also used other
tactical features in pattern matching than stone
positions. But we didnt use them.
18BackgroundStern et als work?
- Preparing pattern templates
- Constructing pattern dictionary
- Scoring patterns
- Predicting moves
19BackgroundStern et als work?
- First, pattern templates were prepared.
- These templates defined the shape and range of
patterns of stone position.
20BackgroundStern et als work?
- Various sizes of templates were prepared because
using only large templates causes sparseness
problems.
21BackgroundStern et als work?
- Next, they constructed a pattern dictionary by
extracting patterns around the actual expert
moves.
22BackgroundStern et als work?
- With all the pattern templates, patterns around
actual moves of expert games were extracted.
23BackgroundStern et als work?
- Then, they scored patterns in the pattern
dictionary. - around actual expert move as positive example,
- around other legal moves as negative example.
positive example
negative example
24BackgroundStern et als work?
- With the scores of patterns, they predicted
moves. - positive example patterns are more likely to
be selected than negative example patterns.
25BackgroundStern et als work?
- When storing and comparing patterns, they used
Zobrist hashing1990. - By Zobrist hashing, they could store and compare
patterns fast with limited storage space used.
26BackgroundStern et als work?
- 20,000 matches for training data, 500 matches for
test data26 accuracy in prediction(2005) - 181,000 matches for training data, 4,400 matches
for test data34 accuracy in prediction (2006)
27BackgroundMaximum Entropy Method?
- We used Maximum Entropy Method (MEM) Adam et al,
1996 for re-ranking. - MEM is a machine learning method.
28BackgroundMaximum Entropy Method?
samples
input
MEM system
MEM system
training
labeled samples
output
P(label,sample) for each label
29BackgroundMaximum Entropy Method?
- In this research, we could get P(positive,sampl
e) and P(negative,sample) by MEM. - In other words, MEM was used as a binary
classifier.
30BackgroundMaximum Entropy Method?
- MEM can manage multiple overlapping features.
- MEM is often used in Natural Language Processing
(NLP) research.
31BackgroundMaximum Entropy Method?
- MEM consumes much memory, so we couldnt apply
MEM directly. Therefore, we applied MEM for
re-ranking instead.
32Our Method
33Our MethodOverview
legal moves
relative frequency module
relative frequency module
training
ranking
labeled samples
labeled samples
MEM module
MEM module
re-ranking
training
ranked list
34Our MethodFeature?
- Strong human players often select moves which are
consistent with previous moves. - So, we added information of previous moves as
features for machine learning to attain more
accurate move prediction. (MEM can manage
multiple overlapping features)
35Our MethodFeature?
- We added
- relative coordinates to the 4 previous moves
- coordinates of the current move
- as features for machine learning.
36Our MethodTraining Our System?
- First, we constructed a pattern dictionary in the
way same as Stern et al. - Then, we learned the relative frequencies of
patterns of stone positions by using half of the
training data.
37Our MethodTraining Our System?
relative frequency module
relative frequency module
training
labeled samples
labeled samples
MEM module
MEM module
training
38Our MethodTraining Our System?
- We prepared two counters, used and unused,
for each pattern. - If a pattern appears around the actual expert
move, used of it is incremented. - If a pattern appears around another legal move,
unused of it is incremented.
39Our MethodTraining Our System?
- around the actual expert move, used is
incremented - around the other legal moves, unused is
incremented
used
unused
40Our MethodTraining Our System?
- We calculated the relative frequencies as
follows - We used Laplaces law.
-
41Our MethodTraining Our System?
- Then, we trained our MEM system by using the rest
of the training data. - First, we ranked all legal moves by the relative
frequencies. - Then, by using top n (20-80) in the ranking, we
trained our MEM system.
42Our MethodTraining Our System?
relative frequency module
relative frequency module
training
labeled samples
labeled samples
MEM module
MEM module
training
43Our MethodTraining Our System?
- Features for machine learning were
- A pattern around the move
- Coordinates of the move
- Relative coordinates to the 4 previous moves
44Our MethodTraining Our System?
- The actual expert moves were scored as
positive. - The other legal moves were scored as negative.
45Our MethodTraining Our System?
- the actual expert move is labeled as positive
- the other legal moves, are labeled as negative
positive
negative
46Our MethodPredicting Moves?
- First, we ranked all legal moves with the
relative frequencies. - Then we re-ranked top n(20-80) with MEM.
47Our MethodPredicting Moves?
legal moves
relative frequency module
ranking
MEM module
re-ranking
ranked list
48Experiment
49ExperimentData
- We conducted our experiment using the records in
the GoGoD database T.Mark et al. This database
contains human expert games.
50ExperimentChanging the re-ranking amount?
- We used 2,000 matches as training set and 500
matches as development set. We changed the amount
used for re-ranking and evaluated the accuracy.
51ExperimentChanging the re-ranking amount?
legal moves
relative frequency module
relative frequency module
training
ranking
labeled samples
labeled samples
MEM module
MEM module
re-ranking
training
changed the amount used for re-ranking
ranked list
52ExperimentChanging the re-ranking amount?
- The results are as follows. Column 0 is the
result without re-ranking. The cumulative density
of rank x is y means that y of all expert moves
are in the top x in the ranking by our system.
53ExperimentChanging the re-ranking amount?
- At ranks 1,5, and 10 the results with re-ranking
yielded better outcomes than those without
re-ranking.
54ExperimentChanging the re-ranking amount?
- However, for the others, the results with
re-ranking did not yield better outcomes than
those without re-ranking.
55ExperimentChanging the re-ranking amount?
- Increasing the amount used for re-ranking had a
bad influence on rank 1.
56ExperimentChanging the re-ranking amount?
- but a good influence on ranks 10, 20, 40, and 60.
57ExperimentCompared with Stern et als work?
- We used 20,000 matches as training set and 500
matches as test set. We set the re-ranking amount
to 20. And we compared our system with Stern et
als work2005,2006.
58ExperimentCompared with Stern et als work?
- Compared to Stern et als work 2005, our system
yielded better results in cumulative density at
ranks 1, 5, and 10.
59ExperimentCompared with Stern et als work?
- However our system was outperformed at rank 20.
60ExperimentCompared with Stern et als work?
- We could not obtain better results than Stern et
als work 2006 with the current size of our
training data.
61ExperimentMatch with GnuGo
- We had our system (only move prediction) play
against GnuGo3.6. - Many moves of our system were not so bad but some
moves cause the system to lose.
62Discussion and Conclusion
63Discussion and Conclusion?
- We could attain high accuracy with a relatively
small amount of training data. - Our system lost to GnuGo but many moves were not
so bad. - We may obtain better results than Stern et als
work 2006 if we use more training data.
64Discussion and Conclusion?
- There may be problems related to using the
previous moves as features for machine learning. - There are few situations in which previous moves
are bad moves in the training data. - The situation in which previous moves are bad
moves can cause bad effects on the move
prediction.
65- Any Questions?
- Please speak slowly and clearly.
66- In our paper, we said that we add patterns around
the previous moves as features for machine
learning. - But they are meaningless because for all the
legal moves, patterns around the previous moves
are the same. - These patterns dont contribute to scoring.