Move Prediction in Go with the Maximum Entropy Method - PowerPoint PPT Presentation

1 / 66

About This Presentation

Title:

Move Prediction in Go with the Maximum Entropy Method

Description:

First, we constructed a pattern dictionary in the way same as Stern et al. ... Compared to Stern et al's work 2005, our system yielded better results in ... – PowerPoint PPT presentation

Number of Views:78

Avg rating:3.0/5.0

Slides: 67

Provided by: wwwtsuji

Category:

more less

Transcript and Presenter's Notes

Title: Move Prediction in Go with the Maximum Entropy Method

1
Move Prediction in Go with the Maximum Entropy
Method

Nobuo Araki
The University of Tokyo

2
Contents

About Go
What is move prediction?
Background
Our Method
Experiment
Discussion and Conclusion

3
About Go
4
About Go?

Go is a board game played by two players (black
and white).
Two players place stones one after the other.
(placing a stone is called move)

5
About Go?

This figure shows halfway through a Go match.
One match ends with about 250 moves.

6
About Go?

Purpose of Go is surrounding as big area as
possible.

Black68-White64.5, Black won.
7
About Go?

Unlike chess programs, Go programs are still very
weak. Maybe Go programs are weaker than the
middle level of amateurs.
So, making strong Go programs is a hot topic in
AI research.

8
What is move prediction?
9
What is move prediction??

In Go, there are many legal moves.
In this situation, all the empty places are legal
positions to place a stone.

10
What is move prediction??

Unlike chess, there are too many legal moves to
apply brute-force game tree search
chess88
Gousually 1919, very large

11
What is move prediction??

But, in Go, strong human players can select a
small number of candidate moves instantly
(without thinking deeply).

12
What is move prediction??

Move prediction is doing such selection by a
computer program.

13
What is move prediction??

In Go programs, forward pruning by move
prediction is necessary.
In this research, we tried to attain more
accurate move prediction than existing approaches.

14
Background
15

First, I will explain Stern et al.s research
which is the basis of this research.
Next, I will explain the Maximum Entropy Method
which we used in this research.

16
BackgroundStern et als work?

Stern et als research 2005,2006 is the basis
of our research.
They tried to predict moves by simple pattern
matching of stone positions.

17
BackgroundStern et als work?

In Go, patterns of stone positions are very
important in tactics.
In Stern et al. 2006, they also used other
tactical features in pattern matching than stone
positions. But we didnt use them.

18
BackgroundStern et als work?

Preparing pattern templates
Constructing pattern dictionary
Scoring patterns
Predicting moves

19
BackgroundStern et als work?

First, pattern templates were prepared.
These templates defined the shape and range of
patterns of stone position.

20
BackgroundStern et als work?

Various sizes of templates were prepared because
using only large templates causes sparseness
problems.

21
BackgroundStern et als work?

Next, they constructed a pattern dictionary by
extracting patterns around the actual expert
moves.

22
BackgroundStern et als work?

With all the pattern templates, patterns around
actual moves of expert games were extracted.

23
BackgroundStern et als work?

Then, they scored patterns in the pattern
dictionary.
around actual expert move as positive example,
around other legal moves as negative example.

positive example
negative example
24
BackgroundStern et als work?

With the scores of patterns, they predicted
moves.
positive example patterns are more likely to
be selected than negative example patterns.

25
BackgroundStern et als work?

When storing and comparing patterns, they used
Zobrist hashing1990.
By Zobrist hashing, they could store and compare
patterns fast with limited storage space used.

26
BackgroundStern et als work?

20,000 matches for training data, 500 matches for
test data26 accuracy in prediction(2005)
181,000 matches for training data, 4,400 matches
for test data34 accuracy in prediction (2006)

27
BackgroundMaximum Entropy Method?

We used Maximum Entropy Method (MEM) Adam et al,
1996 for re-ranking.
MEM is a machine learning method.

28
BackgroundMaximum Entropy Method?
samples
input
MEM system
MEM system
training
labeled samples
output
P(label,sample) for each label
29
BackgroundMaximum Entropy Method?

In this research, we could get P(positive,sampl
e) and P(negative,sample) by MEM.
In other words, MEM was used as a binary
classifier.

30
BackgroundMaximum Entropy Method?

MEM can manage multiple overlapping features.
MEM is often used in Natural Language Processing
(NLP) research.

31
BackgroundMaximum Entropy Method?

MEM consumes much memory, so we couldnt apply
MEM directly. Therefore, we applied MEM for
re-ranking instead.

32
Our Method
33
Our MethodOverview
legal moves
relative frequency module
relative frequency module
training
ranking
labeled samples
labeled samples
MEM module
MEM module
re-ranking
training
ranked list
34
Our MethodFeature?

Strong human players often select moves which are
consistent with previous moves.
So, we added information of previous moves as
features for machine learning to attain more
accurate move prediction. (MEM can manage
multiple overlapping features)

35
Our MethodFeature?

We added
relative coordinates to the 4 previous moves
coordinates of the current move
as features for machine learning.

36
Our MethodTraining Our System?

First, we constructed a pattern dictionary in the
way same as Stern et al.
Then, we learned the relative frequencies of
patterns of stone positions by using half of the
training data.

37
Our MethodTraining Our System?
relative frequency module
relative frequency module
training
labeled samples
labeled samples
MEM module
MEM module
training
38
Our MethodTraining Our System?

We prepared two counters, used and unused,
for each pattern.
If a pattern appears around the actual expert
move, used of it is incremented.
If a pattern appears around another legal move,
unused of it is incremented.

39
Our MethodTraining Our System?

around the actual expert move, used is
incremented
around the other legal moves, unused is
incremented

used
unused
40
Our MethodTraining Our System?

We calculated the relative frequencies as
follows
We used Laplaces law.

41
Our MethodTraining Our System?

Then, we trained our MEM system by using the rest
of the training data.
First, we ranked all legal moves by the relative
frequencies.
Then, by using top n (20-80) in the ranking, we
trained our MEM system.

42
Our MethodTraining Our System?
relative frequency module
relative frequency module
training
labeled samples
labeled samples
MEM module
MEM module
training
43
Our MethodTraining Our System?

Features for machine learning were
A pattern around the move
Coordinates of the move
Relative coordinates to the 4 previous moves

44
Our MethodTraining Our System?

The actual expert moves were scored as
positive.
The other legal moves were scored as negative.

45
Our MethodTraining Our System?

the actual expert move is labeled as positive
the other legal moves, are labeled as negative

positive
negative
46
Our MethodPredicting Moves?

First, we ranked all legal moves with the
relative frequencies.
Then we re-ranked top n(20-80) with MEM.

47
Our MethodPredicting Moves?
legal moves
relative frequency module
ranking
MEM module
re-ranking
ranked list
48
Experiment
49
ExperimentData

We conducted our experiment using the records in
the GoGoD database T.Mark et al. This database
contains human expert games.

50
ExperimentChanging the re-ranking amount?

We used 2,000 matches as training set and 500
matches as development set. We changed the amount
used for re-ranking and evaluated the accuracy.

51
ExperimentChanging the re-ranking amount?
legal moves
relative frequency module
relative frequency module
training
ranking
labeled samples
labeled samples
MEM module
MEM module
re-ranking
training
changed the amount used for re-ranking
ranked list
52
ExperimentChanging the re-ranking amount?

The results are as follows. Column 0 is the
result without re-ranking. The cumulative density
of rank x is y means that y of all expert moves
are in the top x in the ranking by our system.

53
ExperimentChanging the re-ranking amount?

At ranks 1,5, and 10 the results with re-ranking
yielded better outcomes than those without
re-ranking.

54
ExperimentChanging the re-ranking amount?

However, for the others, the results with
re-ranking did not yield better outcomes than
those without re-ranking.

55
ExperimentChanging the re-ranking amount?

Increasing the amount used for re-ranking had a
bad influence on rank 1.

56
ExperimentChanging the re-ranking amount?

but a good influence on ranks 10, 20, 40, and 60.

57
ExperimentCompared with Stern et als work?

We used 20,000 matches as training set and 500
matches as test set. We set the re-ranking amount
to 20. And we compared our system with Stern et
als work2005,2006.

58
ExperimentCompared with Stern et als work?

Compared to Stern et als work 2005, our system
yielded better results in cumulative density at
ranks 1, 5, and 10.

59
ExperimentCompared with Stern et als work?

However our system was outperformed at rank 20.

60
ExperimentCompared with Stern et als work?

We could not obtain better results than Stern et
als work 2006 with the current size of our
training data.

61
ExperimentMatch with GnuGo

We had our system (only move prediction) play
against GnuGo3.6.
Many moves of our system were not so bad but some
moves cause the system to lose.

62
Discussion and Conclusion
63
Discussion and Conclusion?

We could attain high accuracy with a relatively
small amount of training data.
Our system lost to GnuGo but many moves were not
so bad.
We may obtain better results than Stern et als
work 2006 if we use more training data.

64
Discussion and Conclusion?

There may be problems related to using the
previous moves as features for machine learning.
There are few situations in which previous moves
are bad moves in the training data.
The situation in which previous moves are bad
moves can cause bad effects on the move
prediction.

Any Questions?
Please speak slowly and clearly.

In our paper, we said that we add patterns around
the previous moves as features for machine
learning.
But they are meaningless because for all the
legal moves, patterns around the previous moves
are the same.
These patterns dont contribute to scoring.

Write a Comment

User Comments (0)