Move Prediction in Go with the Maximum Entropy Method - PowerPoint PPT Presentation

1 / 66
About This Presentation
Title:

Move Prediction in Go with the Maximum Entropy Method

Description:

First, we constructed a pattern dictionary in the way same as Stern et al. ... Compared to Stern et al's work 2005, our system yielded better results in ... – PowerPoint PPT presentation

Number of Views:78
Avg rating:3.0/5.0
Slides: 67
Provided by: wwwtsuji
Category:

less

Transcript and Presenter's Notes

Title: Move Prediction in Go with the Maximum Entropy Method


1
Move Prediction in Go with the Maximum Entropy
Method
  • Nobuo Araki
  • The University of Tokyo

2
Contents
  • About Go
  • What is move prediction?
  • Background
  • Our Method
  • Experiment
  • Discussion and Conclusion

3
About Go
4
About Go?
  • Go is a board game played by two players (black
    and white).
  • Two players place stones one after the other.
    (placing a stone is called move)

5
About Go?
  • This figure shows halfway through a Go match.
  • One match ends with about 250 moves.

6
About Go?
  • Purpose of Go is surrounding as big area as
    possible.

Black68-White64.5, Black won.
7
About Go?
  • Unlike chess programs, Go programs are still very
    weak. Maybe Go programs are weaker than the
    middle level of amateurs.
  • So, making strong Go programs is a hot topic in
    AI research.

8
What is move prediction?
9
What is move prediction??
  • In Go, there are many legal moves.
  • In this situation, all the empty places are legal
    positions to place a stone.

10
What is move prediction??
  • Unlike chess, there are too many legal moves to
    apply brute-force game tree search
  • chess88
  • Gousually 1919, very large

11
What is move prediction??
  • But, in Go, strong human players can select a
    small number of candidate moves instantly
    (without thinking deeply).

12
What is move prediction??
  • Move prediction is doing such selection by a
    computer program.

13
What is move prediction??
  • In Go programs, forward pruning by move
    prediction is necessary.
  • In this research, we tried to attain more
    accurate move prediction than existing approaches.

14
Background
15
  • First, I will explain Stern et al.s research
    which is the basis of this research.
  • Next, I will explain the Maximum Entropy Method
    which we used in this research.

16
BackgroundStern et als work?
  • Stern et als research 2005,2006 is the basis
    of our research.
  • They tried to predict moves by simple pattern
    matching of stone positions.

17
BackgroundStern et als work?
  • In Go, patterns of stone positions are very
    important in tactics.
  • In Stern et al. 2006, they also used other
    tactical features in pattern matching than stone
    positions. But we didnt use them.

18
BackgroundStern et als work?
  • Preparing pattern templates
  • Constructing pattern dictionary
  • Scoring patterns
  • Predicting moves

19
BackgroundStern et als work?
  • First, pattern templates were prepared.
  • These templates defined the shape and range of
    patterns of stone position.

20
BackgroundStern et als work?
  • Various sizes of templates were prepared because
    using only large templates causes sparseness
    problems.

21
BackgroundStern et als work?
  • Next, they constructed a pattern dictionary by
    extracting patterns around the actual expert
    moves.

22
BackgroundStern et als work?
  • With all the pattern templates, patterns around
    actual moves of expert games were extracted.

23
BackgroundStern et als work?
  • Then, they scored patterns in the pattern
    dictionary.
  • around actual expert move as positive example,
  • around other legal moves as negative example.

positive example
negative example
24
BackgroundStern et als work?
  • With the scores of patterns, they predicted
    moves.
  • positive example patterns are more likely to
    be selected than negative example patterns.

25
BackgroundStern et als work?
  • When storing and comparing patterns, they used
    Zobrist hashing1990.
  • By Zobrist hashing, they could store and compare
    patterns fast with limited storage space used.

26
BackgroundStern et als work?
  • 20,000 matches for training data, 500 matches for
    test data26 accuracy in prediction(2005)
  • 181,000 matches for training data, 4,400 matches
    for test data34 accuracy in prediction (2006)

27
BackgroundMaximum Entropy Method?
  • We used Maximum Entropy Method (MEM) Adam et al,
    1996 for re-ranking.
  • MEM is a machine learning method.

28
BackgroundMaximum Entropy Method?
samples
input
MEM system
MEM system
training
labeled samples
output
P(label,sample) for each label
29
BackgroundMaximum Entropy Method?
  • In this research, we could get P(positive,sampl
    e) and P(negative,sample) by MEM.
  • In other words, MEM was used as a binary
    classifier.

30
BackgroundMaximum Entropy Method?
  • MEM can manage multiple overlapping features.
  • MEM is often used in Natural Language Processing
    (NLP) research.

31
BackgroundMaximum Entropy Method?
  • MEM consumes much memory, so we couldnt apply
    MEM directly. Therefore, we applied MEM for
    re-ranking instead.

32
Our Method
33
Our MethodOverview
legal moves
relative frequency module
relative frequency module
training
ranking
labeled samples
labeled samples
MEM module
MEM module
re-ranking
training
ranked list
34
Our MethodFeature?
  • Strong human players often select moves which are
    consistent with previous moves.
  • So, we added information of previous moves as
    features for machine learning to attain more
    accurate move prediction. (MEM can manage
    multiple overlapping features)

35
Our MethodFeature?
  • We added
  • relative coordinates to the 4 previous moves
  • coordinates of the current move
  • as features for machine learning.

36
Our MethodTraining Our System?
  • First, we constructed a pattern dictionary in the
    way same as Stern et al.
  • Then, we learned the relative frequencies of
    patterns of stone positions by using half of the
    training data.

37
Our MethodTraining Our System?
relative frequency module
relative frequency module
training
labeled samples
labeled samples
MEM module
MEM module
training
38
Our MethodTraining Our System?
  • We prepared two counters, used and unused,
    for each pattern.
  • If a pattern appears around the actual expert
    move, used of it is incremented.
  • If a pattern appears around another legal move,
    unused of it is incremented.

39
Our MethodTraining Our System?
  • around the actual expert move, used is
    incremented
  • around the other legal moves, unused is
    incremented

used
unused
40
Our MethodTraining Our System?
  • We calculated the relative frequencies as
    follows
  • We used Laplaces law.

41
Our MethodTraining Our System?
  • Then, we trained our MEM system by using the rest
    of the training data.
  • First, we ranked all legal moves by the relative
    frequencies.
  • Then, by using top n (20-80) in the ranking, we
    trained our MEM system.

42
Our MethodTraining Our System?
relative frequency module
relative frequency module
training
labeled samples
labeled samples
MEM module
MEM module
training
43
Our MethodTraining Our System?
  • Features for machine learning were
  • A pattern around the move
  • Coordinates of the move
  • Relative coordinates to the 4 previous moves

44
Our MethodTraining Our System?
  • The actual expert moves were scored as
    positive.
  • The other legal moves were scored as negative.

45
Our MethodTraining Our System?
  • the actual expert move is labeled as positive
  • the other legal moves, are labeled as negative

positive
negative
46
Our MethodPredicting Moves?
  • First, we ranked all legal moves with the
    relative frequencies.
  • Then we re-ranked top n(20-80) with MEM.

47
Our MethodPredicting Moves?
legal moves
relative frequency module
ranking
MEM module
re-ranking
ranked list
48
Experiment
49
ExperimentData
  • We conducted our experiment using the records in
    the GoGoD database T.Mark et al. This database
    contains human expert games.

50
ExperimentChanging the re-ranking amount?
  • We used 2,000 matches as training set and 500
    matches as development set. We changed the amount
    used for re-ranking and evaluated the accuracy.

51
ExperimentChanging the re-ranking amount?
legal moves
relative frequency module
relative frequency module
training
ranking
labeled samples
labeled samples
MEM module
MEM module
re-ranking
training
changed the amount used for re-ranking
ranked list
52
ExperimentChanging the re-ranking amount?
  • The results are as follows. Column 0 is the
    result without re-ranking. The cumulative density
    of rank x is y means that y of all expert moves
    are in the top x in the ranking by our system.

53
ExperimentChanging the re-ranking amount?
  • At ranks 1,5, and 10 the results with re-ranking
    yielded better outcomes than those without
    re-ranking.

54
ExperimentChanging the re-ranking amount?
  • However, for the others, the results with
    re-ranking did not yield better outcomes than
    those without re-ranking.


55
ExperimentChanging the re-ranking amount?
  • Increasing the amount used for re-ranking had a
    bad influence on rank 1.

56
ExperimentChanging the re-ranking amount?
  • but a good influence on ranks 10, 20, 40, and 60.

57
ExperimentCompared with Stern et als work?
  • We used 20,000 matches as training set and 500
    matches as test set. We set the re-ranking amount
    to 20. And we compared our system with Stern et
    als work2005,2006.

58
ExperimentCompared with Stern et als work?
  • Compared to Stern et als work 2005, our system
    yielded better results in cumulative density at
    ranks 1, 5, and 10.

59
ExperimentCompared with Stern et als work?
  • However our system was outperformed at rank 20.

60
ExperimentCompared with Stern et als work?
  • We could not obtain better results than Stern et
    als work 2006 with the current size of our
    training data.

61
ExperimentMatch with GnuGo
  • We had our system (only move prediction) play
    against GnuGo3.6.
  • Many moves of our system were not so bad but some
    moves cause the system to lose.

62
Discussion and Conclusion
63
Discussion and Conclusion?
  • We could attain high accuracy with a relatively
    small amount of training data.
  • Our system lost to GnuGo but many moves were not
    so bad.
  • We may obtain better results than Stern et als
    work 2006 if we use more training data.

64
Discussion and Conclusion?
  • There may be problems related to using the
    previous moves as features for machine learning.
  • There are few situations in which previous moves
    are bad moves in the training data.
  • The situation in which previous moves are bad
    moves can cause bad effects on the move
    prediction.

65
  • Any Questions?
  • Please speak slowly and clearly.

66
  • In our paper, we said that we add patterns around
    the previous moves as features for machine
    learning.
  • But they are meaningless because for all the
    legal moves, patterns around the previous moves
    are the same.
  • These patterns dont contribute to scoring.
Write a Comment
User Comments (0)
About PowerShow.com