Solving Probabilistic Combinatorial Games - PowerPoint PPT Presentation

About This Presentation
Title:

Solving Probabilistic Combinatorial Games

Description:

Use Monte-Carlo methods to randomly collect k samples from 2n data points in the ... Applied to Monte-Carlo Go by Bouzy and Helmstetter and several other researchers. ... – PowerPoint PPT presentation

Number of Views:69
Avg rating:3.0/5.0
Slides: 24
Provided by: hah1
Category:

less

Transcript and Presenter's Notes

Title: Solving Probabilistic Combinatorial Games


1
Solving Probabilistic Combinatorial Games
  • Ling Zhao Martin Mueller
  • University of Alberta
  • September 7, 2005

Paper link http//www.cs.ualberta.ca/zhao/PCG.pd
f
2
Motivations
  • Ken Chens previous work to maximize winning
    chance in Go.
  • Maximize points Vs. maximize winning probability
  • How to solve the abstract game efficiently or
    play the abstract game well?

3
Motivations
Results (black plays first) 15 (80), -7 (20)
4
Combinatorial Games
5
Probabilistic Combinatorial Games
  • A Terminal node which is expressed as a
    probability distribution (dp1, v1, p2, v2,
    , pn, vn ) is a PCG.
  • If A1, A2, An, B1, B2, , Bn are PCGs, then
    A1, A2, An B1, B2, , Bn is a PCG.
  • A sum of PCGs is a PCG.

Left options
Right options
A move in a sum game consists of a move to an
option in exactly one subgame and leaves all
other subgames unchanged.
6
Simple PCG (SPCG)
  • Each PCG has exactly one left option and one
    right option.
  • Each option leads immediately to a terminal node.
  • Each distribution has exactly 2 values with
    associate probabilities.

7
Problems to Address
  • How to solve PCGs efficiently?
  • How to play PCGs well if resources are limited or
    fast play is required?

8
Game Tree Analysis
  • Very regular game tree a node at depth k has
    exactly n-k children, so n!/(n-k)! nodes in total
    at depth k.
  • Very large number of transpositions
  • C(n, k) C(k, k/2) distinct nodes at depth k.

9
Terminal Node Evaluation
  • Terminal node is a sum of probability
    distributions.
  • Winning probability

10
Monte-Carlo Terminal Evaluation (MCTE)
  • Use Monte-Carlo methods to randomly collect k
    samples from 2n data points in the sum of n
    distributions.
  • Use the average winning percentage of samples
    (Pw) to approximate the overall winning
    probability (Pw).
  • Theory from statistics Pw - Pw is a normal
    distribution, with mean0 and std dev
    lt
  • Experimental results

11
Monte-Carlo Interior Evaluation (MCIE)
  • Evaluation of a node is
  • approximated by averaging the
  • values of terminal nodes reached
  • from it through random play.
  • Proposed by Abramson in 1990.
  • - Using 4x4 tic-tac-toe and 6x6 Othello for
    experiments.
  • Applied to Monte-Carlo Go by Bouzy and
    Helmstetter and several other researchers.

12
SPCG Solver and Player
  • Solver alpha-beta search, transposition tables,
    move ordering (MCIE MCTE).
  • Player alpha-beta search to a certain depth and
    use Monte-Carlo interior evaluation for frontier
    nodes.

13
Experimental Results
  • 100 randomly generated games, and each game has
    14 subgames.
  • Value distribution probability from 0 to 1,
    value from 1000 to 1000.
  • AMD 2400MHz CPUs
  • 220 cache entries (terminal nodes have higher
    priority)
  • About 8 seconds to solve a game.

14
Solver Performance
  • Monte-Carlo move ordering dm depth limit for
    Monte-Carlo move ordering being used, otherwise
    history heuristic is used.
  • Monte-Carlo interior evaluation nt percentage
    of all the current nodes descendant terminal
    nodes sampled.
  • Monte-Carlo terminal evaluation nc number of
    data points sampled.
  • Accurate terminal evaluation occupies 90 of the
    overall running time.

15
Solver Performance Results
16
Monte-Carlo Player
  • Test against the perfect player.
  • Each game has two rounds each side plays first
    once.
  • Winning probability is the average of the two
    rounds.
  • Parameters search depth and nc.

17
Monte-Carlo Player Results
18
Error in Move Ordering
  • Average probability error
  • Move A B
  • Actual win prob 0.18 lt 0.19
  • Estimate 0.32 gt 0.30
  • Win prob lost 0.01
  • Average probability error is the average of the
    winning probability lost in all move pairs of a
    node.
  • Worst probability error the probability lost due
    to the wrong best move chosen.

19
Results
20
Conclusions
  • Efficient exact and heuristic solvers for SPCG.
  • Successfully incorporate Monte-Carlo move
    ordering into alpha-beta search to SPCG.
  • A heuristic evaluation techniques based on
    Monte-Carlo with performance close to the prefect
    player.
  • Extensive experiments for the two solvers.

21
Future Work
  • Better algorithms to accurately evaluate terminal
    nodes?
  • Progressive pruning.
  • Why does the simple Abramsons Expected Outcome
    model perform so well in move ordering?

22
New Directions to Apply Monte-Carlo to Computer Go
Monte-Carlo Go

New Direction
23
Future Work Transformation
?
?
PCG solver or player
Write a Comment
User Comments (0)
About PowerShow.com