Solving Probabilistic Combinatorial Games - PowerPoint PPT Presentation

About This Presentation

Title:

Solving Probabilistic Combinatorial Games

Description:

Use Monte-Carlo methods to randomly collect k samples from 2n data points in the ... Applied to Monte-Carlo Go by Bouzy and Helmstetter and several other researchers. ... – PowerPoint PPT presentation

Number of Views:69

Avg rating:3.0/5.0

Slides: 24

Provided by: hah1

Category:

more less

Transcript and Presenter's Notes

Title: Solving Probabilistic Combinatorial Games

1
Solving Probabilistic Combinatorial Games

Ling Zhao Martin Mueller
University of Alberta
September 7, 2005

Paper link http//www.cs.ualberta.ca/zhao/PCG.pd
f
2
Motivations

Ken Chens previous work to maximize winning
chance in Go.
Maximize points Vs. maximize winning probability
How to solve the abstract game efficiently or
play the abstract game well?

3
Motivations
Results (black plays first) 15 (80), -7 (20)
4
Combinatorial Games
5
Probabilistic Combinatorial Games

A Terminal node which is expressed as a
probability distribution (dp1, v1, p2, v2,
, pn, vn ) is a PCG.
If A1, A2, An, B1, B2, , Bn are PCGs, then
A1, A2, An B1, B2, , Bn is a PCG.
A sum of PCGs is a PCG.

Left options
Right options
A move in a sum game consists of a move to an
option in exactly one subgame and leaves all
other subgames unchanged.
6
Simple PCG (SPCG)

Each PCG has exactly one left option and one
right option.
Each option leads immediately to a terminal node.
Each distribution has exactly 2 values with
associate probabilities.

7
Problems to Address

How to solve PCGs efficiently?
How to play PCGs well if resources are limited or
fast play is required?

8
Game Tree Analysis

Very regular game tree a node at depth k has
exactly n-k children, so n!/(n-k)! nodes in total
at depth k.
Very large number of transpositions
C(n, k) C(k, k/2) distinct nodes at depth k.

9
Terminal Node Evaluation

Terminal node is a sum of probability
distributions.
Winning probability

10
Monte-Carlo Terminal Evaluation (MCTE)

Use Monte-Carlo methods to randomly collect k
samples from 2n data points in the sum of n
distributions.
Use the average winning percentage of samples
(Pw) to approximate the overall winning
probability (Pw).
Theory from statistics Pw - Pw is a normal
distribution, with mean0 and std dev
lt
Experimental results

11
Monte-Carlo Interior Evaluation (MCIE)

Evaluation of a node is
approximated by averaging the
values of terminal nodes reached
from it through random play.
Proposed by Abramson in 1990.
- Using 4x4 tic-tac-toe and 6x6 Othello for
experiments.
Applied to Monte-Carlo Go by Bouzy and
Helmstetter and several other researchers.

12
SPCG Solver and Player

Solver alpha-beta search, transposition tables,
move ordering (MCIE MCTE).
Player alpha-beta search to a certain depth and
use Monte-Carlo interior evaluation for frontier
nodes.

13
Experimental Results

100 randomly generated games, and each game has
14 subgames.
Value distribution probability from 0 to 1,
value from 1000 to 1000.
AMD 2400MHz CPUs
220 cache entries (terminal nodes have higher
priority)
About 8 seconds to solve a game.

14
Solver Performance

Monte-Carlo move ordering dm depth limit for
Monte-Carlo move ordering being used, otherwise
history heuristic is used.
Monte-Carlo interior evaluation nt percentage
of all the current nodes descendant terminal
nodes sampled.
Monte-Carlo terminal evaluation nc number of
data points sampled.
Accurate terminal evaluation occupies 90 of the
overall running time.

15
Solver Performance Results
16
Monte-Carlo Player

Test against the perfect player.
Each game has two rounds each side plays first
once.
Winning probability is the average of the two
rounds.
Parameters search depth and nc.

17
Monte-Carlo Player Results
18
Error in Move Ordering

Average probability error
Move A B
Actual win prob 0.18 lt 0.19
Estimate 0.32 gt 0.30
Win prob lost 0.01
Average probability error is the average of the
winning probability lost in all move pairs of a
node.
Worst probability error the probability lost due
to the wrong best move chosen.

19
Results
20
Conclusions

Efficient exact and heuristic solvers for SPCG.
Successfully incorporate Monte-Carlo move
ordering into alpha-beta search to SPCG.
A heuristic evaluation techniques based on
Monte-Carlo with performance close to the prefect
player.
Extensive experiments for the two solvers.

21
Future Work

Better algorithms to accurately evaluate terminal
nodes?
Progressive pruning.
Why does the simple Abramsons Expected Outcome
model perform so well in move ordering?

22
New Directions to Apply Monte-Carlo to Computer Go
Monte-Carlo Go

New Direction
23
Future Work Transformation
?
?
PCG solver or player

Write a Comment

User Comments (0)