Nesterovs excessive gap technique and poker - PowerPoint PPT Presentation

About This Presentation
Title:

Nesterovs excessive gap technique and poker

Description:

From weak duality, we have that f(y) F(x) The excessive gap condition requires that ... such that after at most N iterations, the iterates have duality gap at most ... – PowerPoint PPT presentation

Number of Views:87
Avg rating:3.0/5.0
Slides: 32
Provided by: csC76
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Nesterovs excessive gap technique and poker


1
Nesterovs excessive gap technique and poker
  • Andrew Gilpin
  • CMU Theory Lunch
  • Feb 28, 2007
  • Joint work with
  • Samid Hoda, Javier Peña, Troels Sørensen, Tuomas
    Sandholm

2
Outline
  • Two-person zero-sum sequential games
  • First-order methods for convex optimization
  • Nesterovs excessive gap technique (EGT)
  • EGT for sequential games
  • Heuristics for EGT
  • Application to Texas Holdem poker

3
We want to solve
If Q1 and Q2 are simplices, this is the Nash
equilibrium problem for two-person zero-sum
matrix games
If Q1 and Q2 are complexes, this is the Nash
equilibrium problem for two-person zero-sum
sequential games
4
Whats a complex?
Its just like a simplex, but more complex.
Each players complex encodes her set
of realization plans in the game In particular,
player 1s complex is where E and e depend on
the game
5
A B C D E F G
H
6
Recall our problem
where Q1 and Q2 are complexes
Since Q1 and Q2 have a linear description, this
problem can be solved as an LP. However, current
LP solution methods do not scale
7
(Un)scalability of LP solvers
  • Rhode Island poker Shi Littman 01
  • LP has 91 million rows and columns
  • Applying GameShrink automated abstraction
    algorithm yields an LP with only 1.2 million rows
    and columns, and 50 million non-zeros G.
    Sandholm, 06a
  • Solution requires 25 GB RAM and over a week of
    CPU time
  • Texas Holdem poker
  • 1018 nodes in game tree
  • Lossy abstractions need to be performed
  • Limitations of current solver technology primary
    limitation to achieving expert-level strategies
    G. Sandholm 06b, 07a
  • Instead of standard LP solvers, what about a
    first-order method?

8
Convex optimization
Suppose we want to solve
Note that this formulation captures ALL convex
optimization problems (can model feasible space
using an indicator function)
where f is convex.
For general f, convergence requires O(1/e2)
iterations (e.g., for subgradient methods) For
smooth, strongly convex f with Lipschitz- continuo
us gradient, can be done in O(1/e½) iterations
Analysis based on black-box oracle access model.
Can we do better by looking inside the box?
9
Strong convexity
  • A function is strongly
    convex if there exists such that
  • for all and all
  • is the strong convexity parameter of d

10
Recall our problem
where Q1 and Q2 are complexes
Equivalently where and
11
,
,
Unfortunately, F and f are non-smooth Fortunately
, they have a special structure
Let d1,d2 be smooth and strongly convex on
Q1,Q2 These are called prox-functions Now let µ
gt 0 and consider These are well-defined
smooth functions
12
Excessive gap condition
From weak duality, we have that f(y) F(x) The
excessive gap condition requires that
fµ(y) Fµ(x) (EGC) The
algorithm maintains (EGC), and gradually
decreases µ As µ decreases, the smoothed
functions approach the non-smooth functions, and
thus iterates satisfying (EGC) converge to
optimal solutions
13
Nesterovs main theorem
  • Theorem Nesterov 05
  • There exists an algorithm such that after at most
    N iterations, the iterates have duality gap at
    most
  • Furthermore, each iteration only requires solving
    three problems of the form
  • and performing three matrix-vector product
    operations on A.

14
Nice prox functions
  • A prox function d for Q is nice if it is
  • Strongly convex continuous everywhere in Q, and
    differentiable in the relative interior of Q
  • The min of d over Q is 0
  • The following maps are easily computable

15
Nice simplex prox function 1 Entropy
16
Nice simplex prox function 2 Euclidean
sargmax can be computed in O(n log n) time
17
From the simplex to the complex
  • Theorem Hoda, G., Peña 06
  • A nice prox function can be constructed for
  • the complex via a recursive application of
  • any nice prox function for the simplex

18
Prox function example
Let be any nice simplex prox function. The
prox function for this matrix is
19
Solving
20
(similar to b(i-vii))
21
Heuristics G., Hoda, Peña, Sandholm 07
  • Heuristic 1 Aggressive µ reduction
  • The µ given in the previous algorithm is a
    conservative choice guaranteeing convergence
  • In practice, we can do much better by
    aggressively pushing µ, while checking that the
    excessive gap condition is satisfied
  • Heuristic 2 Balanced µ reduction
  • To prevent one µ from dominating the other, we
    also perform periodic adjustments to keep them
    within a small factor of one another

22
Matrix-vector multiplication in pokerG., Hoda,
Peña, Sandholm 07
  • The main time and space bottleneck of the
    algorithm is the matrix-vector product on A
  • Instead of storing the entire matrix, we can
    represent it as a composition of Kronecker
    products
  • We can also effectively take advantage of
    parallelization in the matrix-vector product to
    achieve near-linear speedup

23
Memory usage comparison
24
Poker
  • Poker is a recognized challenge problem in AI
    because (among other reasons)
  • the other players cards are hidden
  • bluffing and other deceptive strategies are
    needed in a good player
  • there is uncertainty about future events.
  • Texas Holdem most popular variant of poker
  • Two-player game tree has 1018 nodes

25
Potential-aware automated abstractionG.,
Sandholm, Sørensen 07
  • Most prior automated abstraction algorithms
    employ a myopic expected value computation as a
    similarity metric
  • This ignores hands like flush draws where
    although the probability of winning is small, the
    payoff could be high
  • Our newest algorithm considers higher-dimensional
    spaces consisting of histograms over abstracted
    classes of states from later stages of the game
  • This enables our bottom-up abstraction algorithm
    to automatically take into account positive and
    negative potential

26
Solving the four-round model
  • Computed abstraction with
  • 20 first-round buckets
  • 800 second-round buckets
  • 4800 third-round buckets
  • 28800 fourth-round buckets
  • Algorithm using 30 GB RAM
  • Simply representing as an LP requires 32 TB
  • Outputs new, improved solution every 2.5 days

27
G., Sandholm, Sørensen 07
28
G., Sandholm, Sørensen 07
29
G., Sandholm, Sørensen 07
30
Future research
  • Customizing second-order (e.g. interior-point
    methods) for the equilibrium problem
  • Additional heuristics for improving practical
    performance of EGT algorithm
  • Techniques for finding an optimal solution from
    an e-solution

31
Thank you ?
Write a Comment
User Comments (0)
About PowerShow.com