Nesterovs excessive gap technique and poker - PowerPoint PPT Presentation

About This Presentation

Title:

Nesterovs excessive gap technique and poker

Description:

From weak duality, we have that f(y) F(x) The excessive gap condition requires that ... such that after at most N iterations, the iterates have duality gap at most ... – PowerPoint PPT presentation

Number of Views:88

Avg rating:3.0/5.0

Slides: 32

Provided by: csC76

Learn more at: http://www.cs.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Nesterovs excessive gap technique and poker

1
Nesterovs excessive gap technique and poker

Andrew Gilpin
CMU Theory Lunch
Feb 28, 2007
Joint work with
Samid Hoda, Javier Peña, Troels Sørensen, Tuomas
Sandholm

2
Outline

Two-person zero-sum sequential games
First-order methods for convex optimization
Nesterovs excessive gap technique (EGT)
EGT for sequential games
Heuristics for EGT
Application to Texas Holdem poker

3
We want to solve
If Q1 and Q2 are simplices, this is the Nash
equilibrium problem for two-person zero-sum
matrix games
If Q1 and Q2 are complexes, this is the Nash
equilibrium problem for two-person zero-sum
sequential games
4
Whats a complex?
Its just like a simplex, but more complex.
Each players complex encodes her set
of realization plans in the game In particular,
player 1s complex is where E and e depend on
the game
5
A B C D E F G
H
6
Recall our problem
where Q1 and Q2 are complexes
Since Q1 and Q2 have a linear description, this
problem can be solved as an LP. However, current
LP solution methods do not scale
7
(Un)scalability of LP solvers

Rhode Island poker Shi Littman 01
LP has 91 million rows and columns
Applying GameShrink automated abstraction
algorithm yields an LP with only 1.2 million rows
and columns, and 50 million non-zeros G.
Sandholm, 06a
Solution requires 25 GB RAM and over a week of
CPU time
Texas Holdem poker
1018 nodes in game tree
Lossy abstractions need to be performed
Limitations of current solver technology primary
limitation to achieving expert-level strategies
G. Sandholm 06b, 07a
Instead of standard LP solvers, what about a
first-order method?

8
Convex optimization
Suppose we want to solve
Note that this formulation captures ALL convex
optimization problems (can model feasible space
using an indicator function)
where f is convex.
For general f, convergence requires O(1/e2)
iterations (e.g., for subgradient methods) For
smooth, strongly convex f with Lipschitz- continuo
us gradient, can be done in O(1/e½) iterations
Analysis based on black-box oracle access model.
Can we do better by looking inside the box?
9
Strong convexity

A function is strongly
convex if there exists such that
for all and all
is the strong convexity parameter of d

10
Recall our problem
where Q1 and Q2 are complexes
Equivalently where and
11
,
,
Unfortunately, F and f are non-smooth Fortunately
, they have a special structure
Let d1,d2 be smooth and strongly convex on
Q1,Q2 These are called prox-functions Now let µ
gt 0 and consider These are well-defined
smooth functions
12
Excessive gap condition
From weak duality, we have that f(y) F(x) The
excessive gap condition requires that
fµ(y) Fµ(x) (EGC) The
algorithm maintains (EGC), and gradually
decreases µ As µ decreases, the smoothed
functions approach the non-smooth functions, and
thus iterates satisfying (EGC) converge to
optimal solutions
13
Nesterovs main theorem

Theorem Nesterov 05
There exists an algorithm such that after at most
N iterations, the iterates have duality gap at
most
Furthermore, each iteration only requires solving
three problems of the form
and performing three matrix-vector product
operations on A.

14
Nice prox functions

A prox function d for Q is nice if it is
Strongly convex continuous everywhere in Q, and
differentiable in the relative interior of Q
The min of d over Q is 0
The following maps are easily computable

15
Nice simplex prox function 1 Entropy
16
Nice simplex prox function 2 Euclidean
sargmax can be computed in O(n log n) time
17
From the simplex to the complex

Theorem Hoda, G., Peña 06
A nice prox function can be constructed for
the complex via a recursive application of
any nice prox function for the simplex

18
Prox function example
Let be any nice simplex prox function. The
prox function for this matrix is
19
Solving
20
(similar to b(i-vii))
21
Heuristics G., Hoda, Peña, Sandholm 07

Heuristic 1 Aggressive µ reduction
The µ given in the previous algorithm is a
conservative choice guaranteeing convergence
In practice, we can do much better by
aggressively pushing µ, while checking that the
excessive gap condition is satisfied
Heuristic 2 Balanced µ reduction
To prevent one µ from dominating the other, we
also perform periodic adjustments to keep them
within a small factor of one another

22
Matrix-vector multiplication in pokerG., Hoda,
Peña, Sandholm 07

The main time and space bottleneck of the
algorithm is the matrix-vector product on A
Instead of storing the entire matrix, we can
represent it as a composition of Kronecker
products
We can also effectively take advantage of
parallelization in the matrix-vector product to
achieve near-linear speedup

23
Memory usage comparison
24
Poker

Poker is a recognized challenge problem in AI
because (among other reasons)
the other players cards are hidden
bluffing and other deceptive strategies are
needed in a good player
there is uncertainty about future events.
Texas Holdem most popular variant of poker
Two-player game tree has 1018 nodes

25
Potential-aware automated abstractionG.,
Sandholm, Sørensen 07

Most prior automated abstraction algorithms
employ a myopic expected value computation as a
similarity metric
This ignores hands like flush draws where
although the probability of winning is small, the
payoff could be high
Our newest algorithm considers higher-dimensional
spaces consisting of histograms over abstracted
classes of states from later stages of the game
This enables our bottom-up abstraction algorithm
to automatically take into account positive and
negative potential

26
Solving the four-round model

Computed abstraction with
20 first-round buckets
800 second-round buckets
4800 third-round buckets
28800 fourth-round buckets
Algorithm using 30 GB RAM
Simply representing as an LP requires 32 TB
Outputs new, improved solution every 2.5 days

27
G., Sandholm, Sørensen 07
28
G., Sandholm, Sørensen 07
29
G., Sandholm, Sørensen 07
30
Future research

Customizing second-order (e.g. interior-point
methods) for the equilibrium problem
Additional heuristics for improving practical
performance of EGT algorithm
Techniques for finding an optimal solution from
an e-solution

31
Thank you ?

Write a Comment

User Comments (0)