Title: Uri Zwick Tel Aviv University
1Uri ZwickTel Aviv University
Simple Stochastic GamesMean Payoff GamesParity
Games
TexPoint fonts used in EMF. Read the TexPoint
manual before you delete this box. AAAA
2Randomized subexponential algorithm for SSG
Deterministic subexponential algorithm for PG
3Simple Stochastic Games
Mean Payoff Games
Parity Games
4A simple Simple Stochastic Game
5Simple Stochastic game (SSGs) Reachability
version Condon (1992)
R
min
MAX
RAND
Two Players MAX and min
Objective MAX/min the probability of getting to
the MAX-sink
6Simple Stochastic games (SSGs)Strategies
A general strategy may be randomized and history
dependent
A positional strategy is deterministicand
history independent
Positional strategy for MAX choice of an
outgoing edge from each MAX vertex
7Simple Stochastic games (SSGs)Values
Every vertex i in the game has a value ?vi
general
positional
general
positional
Both players have positional optimal strategies
There are strategies that are optimal for every
starting position
8Simple Stochastic game (SSGs) Condon (1992)
Terminating binary games
The outdegrees of all non-sinks are 2
All probabilities are ½.
The game terminates with prob. 1
Easy reduction from general gamesto terminating
binary games
9Solving terminating binary SSGs
The values vi of the vertices of a game are the
unique solution of the following equations
The values are rational numbersrequiring only a
linear number of bits
Corollary Decision version in NP ? co-NP
10Value iteration (for binary SSGs)
Iterate the operator
Converges to the unique solution
But, may require an exponentialnumber of
iterations to get close
11Simple Stochastic game (SSGs) Payoff version
Shapley (1953)
R
min
MAX
RAND
Limiting average version
Discounted version
12Markov Decision Processes (MDPs)
R
min
MAX
RAND
Theorem Epenoux (1964)
Values and optimal strategies of a MDP can be
found by solving an LP
13Markov Decision Processes (MDPs)
R
min
MAX
RAND
Theorem Epenoux (1970)
Values and optimal strategies of a MDP can be
found by solving an LP
14SSG ? NP ? co-NP Another proof
Deciding whether the value of a game isat least
(at most) v is in NP ? co-NP
To show that value ? v ,guess an optimal
strategy ? for MAX
Find an optimal counter-strategy ? for min by
solving the resulting MDP.
Is the problem in P ?
15Mean Payoff Games (MPGs)Ehrenfeucht, Mycielski
(1979)
R
min
MAX
RAND
Non-terminating version
Discounted version
PayoffSSGs
ReachabilitySSGs
MPGs
Pseudo-polynomial algorithm
(PZ96)
16Mean Payoff Games (MPGs)Ehrenfeucht, Mycielski
(1979)
Again, both players have optimal positional
strategies.
Value(s,?) average of cycle formed
17Selecting the second largest element with only
four storage locations PZ96
18Parity Games (PGs) A simple example
Priorities
2
3
2
1
4
1
EVEN wins if largest priorityseen infinitely
often is even
19Parity Games (PGs)
EVEN wins if largest priorityseen infinitely
often is even
Equivalent to many interesting problemsin
automata and verification
Non-emptyness of ?-tree automata
modal ?-calculus model checking
20Parity Games (PGs)
Mean Payoff Games (MPGs)
Stirling (1993) Puri (1995)
Replace priority k by payoff (?n)k
Move payoffs to outgoing edges
21Simple Stochastic games (SSGs)Switches
A switch is a change of strategy at a single
vertex
A switch is profitable for MAX if it increases
the value of the game (sum of values of all
vertices)
A strategy is optimal iff no switch is profitable
22Switches
23Strategy/Policy Iteration
Start with some strategy s (of MAX)
While there are improving switches, perform some
of them
As each step is strictly improving and as there
is a finite number of strategies, the algorithm
must end with an optimal strategy
SSG ? PLS (Polynomial Local Search)
24Strategy/Policy IterationComplexity?
Performing only one switch at a time may lead to
exponentially many improvements,even for MDPs
Condon (1992)
What happens if we perform all profitable
switches Hoffman-Karp (1966)
???
Not known to be polynomialO(2n/n) Mansour-Singh
(1999)
No non-linear examples2n-O(1) Madani (2002)
25A randomized subexponential algorithm for simple
stochastic games
26A randomized subexponentialalgorithm for binary
SSGsLudwig (1995)Kalai (1992)
Matousek-Sharir-Welzl (1992)
Start with an arbitrary strategy ? for MAX
Choose a random vertex i??VMAX
Find the optimal strategy ? for MAX in the
gamein which the only outgoing edge of i is
(i,?(i))
If switching ? at i is not profitable, then ?
is optimal
Otherwise, let ??? (?)i and repeat
27A randomized subexponentialalgorithm for binary
SSGsLudwig (1995)Kalai (1992)
Matousek-Sharir-Welzl (1992)
MAX vertices
All correct !
Would never be switched !
There is a hidden order of MAX vertices under
which the optimal strategy returned by the first
recursive call correctly fixes the strategy of
MAX at vertices 1,2,,i
28The hidden order
Let ui be the sum of values of the optimal
strategy of MAX that agrees with s on i
Order the vertices such that
Positions 1,..,Iwere switchedand would neverbe
switched again
29The hidden order
ui(s) - the maximum sum of values of a strategy
of MAX that agrees with s on i
30The hidden order
Order the vertices such that
Positions 1,..,iwere switchedand would neverbe
switched again
31SSGs are LP-type problems Halman (2002)
General (non-binary) SSGs can be solved in time
Independently observed byBjörklund-Sandberg-Voro
byov (2005)
AUSO Acyclic Unique Sink Orientations
32SSGs GPLCPGärtner-Rüst (2005)Björklund-
Svensson-Vorobyov (2005)
GPLCP Generalized Linear ComplementaryProblem
with a P-matrix
33A deterministic subexponential algorithm for
parity games
Mike PatersonMarcin JurdzinskiUri Zwick
34Parity Games (PGs) A simple example
Priorities
2
3
2
1
4
1
EVEN wins if largest priorityseen infinitely
often is even
35Parity Games (PGs)
Mean Payoff Games (MPGs)
Stirling (1993) Puri (1995)
Replace priority k by payoff (?n)k
Move payoffs to outgoing edges
36Exponential algorithm for PGsMcNaughton (1993)
Zielonka (1998)
Vertices of highest priority(even)
Firstrecursivecall
Second recursivecall
In the worst case, both recursive calls are on
games of size n?1
Vertices from whichEVEN can force thegame to
enter A
37Exponential algorithm for PGsMcNaughton (1993)
Zielonka (1998)
Vertices of highest priority(even)
Firstrecursivecall
Vertices from whichEVEN can force thegame to
enter A
Lemma (i) (ii)
38Exponential algorithm for PGsMcNaughton (1993)
Zielonka (1998)
Second recursivecall
In the worst case, both recursive calls are on
games of size n?1
39Deterministic subexponential alg for PGs
Jurdzinski, Paterson, Z (2006)
Idea Look for small dominions!
Second recursivecall
Dominions of size s can be found in O(ns) time
Dominion
Dominion A (small) set from which one of the
players can win without the play ever leaving
this set
40Open problems
- Polynomial algorithms?
- Is the Policy Improvement algorithm polynomial?
- Faster subexponential algorithmsfor parity
games? - Deterministic subexponential algorithmsfor MPGs
and SSGs? - Faster pseudo-polynomial algorithmsfor MPGs?