Uri Zwick Tel Aviv University - PowerPoint PPT Presentation

About This Presentation
Title:

Uri Zwick Tel Aviv University

Description:

MAX-sink. min-sink. Simple Stochastic games (SSGs) Strategies ... Find the optimal strategy ' for MAX in the game. in which the only outgoing edge of i is (i, (i) ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 37
Provided by: csta3
Category:
Tags: aviv | games | max | reading | tel | university | uri | zwick

less

Transcript and Presenter's Notes

Title: Uri Zwick Tel Aviv University


1
Uri ZwickTel Aviv University
Simple Stochastic GamesMean Payoff GamesParity
Games
TexPoint fonts used in EMF. Read the TexPoint
manual before you delete this box. AAAA
2
Randomized subexponential algorithm for SSG
Deterministic subexponential algorithm for PG
3
Simple Stochastic Games
Mean Payoff Games
Parity Games
4
A simple Simple Stochastic Game
5
Simple Stochastic game (SSGs) Reachability
version Condon (1992)
R
min
MAX
RAND
Two Players MAX and min
Objective MAX/min the probability of getting to
the MAX-sink
6
Simple Stochastic games (SSGs)Strategies
A general strategy may be randomized and history
dependent
A positional strategy is deterministicand
history independent
Positional strategy for MAX choice of an
outgoing edge from each MAX vertex
7
Simple Stochastic games (SSGs)Values
Every vertex i in the game has a value ?vi
general
positional
general
positional
Both players have positional optimal strategies
There are strategies that are optimal for every
starting position
8
Simple Stochastic game (SSGs) Condon (1992)
Terminating binary games
The outdegrees of all non-sinks are 2
All probabilities are ½.
The game terminates with prob. 1
Easy reduction from general gamesto terminating
binary games
9
Solving terminating binary SSGs
The values vi of the vertices of a game are the
unique solution of the following equations
The values are rational numbersrequiring only a
linear number of bits
Corollary Decision version in NP ? co-NP
10
Value iteration (for binary SSGs)
Iterate the operator
Converges to the unique solution
But, may require an exponentialnumber of
iterations to get close
11
Simple Stochastic game (SSGs) Payoff version
Shapley (1953)
R
min
MAX
RAND
Limiting average version
Discounted version
12
Markov Decision Processes (MDPs)
R
min
MAX
RAND
Theorem Epenoux (1964)
Values and optimal strategies of a MDP can be
found by solving an LP
13
Markov Decision Processes (MDPs)
R
min
MAX
RAND
Theorem Epenoux (1970)
Values and optimal strategies of a MDP can be
found by solving an LP
14
SSG ? NP ? co-NP Another proof
Deciding whether the value of a game isat least
(at most) v is in NP ? co-NP
To show that value ? v ,guess an optimal
strategy ? for MAX
Find an optimal counter-strategy ? for min by
solving the resulting MDP.
Is the problem in P ?
15
Mean Payoff Games (MPGs)Ehrenfeucht, Mycielski
(1979)
R
min
MAX
RAND
Non-terminating version
Discounted version
PayoffSSGs
ReachabilitySSGs
MPGs
Pseudo-polynomial algorithm
(PZ96)
16
Mean Payoff Games (MPGs)Ehrenfeucht, Mycielski
(1979)
Again, both players have optimal positional
strategies.
Value(s,?) average of cycle formed
17
Selecting the second largest element with only
four storage locations PZ96
18
Parity Games (PGs) A simple example
Priorities
2
3
2
1
4
1
EVEN wins if largest priorityseen infinitely
often is even
19
Parity Games (PGs)
EVEN wins if largest priorityseen infinitely
often is even
Equivalent to many interesting problemsin
automata and verification
Non-emptyness of ?-tree automata
modal ?-calculus model checking
20
Parity Games (PGs)
Mean Payoff Games (MPGs)
Stirling (1993) Puri (1995)
Replace priority k by payoff (?n)k
Move payoffs to outgoing edges
21
Simple Stochastic games (SSGs)Switches

A switch is a change of strategy at a single
vertex
A switch is profitable for MAX if it increases
the value of the game (sum of values of all
vertices)
A strategy is optimal iff no switch is profitable
22
Switches

23
Strategy/Policy Iteration
Start with some strategy s (of MAX)
While there are improving switches, perform some
of them
As each step is strictly improving and as there
is a finite number of strategies, the algorithm
must end with an optimal strategy
SSG ? PLS (Polynomial Local Search)
24
Strategy/Policy IterationComplexity?
Performing only one switch at a time may lead to
exponentially many improvements,even for MDPs
Condon (1992)
What happens if we perform all profitable
switches Hoffman-Karp (1966)
???
Not known to be polynomialO(2n/n) Mansour-Singh
(1999)
No non-linear examples2n-O(1) Madani (2002)
25
A randomized subexponential algorithm for simple
stochastic games
26
A randomized subexponentialalgorithm for binary
SSGsLudwig (1995)Kalai (1992)
Matousek-Sharir-Welzl (1992)
Start with an arbitrary strategy ? for MAX
Choose a random vertex i??VMAX
Find the optimal strategy ? for MAX in the
gamein which the only outgoing edge of i is
(i,?(i))
If switching ? at i is not profitable, then ?
is optimal
Otherwise, let ??? (?)i and repeat
27
A randomized subexponentialalgorithm for binary
SSGsLudwig (1995)Kalai (1992)
Matousek-Sharir-Welzl (1992)
MAX vertices
All correct !
Would never be switched !
There is a hidden order of MAX vertices under
which the optimal strategy returned by the first
recursive call correctly fixes the strategy of
MAX at vertices 1,2,,i
28
The hidden order
Let ui be the sum of values of the optimal
strategy of MAX that agrees with s on i
Order the vertices such that
Positions 1,..,Iwere switchedand would neverbe
switched again
29
The hidden order
ui(s) - the maximum sum of values of a strategy
of MAX that agrees with s on i
30
The hidden order
Order the vertices such that
Positions 1,..,iwere switchedand would neverbe
switched again
31
SSGs are LP-type problems Halman (2002)
General (non-binary) SSGs can be solved in time
Independently observed byBjörklund-Sandberg-Voro
byov (2005)
AUSO Acyclic Unique Sink Orientations
32
SSGs GPLCPGärtner-Rüst (2005)Björklund-
Svensson-Vorobyov (2005)
GPLCP Generalized Linear ComplementaryProblem
with a P-matrix
33
A deterministic subexponential algorithm for
parity games
Mike PatersonMarcin JurdzinskiUri Zwick
34
Parity Games (PGs) A simple example
Priorities
2
3
2
1
4
1
EVEN wins if largest priorityseen infinitely
often is even
35
Parity Games (PGs)
Mean Payoff Games (MPGs)
Stirling (1993) Puri (1995)
Replace priority k by payoff (?n)k
Move payoffs to outgoing edges
36
Exponential algorithm for PGsMcNaughton (1993)
Zielonka (1998)
Vertices of highest priority(even)
Firstrecursivecall
Second recursivecall
In the worst case, both recursive calls are on
games of size n?1
Vertices from whichEVEN can force thegame to
enter A
37
Exponential algorithm for PGsMcNaughton (1993)
Zielonka (1998)
Vertices of highest priority(even)
Firstrecursivecall
Vertices from whichEVEN can force thegame to
enter A
Lemma (i) (ii)
38
Exponential algorithm for PGsMcNaughton (1993)
Zielonka (1998)
Second recursivecall
In the worst case, both recursive calls are on
games of size n?1
39
Deterministic subexponential alg for PGs
Jurdzinski, Paterson, Z (2006)
Idea Look for small dominions!
Second recursivecall
Dominions of size s can be found in O(ns) time
Dominion
Dominion A (small) set from which one of the
players can win without the play ever leaving
this set
40
Open problems
  • Polynomial algorithms?
  • Is the Policy Improvement algorithm polynomial?
  • Faster subexponential algorithmsfor parity
    games?
  • Deterministic subexponential algorithmsfor MPGs
    and SSGs?
  • Faster pseudo-polynomial algorithmsfor MPGs?
Write a Comment
User Comments (0)
About PowerShow.com