Title: Asymptotic Analysis for Large Scale Dynamic Stochastic Games
1Asymptotic Analysis for Large Scale Dynamic
Stochastic Games
- Sachin Adlakha, Ramesh Johari, Gabriel Weintraub
and Andrea Goldsmith - DARPA ITMANET Meeting
- September 13-14, 2009.
2Asymptotic Analysis for Large Scale Dynamic
Stochastic Games S. Adlakha, R. Johari, G.
Weintraub, A. Goldsmith
- MAIN RESULT Taxonomy of Stochastic Games
- HOW IT WORKS
- Existence results for competitive model are based
on continuity arguments. - AME property for a competitive model is derived
from the fact that opponents at higher states
lead to lower payoff. - ASSUMPTIONS AND LIMITATIONS
- Mean field requires all nodes to interact with
each other applies to Dense networks only - Coordination model requires different existence
proof
- General Framework for interaction of multiple
devices - Further, our results
- provide common thread to analyze both
competitive and coordination models. - provide exogenous conditions for existence and
AME for competitive models - provide results on a special class of
coordination model linear quadratic tracking
games.
Many cognitive radio models do notaccount for
reaction of other devicesto a single devices
action. In prior work, we developed a
generalstochastic game model to tractably
capture interactions of many devices.
In principle, tracking state of other
devices is complex. We approximate state of other
devices via a mean field limit.
Provide existence and AME results for general
class of coordination games. Our main goal is
to develop arelated model that applies when a
single node interacts with a small number of
other nodes each period.
New Paradigm for analyzing large scale
competitive and coordination games
3Modeling Interaction between Devices
- Wireless spectrum sharing
- Nodes interact with each other.
- The environment for a single node comprises of
active devices. - Nodes operate in a reactive environment.
- Markov Perfect Equilibrium (MPE)
- Standard solution concept for stochastic games.
- The action of each player depends on the state of
everyone. - Problems
- MPE is hard to compute.
- Requires excessive information exchange.
4Oblivious Equilibrium (OE)
- Mean field equilibrium concept.
- Each device reacts to an average state of other
players. - Requires little information exchange.
- Easy to compute and implement.
- Questions
- When does such policies exist?
- How close is OE to MPE in terms of payoff
received to a device?
5Taxonomy of Stochastic Games
- Competitive models
- Non-cooperative games
- Payoff characterized by non-increasing
differences between own state and opponent states
sub modular structure. - Opponents at higher state leads to lower payoff.
- Coordination models
- Cooperative games
- Payoff has increasing differences between own
state and opponent state super modular
structure. - Payoff depends on how close are nodes to other
players. - Contribution
6State of the Art
- Generalized the idea of OE to general stochastic
games Allerton 07. - Unified existing models, such as LQG games, via
our framework CDC 08. - Exogenous conditions for approximating MPE using
OE for linear dynamics and separable payoffs
Allerton 08. - Current Results
- Exogenous conditions on model primitives which
- Prove the existence of an oblivious equilibrium
for competitive models. - Show that OE is close to MPE asymptotically for
competitive models.
7Our model
- m players
- State of player i is xi action of player i is ai
- State evolution
-
- Payoff
- where f-i empirical distribution of other
players states
8Common Assumptions
- A1 The state transition function is concave in
state and action and has decreasing differences
in state and action. - A2 For any action, is a non-increasing
function of state, non-decreasing function of
action and has negative drift at zero action. - A3 The payoff function is jointly concave in
state and action and has decreasing differences
in state and action. - A4 The first derivative of the payoff function
w.r.t state becomes negative as the state
increases.
These assumptions imply that the optimal policy
is non-increasing and asymptotically goes to zero.
9Competitive Model - Assumptions
- A5 The payoff function has decreasing
difference between state and f-i and between
action and f-i . - Ordering relation on f-i first order stochastic
dominance. - A6 The payoff decreases as f increases. That
is, - if f1 f2, then ¼(x, a, f1) ¼(x, a, f2).
- A7 The logarithm of the payoff is Gateaux
differentiable w.r.t. f-i. - Define
- A8 Assume that the payoff function is such
that g(y) O(yK) for some K.
10Main Result Competitive Model
- Under A1-A8, OE exists for competitive models
and OE payoff is approximately optimal over
Markov policies, as m ? 1. - In other words, OE is approximately an MPE.
- The key point here is that no single player is
overly influential and the true state
distribution is close to the time averageso
knowledge of other players policies does not
significantly improve payoff. - Advantage
- Each player can use oblivious policy without loss
in performance.
11Competitive vs. Coordination Models
- Competitive and coordination models have
significant differences. - Existence in competitive models comes from
continuity arguments. - For coordination model, assumption A6 does not
holds - requires different existence proof. - This dichotomy exists even in single shot games.
- We have results for a special class of
coordination games linear quadratic tracking
models (a generalization of model by Caines et.
al.)
12Main Contributions and Future Work
- Provide common thread to analyze both competitive
and coordination model. - Provide exogenous conditions for existence and
AME property for competitive model. - Existence results are important for these models
to be meaningful. - Provide results on a special class of
coordination model linear quadratic tracking
games. - Future Work
- Provide exogenous conditions for existence and
AME property for general coordination games. - Develop similar models where a single node
interacts with a small set of nodes at each time
period. - Apply these models to interfering transmissions
between energy constrained nodes.