Title: Structured Models for MultiAgent Interactions
1Structured Models forMulti-Agent Interactions
- Daphne Koller
- Stanford University
Joint work with Brian Milch, U.C. Berkeley
2Scaling Up
- Question
- Modeling and solving small games is already hard
- How can we scale up to larger ones?
- Answer
- Real-world situations have a lot of structure
- Otherwise people wouldnt be able to handle them
- Goal construct
- languages based on structured representations,
allowing compact models of complex situations - algorithms that exploit this structure to support
effective reasoning
3Representations of Games
strategies of player II
strategies of player I
- Normal form
- basic units strategies
- game representation loses all structure
- matrix size exponentially larger than game tree
- Extensive form
- basic units events
- game structure explicitly encodes time,
information - game tree size can still be very large
4Representation Inference
Minimax linear program for two-player zero-sum
games Applied to abstract 2-player poker Koller
Pfeffer
solution time (sec)
size of tree
Romanovskii, 1962 Koller, Megiddo von
Stengel, 1994
5MAID Representation
- MAID form
- basic units variables dependencies between
them - game structure explicitly encodes time,
information, independence - can be exponentially smaller than game tree
- game structure supports new forms of
decomposition backward inductions - solving can be exponentially more efficient than
extensive form
6Outline
- Probabilistic Reasoning Bayesian networks
- Pearl, Jensen,
- Influence Diagrams
- Strategic Relevance
- Exploiting Structure for Solving Games
7Probability Distributions
- Probabilistic model (e.g., a la Savage)
- set of possible states in which the world can be
- probability distribution over this space.
- State assignment of values to variables
- diseases, symptoms, predisposing factors,
- Problem
- n variables ? 2n states (or more)
- representing the joint distribution is infeasible.
8Bayesian Network
P(A B,E) a function Val(B,E) ? ?(Val(A))
Earthquake
Burglary
Alarm
Newscast
PhoneCall
nodes random variables edges direct
probabilistic influence
Network structure encodes conditional
independencies Phone-Call is independent
of Burglary given Alarm
9BN Semantics Probability Model
qualitative BN structure
local probability models
full joint distribution over domain
- Compact natural representation
- nodes have ? k parents ?? 2kn vs. 2n parameters
- parameters natural and easy to elicit.
10BN Semantics Independencies
- The graph structure of the BN implies a set of
conditional independence assumptions - satisfied by every distribution over this graph
Burglary and Earthquake independent
Burglary and Call independent given Alarm
Newscast and Alarm independent given Earthquake
11BN Semantics Dependencies
- BN structure also specifies potential
dependencies - those that might hold for some distribution over
graph
- Burglary and Earthquake dependent given Alarm
12Active paths
A
A
B, C can be dependent
B, C are independent given A
C
B
B, C can be dependent given A,D
D
D
- Probabilistic influence flows along active
paths - d-separation if there is no active path
Simple linear-time algorithm for testing
conditional independence using only graphical
structure
- Sound d-separation ? independence for all P
- Complete no d-separation ? dependence for almost
all P
13CPCS
? 21000 states
14Bayesian Networks
- Explicit representation of domain structure
- Cognitively intuitive compact models of complex
domains - Same model allows relevant probabilities to be
computed in any evidence state - Algorithms that exploit structure for effective
inference even in very large models
15Outline
- Probabilistic Reasoning Bayesian networks
- Influence Diagrams
- Howard, Shachter, Jensen,
- Strategic Relevance
- Exploiting Structure for Solving Games
16Example The Tree Killer
- Alice wants a patio, but the benefit outweighs
the cost only if she gets an ocean view - Bobs tree blocks her view
- Alice chooses whether to poison the tree
- Tree may become sick
- Bob chooses whether to call a tree doctor
- Alice can see whether tree doctor comes
- Alice chooses whether to build her patio
- Tree may die when winter comes
17Standard Representation Game Tree
Poison Tree?
Tree Sick?
Call Tree Doctor?
Build Patio?
Tree Dead?
5 levels 25 32 terminal nodes
18Multi-Agent Influence Diagrams (MAIDs)
Influence diagram representation easily extended
to multiple agents
Tree Doctor
Spike Tree
Build Patio
TreeSick
Cost
View
TreeDead
Tree killer example
Tree
19MAIDs ? Trees
- Same idea as for single-agent Ids
- Information is different for different agents
20Decision Nodes
- Incoming edges are information edges
- variables whose values the agent knows when
deciding - agents strategy can depend on values of parents
- Each parent instantiation
- u ? Val(Parents(D))
- is an information set
- Perfect recall if D1 precedes D2
- at D2 agent remembers
- his decision at D1
- everything he knew at D1
- formally D1,Parents(D1) ? Parents(D2)
- usually perfect recall edges are implicit, not
drawn
Spike Tree
TreeSick
Tree Doctor
Build Patio
21Strategies
- Strategy ? at D
- A pure (deterministic) strategy specifies an
action at D for every information set u - A behavior strategy specifies a distribution over
actions for every u - Strategy ? specifies distribution P?(D
Parents(D)) - turns a decision node into a chance node
- information parents play exactly the same role as
parents of chance node
22MAID Semantics
- MAID M defines a set of possible strategy
profiles - M plus any strategy profile ? defines a BN M?
- Each decision node D becomes a chance node, with
?D as its CPD - M? defines a probability distribution, from
which we can derive an expected utility for each
agent - Thus, a MAID defines a mapping from strategy
profiles to expected utility vectors
23Readability
P1 Hand
P2 Hand
Bet
Bet
Bet
Flop Cards
Bet
Bet
Bet
Card 4
Bet
Bet
Bet
24Compactness
Suitability 1W
Suitability 1E
Util 1W
Building 1E
Building 1W
Util 1E
Suitability 2W
Suitability 2E
Util 2W
Building 2W
Building 2E
Util 2E
Suitability 3W
Suitability 3E
Util 3W
Building 3W
Building 3E
Road example
Util 3E
25Compactness
- Assume all variables have three values
- Each decision node observes three variables
- Number of information sets per agent 33 27
- Size of MAID
- n chance nodes of size 3
- n decision nodes of size 273
- Size of game tree
- 2n splits, each over three values
- Size of normal (matrix) form
- n players, each with 327 pure strategies
?54n
?32n
? (327)n
26Outline
- Probabilistic Reasoning Bayesian networks
- Influence Diagrams
- Strategic Relevance
- Exploiting Structure for Solving Games
27Optimality and Equilibrium
- Let E be a subset of Da, and let ? be a partial
strategy over E - Is ? the best partial strategy for agent a to
adopt? - Depends on decision rules for other decision
nodes - ? is optimal for a strategy profile ? if for all
partial strategies ? over E - A strategy profile ? is a Nash equilibrium if
for every agent a, ?Da is optimal for ?
28MAIDs and Games
- A MAID is equivalent to a game tree it defines a
mapping from strategy profiles to payoff vectors - Finding equilibria in the MAID is equivalent to
finding equilibria in the game tree - One way to find equilibrium in MAID
- construct the game tree
- solve the game
- Incurs exponential blowup in representation size
- Question can we find equilibria in a MAID
directly?
29Local Optimization
- Consider finding a decision rule for a single
decision node D that is optimal for ? - For each instantiation pa of Pa(D), must find P
that maximizes - Some decision rules in ? may not affect this
maximization problem
30Strategic Relevance
- Intuitively, D relies on D if we need to know
the decision rule for D in order to determine
the optimal decision rule for D. - We define a relevance graph, with
- a node for each decision
- an edge from D to D if D relies on D
D
D
31Examples I Information
32Examples II Simple Card Game
Deal
Bet1
Bet2
- Bet2 relies on Bet1 even though Bet2 observes
Bet1 - Bet2 can depend on Deal
- Deal influences U
- Need probability model of Bet2 to derive
posterior on Deal and compute expectation over U
Decision D can require D even if D is
observed at D !
33Examples III Decoupled Utilities
Deal
Bet1
Bet1
Bet2
Bet2
U
U
- Bet2 relies on Bet1 even without influence on
utility - Bet2 can depend on Deal
- Deal influences U
- Need probability model of Bet2 to derive
posterior on Deal and compute expectation over U
34Examples IV Tree Killer
Poison Tree
Tree Doctor
Poison Tree
Build Patio
TreeSick
Cost
View
TreeDead
Tree
35s-Reachability
given
D relies on D (D relevant to D)
D
D
D
CPD of D influences P(U D,Pa(D))
exists
U
U
- D is s-reachable from D if there is some among
the descendants of D, such that if a new parent
were added to D, there would be an active path
from to U given D and Pa(D).
36s-Reachability
Nodes that D relies on are the nodes that are
s-reachable from D.
Theorem s-reachability is sound complete for
strategic relevance
- Sound no s-reachability ? strategic irrelevance
? P,U - Complete s-reachability ? relevance for some P,U
Theorem Can build the relevance graph in
quadratic time using only structure of MAID
37Outline
- Probabilistic Reasoning Bayesian networks
- Influence Diagrams
- Strategic Relevance
- Exploiting Structure for Solving Games
38Intuition Backward Induction
- D observes D
- Can optimize decision rule at D without knowing
decision rule at D - Having optimized D, can optimize D
- D doesnt care about D
- Can optimize decision rule at D without knowing
decision rule at D - Having optimized D , can optimize D
39Generalized Backward Induction
- Idea Solve decisions by order of relevance graph
- Generalized Backward Induction
- Choose decision node D that relies on no other
- Find optimal strategy for D by maximizing its
local expected utility - Replace D by chance node
40Finding Equilibria Acyclic Relevance Graphs
D1
D2
Dn
Dn-1
Dn-1
Dn
- Choose any strategy profile ? for D1,,Dn-1
- Derive decision rule ? for Dn that is optimal for
? - Node Dn does not rely on preceding ones
- ? is optimal for any other strategy profile as
well!
- We can now set ? as CPD for Dn
- And continue by optimizing Dn-1
41Generalized Backward Induction
- Given topological sort D1,,Dn of relevance
graph - Begin with arbitrary fully mixed strategy profile
? - For i n down to 1
- Find decision rule ? for Di that is optimal for ?
- Decision rules at previous decisions fixed
earlier - Decision rules at subsequent decisions irrelevant
- Let ?(Di) ?
Theorem If the relevance graph of a MAID is
acyclic, it can be solved by generalized backward
induction, and the result is a pure-strategy Nash
equilibrium
42When is the Relevance Graph Acyclic?
- Single-agent influence diagrams with perfect
recall - Multi-agent games with perfect information
- Some games with imperfect information
- e.g., Tree Killer example
But in many MAIDs the relevance graph has cycles
43Cyclic Relevance Graphs
Question What if the relevance graph is cyclic?
- Strongly connected component (SCC)
- maximal subgraph s.t. ? directed path between
every pair of nodes - The decisions in each SCC require each other
- They must be optimized together
- Different SCCs can be solved separately
44Generalized Backward Induction
- Given topological sort C1,,Cm of SCCs in
relevance graph - Begin with arbitrary fully mixed strategy profile
? - For i m down to 1
- Construct reduced MAID M?-Ci
- Strategies for previous SCCs selected before
- Strategies for subsequent SCCs irrelevant
- Create game tree for M?-Ci
- Use game solver to find equilibrium strategy
profile ? for Ci in this reduced game - Let ?(Ci) ?
Theorem If find equilibrium for each SCC, the
result is equilibrium for whole game
45Road Relevance Graph
1W
1E
2W
2E
3W
3E
Note Reduced games over SCCs are not subgames!
46Experiment Road Example
Reminder, for n4 Tree size 6561 nodes
Matrix size 4.7?1027
For n40 Tree size 1.47 ?1038 nodes
47Cutting Cycles
D
- Idea enumerate possible values d for some
decision D - Once we determine D, residual MAID has acyclic
relevance graph - Solve residual MAID using generalized backward
induction - Check whether combined strategy with d is an
equilibrium
- May need to instantiate several decision nodes to
cut cycle - Can deal with each SCC separately
Theorem Can find all pure strategy equilibria in
time linear in of SCCs, exponential in max of
decisions required to cut all loops in component
48Irrelevant Information
What if B can observe As decision
completely irrelevant to him
Cost
Resource
B Sales
A Sales
Commission
Commission
Sales-A
Sales-B
Revenue
- We can automatically
- analyze relevance based on graph structure
- eliminate irrelevant information edges
- In associated tree, safe merging of information
sets - Leads to exponential decrease in of decisions
to optimize in influence diagram!
49Related Work
- Suryadi and Gmytrasiewicz (1999) use multi-agent
influence diagrams, but with recursive modeling - Milch and Koller (2000) use the MAID
representation described here, but have no
algorithm for finding equilibria - Nilsson and Lauritzen (2000) discuss limited
memory influence diagrams (LIMIDs) and derive
s-reachability, but do not apply it to
multi-agent case - La Mura (2000) proposes game networks, with an
undirected notion of strategic dependence
50Future Work
- Take advantage of structure within SCCs
- Represent asymmetric scenarios compactly
- Detect irrelevant observations
51Computational Game Theory
Game theory Past
Game theory Future
- Expert analysis of
- Prototypical examples that highlight key issues
- Abstracted problems for big organizations
- Autonomous agents interacting economically
- Decision support systems for consumers
- Complex problems
- many relevant variables
- interacting decisions
- Simplified examples
- small enough to be analyzed by hand
52Conclusions
- Multi-agent influence diagrams
- compact intuitive language for multi-agent
interactions - basic units variables rather than strategies or
events - MAIDs make explicit structure that is lost in
game trees - Can exploit structure to find equilibria
efficiently - sometimes exponentially faster than existing
algorithms - Exciting question
- What else does structure buy us?
53http//robotics.stanford.edu/koller koller_at_cs.st
anford.edu