Using Potential Games to Design Distributed Optimisation Systems - PowerPoint PPT Presentation

1 / 32

About This Presentation

Title:

Using Potential Games to Design Distributed Optimisation Systems

Description:

Interested in building distributed optimisation techniques, using agents ... Hunt hare (defect) Hunt stag (cooperate) Column player. 2 pure-strategy Nash equilibria: ... – PowerPoint PPT presentation

Number of Views:57

Avg rating:3.0/5.0

Slides: 33

Provided by: archiec

Category:

more less

Transcript and Presenter's Notes

Title: Using Potential Games to Design Distributed Optimisation Systems

1
Using Potential Games to Design Distributed
Optimisation Systems

Archie Chapman
Intelligence, Agents, Multimedia
Electronics Computer Science

2
Motivation

Interested in building distributed optimisation
techniques, using agents
No need to collect the information at a central
point
Fault tolerant, robust to failures
Reduced communication requirements
Flexible response in dynamic environments
Use game theory to design and analyse
self-organising agent systems

3
Overview

Game theory
Nash equilibrium
Correlated equilibrium
Potential games
Existence of Nash and correlated equilibria
Convergence of simple adaptive procedures
Potential games for distributed optimisation
Alignment of global private utilities
Example Graph colouring problem

4
A very short introduction to game theory (i)

A game is an interaction between two or more
self-interested agents
Each agent chooses from a set of strategies, Si
A (joint) strategy profile, s, is the set of
chosen strategies, also called an outcome of the
game
Each agent has a utility function, ui(s),
specifying their preference for each outcome in
terms of a payoff

5
A very short introduction to game theory (ii)

An agents best response is the strategy with the
highest payoff, given its opponents choice of
strategy
A Nash equilibrium is a strategy profile such
that every agents strategy is a best response to
others choice of strategy
No agent has an incentive to change strategy,
given that everyone else plays the equilibrium
strategy
A correlated equilibrium is a commonly known
probability distribution over correlated signals
recommending a joint strategy profile
No agent has an incentive to go against their
recommendation, given that everyone else follows
their recommendation

6
Example Stag Hunt game
7
Nash equilibrium
2 pure-strategy Nash equilibria cooperate,
cooperate defect, defect
8
Correlated equilibrium
Probability distribution over signals
Assume row player is told by the signal to
cooperate
Each agent receives a signal recommending only
its own strategy (e.g. cooperate), but knows
the joint probability distribution over others
recommendations
Then the probability that the column player has
been told to cooperate is ½ (1/2 1/6) ¾
The expected payoff for following the
recommendation is ¾ (4) ¼ (0) 3 and for
not following the recommendation is ¾ (3) ¼
(2) 2 ¾
9
Correlated equilibrium
Now assume row player is told by the signal to
defect
Then the probability that the column player has
been told to cooperate is 1/6(1/6 1/6) 1/2
The expected payoff for following the
recommendation is ½ (3) ½ (2) 2 ½ and for
not following the recommendation is ½ (4) ½
(0) 2
10
Correlated equilibrium
The set of correlated equilibria contain the set
of Nash equilibria
See that the Nash equilibria of the Stag hunt
game are correlated equilibria with the following
signal probabilities
cooperate, cooperate
defect, defect
11
Distributed optimisation design questions

How do we ensure that the equilibrium outcome is
desirable/optimal? That is, how do we ensure
that the equilibrium corresponds to a good global
utility?
How do we ensure that the agents converge to an
equilibrium? What if there are many equilibria
in the game? Which one will emerge?
Often these problems can be addressed using
results from the class of potential games

12
Potential Games

An ordinal potential function, P(si,s-i), for a
game is a function such that
ui(si,s-i) - ui(si,s-i) gt 0 n P(si,s-i) -
P(si,s-i) gt 0,
i.e. the sign of the change in private utility to
a unilaterally deviating player is matched by the
sign of the change in the potential function
A game that admits a potential is called a
potential game

13
Example Stag Hunt game
Stag hunt Potential game
function
14
Example Stag Hunt game
Stag hunt Potential game
function Start in the defecting
equilibrium
And see what happens to the potential function if
the column player changes strategy
15
Example Stag Hunt game
Stag hunt Potential game
function Change in the deviators utility
is matched by a change in the value of the
potential function
0 - 2 0 - 2
16
Example Stag Hunt game
Stag hunt Potential game
function Now, if the row player were to
respond
by moving to the cooperative equilibrium
17
Example Stag Hunt game
4 - 3 1 - 0
Stag hunt Potential game
function The change in the row players
utility is matched by a change in the potential
function
18
Potential games naturally arise as

Network congestion games
Oligopoly models, games of strategic compliments
or substitutes
Coalition formation problems
Organisation of the firm
Principal-agent games

19
Potential games have been used for distributed
optimisation in

Automatic radio channel selection
Power control problems in wireless networks
Scheduling problems in communication networks
Autonomous vehicle target tracking

20
Properties of potential games

All potential games contain at least one pure
strategy Nash equilibrium
Corollary every local maxima of the potential
function is a Nash equilibrium
Potential games in which (i) Si are convex and
compact, and (ii) have a smooth, strictly concave
potential have a unique pure strategy correlated
equilibrium (which is also a unique Nash
equilibrium)
All potential games have the finite improvement
property a convergence property for learning
processes in repeated games

21
Learning in repeated games

Typically refers to simple adaptive procedures
played in a repeated game
No known efficient learning procedures that
converge to Nash Eq for all classes of games, but
Best Response and Fictitious Play converge to
Nash Eq in potential games (finite improvement
property)
Regret Matching converges to the set of
correlated equilibria in all finite games

22
Best Response and Fictitious Play

Both choose strategy to maximise expected revenue
Best response
Max expected revenue with beliefs given by last
rounds play
sit1 argmaxsi ui(si,s-it)
Fictitious Play
Max expected revenue with beliefs given by
empirical frequency of play
qit(s-i) observed frequency of opponents
profile s-i
sit1 argmaxsi Ss-i qit(s-i)ui(si,s-i)

23
Regret Matching

Regret the difference between the payoff an
agent would have received if it chose si and the
payoff it actually received ui(si,s-it) -
ui(st)
Average regret for not selecting si in every
subsequent period
Rt1(si) max 1/t St ui(si,s-it) - ui(st) ,
0
Strategies are selected in proportion to their
average level of regret, e.g
Pr(si) Rt1(si) / Ssi Rt1(si)

24
Potential games for distributed optimisation

Take an optimisation problem, and assign each
variable to a separate agent
Construct the agents utility functions to align
them with the global objectives, i.e. align the
equilibria of the game with the optimal global
state
This will result in a potential game, so by
allowing the agents to adjust using some learning
procedure, the system should converge to
equilibrium/optimal state
How do we align the agents private utilities with
the global goals...?

25
Aligning agents utilities with global goals

Agents private utility functions ui(si,s-i)
Systems global utility function ug(si,s-i)
Aligned utilities
ui(si,s-i) - ui(si,s-i) gt 0 n ug(si,s-i) -
ug(si,s-i) gt 0
Construct agents utility functions such that any
change in strategy has the same effect on the
global utility as it does on the agents utility

26
Aligning agents utilities with global goals

Strategic situations where agents utilities are
aligned to global utilities are, by definition,
potential games, where the global utility
function is a potential function for the game
Typically, to align private utilities, use an
agents marginal contribution to the global
utility as its private utility function (there
are other methods, but not today!)

27
Simple example of aligned utilities Graph
colouring problem
The global utility is maximised by minimising the
number of conflicts ug - (total number of
neighbouring nodes with same colour)
So align the private utilities by setting ui
- (number of is neighbours with same colour)
28
Graph colouring as a potential game
Pair-wise interaction between agents
i.e. each agent plays this game with each of its
neighbours.
Potential function
Each agents full utility function, and the full
potential function are given by aggregation these
pair-wise functions
29
Graph colouring as a potential game

Now, simple learning dynamics will converge to a
Nash equilibrium of the game
Example graph

30
Graph colouring as a potential game
Average time to converge Best response 563
steps / 48 cycles Weighted F-play 560 steps /
47 cycles Weighted R-match 686 steps / 57 cycles
31
Future work

Investigate which types of problems can be
decomposed so as to align private and global
utilities, generating potential games
Find adaptive procedures that efficiently
converge to preferred outcomes
Examine the effects of dynamic environments on
outcomes
Examine the effects of network structure on
convergence

32
Thank you