Using Potential Games to Design Distributed Optimisation Systems - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Using Potential Games to Design Distributed Optimisation Systems

Description:

Interested in building distributed optimisation techniques, using agents ... Hunt hare (defect) Hunt stag (cooperate) Column player. 2 pure-strategy Nash equilibria: ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 33
Provided by: archiec
Category:

less

Transcript and Presenter's Notes

Title: Using Potential Games to Design Distributed Optimisation Systems


1
Using Potential Games to Design Distributed
Optimisation Systems
  • Archie Chapman
  • Intelligence, Agents, Multimedia
  • Electronics Computer Science

2
Motivation
  • Interested in building distributed optimisation
    techniques, using agents
  • No need to collect the information at a central
    point
  • Fault tolerant, robust to failures
  • Reduced communication requirements
  • Flexible response in dynamic environments
  • Use game theory to design and analyse
    self-organising agent systems

3
Overview
  • Game theory
  • Nash equilibrium
  • Correlated equilibrium
  • Potential games
  • Existence of Nash and correlated equilibria
  • Convergence of simple adaptive procedures
  • Potential games for distributed optimisation
  • Alignment of global private utilities
  • Example Graph colouring problem

4
A very short introduction to game theory (i)
  • A game is an interaction between two or more
    self-interested agents
  • Each agent chooses from a set of strategies, Si
  • A (joint) strategy profile, s, is the set of
    chosen strategies, also called an outcome of the
    game
  • Each agent has a utility function, ui(s),
    specifying their preference for each outcome in
    terms of a payoff

5
A very short introduction to game theory (ii)
  • An agents best response is the strategy with the
    highest payoff, given its opponents choice of
    strategy
  • A Nash equilibrium is a strategy profile such
    that every agents strategy is a best response to
    others choice of strategy
  • No agent has an incentive to change strategy,
    given that everyone else plays the equilibrium
    strategy
  • A correlated equilibrium is a commonly known
    probability distribution over correlated signals
    recommending a joint strategy profile
  • No agent has an incentive to go against their
    recommendation, given that everyone else follows
    their recommendation

6
Example Stag Hunt game
7
Nash equilibrium
2 pure-strategy Nash equilibria cooperate,
cooperate defect, defect
8
Correlated equilibrium
Probability distribution over signals
Assume row player is told by the signal to
cooperate
Each agent receives a signal recommending only
its own strategy (e.g. cooperate), but knows
the joint probability distribution over others
recommendations
Then the probability that the column player has
been told to cooperate is ½ (1/2 1/6) ¾
The expected payoff for following the
recommendation is ¾ (4) ¼ (0) 3 and for
not following the recommendation is ¾ (3) ¼
(2) 2 ¾
9
Correlated equilibrium
Now assume row player is told by the signal to
defect
Then the probability that the column player has
been told to cooperate is 1/6(1/6 1/6) 1/2
The expected payoff for following the
recommendation is ½ (3) ½ (2) 2 ½ and for
not following the recommendation is ½ (4) ½
(0) 2
10
Correlated equilibrium
The set of correlated equilibria contain the set
of Nash equilibria
See that the Nash equilibria of the Stag hunt
game are correlated equilibria with the following
signal probabilities
cooperate, cooperate
defect, defect
11
Distributed optimisation design questions
  • How do we ensure that the equilibrium outcome is
    desirable/optimal? That is, how do we ensure
    that the equilibrium corresponds to a good global
    utility?
  • How do we ensure that the agents converge to an
    equilibrium? What if there are many equilibria
    in the game? Which one will emerge?
  • Often these problems can be addressed using
    results from the class of potential games

12
Potential Games
  • An ordinal potential function, P(si,s-i), for a
    game is a function such that
  • ui(si,s-i) - ui(si,s-i) gt 0 n P(si,s-i) -
    P(si,s-i) gt 0,
  • i.e. the sign of the change in private utility to
    a unilaterally deviating player is matched by the
    sign of the change in the potential function
  • A game that admits a potential is called a
    potential game

13
Example Stag Hunt game
Stag hunt Potential game
function
14
Example Stag Hunt game
Stag hunt Potential game
function Start in the defecting
equilibrium
And see what happens to the potential function if
the column player changes strategy
15
Example Stag Hunt game
Stag hunt Potential game
function Change in the deviators utility
is matched by a change in the value of the
potential function
0 - 2 0 - 2
16
Example Stag Hunt game
Stag hunt Potential game
function Now, if the row player were to
respond
by moving to the cooperative equilibrium
17
Example Stag Hunt game
4 - 3 1 - 0
Stag hunt Potential game
function The change in the row players
utility is matched by a change in the potential
function
18
Potential games naturally arise as
  • Network congestion games
  • Oligopoly models, games of strategic compliments
    or substitutes
  • Coalition formation problems
  • Organisation of the firm
  • Principal-agent games

19
Potential games have been used for distributed
optimisation in
  • Automatic radio channel selection
  • Power control problems in wireless networks
  • Scheduling problems in communication networks
  • Autonomous vehicle target tracking

20
Properties of potential games
  • All potential games contain at least one pure
    strategy Nash equilibrium
  • Corollary every local maxima of the potential
    function is a Nash equilibrium
  • Potential games in which (i) Si are convex and
    compact, and (ii) have a smooth, strictly concave
    potential have a unique pure strategy correlated
    equilibrium (which is also a unique Nash
    equilibrium)
  • All potential games have the finite improvement
    property a convergence property for learning
    processes in repeated games

21
Learning in repeated games
  • Typically refers to simple adaptive procedures
    played in a repeated game
  • No known efficient learning procedures that
    converge to Nash Eq for all classes of games, but
    Best Response and Fictitious Play converge to
    Nash Eq in potential games (finite improvement
    property)
  • Regret Matching converges to the set of
    correlated equilibria in all finite games

22
Best Response and Fictitious Play
  • Both choose strategy to maximise expected revenue
  • Best response
  • Max expected revenue with beliefs given by last
    rounds play
  • sit1 argmaxsi ui(si,s-it)
  • Fictitious Play
  • Max expected revenue with beliefs given by
    empirical frequency of play
  • qit(s-i) observed frequency of opponents
    profile s-i
  • sit1 argmaxsi Ss-i qit(s-i)ui(si,s-i)

23
Regret Matching
  • Regret the difference between the payoff an
    agent would have received if it chose si and the
    payoff it actually received ui(si,s-it) -
    ui(st)
  • Average regret for not selecting si in every
    subsequent period
  • Rt1(si) max 1/t St ui(si,s-it) - ui(st) ,
    0
  • Strategies are selected in proportion to their
    average level of regret, e.g
  • Pr(si) Rt1(si) / Ssi Rt1(si)

24
Potential games for distributed optimisation
  • Take an optimisation problem, and assign each
    variable to a separate agent
  • Construct the agents utility functions to align
    them with the global objectives, i.e. align the
    equilibria of the game with the optimal global
    state
  • This will result in a potential game, so by
    allowing the agents to adjust using some learning
    procedure, the system should converge to
    equilibrium/optimal state
  • How do we align the agents private utilities with
    the global goals...?

25
Aligning agents utilities with global goals
  • Agents private utility functions ui(si,s-i)
  • Systems global utility function ug(si,s-i)
  • Aligned utilities
  • ui(si,s-i) - ui(si,s-i) gt 0 n ug(si,s-i) -
    ug(si,s-i) gt 0
  • Construct agents utility functions such that any
    change in strategy has the same effect on the
    global utility as it does on the agents utility

26
Aligning agents utilities with global goals
  • Strategic situations where agents utilities are
    aligned to global utilities are, by definition,
    potential games, where the global utility
    function is a potential function for the game
  • Typically, to align private utilities, use an
    agents marginal contribution to the global
    utility as its private utility function (there
    are other methods, but not today!)

27
Simple example of aligned utilities Graph
colouring problem
The global utility is maximised by minimising the
number of conflicts ug - (total number of
neighbouring nodes with same colour)
So align the private utilities by setting ui
- (number of is neighbours with same colour)
28
Graph colouring as a potential game
Pair-wise interaction between agents
i.e. each agent plays this game with each of its
neighbours.
Potential function
Each agents full utility function, and the full
potential function are given by aggregation these
pair-wise functions
29
Graph colouring as a potential game
  • Now, simple learning dynamics will converge to a
    Nash equilibrium of the game
  • Example graph

30
Graph colouring as a potential game
Average time to converge Best response 563
steps / 48 cycles Weighted F-play 560 steps /
47 cycles Weighted R-match 686 steps / 57 cycles
31
Future work
  • Investigate which types of problems can be
    decomposed so as to align private and global
    utilities, generating potential games
  • Find adaptive procedures that efficiently
    converge to preferred outcomes
  • Examine the effects of dynamic environments on
    outcomes
  • Examine the effects of network structure on
    convergence

32
Thank you
  • Questions?
  • Comments?
Write a Comment
User Comments (0)
About PowerShow.com