Title: Design of MultiAgent Systems
1Design of Multi-Agent Systems
- Teacher
- Bart Verheij
- Student assistants
- Albert Hankel
- Elske van der Vaart
- Web site
- http//www.ai.rug.nl/verheij/teaching/dmas/
- (Nestor contains a link)
2Student presentations
3Student presentations
4Some practical matters
- Please submit exercises to designofmas_at_gmail.com.
- Please use naming conventions for file names and
message subjects. - Please read your student mail.
5Overview
- Introduction
- Evaluation criteria equilibria
- Social welfare
- Pareto efficiency
- Nash equilibria
- The Prisoners Dilemma
- Loose end dominant strategies
Not or differentin the book
6Typical structure of a multi-agent system
7Interactions
- Communication
- Influence on environment (spheres of influence)
- Organizations, communities, coalitions
- Hierarchical relations
- Cooperation, competition
8Utilities preferences
- How to measure the results of a multi-agent
systems? In terms of preferences and utilities. - Some notation
- ??1,?2, outcomes, future
environmental states - group preferences (assumes cooperation)
- individual preferences
9Preferences
- Strict preferences
-
- Properties
- Reflexive
- Transitive
- Comparable
10Utilities
- According to utility theory, preferences can be
measured in terms of real numbers - Example money
- But money isnt always the right measure think
of the subjective value of a million dollars when
you have nothing or when you are Bill Gates.
11Utility money
12Zero-sum constant-sum games
- Simplification two agents
- Constant sum games
- The sum of all players' payoffs is the same for
any outcome. - ui(w) uj(w) C for all w ? W
- Zero-sum games
- All outcomes involve a sum of the players
payoffs of 0 - ui(w) uj(w) 0 for all w ? W
- Chess
- 0, ½, 1
- -½, 0, ½
13Zero-sum constant-sum games
- One agents gain is another agents loss.
- Zero-sum games are necessarily always
competitive. - But there are many non-zero sum situations.
14Overview
- Introduction
- Evaluation criteria equilibria
- Social welfare
- Pareto efficiency
- Nash equilibria
- The Prisoners Dilemma
- Loose end dominant strategies
15Kinds of evaluation criteria equilibria
- Social welfare
- Pareto efficiency
- Nash equilibrium
16Social welfare
- Social welfare measures the sum of all
individual outcomes. - Optimal social welfare may not be achievable
when individuals are self-interested -
- Individual agents follow their own (different)
utility function.
17Example 1
highest social welfare
18Overview
- Introduction
- Evaluation criteria equilibria
- Social welfare
- Pareto efficiency
- Nash equilibria
- The Prisoners Dilemma
- Loose end dominant strategies
19Pareto efficiency or optimality
- An outcome is Pareto optimal if a better outcome
for one agent always results in a worse outcome
for some other agent - When all agents pursue social welfare, highest
social welfare is Pareto optimal. However, a
Pareto optimal outcome need not be desirable.
E.g., dictatorship - Pareto improvement change that is an
improvement for someone without hurting anyone
20Example 1
Pareto efficient
Pareto improvements
21Overview
- Introduction
- Evaluation criteria equilibria
- Social welfare
- Pareto efficiency
- Nash equilibria
- The Prisoners Dilemma
- Loose end dominant strategies
22Nash equilibrium
- Two strategies s1 and s2 are in Nash equilibrium
if - under the assumption that agent i plays s1, agent
j can do no better than play s2 and - under the assumption that agent j plays s2, agent
i can do no better than play s1. - No individual has the incentive to unilaterally
change strategy - Example driving on the right side of the road
- Nash equilibria do not always exist and are not
always unique
23Example 1
Nash equilibria
Nashincentives
24Example 1
outcomes corresponding to strategies in Nash
equilibrium
25Example 2
no Nash equilibrium
26Example 3
unique Nash equilibrium
27Example 3
unique Nash equilibrium
highest social welfare Pareto efficient
28Overview
- Introduction
- Evaluation criteria equilibria
- Social welfare
- Pareto efficiency
- Nash equilibria
- The Prisoners Dilemma
- Loose end dominant strategies
29The Prisoners Dilemma
- Two men are collectively charged with a crime
and held in separate cells, with no way of
meeting or communicating. They are told that - if one confesses and the other does not, the
confessor will be freed, and the other will be
jailed for three years - if both confess, then each will be jailed for two
years - Both prisoners know that if neither confesses,
then they will each be jailed for one year
30The Prisoners Dilemma
- The prisoners can either defect or cooperate.
- The rational action for each individual prisoner
is to defect. - Example 3 is a prisoners dilemma (but note that
it tables utilities, not prison years less years
in prison has a higher utility). - Real life nuclear arms reduction, free riders
31The Prisoners Dilemma
- The Prisoners Dilemma is the fundamental
problem of multi-agent interactions. - It appears to imply that cooperation will not
occur in societies of self-interested agents.
32Recovering cooperation ...
- Conclusions that some have drawn from this
analysis - the game theory notion of rational action is
wrong! - somehow the dilemma is being formulated wrongly
- Arguments to recover cooperation
- We are not all Machiavelli!
- The other prisoner is my twin!
- The shadow of the future
33The Iterated Prisoners Dilemma
- One answer play the game more than once
- If you know you will be meeting your opponent
again, then the incentive to defect appears to
evaporate -
- When you now how many times youll meet your
opponent, defection is again rational
34Axelrods tournament
- Suppose you play iterated prisoners dilemma
against a range of opponentsWhat strategy
should you choose, so as to maximize your overall
payoff? - Axelrod (1984) investigated this problem, with a
computer tournament for programs playing the
prisoners dilemma
35Strategies in Axelrods tournament
- ALL-D
- Always defect
- TIT-FOR-TAT
- At the first meeting of an opponent cooperate.
Then do what your opponent did on the previous
meeting - TESTER
- First defect. If the opponent retaliates, play
TIT-FOR-TAT. Otherwise intersperse cooperation
and defection. - JOSS
- As TIT-FOR-TAT, except periodically defect
36Reasons for TIT-FOR-TATs success
- Dont be enviousDont play as if it were zero
sum! - Be niceStart by cooperating, and reciprocate
cooperation - Retaliate appropriatelyAlways punish defection
immediately, but use measured force dont
overdo it - Dont hold grudgesAlways reciprocate
cooperation immediately
37Overview
- Introduction
- Evaluation criteria equilibria
- Social welfare
- Pareto efficiency
- Nash equilibria
- The Prisoners Dilemma
- Loose end dominant strategies
38Dominant strategy
- A strategy is dominant for an agent if it is the
best under all circumstances - Dominant strategy equilibrium each agent uses a
dominant strategy - A dominant strategy equilibrium is always a Nash
equilibrium (but there are more of the latter).
39Example 4
Dominant for a2
Dominant for a1
40Just to play with new roads
- There are 6 cars going from A to D each day.
- (A,B) and (C,D) are highways
- time(c) 5 2c, where c is the number of cars
- - (B,D) and (A,C) are local roads
- time(c) 20 c
What will happen when a new highway is made
between B and C?