Strategic Negotiation and Cooperation Among Autonomous Agents

About This Presentation

Title:

Strategic Negotiation and Cooperation Among Autonomous Agents

Description:

Designers (from different companies, countries, etc.) come together to agree on ... Discuss various possibilities and their tradeoffs, and agree on protocols, ... – PowerPoint PPT presentation

Number of Views:87

Avg rating:3.0/5.0

Slides: 118

Provided by: ValuedGate1519

Category:

more less

Transcript and Presenter's Notes

Title: Strategic Negotiation and Cooperation Among Autonomous Agents

1
Automated Negotiation
Sarit Kraus Bar-Ilan, Israel UMD,USA
2
Plan of the course

Introduction
Rules of Encounters
Strategic Negotiation
Auctions
protocols
strategies
Argumentation

3
Machines Controlling and Sharing Resources

Electrical grids (load balancing)
Telecommunications networks (routing)
PDAs (schedulers)
Shared databases (intelligent access)
Traffic control (coordination)

4
Broad Working Assumption

Designers (from different companies, countries,
etc.) come together to agree on standards for how
their automated agents will interact (in a given
domain)
Discuss various possibilities and their
tradeoffs, and agree on protocols, strategies,
and social laws to be implemented in their
machines

5
Attributes of Standards

Efficient Pareto Optimal
Stable No incentive to deviate
Simple Low computational and communication
cost
Distributed No central decision-maker
Symmetric Agents play equivalent roles

Designing protocols for specific classes of
domains that satisfy some or all of these
attributes
6
Distributed Artificial Intelligence (DAI)

Distributed Problem Solving (DPS) Centrally
designed systems, built-in cooperation, have
global problem to solve
Multi-Agent Systems (MAS) Group of
utility-maximizing heterogeneous agents
co-existing in same environment, possibly
competitive

7
Phone Call Competition Example

Customer wishes to place long-distance call
Carriers simultaneously bid, sending proposed
prices
Phone automatically chooses the carrier
(dynamically)

ATT
Sprint
MCI
0.20
0.23
0.18
8
Best Bid Wins

Phone chooses carrier with lowest bid
Carrier gets amount that it bid

MCI
Sprint
ATT
0.20
0.23
0.18
9
Attributes of the Mechanism

Distributed
Symmetric
Stable
Simple
Efficient

Carriers have an incentive to invest effort in
strategic behavior
ATT
MCI
Sprint
0.20
0.23
0.18
10
Best Bid Wins, Gets Second Price

Phone chooses carrier with lowest bid
Carrier gets amount of second-best price

MCI
Sprint
ATT
0.20
0.23
0.18
11
Attributes of the Mechanism

Distributed
Symmetric
Stable
Simple
Efficient

Carriers have no incentive to invest effort in
strategic behavior
ATT
MCI
Sprint
0.20
0.23
0.18
12
Database Domain
Common Database
TOD
13
Negotiation
A discussion in which interested parties
exchange information and come to an agreement.
Davis and Smith, 1977

Two-way exchange of information
Each party evaluates information from its own
perspective
Final agreement is reached by mutual selection

14
Game Theory--Short Introduction

Game theory is the study of decision making in
multi-person situations where the outcome depends
on everyones choice.
In Decision Theory and the theory of competitive
equilibrium from economics the other participants
actions are considered as an environmental
parameter. The effect of the of the
decision-makers actions on the other
participants is not taken into consideration.

15
Describing a Game

Essential elements players, actions,
information, strategies, payoffs, outcome, and
equilibria.
Ways to present social interactions as a game
Extensive formthe most complete description.
Strategic form many details are omitted.
Coalitional form binding agreements exist.

16
Example of two players game
dindia
op
deal
0 2-
1 2
deal
Dsikh
3- 0
2- 1-

blow
17
Nash Equilibrium

An action profile is an order set a(a1,,aN) of
one action for each of the N players in the game.
An action profile a is a Nash Equilibrium (Nash
53) of a strategic game, if each agent j does
not have a different action yielding an outcome
that it prefers to that generated when chooses
aj, given that every other player I chooses ai.

18
2,1-
blow
3-,5
op
sik
2,5
yes
Ind
2,1-
3,4
op
blow
yes
0.4
sik
Ind
dealH
dealH
c
1,4
0.6
Ind
sik
dealH
Ind
dealH
dealH
1,4
dealH
sik
op
op
4- ,4
-3,0-
19
Rules of Encounter
Jeffrey S. Rosenschein Gilad Zlotkin
20
Domain Theory

Task Oriented Domains
Agents have tasks to achieve
Task redistribution
State Oriented Domains
Goals specify acceptable final states
Side effects
Joint plan and schedules
Worth Oriented Domains
Function rating states acceptability
Joint plan, schedules, and goal relaxation

21
Postmen Domain
Post Office
TOD
a
?
?
c
b
?
?
f
?
e
d
22
Database Domain
Common Database
TOD
23
Fax Domain
faxes to send
TOD
a
c
b
Cost is only to establish connection
f
e
d
24
Slotted Blocks World
SOD
3
1
2
3
1
2
25
The Multi-Agent Tileworld
WOD
hole
agents
tile
B
A
2
2
5
5
2
obstacle
4
3
2
26
Task Oriented Domain (TOD)

A tuple lt T, A, c gt where
T is the set of all possible tasks
A A1 , , An is a list of agents
c is a monotonic function c 2T ? ?

An encounter is a list T1 ,, Tn of finite sets
of tasks from T such that agent Ak needs to
achieve all the tasks in Tk (also called agent
Aks goal).
27
Building Blocks

Domain
A precise definition of what a goal is
Agent operations
Negotiation Protocol
A definition of a deal
A definition of utility
A definition of the conflict deal
Negotiation Strategy
In Equilibrium
Incentive-compatible

28
Deal and Utility in two-agent TOD

Deal ? is a pair (D1, D2) D1 ? D2 T1 ? T2
Conflict deal ? (T1, T2)
Utilityi(?) Cost(Ti) Cost(Di)

29
Negotiation Protocols

Agents use a product-maximizing negotiation
protocol (as in Nash bargaining theory)
It should be a symmetric PMM (product maximizing
mechanism)
Examples 1-step protocol, monotonic concession
protocol

30
Building Blocks

Domain
A precise definition of what a goal is
Agent operations
Negotiation Protocol
A definition of a deal
A definition of utility
A definition of the conflict deal
Negotiation Strategy
In Equilibrium
Incentive-compatible

31
Negotiation with Incomplete Information
Post Office
?
a
b
h
1
g
c

What if the agents dont know each others
letters?

f
e
d
?
?
2
1
32
1 Phase Game Broadcast Tasks
Post Office
?
a
b
h
1
g
c

Agents will flip a coin to decide who delivers
all the letters.

e
f
d
?
?
2
1
33
Hiding Letters
Post Office
?
a
b
h
(1)
(hidden)
g
c
e
f
d
They then agree that agent 2 delivers to f and e.
?
?
2
1
34
Another Possibility for Deception
Post Office
a
c
b
?

They will agree to flip a coin to decide who goes
to b and who goes to c.

?
1, 2
1, 2
35
Phantom Letter
Post Office
b, c, d
a
b, c
c
?
b
1, 2

They agree that agent 1 goes to c.

?
1, 2
?
d
1 (phantom)
36
Negotiation over Mixed Deals

Mixed deal (D1, D2) p
The agents will perform (D1, D2) with probability
p, and the symmetric deal (D2, D1) with
probability 1 p

Theorem With mixed deals, agents can always
agree on the all-or-nothing deal
37
Hiding Letters with MixedAll-or-Nothing Deals
Post Office
?
a
b
h
(1)
(hidden)
g
c

They will agree on the mixed deal where agent 1
has a 3/8 chance of delivering to f and e.

e
f
d
?
?
2
1
38
Phantom Letters with Mixed Deals
Post Office
b, c, d
a
b, c
c
?
b

They will agree on the mixed deal where A has 3/4
chance of delivering all letters, lowering his
expected utility.

1, 2
?
1, 2
?
d
1 (phantom)
39
Sub-Additive TODs

TOD lt T, A, c gt is sub-additive if for all finite
sets of tasks X, Y in T we have
c(X ? Y) ? c(X) c(Y)

40
Sub-Additivity
X
Y
c(X ? Y) ? c(X) c(Y)
41
Sub-Additive TODs

The Postmen Domain, Database Domain, and Fax
Domain are sub-additive.

The Delivery Domain (where postmen dont have
to return to the Post Office) is not sub-additive.
?
?
42
Incentive Compatible Mechanisms
a
?
a
b
h
(1)
(hidden)
?
g
c
Sub-Additive
1, 2
?
1, 2
e
f
d
Hidden
Phantom
?
?
?
1
(phantom)
Pure
L
L
2
1
A/N
T/P
T
Mix
L
T/P
Theorem For all encounters in all sub-additive
TODs, when using a PMM over all-or-nothing deals,
no agent has an incentive to hide a task.
43
Decoy Tasks
Decoy tasks, however, can be beneficial even with
all-or-nothing deals
Sub-Additive
Hidden
Phantom
Decoy
Pure
L
L
L
A/N
T
T/P
L
Mix
L
T/P
L
44
Concave TODs

TOD lt T, A, c gt is concave if for all finite sets
of tasks Y and Z in T , and X ? Y, we have
c(Y ? Z) c(Y) ? c(X ? Z) c(X)

Concavity implies sub-additivity.
45
Concavity
Z
X
Y

The cost Z adds to X is more than the cost it
adds to Y.(Z - X is a superset of Z - Y)

46
Concave TODs

The Database Domain and Fax Domain are concave
(not the Postmen Domain, unless restricted to
trees).

Z
1
?
This example was not concave Z adds 0 to X, but
adds 2 to its superset Y (all blue nodes).
?
2
?
1
X
1
2
?
?
?
1
1
47
Three-Dimensional Incentive Compatible Mechanism
Table
Theorem For all encounters in all concave TODs,
when using a PMM over all-or-nothing deals, no
agent has any incentive to lie.
Concave
Hidden
Phantom
Decoy
Pure
L
L
L
A/N
T
T
T
Mix
L
T
T
Sub-Additive
Hidden
Phantom
Decoy
Pure
L
L
L
A/N
T
T/P
L
Mix
L
T/P
L
48
Modular TODs

TOD lt T, A, c gt is modular if for all finite sets
of tasks X, Y in T we have
c(X ? Y) c(X) c(Y) c(X ? Y)

Modularity implies concavity.
49
Modularity
X
Y

c(X ? Y) c(X) c(Y) c(X ? Y)

50
Modular TODs

The Fax Domain is modular (not the Database
Domain nor the Postmen Domain, unless restricted
to a star topology).

Even in modular TODs, hiding tasks can be
beneficial in general mixed deals.
51
Three-Dimensional Incentive Compatible Mechanism
Table
Modular
H
P
D
Pure
L
T
T
Concave
A/N
T
T
T
H
P
D
Mix
L
T
T
Pure
L
L
L
A/N
T
T
T
Sub-Additive
H
P
D
Mix
L
T
T
Pure
L
L
L
A/N
T
T/P
L
Mix
L
T/P
L
52
Related Work

Coalitions Formations Shehory, Sandholm
Mechanism designEphrati, Kraus, Tennenholtz
Other models of negotiation Sycara, Durfee,
Lesser, Gasser, Gmytrasiewicz, Jennings
Consensus mechanisms, voting techniques, economic
models Ephrati, Wellman, Sandholm

53
Conclusions

By appropriately adjusting the rules of encounter
by which agents must interact, we can influence
the private strategies that designers build into
their machines
The interaction mechanism should ensure the
efficiency of multi-agent systems

Rules of Encounter
Efficiency
54
Conclusions

To maintain efficiency over time of dynamic
multi-agent systems, the rules must also be
stable
The use of formal tools enables the design of
efficient and stable mechanisms, and the precise
characterization of their properties

Stability
Formal Tools
55
Strategic Negotiation

Collaborators Jon Wilkenfeld, Rina
Schwartz-Azoulay, Orna Shechter, Esti Freitsis

56
DAI Overview

AI
DAI
DPS MA
strategic
negotiation

57
Strategic Negotiation Model

Model of alternative offers (Rubinstein) which
takes negotiation time into consideration
reduces negotiation time.
During the strategic-negotiations agents
communicate their respective desires to reach
mutually beneficial agreement.
The model provides a unified to many problems.

58
Structure of the Negotiation

There are N self motivated agents, randomly
designated 1,2,...
All the agents negotiate to reach an agreement.
The negotiation process may include several
equidistant iterations 0,1,2 ?Time and can
continue forever. In each time period t, agent
j(t) t mod N makes an offer.

59
Structure of the Negotiation - cont.

The other agents respond simultaneously YES4
or NO8 or OPTM.
If the offer was accepted4 by all the agentsthe
last offer is implemented.
If at least one agent opts outM a conflict
occurs.
Otherwise (the offer was rejected8 by at least
one agent), the negotiation proceeds to period
t1. ???

60
Applications

Information servers (large databases).
Resources sharing.
Tasks distribution.
Computer assisted negotiation.
Union/management negotiation.

61
Negotiation on data allocation in multi-server
environment
62
Environment Description

There are several information servers. Each
server is located at a different geographical
area.
Each server receives queries from the clients in
its area, and sends documents as responses to
queries. These documents can be stored locally,
or in another server.

63
Environment Description
the query
document/s
distance
serverj
serveri
a query
the document/s
a client
area j
area i
64
Environment Description - cont.

The information is clustered in datasets
(corresponding to file, fragment, etc.)
Each new dataset has to be allocated to one of
the servers by mutual agreement among the
servers.
Each server wants to store the datasets in a
location which reduces its communication and
storage costs.
A negotiation session is initiated when a set of
new datasets arrive.

65
Motivation

Cooperation among servers with similar areas of
interest (e.g., Web servers).
The Data and Information System component of the
Earth Observing System (EOSDIS) of NASAA
distributed knowledge system which supports
archival and distribution of data at multiple and
independent servers.

66
Motivation - cont.

Each data collection, or file, is called a
dataset. The datasets are huge, so each dataset
has only one copy.
The current policy for data allocation in NASA is
static old datasets are not reallocated each
new dataset is located by the server with the
nearest topics (defined according to the topics
of the datasets stored by this server).

67
Related Work -File Allocation Problem

The original problemHow to distribute files
among computers, in order to optimize the system
performance.
Our problemHow can self-motivated servers
decide about distribution of files, when each
server has its own objectives.

68
Basic Definitions

SERVERS the set of the servers.
DATASETS the set of datasets (files) to be
allocated.
Allocationa mapping of each dataset to one of
theservers. The set of all possible allocation
is denoted by Allocs.
U the utility function of each server.

69
The Conflict Allocation

If at least one server opts outM of the
negotiation, then the conflict allocation
conflict_alloc is implemented.
We consider the conflict allocation to be the
static allocation. (each dataset is stored in the
server with closest topics).

70
Utility Function

Userver(alloc,t) specifies the utility of server
from alloc?Allocs at time t.
It consists of
The utility from the assignment of each dataset.
The cost of negotiation delay.
Userver(alloc,0) S Vserver(x,alloc(x)).
x?DATASETS

71
Parameters of utility

query price payment for retrieved docoments.
usage(ds,s) the expected number of documents of
dataset ds from clients in the area of server
s.
storage costs, retrieve costs, answer costs.

72
Cost over time

Cost of communication and computation time of the
negotiation.
Loss of unused information new documents can not
be used until the negotiation ends.
Datasets usage and storage cost are assumed to
decrease over time, with the same discount ratio
(p-1).
Thus, there is a constant discount ratio of the
utility from an allocation Userver(alloc,t)d
tUserver(alloc,0) - tC.

73
Assumptions

Each server prefers any agreement over
continuation of the negotiation indefinitely.
The utility of each server from the conflict
allocation is always greater or equal to 0.
OFFERS - the set of allocations that are
preferred by all the agents over opting out.

74
Equilibrium

Nash equilibriumA strategy profile p is a Nash
Equilibriumif no player has a different strategy
yielding an outcome that he prefers to that
generated when it chooses pi.
Subgame Perfect EquilibriumIf the strategy
profile induced in every subgame is a Nash
Equilibrium of this subgame.

75
Negotiation Analysis - Simultaneous Responses

Simultaneous responsesA server, when
responding, is not informed of the other
responses.
TheoremFor each offer x ? OFFERS, there is a
subgame-perfect equilibrium of the bargaining
game, with the outcome x offered and unanimously
accepted in period 0.

76
Choosing the Allocation

The designers of the servers can agree in advance
on a joint technique for choosing x
giving each server its conflict utility.
maximizing a social welfare criterion
the sum of the servers utilities.
or the generalized Nash product of the servers
utilities P (Us(x)-Us(conflict)).

77
Choosing the Allocation - cont.

The problem of finding an optimal allocation is
NP-complete (a reduction from the multiprocessors
scheduling).
When finding x is intractable, we suggest the
following protocol
each server will search for an allocation
the allocation which maximizes the predefined
social welfare criterion will be chosen.

78
Search Methods

We have implemented the following algorithms
A backtracking algorithmSearching the search
space of the allocation problem.
A random restart hill-climbing algorithmStarts
with a random allocation and tries to improve it.
A genetic algorithmSearching by simulating an
evolution process. Each individual represents an
allocation. The algorithm involves reproduction,
crossover and mutation of individuals.

79
Experimental Evaluation

How do the parameters influence the results of
the negotiation?
vcost(alloc) the variable costs due to an
allocation (excludes storage_cost and the gains
due to queries).
vcost_ratio the ratio of vcosts when using
negotiation, and vcosts of the static allocation.

80
Effect of Parameters on The Results

As the number of servers grows, vcost_ratio
increases (more complex computations) L.
As the number of datasets grows, vcost_ratio
decreases (negotiation is more beneficial) J.
Changing the mean usage did not influence
vcost_ratio significantlyK, but vcost_ratio
decreases as the standard deviation of the usage
increasesJ.

81
Influence of Parameters - cont.

When the standard deviation of the distances
between servers increases, vcost_ratio
decreasesJ.
When the distance between servers increases,
vcost_ratio decreasesJ.
In the domains tested,
answer_cost ? vcost_ratio ? L.
storage_cost ? vcost_ratio ? L.
retrieve_cost ? vcost_ratio ? J.
query_price ? vcost_ratio ? J.

82
Social Criteria

We studied the effect of the choice of the social
welfare criterion on the results.
We compare the following criteria
Sum of agents utilities.
Product of agents utilities.
Maximizing the sum achieves lower vcost_ratio.
Maximizing the product achieves lower dispersion
of the agents utilities.

83
Incomplete Information

Each server knows
The usage frequency of all datasets, by clients
from its area.
The usage frequency of datasets stored in it, by
all clients.

84
Incomplete Information - cont.

A revelation mechanism
First, all the servers report simultaneously all
their private information
for each dataset, the past usage of the dataset
by this server.
for each server, the past usage of each local
dataset by this server.
Then, the negotiation proceeds as in the complete
information case.

85
Incomplete Information - cont.

LemmaThere is a Nash equilibrium where each
server tells the truth about its past usage of
remote datasets, and the other servers usage of
its local datasets.
Lies concerning details about local usage of
local datasets are intractable.

86
Summary negotiation on data allocation

We have considered the data allocation problem in
a distributed environment.
We have presented the utility function of the
servers, which expresses their preferences.
We have proposed using a negotiation protocol for
solving the problem.
For incomplete information situations, a
revelation process was added to the protocol.

87
Negotiations in the pollution sharing problem

Collaborator Esti Freitsis

88
Environment Description

There are some closely grouped plants in an
industrial region.
Each plant can produce several types of products.
Each plant has a utility function (profit).
There are several types of pollution substances.
Each plant has norms, restricting maximal
emission of each polluting substance that it
emits. The pollution always has to be below these
norms. We refer to the situation when only these
norms have to be carried out as usual
circumstances.

89
Special circumstances

Sometimes there is a need to reduce pollution for
some period because of external factors such as
weather (high humidity, wind towards residential
area). In this case plants receive new norms. We
refer to this situation as special circumstances.

90
Current solution

Current solution each plant reduce pollution
according to the new norms.
Disadvantage for one plant it is less costly to
reduce one substance while for another it is less
costly to reduce another substance.

91
Negotiations

Our solution plants negotiate to reach
beneficial agreements about the emission of what
substances and by which percent each of them must
be reduced.
The conflict solution following the new norms.
We consider complete information situations.

92
Negotiations Protocols

Simultaneous responsesan agent responding to an
offer is not informed of the other responses.
Sequential responses an agent responding to an
offer is informed of the responses of the
preceding agents (assuming that the agents are
ordered).

93
Negotiations strategies for simultaneous responses

As in the data allocation case
For each possible agreement x that is better to
all the plants than the conflict solution there
is a subgame-perfect equilibrium of the
bargaining game, with the outcome x offered and
unanimously accepted in period 0.

94
Negotiations strategies for sequential responses

Assumption there is a time period, T where
negotiation cannot continue anymore. In T the
conflict allocation is implemented.
Perfect equilibrium by backward induction
At T-1 if negotiations hasnt ended, AT-1
suggests the best agreement to itself which is
better to all agents than the conflict solution
(denoted by OT-1 ) the other agents accept.
At T-2, AT-2 suggests the best agreement to
itself which is better to all agents than the
conflict solution and OT-1 (denoted by OT-2).
The other agents accept.
By induction, at the first time period A0 O0 the
others accept.

95
Assumptions about the environment

Profit is a linear function of the number of
items of each product produced by the plant
Pollution is a linear function of the number of
items of each product produced.

96
Techniques which were checked

Strategic negotiations
Sequential responses backtracking
Simultaneous response Maximization of the sum
with guaranties of default profit
Simplex method - method for linear optimization
Nash Product
Praxis - method for multi-variable nonlinear
function minimization.
Hill Climbing

97
Simulation Parameters

Number of plants is varied from 5 to 20.
Number of pollution types is varied from 5 to 20.
For each product pollution of some type is
produced with probability 1/2.
Each plant produces Max_prod different types of
products. Max_prod is varied from 5 to 20.
Pollution and profit per item of product and
pollution constraints are set randomly.
Results Average of 25 simulation runs.

98
Plants utility as the function of the number of
plants
99
Standard Deviation as the function of the number
of plants
100
Computation time as a function of number of plants
101
Plants utility as the function of the number of
pollution substances
102
Standard deviation as the function of the number
of pollution substances
103
Computation time as a function of the number of
pollution substances
104
Plants utility as a function of the number of
products
105
Standard deviation as a function of the number
of products
106
Computation time as the function of the number of
products
107
Computation time as a function of the number of
products
108
Conclusions

Maximizing the sum yields the highest average
utility, but also the highest standard deviation
requires agreement between the designers on
selecting a solution.
Backward induction yields a reasonable average
utility with low standard deviations and no need
for designers agreement on detailed protocol.
On going work incomplete information.

109
Sharing Resources Through Negotiation

Joint resource public communication system
satellite
Agents self motivated.
Environment no central controller.

110
Environment Description

Two agents must share a joint resource the
resource can only be used by one agent at a time.
No central controller.
One agent (A) is using the resource, and the
second (W) wants to use it too.
The agents negotiate to reach an agreement a
schedule that divides the usage of the resource
lts,tgt.

111
Environment Description -cont

A continues to use the resource as the
negotiation proceeds A gains over time.
W is not able to use the resource W loses over
time.
Opting out causes damage to the resourceboth
agents wait q time steps.
Additional option an agent can leave the
negotiation.

112
Applying the strategic model

We developed a detailed utility function for the
agents (U_A U_W). Parameters type of goal,
dead-lines, costs of negotiation, gains from
goal, etc.
Main factor in the negotiation the best
agreement for A, which is still better for W than
Opting out (O_n).

113
Perfect equilibrium strategies

O_n depends on the specific situation we proved
lemmas which specify the value of O_n as a
function of the utility function parameters.
Complete information Negotiation ends at most
after one step with an agreement, or W leaves.
The strategies are simple.

114
Experiments Using MINUET
Agent 2
Agent 1
Send request lt5,3gt
Working on goal 102

Receive request lt5,3gt
Resources
1001 - free 1002 - busy
115
Experiments Results
Nego.
EDF
Metric
Utility score
91
91
Abandon goals
9.6
8.4
21.2
Nego./Alter.
15.5
116
Summary

A strategic model of negotiation, taking the
passage of time into account.
We consider wide range of situationscomplete
/incomplete informationNgt2 agentsagents lose
over time/some lose and some gain over time