Title: Intelligent Agents Agent Rationality and Auctions
1Intelligent Agents Agent Rationality and
Auctions
- Katia Sycara
- The Robotics Institute
- email katia_at_cs.cmu.edu
- www.cs.cmu.edu/softagents
2The Electronic Society
- A society of electronic autonomous agents
- The agents have means to communicate with each
other - The agents do not necessarily know others. Means
for finding others exist (may be costly) - Agents may be heterogeneous by means of
- design philosophy
- expertise/capabilities
- capacity/access to resources
- intelligence the algorithms used for problem
solving - as a result different performance - efficiency,
quality
3Agents Attitudes
- Self interest a self-interested agent is
attempting to maximize its own personal payoff - Benevolence/altruism a benevolent agent is
attempting to increase others payoffs and the
cumulative payoff of the society - Cooperation an agent is considered cooperative
when it performs tasks on behalf of other agents
(possibly for payment) - Non-cooperative has game-theoretic strategic
behavior
4Rationality
- A rational behavior is such that an agent prefers
a greater payoff over a smaller one - A rational agent should always behave rationally.
That is, from among several options, it should
select the one that results in maximum payoff - The problem
- in may cases the number of options is
overwhelming - there may be no algorithm for finding the best
5Bounded Rationality
- To overcome problems of rationality, bounded
rationality - limits the time/computation for option
consideration - prunes the search space
- imposes restrictions on the types of options
- Results in fewer possibilities, hence
- computationally feasible
- may be too restrictive, far from optimal
- strategically inferior to rational
6Good Enough Behavior
- Make the bounded rationality rational
- modify linear payoff functions to incorporate
computational costs - put a cap on payoff
- add a small-amounts indifference
- The payoff of an option is good enough if
- too much additional computation to find other
good options, or - other options do not provide a significant payoff
increase, or - the agent is indifferent w.r.t. the increase
7What are Protocols?
- A protocol (aka mechanism)
- provides a set of rules and behaviors to be
followed by agents who participate in it - following the rules of a protocol is to an
agents discretion, though deviation may leave it
out of the game - examples auctions, negotiation, voting
- Desired properties
- maximize payoffs
- not manipulable/enforceable
- simple to implement and execute
8What are Strategies?
- A strategy
- is one of the possible actions an agent can
select given the protocol - is not dictated (or provided) by the protocol
- is usually the result of the agents reasoning
and decisions, based on local algorithms and
info. - examples bid lowest possible, vote for your
faction - A good strategy
- should maximize the agents payoff given the
protocol and the behavior of other agents - should be difficult or impossible to manipulate
- should be computationally feasible
- may depend on the strategies of other agents
9Protocol evaluation
- Payoff maximization can refer to individual
payoffs, group payoffs, or social welfare - the
sum of individual payoffs - Pareto-optimality a payoff vector p(x1,x2,,xn)
is pareto-optimal if there is no other feasible
payoff vector p' such that at least one payoff
is better in p' and no payoff is worse in p - Stability a protocol is stable if once the
agents arrived at a solution they do not deviate
from it
10Stability and Equilibria
- There are multiple stability concepts. In game
theory, the notion of equilibrium is used - dominant strategies the agents have some
strategies that, regardless of what others do,
maximize payoff - Nash equilibrium the agents have strategies
that, as long as other stick to theirs, maximize
payoff - Mixed Nash the agents each have a set of
strategies from among which they select one with
some probability - Bayes-Nash adds history to the previous one
11The Prisoners Dilemma
12No Pure Nash Equilibrium
13Mixed Nash
- Player 1 will cooperate with probability pc and
defect with probability pd - Player 2 will cooperate with probability qc and
defect with probability qd - Expected utility of an agent is the utility from
a strategy times the probability of this strategy
being selected - When there are multiple possibilities, the
expected utility is a sum over these
possibilities
14Computing the Probabilities
- The expected utility of an agent x when the other
agent y follows strategy s is denoted by Ux(s) - In the case of equilibrium (mixed Nash), the
expected utility of x should be the same for all
of the possible strategies of y - In our case we have agents 1,2 and strategies c,d
- We require that Ux(c) Ux(d), which means that
for each of the two agents, the expected utility
from the other cooperating should be equal to the
expected utility from the other defecting
15Computation Details
- For agent 1 we have
- U1(c) qc(6 pc 0 pd), U1(d) qd(0 pc 5 pd)
- For agent 2 we have
- U2(c) pc(2 qc 3 qd), U2(d) pd(1 qc- 0 qd)
- The requirement that Ux(c) Ux(d) results in
- qc 0.642, qd 0.358
- pc 0.317, pd 0.683
16Tragedy of Common Goods
- Information on the web is (mostly) free
- Agents that seek up to date information may query
web site as frequently as desired - If all agents will do so, the network will be
overly congested, and some servers will crash - So, is it undesirable to behave this way?
- If all (or most) of the agents prevent
congestion, it is in the best interest of each
individual agent to increase network use ...
17The Contract Net Protocol
- An agent coordination and distributed task
allocation mechanism, where - multiple heterogeneous agents can perform tasks
- agents can play two roles managers, contractees
- managers receive tasks, select prospective
contractees and ask for bids - best bid wins task, performs it, manager monitors
- Pros and cons
- simple to implement, base for many other
protocols - fully distributed
- performance quality not checked
- easy to manipulate (free riders), may cause loops
18Lying
- Optimality analysis of previous contracts assumes
that agents are sincere reveal their real
marginal cost when they bid, their tasks - but it is beneficial to lie about costs ...
- and about tasks hide, phantom, generate
- Immunity to lying
- pure deals (disjoint tasks) not immune
- mixed deals (probability distribution) phantom
immune
19Commitment
- A contractee should commit to task performance
and a manager, not to terminate contract, but - During execution, contractee may receive a more
profitable task, or manager - a better bid - Self-interest will result in de-commitment
- To prevent losses - need for enforcement
- De-commitment penalties, leveled commitment
- Hedging against risk pricing a contract like an
option, taking into account future via
probabilities of events
20Contracts
- Contingency contracts - when probability of
future events is known, similar to mixed Nash - but this is, in general, exponentially complex
- Leveled commitment contracts - allow unilateral
de-commitment at any time, penalties set in
advance - self-interested agents may avoid beneficial
de-commitment based on chance that the other
party do so - not optimal, but better than non-leveled
- Option pricing
- optimal (complete knowledge, infinite markets)
21Background
- Advantages of contingent contracts
- (1) the space of possible decisions is enlarged
- (2) the decision makers payoff can benefit given
these flexibilities - (3) the overall payoff (social welfare) could be
improved
- For the binding contracts, decisions are now or
never. - For the contingent contracts, decisions could be
deferred for the future when more information
could be obtained.
22 Model
- A call option gives the holder the right to buy
the underlying asset by a certain date(expiration
date) for a certain price(strike price). - American option can be exercised at any time up
to the expiration date. - Six factors have effects on the price of a stock
optionstock price,strike price,time to
expiration,volatility,risk-free interest and
dividends. - Contingent contracts where an agent can decommit
at any time can be viewed as American call
options without dividends.
23Simulation and Analysis
- Assumptions for the Simulation
- Only one agent can take a role as a manager.
- The manager generates non-decomposable tasks in a
predefined frequency and task specifies the
contract duration and execution time for each
task. - The task value is linearly increasing from 10 to
100 by 10 periodically. Similar to stock price,
the task value is stochastic. Using Monte Carlo
method, the manager simulates the task value at
each time period and announces the current task
value to all contractors.
24Simulation and Analysis
- Each contractor has a queue of capacity equal to
10 tasks, and he schedules his tasks only
according to the latest start time. - If a contractor breaches a contract during the
task execution, both manager and contractor can
get partial result. - The volatility and risk-free interest in the
option pricing model are fixed for all the
experiments.
25Performance Evaluation
- Three dimensions are focused on to evaluate the
performance Throughput, Social Welfare and
Negotiation Efficiency. - Throughput total number of tasks executed
within predefined experiment duration. - Social Welfare is defined as the total payoff of
all the agents, which is used to check the global
optimality. - Negotiation Efficiency is measured by the total
decision making time. It represents the total
time spent in all the tasks from the moment they
are assigned until the moment they are executed
or breached.
26Conclusion
- CCP incorporates option pricing theory so
contracts could be modeled in a very natural way. - CCP provides a computational framework for the
agents to calculate the value of flexible
contract, the payoff and penalty fee, when to
breach.
27Conclusion (contd)
- Comparing to CNP, LCP and CCP are less
computational efficient. CCP provides a more
general framework for the agents to evaluate and
compute optimal decisions in face of uncertainty.
- Both LCP and CCP in scenario2 have a high
solution quality while LCP achieves a good
tradeoff between commitment and flexibility.
28Auctions
- A centralized protocol, includes one auctioneer
and multiple bidders - The auctioneer puts a good for sale. In some
cases, the good may be a combination of other
goods, or a good with multiple attributes - The bidders make offers. This may be repeated for
several times, depending on the auction type - The auctioneer determines the winner
29Auctions Pros and Cons
- Usually easier to prevent bidder lying
- Simple protocols
- Centralized a single point of failure
- Multi-attribute exponentially complex
- Allows collusion behind the scenes
- May favor the auctioneer
30Auction Types
- Private value the value of a good to a bidder
agent depends only on its private preferences.
Assumed to be known exactly - Common value the goods value depends entirely
on other agents valuation - Correlated value the goods value depends on
internal and external valuations
31Auction Protocols
- English auction (aka first-price open-cry)
- bidders free to raise their bid
- end no more raises, winner highest bidder at
bid - agent strategy a series of bids, based on
private value, estimates of others valuations,
their past bids - dominant strategy bid a small amount more than
current highest bid, stop when private value
reached - For correlated value
- auctioneer increases price by constant or other
rate - open-exit allows to quit without re-entry
32More Protocols
- First-price sealed-bid auction
- each bidder submits one bid, not knowing others
- highest wins, pays his bid
- agent strategy function of private value and
beliefs about others valuations - no dominant strategy. Best bid less than true
value - how much less? Nash is computable if probability
distribution of agents values is known - Example n agents, uniform value distribution,
agent i has value vi, there is Nash if each agent
i bids vi(n-1)/n
33Yet More Auctions
- Dutch auction (decending)
- the seller lowers the price until a bidder takes
it - strategically, equivalent to first-price
sealed-bid - advantage auctioneer can accelerate auction
- All-pay auction
- each bidder pays its bid to the auctioneer
- several types of such auctions are used for
resource (re-)allocation
34Vickrey (second-price sealed-bid)
- Each bidder submits one bid, not knowing others
- The highest bid wins, but bidder pays
second-highest bid - Agent strategy base bid on private value and
beliefs about others values - Dominant strategy bid true valuation
- if it bids more and this increment made him win,
the agent ends up with a loss, since it may pay
more that its true value - if it bids less, there is a smaller chance of
winning (but winning price is not affected) - Meaning bid true value regardless of others
35So, Which Auction is Better?
- Computation auctions with dominant strategies
(Vickrey and English) are more efficient - no
need to speculate regarding other bidders - Auctioneers revenue
- second-price is less than the true price, however
first-price bidders under-bid. Which effect is
stronger? - for risk-neutral bidders with private independent
values, the effects are equivalent - for risk-averse bidders, Dutch and first-price
sealed-bid auctions maximize auctioneers revenue
36Real Auctions
- In real auctions, values are not private
- As a result, for 3 or more bidders, English
auctions provides auctioneer revenue higher than
Vickrey does - Explanation when it observes other bidders
increasing their bid, a bidder increases its own
valuation of the good - Both English and Vickrey are better for the
auctioneer than Dutch and first-price sealed-bid
37Collusion
- Bidder can coordinate their bids to lower them
- In English and Vickrey auctions, collusion is a
dominant Strategy! - Example
- agents a,b,c values of the good are 10,10,12,
respectively - they can agree to bid 5,5,6 respectively
- if one defects, all observe that, and can
increase to real value, so there is no benefit
from defection
38Avoiding Collusion
- In the first-price sealed-bid and Dutch auctions,
bidder collusion is not dominant, but possible - in the previous example, after a,b,c decided on
bidding 5,5,6, it is beneficial for a,b to bid
more than 5. For any bid of c below 10 they can
bid and win - In first-price sealed-bid, Vickrey and Dutch
auctions, all bidders must identify each other
and collude jointly. External bidder can win - In the English auction identifying is through
bidding. Computerize anonymization can prevent
identification and collusion
39Insincere Auctioneer
- Private value auctions
- Vickrey auctioneer can overstate the second
highest bid to the winner - Solution electronic signature
- Other auctions do not motivate auctioneer lying,
since the winner pays its bid - Non-private value
- English auctioneer can use shills that bid in
the auction to increase real bidders valuation - Any auction auctioneer may bid, to guarantee a
minimum price
40Example Auctioneer Bid
- In the Vickrey auction, auctioneer is motivated
to bid over its true reservation price - In case his bid is second, it determines the
goods price higher than the reservation price - On the other hand, auctioneer may win although
others value the good at more than reservation
price
41Insincere Bidders
- Non-private value
- winners curse an agent that bids its true value
and wins knows that it was too high - this means the a win is a loss (of money)
- hence, agents should bid less than true value
- this is the best strategy even in Vickrey (unlike
private value Vickrey) - Private value, Vickrey
- dominant truthful bidding reveals true valuations
- this may be disadvantageous
- when subcontracting, subcontractors may
re-negotiate
42Auctions of Interrelated Goods
- Multiple homogeneous goods truth revelation of
Vickrey holds - Heterogeneous goods, one at a time,
interdependent values - for optimal bidding, agents need full lookahead
- but then agents dont bid true values per good
- Protocol modifications to overcome that
- pool of goods at a single auction
- allow decommitemt, with penalties
43Negotiation
- A process by which two or more agents reach a
joint decision, each trying to achieve an
individual goal/objective. Includes - a conversation language
- a protocol
- a decision process (by which an agent decides
upon its position, concessions, criteria for
agreement - Can be performed 11, 1N, NN
- May include a single shot message by each party
or conversation with several messages going back
and forth
44Negotiation Desired Properties
- Efficiency little time spent on arriving at
agreements - Stability once an agreement is reached, agents
should stick to it - Simplicity computation and communication
overheads should be small - Distribution there should not be a central
decision maker - Symmetry the mechanism should not be biased
45Some Details
- Goal of negotiation arrive at an agreement
beneficial to all parties - Initially, parties may have different beliefs
- So, a proposal and a counter-proposal are not
merely an offer for an agreement - they are an
attempt to change the beliefs of the other party - Beliefs are based on facts and justifications
- In MAS, Truth Maintenance Systems (TMS),
utilizing logic, serve for belief maintenance and
revision - Commonly includes internal, external, out
46Are Auctions a Negotiation Protocol?
- In auctions, the protocol does not assume, and
attempts to prevent, inter-agent cooperation - Auctions centralize the commerce
- If the number of buyer-bidders is small, the
final price may be far lower than reservation
price - To enable maximal exploitation of cooperation
opportunities, thus payoff maximization, there is
a need for free, elaborate, one on one, and many
to many negotiation
47Cooperation via Coalitions
- A coalition a set of agents that agree to
cooperate to execute a task/achieve a goal - Assumptions
- agents have different expertise and capacities
- tasks require cooperation of different agents for
their execution (in terms of reasoning/performance
) - task have values, depend on coalition members
- To perform tasks and increase benefits agents may
need to cooperate via coalition formation
48Transportation Example
- A transportation company needs to mobilize 10
passengers - It has access to cars, vans, buses and
helicopters, all are possibly self-interested
service providers - A car can take 3 passengers, a van can take 7, a
bus can take 50 and an helicopter can take 6 - Each vehicle is priced differently and has
different speed and comfort - Passengers are willing to pay a fixed amount
- The company needs to find the best, or at least a
good, combination of vehicles for the task
49Issues in Coalition Formation
- Given the tasks and the other agents, which
coalitions should an agent attempt to form? - What mechanism can an agent use for coalition
formation? - What guarantees regarding efficiency and quality
of task performance can the mechanism provide? - Once a coalition has formed, how should its
members go about distribution of work/payoff?
50Coalition Formation Solutions
- Self-interest vs. benevolence the mechanisms for
benevolent agents are usually much simpler, as
such agents do not need means to maintain their
own payoff maximization - Centralization vs. distribution central design
of coalitions is usually much simpler to execute
and enforce than a distributed one is - Environment super-additivity in super-additive
environments any unification of two coalitions
increases overall payoff. Strongly influences the
mechanism
51Agent5 b5
T3
Agent6 b6
Agent1 b1
Agent2 b2
Agent7 b8
T1(b1,b2,b3)
T2(b2,b4)
Agent3 b3
Agent4 b4
Agent9
Agent8
Agent11 b3
T5
Agent10 b1
Agent13 b4
Agent12 b7
T4(b1,b3,b7)
52Coalition Formation in Dynamic, Open MAS
- Coalition formation (Shehory and Sycara)
- Main idea (distributed greedy algorithm)
- Each agent performs iteratively
- Design possible coalitions w.r.t. tasks
- Compute coalition-task values
- Choose best one and form it
- Re-design when new tasks/agents arrive
53Agent4
Agent1 b1
Agent6 b3
T1
Agent5 b2
Middle Agent
Agent3 b1
Agent7 b7
Agent11
Agent2 b8
Agent10
Agent8 b3
Agent12
Middle Agent