Title: Intelligent Agents Agent Rationality and Auctions

Intelligent Agents Agent Rationality and
  • Katia Sycara
  • The Robotics Institute
  • email

The Electronic Society
  • A society of electronic autonomous agents
  • The agents have means to communicate with each
  • The agents do not necessarily know others. Means
    for finding others exist (may be costly)
  • Agents may be heterogeneous by means of
  • design philosophy
  • expertise/capabilities
  • capacity/access to resources
  • intelligence the algorithms used for problem
  • as a result different performance - efficiency,

Agents Attitudes
  • Self interest a self-interested agent is
    attempting to maximize its own personal payoff
  • Benevolence/altruism a benevolent agent is
    attempting to increase others payoffs and the
    cumulative payoff of the society
  • Cooperation an agent is considered cooperative
    when it performs tasks on behalf of other agents
    (possibly for payment)
  • Non-cooperative has game-theoretic strategic

  • A rational behavior is such that an agent prefers
    a greater payoff over a smaller one
  • A rational agent should always behave rationally.
    That is, from among several options, it should
    select the one that results in maximum payoff
  • The problem
  • in may cases the number of options is
  • there may be no algorithm for finding the best

Bounded Rationality
  • To overcome problems of rationality, bounded
  • limits the time/computation for option
  • prunes the search space
  • imposes restrictions on the types of options
  • Results in fewer possibilities, hence
  • computationally feasible
  • may be too restrictive, far from optimal
  • strategically inferior to rational

Good Enough Behavior
  • Make the bounded rationality rational
  • modify linear payoff functions to incorporate
    computational costs
  • put a cap on payoff
  • add a small-amounts indifference
  • The payoff of an option is good enough if
  • too much additional computation to find other
    good options, or
  • other options do not provide a significant payoff
    increase, or
  • the agent is indifferent w.r.t. the increase

What are Protocols?
  • A protocol (aka mechanism)
  • provides a set of rules and behaviors to be
    followed by agents who participate in it
  • following the rules of a protocol is to an
    agents discretion, though deviation may leave it
    out of the game
  • examples auctions, negotiation, voting
  • Desired properties
  • maximize payoffs
  • not manipulable/enforceable
  • simple to implement and execute

What are Strategies?
  • A strategy
  • is one of the possible actions an agent can
    select given the protocol
  • is not dictated (or provided) by the protocol
  • is usually the result of the agents reasoning
    and decisions, based on local algorithms and
  • examples bid lowest possible, vote for your
  • A good strategy
  • should maximize the agents payoff given the
    protocol and the behavior of other agents
  • should be difficult or impossible to manipulate
  • should be computationally feasible
  • may depend on the strategies of other agents

Protocol evaluation
  • Payoff maximization can refer to individual
    payoffs, group payoffs, or social welfare - the
    sum of individual payoffs
  • Pareto-optimality a payoff vector p(x1,x2,,xn)
    is pareto-optimal if there is no other feasible
    payoff vector p' such that at least one payoff
    is better in p' and no payoff is worse in p
  • Stability a protocol is stable if once the
    agents arrived at a solution they do not deviate
    from it

Stability and Equilibria
  • There are multiple stability concepts. In game
    theory, the notion of equilibrium is used
  • dominant strategies the agents have some
    strategies that, regardless of what others do,
    maximize payoff
  • Nash equilibrium the agents have strategies
    that, as long as other stick to theirs, maximize
  • Mixed Nash the agents each have a set of
    strategies from among which they select one with
    some probability
  • Bayes-Nash adds history to the previous one

The Prisoners Dilemma
No Pure Nash Equilibrium
Mixed Nash
  • Player 1 will cooperate with probability pc and
    defect with probability pd
  • Player 2 will cooperate with probability qc and
    defect with probability qd
  • Expected utility of an agent is the utility from
    a strategy times the probability of this strategy
    being selected
  • When there are multiple possibilities, the
    expected utility is a sum over these

Computing the Probabilities
  • The expected utility of an agent x when the other
    agent y follows strategy s is denoted by Ux(s)
  • In the case of equilibrium (mixed Nash), the
    expected utility of x should be the same for all
    of the possible strategies of y
  • In our case we have agents 1,2 and strategies c,d
  • We require that Ux(c) Ux(d), which means that
    for each of the two agents, the expected utility
    from the other cooperating should be equal to the
    expected utility from the other defecting

Computation Details
  • For agent 1 we have
  • U1(c) qc(6 pc 0 pd), U1(d) qd(0 pc 5 pd)
  • For agent 2 we have
  • U2(c) pc(2 qc 3 qd), U2(d) pd(1 qc- 0 qd)
  • The requirement that Ux(c) Ux(d) results in
  • qc 0.642, qd 0.358
  • pc 0.317, pd 0.683

Tragedy of Common Goods
  • Information on the web is (mostly) free
  • Agents that seek up to date information may query
    web site as frequently as desired
  • If all agents will do so, the network will be
    overly congested, and some servers will crash
  • So, is it undesirable to behave this way?
  • If all (or most) of the agents prevent
    congestion, it is in the best interest of each
    individual agent to increase network use ...

The Contract Net Protocol
  • An agent coordination and distributed task
    allocation mechanism, where
  • multiple heterogeneous agents can perform tasks
  • agents can play two roles managers, contractees
  • managers receive tasks, select prospective
    contractees and ask for bids
  • best bid wins task, performs it, manager monitors
  • Pros and cons
  • simple to implement, base for many other
  • fully distributed
  • performance quality not checked
  • easy to manipulate (free riders), may cause loops

  • Optimality analysis of previous contracts assumes
    that agents are sincere reveal their real
    marginal cost when they bid, their tasks
  • but it is beneficial to lie about costs ...
  • and about tasks hide, phantom, generate
  • Immunity to lying
  • pure deals (disjoint tasks) not immune
  • mixed deals (probability distribution) phantom

  • A contractee should commit to task performance
    and a manager, not to terminate contract, but
  • During execution, contractee may receive a more
    profitable task, or manager - a better bid
  • Self-interest will result in de-commitment
  • To prevent losses - need for enforcement
  • De-commitment penalties, leveled commitment
  • Hedging against risk pricing a contract like an
    option, taking into account future via
    probabilities of events

  • Contingency contracts - when probability of
    future events is known, similar to mixed Nash
  • but this is, in general, exponentially complex
  • Leveled commitment contracts - allow unilateral
    de-commitment at any time, penalties set in
  • self-interested agents may avoid beneficial
    de-commitment based on chance that the other
    party do so
  • not optimal, but better than non-leveled
  • Option pricing
  • optimal (complete knowledge, infinite markets)

  • Advantages of contingent contracts
  • (1) the space of possible decisions is enlarged
  • (2) the decision makers payoff can benefit given
    these flexibilities
  • (3) the overall payoff (social welfare) could be
  • For the binding contracts, decisions are now or
  • For the contingent contracts, decisions could be
    deferred for the future when more information
    could be obtained.

  • A call option gives the holder the right to buy
    the underlying asset by a certain date(expiration
    date) for a certain price(strike price).
  • American option can be exercised at any time up
    to the expiration date.
  • Six factors have effects on the price of a stock
    optionstock price,strike price,time to
    expiration,volatility,risk-free interest and
  • Contingent contracts where an agent can decommit
    at any time can be viewed as American call
    options without dividends.

Simulation and Analysis
  • Assumptions for the Simulation
  • Only one agent can take a role as a manager.
  • The manager generates non-decomposable tasks in a
    predefined frequency and task specifies the
    contract duration and execution time for each
  • The task value is linearly increasing from 10 to
    100 by 10 periodically. Similar to stock price,
    the task value is stochastic. Using Monte Carlo
    method, the manager simulates the task value at
    each time period and announces the current task
    value to all contractors.

Simulation and Analysis
  • Each contractor has a queue of capacity equal to
    10 tasks, and he schedules his tasks only
    according to the latest start time.
  • If a contractor breaches a contract during the
    task execution, both manager and contractor can
    get partial result.
  • The volatility and risk-free interest in the
    option pricing model are fixed for all the

Performance Evaluation
  • Three dimensions are focused on to evaluate the
    performance Throughput, Social Welfare and
    Negotiation Efficiency.
  • Throughput total number of tasks executed
    within predefined experiment duration.
  • Social Welfare is defined as the total payoff of
    all the agents, which is used to check the global
  • Negotiation Efficiency is measured by the total
    decision making time. It represents the total
    time spent in all the tasks from the moment they
    are assigned until the moment they are executed
    or breached.

  • CCP incorporates option pricing theory so
    contracts could be modeled in a very natural way.
  • CCP provides a computational framework for the
    agents to calculate the value of flexible
    contract, the payoff and penalty fee, when to

Conclusion (contd)
  • Comparing to CNP, LCP and CCP are less
    computational efficient. CCP provides a more
    general framework for the agents to evaluate and
    compute optimal decisions in face of uncertainty.
  • Both LCP and CCP in scenario2 have a high
    solution quality while LCP achieves a good
    tradeoff between commitment and flexibility.

  • A centralized protocol, includes one auctioneer
    and multiple bidders
  • The auctioneer puts a good for sale. In some
    cases, the good may be a combination of other
    goods, or a good with multiple attributes
  • The bidders make offers. This may be repeated for
    several times, depending on the auction type
  • The auctioneer determines the winner

Auctions Pros and Cons
  • Usually easier to prevent bidder lying
  • Simple protocols
  • Centralized a single point of failure
  • Multi-attribute exponentially complex
  • Allows collusion behind the scenes
  • May favor the auctioneer

Auction Types
  • Private value the value of a good to a bidder
    agent depends only on its private preferences.
    Assumed to be known exactly
  • Common value the goods value depends entirely
    on other agents valuation
  • Correlated value the goods value depends on
    internal and external valuations

Auction Protocols
  • English auction (aka first-price open-cry)
  • bidders free to raise their bid
  • end no more raises, winner highest bidder at
  • agent strategy a series of bids, based on
    private value, estimates of others valuations,
    their past bids
  • dominant strategy bid a small amount more than
    current highest bid, stop when private value
  • For correlated value
  • auctioneer increases price by constant or other
  • open-exit allows to quit without re-entry

More Protocols
  • First-price sealed-bid auction
  • each bidder submits one bid, not knowing others
  • highest wins, pays his bid
  • agent strategy function of private value and
    beliefs about others valuations
  • no dominant strategy. Best bid less than true
  • how much less? Nash is computable if probability
    distribution of agents values is known
  • Example n agents, uniform value distribution,
    agent i has value vi, there is Nash if each agent
    i bids vi(n-1)/n

Yet More Auctions
  • Dutch auction (decending)
  • the seller lowers the price until a bidder takes
  • strategically, equivalent to first-price
  • advantage auctioneer can accelerate auction
  • All-pay auction
  • each bidder pays its bid to the auctioneer
  • several types of such auctions are used for
    resource (re-)allocation

Vickrey (second-price sealed-bid)
  • Each bidder submits one bid, not knowing others
  • The highest bid wins, but bidder pays
    second-highest bid
  • Agent strategy base bid on private value and
    beliefs about others values
  • Dominant strategy bid true valuation
  • if it bids more and this increment made him win,
    the agent ends up with a loss, since it may pay
    more that its true value
  • if it bids less, there is a smaller chance of
    winning (but winning price is not affected)
  • Meaning bid true value regardless of others

So, Which Auction is Better?
  • Computation auctions with dominant strategies
    (Vickrey and English) are more efficient - no
    need to speculate regarding other bidders
  • Auctioneers revenue
  • second-price is less than the true price, however
    first-price bidders under-bid. Which effect is
  • for risk-neutral bidders with private independent
    values, the effects are equivalent
  • for risk-averse bidders, Dutch and first-price
    sealed-bid auctions maximize auctioneers revenue

Real Auctions
  • In real auctions, values are not private
  • As a result, for 3 or more bidders, English
    auctions provides auctioneer revenue higher than
    Vickrey does
  • Explanation when it observes other bidders
    increasing their bid, a bidder increases its own
    valuation of the good
  • Both English and Vickrey are better for the
    auctioneer than Dutch and first-price sealed-bid

  • Bidder can coordinate their bids to lower them
  • In English and Vickrey auctions, collusion is a
    dominant Strategy!
  • Example
  • agents a,b,c values of the good are 10,10,12,
  • they can agree to bid 5,5,6 respectively
  • if one defects, all observe that, and can
    increase to real value, so there is no benefit
    from defection

Avoiding Collusion
  • In the first-price sealed-bid and Dutch auctions,
    bidder collusion is not dominant, but possible
  • in the previous example, after a,b,c decided on
    bidding 5,5,6, it is beneficial for a,b to bid
    more than 5. For any bid of c below 10 they can
    bid and win
  • In first-price sealed-bid, Vickrey and Dutch
    auctions, all bidders must identify each other
    and collude jointly. External bidder can win
  • In the English auction identifying is through
    bidding. Computerize anonymization can prevent
    identification and collusion

Insincere Auctioneer
  • Private value auctions
  • Vickrey auctioneer can overstate the second
    highest bid to the winner
  • Solution electronic signature
  • Other auctions do not motivate auctioneer lying,
    since the winner pays its bid
  • Non-private value
  • English auctioneer can use shills that bid in
    the auction to increase real bidders valuation
  • Any auction auctioneer may bid, to guarantee a
    minimum price

Example Auctioneer Bid
  • In the Vickrey auction, auctioneer is motivated
    to bid over its true reservation price
  • In case his bid is second, it determines the
    goods price higher than the reservation price
  • On the other hand, auctioneer may win although
    others value the good at more than reservation

Insincere Bidders
  • Non-private value
  • winners curse an agent that bids its true value
    and wins knows that it was too high
  • this means the a win is a loss (of money)
  • hence, agents should bid less than true value
  • this is the best strategy even in Vickrey (unlike
    private value Vickrey)
  • Private value, Vickrey
  • dominant truthful bidding reveals true valuations
  • this may be disadvantageous
  • when subcontracting, subcontractors may

Auctions of Interrelated Goods
  • Multiple homogeneous goods truth revelation of
    Vickrey holds
  • Heterogeneous goods, one at a time,
    interdependent values
  • for optimal bidding, agents need full lookahead
  • but then agents dont bid true values per good
  • Protocol modifications to overcome that
  • pool of goods at a single auction
  • allow decommitemt, with penalties

  • A process by which two or more agents reach a
    joint decision, each trying to achieve an
    individual goal/objective. Includes
  • a conversation language
  • a protocol
  • a decision process (by which an agent decides
    upon its position, concessions, criteria for
  • Can be performed 11, 1N, NN
  • May include a single shot message by each party
    or conversation with several messages going back
    and forth

Negotiation Desired Properties
  • Efficiency little time spent on arriving at
  • Stability once an agreement is reached, agents
    should stick to it
  • Simplicity computation and communication
    overheads should be small
  • Distribution there should not be a central
    decision maker
  • Symmetry the mechanism should not be biased

Some Details
  • Goal of negotiation arrive at an agreement
    beneficial to all parties
  • Initially, parties may have different beliefs
  • So, a proposal and a counter-proposal are not
    merely an offer for an agreement - they are an
    attempt to change the beliefs of the other party
  • Beliefs are based on facts and justifications
  • In MAS, Truth Maintenance Systems (TMS),
    utilizing logic, serve for belief maintenance and
  • Commonly includes internal, external, out

Are Auctions a Negotiation Protocol?
  • In auctions, the protocol does not assume, and
    attempts to prevent, inter-agent cooperation
  • Auctions centralize the commerce
  • If the number of buyer-bidders is small, the
    final price may be far lower than reservation
  • To enable maximal exploitation of cooperation
    opportunities, thus payoff maximization, there is
    a need for free, elaborate, one on one, and many
    to many negotiation

Cooperation via Coalitions
  • A coalition a set of agents that agree to
    cooperate to execute a task/achieve a goal
  • Assumptions
  • agents have different expertise and capacities
  • tasks require cooperation of different agents for
    their execution (in terms of reasoning/performance
  • task have values, depend on coalition members
  • To perform tasks and increase benefits agents may
    need to cooperate via coalition formation

Transportation Example
  • A transportation company needs to mobilize 10
  • It has access to cars, vans, buses and
    helicopters, all are possibly self-interested
    service providers
  • A car can take 3 passengers, a van can take 7, a
    bus can take 50 and an helicopter can take 6
  • Each vehicle is priced differently and has
    different speed and comfort
  • Passengers are willing to pay a fixed amount
  • The company needs to find the best, or at least a
    good, combination of vehicles for the task

Issues in Coalition Formation
  • Given the tasks and the other agents, which
    coalitions should an agent attempt to form?
  • What mechanism can an agent use for coalition
  • What guarantees regarding efficiency and quality
    of task performance can the mechanism provide?
  • Once a coalition has formed, how should its
    members go about distribution of work/payoff?

Coalition Formation Solutions
  • Self-interest vs. benevolence the mechanisms for
    benevolent agents are usually much simpler, as
    such agents do not need means to maintain their
    own payoff maximization
  • Centralization vs. distribution central design
    of coalitions is usually much simpler to execute
    and enforce than a distributed one is
  • Environment super-additivity in super-additive
    environments any unification of two coalitions
    increases overall payoff. Strongly influences the

Agent5 b5
Agent6 b6
Agent1 b1
Agent2 b2
Agent7 b8
Agent3 b3
Agent4 b4
Agent11 b3
Agent10 b1
Agent13 b4
Agent12 b7
Coalition Formation in Dynamic, Open MAS
  • Coalition formation (Shehory and Sycara)
  • Main idea (distributed greedy algorithm)
  • Each agent performs iteratively
  • Design possible coalitions w.r.t. tasks
  • Compute coalition-task values
  • Choose best one and form it
  • Re-design when new tasks/agents arrive

Agent1 b1
Agent6 b3
Agent5 b2
Middle Agent
Agent3 b1
Agent7 b7
Agent2 b8
Agent8 b3
Middle Agent
