Title: Computer Science and the Socio-Economic Sciences
1Computer Science and the Socio-Economic Sciences
Fred Roberts, Rutgers University
2CS and SS
- Many recent applications in CS involve
issues/problems of long interest to social
scientists - preference, utility
- conflict and cooperation
- allocation
- incentives
- consensus
- social choice
- measurement
- Methods developed in SS beginning to be used in
CS -
3CS and SS
- CS applications place great strain on SS methods
- Sheer size of problems addressed
- Computational power of agents an issue
- Limitations on information possessed by players
- Sequential nature of repeated applications
- Thus Need for new generation of SS methods
- Also These new methods will provide powerful
tools to social scientists -
-
4CS and SS Outline
- CS and Consensus/Social Choice
- 2. CS and Game Theory
- 3. Algorithmic Decision Theory
-
-
5CS and SS Outline
- CS and Consensus/Social Choice
- 2. CS and Game Theory
- 3. Algorithmic Decision Theory
-
-
6CS and Consensus/Social Choice
- Relevant social science problems voting, group
decision making - Goal based on everyones
- opinions, reach a consensus
- Typical opinions
- first choice
- ranking of all alternatives
- scores
- classifications
- Long history of research on such problems.
-
-
7CS and Consensus/Social Choice
Background Arrows Impossibility Theorem
There is no consensus method that satisfies
certain reasonable axioms about how societies
should reach decisions. Input rankings of
alternatives. Output consensus ranking.
Kenneth Arrow Nobel prize winner
8CS and Consensus/Social Choice
There are widely studied and widely used
consensus methods. One well-known consensus
method Kemeny-Snell medians Given set of
rankings, find ranking minimizing sum of
distances to other rankings. Kemeny-Snell
medians are having surprising new applications
in CS.
John Kemeny, pioneer in time sharing in CS
9CS and Consensus/Social Choice
Kemeny-Snell distance between rankings twice the
number of pairs of candidates i and j for
which i is ranked above j in one ranking and
below j in the other the number of pairs that
are ranked in one ranking and tied in
another. Kemeny-Snell median Given rankings
a1, a2, , ap, find a ranking x so that d(a1,x)
d(a2,x) d(ap,x) is minimized. Sometimes
just called Kemeny median.
10CS and Consensus/Social Choice
a1 a2 a3 Fish Fish Chicken Chicken Chicken Fi
sh Beef Beef Beef Median a1. If x
a1 d(a1,x) d(a2,x) d(a3,x) 0 0 2 is
minimized. If x a3, the sum is 4. For any other
x, the sum is at least 1 1 1 3.
11CS and Consensus/Social Choice
a1 a2 a3 Fish Chicken Beef Chicken Beef Fish
Beef Fish Chicken Three medians a1, a2, a3.
This is the voters paradox situation.
12CS and Consensus/Social Choice
a1 a2 a3 Fish Chicken Beef Chicken Beef Fish
Beef Fish Chicken Note that sometimes we wish
to minimize d(a1,x)2 d(a2,x)2 d(ap,x)2
A ranking x that minimizes this is called a
Kemeny-Snell mean. In this example, there is one
mean the ranking declaring all three
alternatives tied.
13CS and Consensus/Social Choice
a1 a2 a3 Fish Chicken Beef Chicken Beef Fish
Beef Fish Chicken If x is the ranking
declaring Fish, Chicken and Beef tied,
then d(a1,x)2 d(a2,x)2 d(ap,x)2 32
32 32 27. Not hard to show this is
minimum.
14CS and Consensus/Social Choice
- Theorem (Bartholdi, Tovey, and Trick, 1989
Wakabayashi, 1986) Computing the Kemeny median
of a set of rankings is an NP-complete problem. -
15Meta-search and Collaborative Filtering
- Meta-search
- A consensus problem
- Combine page rankings from several search engines
- Dwork, Kumar, Naor, Sivakumar (2000)
Kemeny-Snell medians good in spam resistance in
meta-search (spam by a page if it causes
meta-search to rank it too highly) - Approximation methods make this computationally
tractable -
-
-
16Meta-search and Collaborative Filtering
- Collaborative Filtering
- Recommending books or movies
- Combine book or movie ratings
- Produce ordered list of books or movies to
recommend - Freund, Iyer, Schapire, Singer (2003) Boosting
algorithm for combining rankings. - Related topic Recommender Systems
-
-
-
17Meta-search and Collaborative Filtering
- A major difference from SS applications
- In SS applications, number of voters is large,
number of candidates is small. - In CS applications, number of voters (search
engines) is small, number of candidates (pages)
is large. - This makes for major new complications and
research challenges. -
-
-
18Large Databases and Inference
- Real data often in form of sequences
- GenBank has over 7 million sequences comprising
8.6 billion bases. - The search for similarity or patterns has
extended from pairs of sequences to finding
patterns that appear in common in a large number
of sequences or throughout the database
consensus sequences. - Emerging field of Bioconsensus applies SS
consensus methods to biological databases. -
19Large Databases and Inference
Why look for such patterns? Similarities between
sequences or parts of sequences lead to the
discovery of shared phenomena. For example, it
was discovered that the sequence for platelet
derived factor, which causes growth in the body,
is 87 identical to the sequence for v-sis, a
cancer-causing gene. This led to the discovery
that v-sis works by stimulating growth.
20Large Databases and Inference
Example Bacterial Promoter Sequences studied by
Waterman (1989) RRNABP1 ACTCCCTATAATGCGCCA TNA
A GAGTGTAATAATGTAGCC UVRBP2
TTATCCAGTATAATTTGT SFC
AAGCGGTGTTATAATGCC Notice that if we are looking
for patterns of length 4, each sequence has the
pattern TAAT.
21Large Databases and Inference
Example Bacterial Promoter Sequences studied by
Waterman (1989) RRNABP1 ACTCCCTATAATGCGCCA TNA
A GAGTGTAATAATGTAGCC UVRBP2
TTATCCAGTATAATTTGT SFC
AAGCGGTGTTATAATGCC Notice that if we are looking
for patterns of length 4, each sequence has the
pattern TAAT.
22Large Databases and Inference
Example However, suppose that we add another
sequence M1 RNA AACCCTCTATACTGCGCG The
pattern TAAT does not appear here. However, it
almost appears, since the pattern TACT appears,
and this has only one mismatch from the pattern
TAAT.
23Large Databases and Inference
Example However, suppose that we add another
sequence M1 RNA AACCCTCTATACTGCGCG The
pattern TAAT does not appear here. However, it
almost appears, since the pattern TACT appears,
and this has only one mismatch from the pattern
TAAT. So, in some sense, the pattern TAAT is
a good consensus pattern.
24Large Databases and Inference
Example We make this precise using best mismatch
distance. Consider two sequences a and b with b
longer than a. Then d(a,b) is the smallest
number of mismatches in all possible alignments
of a as a consecutive subsequence of b.
25Large Databases and Inference
Example a 0011, b 111010 Possible
Alignments 111010 111010 111010 0011 0011
0011 The best-mismatch distance is 2, which is
achieved in the third alignment.
26Large Databases and Inference
Example Now given a database of sequences a1,
a2, , an. Look for a pattern of length k. One
standard method (Smith-Waterman) look for a
consensus sequence b that minimizes ?ik-d(b,ai)
/d(b,ai), where d is best mismatch distance. In
fact, this turns out to be equivalent to
calculating medians like Kemeny-Snell
medians. Algorithms for computing consensus
sequences are important in modern molecular
biology.
27Large Databases and Inference
- Preferential Queries
- Look for flight from New York to Beijing
- Have preferences for
- airline
- itinerary
- type of ticket
- Try to combine responses from multiple
travel-related websites - Sequential decision making Next query or
information access depends on prior responses. -
28Consensus Computing, Image Processing
- Old SS problem Dynamic modeling of how
individuals change opinions over time, eventually
reaching consensus. - Often use dynamic models on graphs
- Related to neural nets.
- CS applications distributed computing.
- Values of processors in a network are updated
until all have same value. -
29Consensus Computing, Image Processing
- CS application Noise removal in digital images
- Does a pixel level represent noise?
- Compare neighboring pixels.
- If values beyond threshold, replace pixel value
with mean or median of values of neighbors. - Related application in distributed computing.
- Values of faulty processors are replaced by those
of neighboring non-faulty ones. - Berman and Garay (1993) use parliamentary
procedure called cloture -
30Computational Intractability of Consensus
Functions
- Bartholdi, Tovey and Trick There are voting
schemes where it can be computationally
intractable to determine who won an election. - Computational intractability can be a good thing
in an election Designing voting systems where it
is computationally intractable to manipulate
the outcome of an election by insincere voting - Adding voters
- Declaring voters ineligible
- Adding candidates
- Declaring candidates ineligible
-
31Electronic Voting
- Issues
- Correctness
- Anonymity
- Availability
- Security
- Privacy
-
32Electronic Voting
- Security Risks in Electronic Voting
- Threat of denial of service attacks
- Threat of penetration attacks involving a
delivery mechanism to transport a malicious
payload to target host (thru Trojan horse or
remote control program) - Private and correct counting of votes
- Cryptographic challenges to keep votes private
- Relevance of work on secure multiparty
computation
33Electronic Voting
- Other CS Challenges
- Resistance to vote buying
- Development of user-friendly interfaces
- Vulnerabilities of communication path between the
voting client (where you vote) and the server
(where votes are counted) - Reliability issues random hardware and software
failures
34Software Hardware Measurement
- Theory of measurement developed by mathematical
social scientists - Measurement theory studies ways to combine scores
obtained on different criteria. - A statement involving scales of
- measurement is considered meaningful if its
truth or falsity is unchanged under acceptable
transformations of all scales involved. - Example It is meaningful to say that I weigh
more than my daughter. - That is because if it is true in kilograms, then
it is also true in pounds, in grams, etc. -
35Software Hardware Measurement
- Measurement theory has studied what statements
you can make after averaging scores. - Think of averaging as a consensus method.
- One general principle To say that the average
score of one set of tests is greater than the
average score of another set of tests is not
meaningful (it is meaningless) under certain
conditions. - This is often the case if the averaging procedure
is to take the arithmetic mean If s(xi) is score
of xi, i 1, 2, , n, then arithmetic mean is - ?is(xi)/n.
- Long literature on what averaging methods lead to
meaningful conclusions. -
36Software Hardware Measurement
- A widely used method in hardware measurement
- Score a computer system on different benchmarks.
- Normalize score relative to performance of one
base system - Average normalized scores
- Pick system with highest average.
- Fleming and Wallace (1986) Outcome can depend on
choice of base system. - Meaningless in sense of measurement theory
- Leads to theory of merging normalized scores
-
37Software Hardware Measurement
BENCHMARK
E
F
G
H
I
417 83 66 39,449 772
244 70 153 33,527 368
134 70 135 66,000 369
P R O C E S S O R
R
M
Z
Data from Heath, Comput. Archit. News (1984)
38Software Hardware Measurement
- Normalize Relative to Processor R
-
BENCHMARK
E
F
G
H
I
417 1.00 83 1.00 66 1.00 39,449 1.00 772 1.00
244 .59 70 .84 153 2.32 33,527 .85 368 .48
134 .32 70 .85 135 2.05 66,000 1.67 369 .45
P R O C E S S O R
R
M
Z
39Software Hardware Measurement
- Take Arithmetic Mean of Normalized Scores
-
Arithmetic Mean
BENCHMARK
E
F
G
H
I
417 1.00 83 1.00 66 1.00 39,449 1.00 772 1.00
244 .59 70 .84 153 2.32 33,527 .85 368 .48
134 .32 70 .85 135 2.05 66,000 1.67 369 .45
P R O C E S S O R
1.00
R
1.01
M
1.07
Z
40Software Hardware Measurement
- Take Arithmetic Mean of Normalized Scores
-
Arithmetic Mean
BENCHMARK
E
F
G
H
I
417 1.00 83 1.00 66 1.00 39,449 1.00 772 1.00
244 .59 70 .84 153 2.32 33,527 .85 368 .48
134 .32 70 .85 135 2.05 66,000 1.67 369 .45
P R O C E S S O R
1.00
R
1.01
M
1.07
Z
Conclude that machine Z is best
41Software Hardware Measurement
- Now Normalize Relative to Processor M
-
BENCHMARK
E
F
G
H
I
417 1.71 83 1.19 66 .43 39,449 1.18 772 2.10
244 1.00 70 1.00 153 1.00 33,527 1.00 368 1.00
134 .55 70 1.00 135 .88 66,000 1.97 369 1.00
P R O C E S S O R
R
M
Z
42Software Hardware Measurement
- Take Arithmetic Mean of Normalized Scores
-
Arithmetic Mean
BENCHMARK
E
F
G
H
I
417 1.71 83 1.19 66 .43 39,449 1.18 772 2.10
244 1.00 70 1.00 153 1.00 33,527 1.00 368 1.00
134 .55 70 1.00 135 .88 66,000 1.97 369 1.00
1.32
P R O C E S S O R
R
1.00
M
1.08
Z
43Software Hardware Measurement
- Take Arithmetic Mean of Normalized Scores
-
Arithmetic Mean
BENCHMARK
E
F
G
H
I
417 1.71 83 1.19 66 .43 39,449 1.18 772 2.10
244 1.00 70 1.00 153 1.00 33,527 1.00 368 1.00
134 .55 70 1.00 135 .88 66,000 1.97 369 1.00
1.32
P R O C E S S O R
R
1.00
M
1.08
Z
Conclude that machine R is best
44Software and Hardware Measurement
- So, the conclusion that a given machine is best
by taking arithmetic mean of normalized scores is
meaningless in this case. - Above example from Fleming and Wallace (1986),
data from Heath (1984) - Sometimes, geometric mean is helpful.
- Geometric mean is
- ? ?is(xi)
-
?
n
45Software Hardware Measurement
- Normalize Relative to Processor R
-
Geometric Mean
BENCHMARK
E
F
G
H
I
417 1.00 83 1.00 66 1.00 39,449 1.00 772 1.00
244 .59 70 .84 153 2.32 33,527 .85 368 .48
134 .32 70 .85 135 2.05 66,000 1.67 369 .45
P R O C E S S O R
R
1.00
.86
M
.84
Z
Conclude that machine R is best
46Software Hardware Measurement
- Now Normalize Relative to Processor M
-
BENCHMARK
Geometric Mean
E
F
G
H
I
417 1.71 83 1.19 66 .43 39,449 1.18 772 2.10
244 1.00 70 1.00 153 1.00 33,527 1.00 368 1.00
134 .55 70 1.00 135 .88 66,000 1.97 369 1.00
P R O C E S S O R
R
1.17
1.00
M
.99
Z
Still conclude that machine R is best
47Software and Hardware Measurement
- In this situation, it is easy to show that the
conclusion that a given machine has highest
geometric mean normalized score is a meaningful
conclusion. - Even meaningful A given machine has geometric
mean normalized score 20 higher than another
machine. - Fleming and Wallace give general conditions under
which comparing geometric means of normalized
scores is meaningful. - Research area what averaging procedures make
sense in what situations? Large literature. - Note There are situations where comparing
arithmetic means is meaningful but comparing
geometric means is not. -
48Software and Hardware Measurement
- Message from measurement theory to computer
science - Do not perform arithmetic operations on data
without paying attention to whether the
conclusions you get are meaningful. -
49CS and SS Outline
- CS and Consensus/Social Choice
- 2. CS and Game Theory
- 3. Algorithmic Decision Theory
-
-
50CS and Game Theory
- Game theory a long history in economics also in
operations research, mathematics - Recently, computer scientists discovering
relevance to their problems - Increasingly complex games arise in practical
applications auctions, Internet - Need new game-theoretic methods for CS problems.
- Need new CS methods to solve modern game theory
problems.
51CS and Game Theory Algorithmic Issues
- Nash Equilibrium
- Each player chooses a strategy
- If no player can benefit by changing his strategy
while others leave theirs unchanged, we are in
Nash equilibrium. - In 1951, Nash showed every game has a Nash
equilibrium. - How hard is this to compute?
John Nash Nobel prize winner
52Example Nash Equilibrium
- 2-player game
- Strategy number between 0 and 3
- Both players win lower amount.
- Player with higher amount pays 2 to player with
lower amount
Player 2 strategy
3
2
0
1
0,0 2,-2 2,-2 2,-2
-2,2 1,1 3,-1 3,-1
-2,2 -1,3 2,2 4,0
-2,2 -1,3 0,4 3,3
0
1
Player 1 strategy
2
3
Source Wikipedia
53Example Nash Equilibrium
- 0-0 is unique Nash equilibrium
- Any other strategy one player can lower his to
below others and improve.
Player 2 strategy
3
2
0
1
0,0 2,-2 2,-2 2,-2
-2,2 1,1 3,-1 3,-1
-2,2 -1,3 2,2 4,0
-2,2 -1,3 0,4 3,3
0
1
Player 1 strategy
2
3
Source Wikipedia
54Example Nash Equilibrium
- 0-0 is unique Nash equilibrium
- Any other strategy one player can lower his to
below others and improve. - E.g. From 2-2, player 1 lowers his number to 1
Player 2 strategy
3
2
0
1
0,0 2,-2 2,-2 2,-2
-2,2 1,1 3,-1 3,-1
-2,2 -1,3 2,2 4,0
-2,2 -1,3 0,4 3,3
0
1
Player 1 strategy
2
3
Source Wikipedia
55Example Nash Equilibrium
- 0-0 is unique Nash equilibrium
- Any other strategy one player can lower his to
below others and improve. - E.g. From 2-2, player 1 lowers his number to 1
Player 2 strategy
3
2
0
1
0,0 2,-2 2,-2 2,-2
-2,2 1,1 3,-1 3,-1
-2,2 -1,3 2,2 4,0
-2,2 -1,3 0,4 3,3
0
1
Player 1 strategy
2
3
Source Wikipedia
56Example Nash Equilibrium
- 0-0 is unique Nash equilibrium
- Any other strategy one player can lower his to
below others and improve. - E.g. From 2-2, player 1 lowers his number to 1
(or player 2 lowers his to 1)
Player 2 strategy
3
2
0
1
0,0 2,-2 2,-2 2,-2
-2,2 1,1 3,-1 3,-1
-2,2 -1,3 2,2 4,0
-2,2 -1,3 0,4 3,3
0
1
Player 1 strategy
2
3
Source Wikipedia
57CS and Game Theory Algorithmic Issues
- Nash Equilibrium
- 2-player games can use linear programming
methods. - Recent powerful result (Daskalakis, Goldberg,
Papadimitriou 2005) for 4-player games, problem
is PPAD-complete. - (PPAD class of search problems where solution is
known to exist by graph-theoretic arguments.) - PPAD-complete means If exists polynomial
algorithm, then exists one for Brouwer fixed
points, which seems unlikely.
58CS and Game Theory Algorithmic Issues
- Other Algorithmic Challenges
- Repeated games.
- Issues of sequential decision making
- Issues of learning to play
- Other solution concepts in multi-player games
power indices (Shapley, Banzhaf, Coleman) - Need calculate them for huge games
- Mostly computationally intractable
- Arise in many applications in CS, e.g.,
multicasting
59Computational Issues in Auction Design
- Auctions increasingly used in business and
government. - Information technology allows complex auctions
with huge number of bidders. - Auctions are unusually complicated games.
60Computational Issues in Auction Design
Bidding functions maximizing expected profit can
be exceedingly difficult to compute. Determining
the winner of an auction can be extremely hard.
(Rothkopf, Pekec, Harstad 1998)
61Computational Issues in Auction Design
- Combinatorial Auctions
- Multiple goods auctioned off.
- Submit bids for combinations of goods.
- This leads to NP-complete allocation problems.
- Might not even be able to feasibly express all
possible preferences for all subsets of goods. - Rothkopf, Pekec, Harstad (1998) determining
winner is computationally tractable for many
economically interesting kinds of combinations.
62Computational Issues in Auction Design
- Some other Issues
- Internet auctions Unsuccessful bidders learn
from previous auctions. - Issues of learning in repeated plays of a game.
- Related to software agents acting on behalf of
humans in electronic marketplaces based on
auctions. - Cryptographic methods needed to preserve privacy
of participants.
63Allocating/Sharing Costs Revenues
- Game-theoretic solutions have long been used to
allocate costs to different users in shared
projects. - Allocating runway fees in airports
- Allocating highway fees to trucks of different
sizes - Universities sharing library facilities
- Fair allocation of telephone calling charges
among users sharing complex phone systems
(Cornells experiment)
64Allocating/Sharing Costs Revenues
- Shapley Value
- Shapley value assigns a payoff to each player in
a multi-player game. - Consider a game in which some coalitions of
players win and some lose, with no subset of a
losing coalition winning. - Consider a coalition forming at random, one
player at a time. - A player i is pivotal if addition of i throws
coalition from losing to winning. - Shapley value of i probability i is pivotal if
an order of players is chosen at random. - In such games with winners/losers, called
Shapley-Shubik power index.
Lloyd Shapley
65Allocating/Sharing Costs Revenues
- Shapley Value
- Example Board of Directors of Company
- Shareholder 1 holds 3 shares.
- Shareholders 2, 3, 4, 5, 6, 7 hold 1 share each.
- A majority of shares are needed to make a
decision. - Coalition 1,4,6 is winning.
- Coalition 2,3,4,5,6 is winning.
- Shareholder 1 is pivotal if he is 3rd, 4th, or
5th. - So shareholder 1s Shapley value is 3/7.
- Sum of Shapley values is 1 (since they are
probabilities) - Thus, each other shareholder has Shapley value
- (4/7)/6 2/21
66Allocating/Sharing Costs Revenues
- Shapley Value
- Allocating Runway Fees at Airports
- Larger planes require longer runways.
- Divide runways into meter-long segments.
- Each month, we know how many landings a plane has
made. - Given a runway of length y meters, consider a
game in which the players are landings and a
coalition wins if the runway is not long enough
for planes in the coalition.
67Allocating/Sharing Costs Revenues
- Shapley Value
- Allocating Runway Fees at Airports
- A landing is pivotal if it is the first landing
added that makes a coalition require a longer
runway. - The Shapley value gives the cost of the yth meter
of runway to a given landing. - We then add up these costs over all runway
lengths a plane requires and all landings it
makes.
68Allocating/Sharing Costs Revenues
- Multicasting
- Applications in multicasting.
- Unicast routing Each packet sent from a source
is delivered to a single receiver. - Sending it to multiple sites Send multiple
copies and waste bandwidth. - In multicast routing Use a directed tree
- connecting source to all receivers.
- At branch points, a packet is duplicated as
- necessary.
69Multicasting
70Allocating/Sharing Costs Revenues
- Multicasting
- Multicast routing Use a directed tree connecting
source to all receivers. - At branch points, a packet is duplicated as
necessary. - Bandwidth is not directly attributable to a
single receiver. - How to distribute costs among receivers?
- One idea Use Shapley value.
71Allocating/Sharing Costs Revenues
- Feigenbaum, Papadimitriou, Shenker (2001) no
feasible implementation for Shapley value in
multicasting. - Note Shapley value is uniquely characterized by
four simple axioms. - Sometimes we state axioms as general principles
we want a solution concept to have. - Jain and Vazirani (1998) polynomial time
computable cost-sharing algorithm - Satisfying some important axioms
- Calculating cost of optimum multicast tree within
factor of two of optimal.
72Bounded Rationality
- Traditional game theory assumption Strategic
agents are fully rational can completely reason
about consequences of their actions. - But Consider bounded computational power.
73Bounded Rationality
- Some issues
- Looking at bounded rationality as bounded recall
in repeated games. - Modeling bounded rationality when strategies are
limited to those implementable on finite state
automata - What are optimal strategies in large, complex
games arising in CS applications for players with
bounded computational power? - E.g. How do players with limited computational
power determine minimal bid increases in an
auction to transform losing bids into winning
ones?
74Streaming Data in Game Theory
- Streaming Data Analysis
- When you only have one shot at the data as it
streams by - Widely used to detect trends and sound alarms in
applications in telecommunications and finance - ATT uses this to detect fraudulent use of credit
cards or impending billing defaults - Other relevant work methods for detecting
fraudulent behavior in financial systems
75Streaming Data in Game Theory
- Streaming Data Analysis
- One pass mechanism of interest in game
theory-based allocation schemes in multicasting
Herzog, Shenker, Estrin (1997) - Arises in on-line auctions.
- Need to develop bidding strategies if only one
pass is allowed
76CS and SS Outline
- CS and Consensus/Social Choice
- 2. CS and Game Theory
- 3. Algorithmic Decision Theory
-
-
77Algorithmic Decision Theory
- Decision makers in many fields (engineering,
medicine, economics, ) have - Remarkable new technologies to use
- Huge amounts of information to help them
- Ability to share information at unprecedented
speeds and quantities -
-
78Algorithmic Decision Theory
- These tools bring daunting new problems
- Massive amounts of data are often incomplete,
unreliable, or distributed - Interoperating/distributed decision makers and
decision making devices need coordination - Many sources of data need to be fused into a good
decision. - There are few highly efficient algorithms to
support decisions. -
-
79Sequential Decision Making
- Making some decisions before all data is in.
- Sequential decision problems arise in
- Communication networks
- Testing connectivity, paging cellular customers,
sequencing tasks - Manufacturing
- Testing machines, fault diagnosis, routing
customer service calls -
-
80Sequential Decision Making
- Sequential decision problems arise in
- Artificial Intelligence
- Optimal derivation strategies in knowledge bases,
best-value satisficing search, coding decision
tables - Medicine
- Diagnosing patients, sequencing treatments
-
-
81Sequential Decision Making
- Online Text Filtering Algorithms
- We seek to identify interesting documents from
a stream of documents - Widely studied problem in machine learning
-
-
82Sequential Decision Making
- Online Text Filtering Algorithms A Model
- As a document arrives, need to decide whether or
not to present it to an oracle - If document presented to oracle and is
interesting, get r reward units. - If presented and not interesting, get penalty of
c units. - What is a strategy for maximizing expected
payoff? - See Fradkin and Littman (2005) for recent work
using sequential decision making methods -
-
83Inspection Problems
- Inspection problem in what order to
- do tests to inspect containers for drugs,
bombs, etc.? - Do we inspect? What test do we do next? How do
outcomes of earlier tests affect this decision? - Simplest case Entities being inspected need to
be classified as ok (0) or suspicious (1). - Binary decision tree model for testing.
- Follow left branch if ok, right branch if
suspicious. - Find cost-minimizing binary decision tree.
-
84Inspection Problems
Follow left branch if ok, right branch if
suspicious.
85Sequential Decision Making Problem
- Some More Details
- Containers have attributes, each
- in a number of states
- Sample attributes
- Levels of certain kinds of chemicals or
biological materials - Whether or not there are items of a certain kind
in the cargo list - Whether cargo was picked up in a certain port
86Sequential Decision Making Problem
- Simplest Case Attributes are in state 0 or 1
- State 1 means have attribute and that is
suspicious. - Then Container is a binary string like 011001
- So Classification is a decision function F that
assigns each binary string to a category 0 or 1
A Boolean function.
011001
F(011001)
If attributes 2, 3, and 6 are present and others
are not, assign container to category F(011001).
87Binary Decision Tree Approach
- Reach category 1 from the root by
- a0 L to a1 R a2 R 1 or
- a0 R a2 R1
- Container classified in category 1 iff it has
- a1 and a2 and not a0 or
- a0 and a2 and possibly a1.
- Corresponding Boolean function F(111) F(101)
F(011) 1, F(abc) 0 otherwise.
88Binary Decision Tree Approach
- This binary decision tree corresponds to the same
Boolean function - F(111) F(101) F(011) 1, F(abc) 0
otherwise. - However, it has one less observation node ai. So,
it is more efficient if all observations are
equally costly and equally likely.
89Binary Decision Tree Approach
- Realistic problem much more difficult
- Test result errors
- Tests cost different amounts of money and take
different amounts of time - There are queues to wait for testing
- One can adjust the thresholds of detectors.
- There are penalties for false negatives and false
positives. - Challenging problems
- for computer science
Gamma ray detector
90Inspection Problems
- Problem of finding optimal binary decision tree
has many other uses - AI rule-based systems
- Circuit complexity
- Reliability analysis
- Theory of programming/databases
- In general, problem is NP-complete
91Inspection Problems
- Some cases of decision functions where the
problem is tractable - k-out-of-n systems
- Certain series-parallel systems
- Read-once systems
- regular systems
- Horn systems
- Recent results in case of inspection
problems at ports Stroud and Saeger - (2004), Anand, et al. (2006).
92Computational Approaches to Information
Management in Decision Making Representation and
Elicitation
- Successful decision making requires efficient
elicitation of information and efficient
representation of the information elicited. - Old problems in the social sciences.
- Computational aspects becoming a focal point
because of need to deal with massive and complex
information.
93Computational Approaches to Information
Management in Decision Making
- Representation and Elicitation
- Example I Social scientists study preferences
I prefer beef to fish - Extracting and representing preferences is key in
decision making applications.
94Computational Approaches to Information
Management in Decision Making
- Representation and Elicitation
- Brute force approach For every pair of
alternatives, ask which is preferred to the
other. - Often computationally infeasible.
95Computational Approaches to Information
Management in Decision Making
- Representation and Elicitation
- In many applications (repeated games,
collaborative filtering), important to elicit
preferences automatically. - CP-nets introduced as tool to represent
preferences succinctly and provide ways to make
inferences about preferences (Boutilier, Brafman,
Doomshlak, Hoos, Poole 2004).
96Computational Approaches to Information
Management in Decision Making Representation and
Elicitation
- Example II combinatorial auctions.
- Decision maker needs to elicit preferences from
all agents for all plausible combinations of
items in the auction. - Similar problem arises in optimal bundling of
goods and services. - Elicitation requires exponentially many queries
in general.
97Computational Approaches to Information
Management in Decision Making Representation and
Elicitation
- Challenge Recognize situations in which
efficient elicitation and representation is
possible. - One result Fishburn, Pekec, Reeds (2002)
- Even more complicated When objects in auction
have complex structure. - Problem arises in
- Legal reasoning, sequential decision making,
automatic decision devices, collaborative
filtering.
98Concluding Comment
- In recent years, interplay between CS
- and biology has transformed major
- parts of Bio into an information science.
- Led to major scientific breakthroughs in biology
such as sequencing of human genome. - Led to significant new developments in CS, such
as database search. - The interplay between CS and SS not nearly as far
along. - Moreover problems are spread over many
disciplines.
99Concluding Comment
- However, CS-SS interplay has already developed a
unique momentum of its own. - One can expect many more exciting outcomes as
partnerships between computer scientists and
social scientists expand and mature.
100(No Transcript)