CS 4700: Foundations of Artificial Intelligence - PowerPoint PPT Presentation

About This Presentation

Title:

CS 4700: Foundations of Artificial Intelligence

Description:

For chess, typically linear weighted sum of features ... a good idea in chess, as well as almost everywhere else! ... Blue's non-Chess hardware is actually ... – PowerPoint PPT presentation

Number of Views:49

Avg rating:3.0/5.0

Slides: 60

Provided by: csCor

Learn more at: https://www.cs.cornell.edu

Category:

more less

Transcript and Presenter's Notes

Title: CS 4700: Foundations of Artificial Intelligence

1
CS 4700Foundations of Artificial Intelligence

Carla P. Gomes
gomes_at_cs.cornell.edu
Module
Adversarial Search
(Reading RN Chapter 6)

2
Outline

Game Playing
Optimal decisions
Minimax
a-ß pruning
Imperfect, real-time decisions

3
Game Playing

Mathematical Game Theory
Branch of economics that views any multi-agent
environment as a game, provided that the impact
of each agent on the others is significant,
regardless of whether the agents are cooperative
or competitive.
Game Playing in AI (typical case)
Deterministic
Turn taking
2-player
Zero-sum game of perfect information (fully
observable)

4
Game Playing vs. Search

Game vs. search problem
"Unpredictable" opponent ? specifying a move for
every possible opponent reply
Time limits ? unlikely to find goal, must
approximate

5
Game Playing

Formal definition of a game
Initial state
Successor function returns list of (move, state)
pairs
Terminal test determines when game over
Terminal states states where game ends
Utility function (objective function or payoff
function) gives numeric value for terminal
states

We will consider games with 2 players (Max and
Min) Max moves first.
6
Game Tree ExampleTic-Tac-Toe
Tree from Maxs perspective
7
Minimax Algorithm

Minimax algorithm
Perfect play for deterministic, 2-player game
Max tries to maximize its score
Min tries to minimize Maxs score (Min)
Goal move to position of highest minimax value
? Identify best achievable payoff against best
play

8
Minimax Algorithm
9
Minimax Algorithm
10
Minimax Algorithm (contd)
3
9
0
7
2
6
11
Minimax Algorithm (contd)
3
0
2
3
9
0
7
2
6
12
Minimax Algorithm (contd)
3
3
0
2
3
9
0
7
2
6
13
Minimax Algorithm (contd)

Properties of minimax algorithm
Complete? Yes (if tree is finite)
Optimal? Yes (against an optimal opponent)
Time complexity? O(bm)
Space complexity? O(bm) (depth-first exploration,
if it generates all successors at once)

m maximum depth of tree b branching factor
For chess, b 35, m 100 for "reasonable"
games? exact solution completely infeasible

m maximum depth of the tree b legal moves
14
Minimax Algorithm

Limitations
Not always feasible to traverse entire tree
Time limitations
Key Improvement
Use evaluation function instead of utility
Evaluation function provides estimate of utility
at given position

? More soon
15
a-ß Pruning

Can we improve search by reducing the size of
the game tree to be examined?

? Yes!!! Using alpha-beta pruning

Principle
If a move is determined worse than another move
already examined, then there is no need for
further examination of the node.

16
a-ß Pruning Example
17
a-ß Pruning Example (contd)
18
a-ß Pruning Example (contd)
19
a-ß Pruning Example (contd)
20
a-ß Pruning Example (contd)
21
Alpha-Beta Pruning (aß prune)

Rules of Thumb
a is the best ( highest) found so far along the
path for Max
ß is the best (lowest) found so far along the
path for Min
Search below a MIN node may be alpha-pruned if
the its ß ? ? of some MAX ancestor
Search below a MAX node may be beta-pruned if the
its ?? ß of some MIN ancestor.

22
Alpha-Beta Pruning Example

Search below a MIN node may be alpha-pruned if
the beta value is lt to the alpha value of some
MAX ancestor.
2. Search below a MAX node may be beta-pruned if
the alpha value is gt to the beta value of some
MIN ancestor.

23
Alpha-Beta Pruning Example

Search below a MIN node may be alpha-pruned if
the beta value is lt to the alpha value of some
MAX ancestor.
2. Search below a MAX node may be beta-pruned if
the alpha value is gt to the beta value of some
MIN ancestor.

3
3
3
24
Alpha-Beta Pruning Example

Search below a MIN node may be alpha-pruned if
the beta value is lt to the alpha value of some
MAX ancestor.
2. Search below a MAX node may be beta-pruned if
the alpha value is gt to the beta value of some
MIN ancestor.

3
3
5
3
ß
25
Alpha-Beta Pruning Example

Search below a MIN node may be alpha-pruned if
the beta value is lt to the alpha value of some
MAX ancestor.
2. Search below a MAX node may be beta-pruned if
the alpha value is gt to the beta value of some
MIN ancestor.

3
0
3
a
5
0
3
ß
26
Alpha-Beta Pruning Example

Search below a MIN node may be alpha-pruned if
the beta value is lt to the alpha value of some
MAX ancestor.
2. Search below a MAX node may be beta-pruned if
the alpha value is gt to the beta value of some
MIN ancestor.

3
0
2
3
a
a
5
2
0
3
ß
27
?-ß Search Algorithm
pruning
pruning
See page 170 RN
28
The a-ß algorithm
29
The a-ß algorithm
30
Another Example

Search below a MIN node may be alpha-pruned if
the beta value is lt to the alpha value of some
MAX ancestor.
2. Search below a MAX node may be beta-pruned if
the alpha value is gt to the beta value of some
MIN ancestor.

31
Example

Search below a MIN node may be alpha-pruned if
the beta value is lt to the alpha value of some
MAX ancestor.
2. Search below a MAX node may be beta-pruned if
the alpha value is gt to the beta value of some
MIN ancestor.

5
5
3
3
5
7
3
6
5
a
6
5
3
3
5
1
2
0
6
7
4
ß
32
Why is it called a-ß?

a is the value of the best (i.e., highest-value)
choice found so far at any choice point along the
path for max
If v is worse than a, max will avoid it
? prune that branch
Define ß similarly for min

33
Properties of a-ß Prune

Pruning does not affect final result
Good move ordering improves effectiveness of
pruning b(e.g., chess, try captures first, then
threats, froward moves, then backward moves)
With "perfect ordering," time complexity
O(bm/2)
? doubles depth of search that alpha-beta pruning
can explore

Example of the value of reasoning about which
computations are relevant (a form of
metareasoning)

34
Resource limits

Suppose we have 100 secs, explore 104 nodes/sec?
106 nodes per move
Standard approach
evaluation function
estimated desirability of position
cutoff test
e.g., depth limit

What is the problem with that?

add quiescence search
quiescent position position where next move
unlikely to cause large change in players
positions

35
Cutoff Search

Suppose we have 100 secs, explore 104 nodes/sec?
106 nodes per move
Does it work in practice?
bm 106, b35 ? m4
4-ply lookahead is a hopeless chess player!
4-ply human novice
8-ply typical PC, human master
12-ply Deep Blue, Kasparov

Other improvements
36
Evaluation Function

Evaluation function
Performed at search cutoff point
Must have same terminal/goal states as utility
function
Tradeoff between accuracy and time ? reasonable
complexity
Accurate
Performance of game-playing system dependent on
accuracy/goodness of evaluation
Evaluation of nonterminal states strongly
correlated with actual chances of winning

37
Evaluation functions

For chess, typically linear weighted sum of
features
Eval(s) w1 f1(s) w2 f2(s) wn fn(s)
e.g., w1 9 with
f1(s) (number of white queens) (number of
black queens), etc.

Key challenge find a good evaluation
function Isolated pawns are bad. How well
protected is your king? How much maneuverability
to you have? Do you control the center of the
board? Strategies change as the game proceeds
38
When Chance is involvedBackgammon Board
39
Expectiminimax

Generalization of minimax for games with chance
nodes
Examples Backgammon, bridge
Calculates expected value where probability is
taken
over all possible dice rolls/chance events
- Max and Min nodes determined as before
- Chance nodes evaluated as weighted average

40
Game Tree for Backgammon

C

41
Expectiminimax
Expectiminimax(n)
Utility(n) for n, a terminal state
for n, a Max node
for n, a Min node
for n, a chance node
42
Expectiminimax
43
Expectiminimax Example
2
2
2
2
2
(00.67 60.33)
(00.67 60.33)
0
0
6
6
3
6
12
0
9
0
3
6
3
0
6
0
3
6
(31.0)
12
9
44
Chess Case Study
45
Combinatorics of Chess

Opening book
Endgame
database of all 5 piece endgames exists database
of all 6 piece games being built
Middle game
Positions evaluated (estimation)
1 move by each player 1,000
2 moves by each player 1,000,000
3 moves by each player 1,000,000,000

46
Positions with Smart Pruning

Search Depth Positions
2 60
4 2,000
6 60,000
8 2,000,000
10 (lt1 second DB) 60,000,000
12 2,000,000,000
14 (5 minutes DB) 60,000,000,000
16 2,000,000,000,000

How many lines of play does a grand master
consider?
Around 5 to 7
47
(No Transcript)
48
Formal Complexity of Chess
How hard is chess?

Obvious problem standard complexity theory tells
us nothing about finite games!
Generalizing chess to NxN board optimal play is
PSPACE-hard

49
Game Tree Search

How to search a game tree was independently
invented by Shannon (1950) and Turing (1951).
Technique called MiniMax search.
Evaluation function combines material position.
Pruning "bad" nodes doesn't work in practice
Extend "unstable" nodes (e.g. after captures)
works well in practice (Selection extension)

50
A Note on Minimax

Minimax obviously correct -- but
Nau (1982) discovered pathological game trees
Games where
evaluation function grows more accurate as it
nears the leaves
but performance is worse the deeper you search!

51
Clustering

Monte Carlo simulations showed clustering is
important
if winning or loosing terminal leaves tend to be
clustered, pathologies do not occur
in chess a position is strong or weak,
rarely completely ambiguous!
But still no completely satisfactory theoretical
understanding of why minimax is good!

52
History of Search Innovations

Shannon, Turing Minimax search 1950
Kotok/McCarthy Alpha-beta pruning 1966
MacHack Transposition tables 1967
Chess 3.0 Iterative-deepening 1975
Belle Special hardware 1978
Cray Blitz Parallel search 1983
Hitech Parallel evaluation 1985
Deep Blue ALL OF THE ABOVE 1997

53
Evaluation Functions

Primary way knowledge of chess is encoded
material
position
doubled pawns
how constrained position is
Must execute quickly - constant time
parallel evaluation allows more complex
functions
tactics patterns to recognitize weak positions
arbitrarily complicated domain knowledge

54
Learning better evaluation functions

Deep Blue learns by tuning weights in its board
evaluation function
f(p) w1f1(p) w2f2(p) ... wnfn(p)
Tune weights to find best least-squares fit with
respect to moves actually chosen by grandmasters
in 1000 games.
The key difference between 1996 and 1997 match!
Note that Kasparov also trained on
computer chess play.

55
Transposition Tables

Introduced by Greenblat's Mac Hack (1966)
Basic idea caching
once a board is evaluated, save in a hash table,
avoid re-evaluating.
called transposition tables, because different
orderings (transpositions) of the same set of
moves can lead to the same board.

56
Transposition Tables as Learning

Is a form of root learning (memorization).
positions generalize sequences of moves
learning on-the-fly
don't repeat blunders can't beat the computer
twice in a row using same moves!
Deep Blue --- huge transposition tables
(100,000,000), must be carefully managed.

57
Time vs Space

Iterative Deepening
a good idea in chess, as well as almost
everywhere else!
Chess 4.x, first to play at Master's level
trades a little time for a huge reduction in
space
lets you do breadth-first search with (more space
efficient) depth-first search
anytime good for response-time critical
applications

58
Special-Purpose and Parallel Hardware

Belle (Thompson 1978)
Cray Blitz (1993)
Hitech (1985)
Deep Blue (1987-1996)
Parallel evaluation allows more complicated
evaluation functions
Hardest part coordinating parallel search
Deep Blue never quite plays the same game,
because of noise in its hardware!

59
Deep Blue

Hardware
32 general processors
220 VSLI chess chips
Overall 200,000,000 positions per second
5 minutes depth 14
Selective extensions - search deeper at unstable
positions
down to depth 25 !

60
Evolution of Deep Blue

From 1987 to 1996
faster chess processors
port to IBM base machine from Sun
Deep Blues non-Chess hardware is actually quite
slow, in integer performance!
bigger opening and endgame books
1996 differed little from 1997 - fixed bugs and
tuned evaluation function!
After its loss in 1996, people underestimated its
strength!

61
(No Transcript)
62
Tactics into Strategy

As Deep Blue goes deeper and deeper into a
position, it displays elements of strategic
understanding. Somewhere out there mere tactics
translate into strategy. This is the closet
thing I've ever seen to computer intelligence.
It's a very weird form of intelligence, but you
can feel it. It feels like thinking.
Frederick Friedel (grandmaster), Newsday, May 9,
1997

63
Automated reasoning --- the path
1M 5M
Multi-agent systems combining reasoning, uncertai
nty learning
10301,020
0.5M 1M
VLSI Verification
10150,500
Case complexity
100K 450K
Military Logistics
106020
20K 100K
Chess (20 steps deep) Kriegspiel (!)
103010
No. of atoms On earth
10K 50K
Deep space mission control
1047
Seconds until heat death of sun
100 200
Car repair diagnosis
1030
Protein folding Calculation (petaflop-year)
Variables
100
10K
20K
100K
1M
Rules (Constraints)
25M Darpa research program --- 2004-2009
64
Kriegspiel
Pieces hidden from opponent
Interesting combination of reasoning, game
tree search, and uncertainty.
Another chess variant Multiplayer asynchronous
chess.
65
The Danger of Introspection

When people express the opinion that human
grandmasters do not examine 200,000,000 move
sequences per second, I ask them, How do you
know?'' The answer is usually that human
grandmasters are not aware of searching this
number of positions, or are aware of searching
many fewer. But almost everything that goes on
in our minds we are unaware of.
Drew McDermott

State-of-the-art of other games

67
Deterministic games in practice

Checkers Chinook ended 40-year-reign of human
world champion
Marion Tinsley in 1994. Used a pre-computed
endgame database
defining perfect play for all positions involving
8 or fewer pieces on
the board, a total of 444 billion positions.
2007 proved to be a draw! Schaeffer et al.
solved checkers for
White Doctor opening (draw) (about 50 other
openings).
Othello human champions refuse to compete
against computers, who are too good
Backgamon TD-Gamon is competitive with World
Champion (ranked
among the top 3 players in the world). Tesauro's
approach (1992) used
learning to come up with a good evaluation
function. Exciting application of
reinforcement learning.

68
Playing GO

Go human champions refuse to compete against
computers, who are too bad. In go, b gt 300, so
most programs use pattern knowledge bases to
suggest plausible moves (RN).

Not true! Computer Beats Pro at U.S. Go
Congress http//www.usgo.org/index.php?23_id460
2 On August 7, 2008, the computer program MoGo
running on 25 nodes (800 cores) beat professional
Go player Myungwan Kim (8p) in a handicap game on
the 19x19 board. The handicap given to the
computer was nine stones. MoGo uses Monte
Carlo based methods combined with, upper
confidence bounds applied to trees (UCT).
69
Summary

Game systems rely heavily on
Search techniques
Heuristic functions
Bounding and pruning technqiues
Knowledge database on game
For AI, the abstract nature of games makes them
an
appealing subject for study
state of the game is easy to represent
agents are usually restricted to a small number
of actions whose outcomes are defined by precise
rules

70
Summary

Game playing was one of the first tasks
undertaken in AI as soon as computers became
programmable (e.g., Turing, Shannon, Wiener
tackled chess). Game playing research has
spawned a number of interesting research ideas on
search, data structures, databases, heuristics,
evaluations functions and many areas of computer
science.
Games are fun Teach your computer how to play
a game!

Write a Comment

User Comments (0)