Game Playing - PowerPoint PPT Presentation

About This Presentation
Title:

Game Playing

Description:

Game Playing Perfect decisions Heuristically based decisions Pruning search trees Games involving chance What is a game? Search problem with Initial state: board ... – PowerPoint PPT presentation

Number of Views:189
Avg rating:3.0/5.0
Slides: 49
Provided by: Davi279
Category:
Tags: drawing | game | games | playing

less

Transcript and Presenter's Notes

Title: Game Playing


1
Game Playing
  • Perfect decisions
  • Heuristically based decisions
  • Pruning search trees
  • Games involving chance

2
What is a game?
  • Search problem with
  • Initial state board position and whose turn it
    is
  • Successor function What are possible moves from
    here?
  • Terminal test Is the game over?
  • Utility function How good is this terminal state?

3
Differences from problem solving
  • Multiagent environment
  • Opponent makes own choices!
  • Playing quickly may be important need a good
    way of approximating solutions and improving
    search

4
Starting pointLook at entire tree
5
Simple game
  • Lets play a game!
  • Motivate minimax

6
Minimax Decision
  • Assign a utility value to each possible ending
  • Assures best possible ending, assuming opponent
    also plays perfectly
  • opponent tries to give you worst possible ending
  • Depth-first search tree traversal that updates
    utility values as it recurses back up the tree

7
Simple game for exampleMinimax decision
MAX (player)
MIN(opponent)
3
12
8
2
4
6
14
5
2
8
Simple game for exampleMinimax decision
3
MAX (player)
MIN(opponent)
3
2
2
3
12
8
2
4
6
14
5
2
9
Properties of Minimax
  • Time complexity
  • O(bm)
  • Space complexity
  • O(bm) (or O(m) if you can just generate next
    successor)
  • Same complexity as depth-first search

10
Multiplayer games
  • Same strategy exactly, but each node has a
    utility for each player involved
  • Assume that each player maximizes own utility at
    each node

11
(No Transcript)
12
Typical tree size
  • For chess, b 35, m 100 for a reasonable
    game
  • completely intractable!

13
So what can you do?
  • Cutoff search early and apply a heuristic
    evaluation function
  • Evaluation function can represent point values to
    pieces, board position, and/or other
    characteristics
  • Evaluation function represents in some sense
    probability of winning
  • In practice, evaluation function is often a
    weighted sum

14
When do you cutoff search?
  • Most straightforward depth limit
  • ... or even iterative deepening
  • Bad in some cases
  • What if just beyond depth limit, catastrophic
    move happens?
  • One fix only apply evaluation function to
    quiescent moves, i.e. unlikely to have wild
    swings in evaluation function
  • Example no pieces about to be captured
  • Run test on state if not quiescent, run a
    quiescence search for a nearby suitable state

15
Horizon Effect
  • One piece is about to transform the game
  • e.g. pawn becoming queen
  • Opponent can prevent this for a long time, but
    not forever
  • Minimax places this stellar move beyond the
    horizon
  • Procrastination
  • Resolved (somewhat) with singular extensions
  • Go much deeper on best moves
  • Related to quiescent search

16
How much lookahead for chess?
  • Ply half-move
  • Human novice 4 ply
  • Typical PC, human master 8 ply
  • Deep Blue, Deep Fritz 10-20 ply
  • Kasparov, Kramnik 20-30 ply but only on select
    strategies
  • But if b35, m 10 (for example)
  • Time O(bm) 3510 3.5 x 1011
  • Need to cut this down

17
Alpha-Beta Pruning Example
MAX (player)
MIN(opponent)
3
3
12
8
2
18
Alpha-Beta Pruning Example
3
MAX (player)
  • Stop right here whenevaluating this node
  • opponent takesminimum of these nodes,
  • player will take maximumof nodes above

MIN(opponent)
3
3
12
8
2
19
Alpha-Beta Pruning Concept
If m gt n, Player wouldchoose the m-node toget a
guaranteed utilityof at least mn-node would
never bereached, stop evaluationof n-node as
soon as youfind child with smallerutility
m
n
20
Alpha-Beta Pruning Concept
If m lt n, Opponent wouldchoose the m-node toget
a guaranteed utilityof at mn-node would never
bereached, stop evaluation ofn-node as soon as
you find a child gt m
m
n
21
The Alpha and the Beta
  • For a leaf, a b utility
  • At a max node
  • a largest child utility found so far for MAX
  • b b of parent
  • At a min node
  • a a of parent
  • b smallest child utility found so far for MIN
  • For any node
  • a lt utility lt b
  • If I had to decide now, it would be...

22
A a -inf, b inf
B a -inf, b inf
C a -inf, b inf
D a -inf, b inf
E a 10, b 10 utility 10
Originally from http//yoda.cis.temple.edu8080/UG
AIWWW/lectures95/search/alpha-beta.html
23
A a -inf, b inf
B a -inf, b inf
C a -inf, b inf
D a -inf, b 10
E a 10, b 10
Originally from http//yoda.cis.temple.edu8080/UG
AIWWW/lectures95/search/alpha-beta.html
24
A a -inf, b inf
B a -inf, b inf
C a -inf, b inf
D a -inf, b 10
F a 11, b 11
Originally from http//yoda.cis.temple.edu8080/UG
AIWWW/lectures95/search/alpha-beta.html
25
A a -inf, b inf
B a -inf, b inf
C a -inf, b inf
D a -inf, b 10 utility 10
F a 11, b 11 utility 11
Originally from http//yoda.cis.temple.edu8080/UG
AIWWW/lectures95/search/alpha-beta.html
26
A a -inf, b inf
B a -inf, b inf
C a 10, b inf
D a -inf, b 10 utility 10
Originally from http//yoda.cis.temple.edu8080/UG
AIWWW/lectures95/search/alpha-beta.html
27
A a -inf, b inf
B a -inf, b inf
C a 10, b inf
G a 10, b inf
Originally from http//yoda.cis.temple.edu8080/UG
AIWWW/lectures95/search/alpha-beta.html
28
A a -inf, b inf
B a -inf, b inf
C a 10, b inf
G a 10, b inf
H a 9, b 9 utility 9
Originally from http//yoda.cis.temple.edu8080/UG
AIWWW/lectures95/search/alpha-beta.html
29
A a -inf, b inf
B a -inf, b inf
C a 10, b inf
G a 10, b 9 utility ?
H a 9, b 9
At an opponent node, with a gt b Stop here and
backtrack (never visit I)
Originally from http//yoda.cis.temple.edu8080/UG
AIWWW/lectures95/search/alpha-beta.html
30
A a -inf, b inf
B a -inf, b inf
C a 10, b inf utility 10
G a 10, b 9 utility ?
Originally from http//yoda.cis.temple.edu8080/UG
AIWWW/lectures95/search/alpha-beta.html
31
A a -inf, b inf
B a -inf, b 10
C a 10, b inf utility 10
Originally from http//yoda.cis.temple.edu8080/UG
AIWWW/lectures95/search/alpha-beta.html
32
A a -inf, b inf
B a -inf, b 10
J a -inf, b 10
... and so on!
Originally from http//yoda.cis.temple.edu8080/UG
AIWWW/lectures95/search/alpha-beta.html
33
How effective is alpha-beta in practice?
  • Pruning does not affect final result
  • With some extra heuristics (good move ordering)
  • Branching factor becomes b1/2
  • 35 ? 6
  • Can look ahead twice as far for same cost
  • Can easily reach depth 8 and play good chess

34
Deterministic games today
  • Checkers Chinook ended 40yearreign of human
    world champion Marion Tinsley in 1994. Used an
    endgame database defining perfect play for all
    positions involving 8 or fewer pieces on the
    board, a total of 443,748,401,247 positions.
  • Othello human champions refuse to compete
    against computers, who are too good.
  • Go human champions refuse to compete against
    computers, who are too bad. In go, b gt 300, so
    most programs use pattern knowledge bases to
    suggest plausible moves.

35
Deterministic games today
  • Chess Deep Blue defeated human world champion
    Gary Kasparov in a sixgame match in 1997. Deep
    Blue searched 197 million positions per second,
    used very sophisticated evaluation, and
    undisclosed methods for extending some lines of
    search up to 40 ply.

36
More on Deep Blue
  • Garry Kasparov, world champ, beat IBMs Deep Blue
    in 1996
  • In 1997, played a rematch
  • Game 1 Kasparov won
  • Game 2 Kasparov resigned when he could have had
    a draw
  • Game 3 Draw
  • Game 4 Draw
  • Game 5 Draw
  • Game 6 Kasparov made some bad mistakes, resigned

Info from http//www.mark-weeks.com/chess/97dk.h
tm
37
Kasparov said...
  • Unfortunately, I based my preparation for this
    match ... on the conventional wisdom of what
    would constitute good anti-computer
    strategy.Conventional wisdom is -- or was until
    the end of this match -- to avoid early
    confrontations, play a slow game, try to
    out-maneuver the machine, force positional
    mistakes, and then, when the climax comes, not
    lose your concentration and not make any tactical
    mistakes.It was my bad luck that this strategy
    worked perfectly in Game 1 -- but never again for
    the rest of the match. By the middle of the
    match, I found myself unprepared for what turned
    out to be a totally new kind of intellectual
    challenge.

http//www.cs.vu.nl/aske/db.html
38
Some technical details on Deep Blue
  • 32-node IBM RS/6000 supercomputer
  • Each node has a Power Two Super Chip (P2SC)
    Processor and 8 specialized chess processors
  • Total of 256 chess processors working in parallel
  • Could calculate 60 billion moves in 3 minutes
  • Evaluation function (tuned via neural networks)
    considers
  • material how much pieces are worth
  • position how many safe squares can pieces attack
  • king safety some measure of king safety
  • tempo have you accomplished little while
    opponent has gotten better position?
  • Written in C under AIX Operating System
  • Uses MPI to pass messages between nodes

http//www.research.ibm.com/deepblue/meet/html/d.3
.3a.html
39
Deep Fritz
  • Played world champion Vladimir Kramnik in 2002
  • More fair contest Kramnik could play with Deep
    Fritz software in advance
  • Ran on 40k 8 processor Compaq server running
    Windows XP, essentially same software sold for
    normal computers
  • Searched less moves than Deep Blue per second,
    but heuristics were better

Pic from ww.chess.gr
40
Kramnik starts strong
  • Game 1 Kramnik black, Fritz white
  • Typically play to a draw when playing black.
    Fritz ended up in Berlin endgame which Kramnik
    knows better than anyone. Kramnik sealed a draw.
  • Game 2 Kramnik white, Fritz black
  • Fritz makes a dreadfully stupid mistake that
    beginners dont even make. Kramnik wins.
    http//www.chessbase.com/images2/2002/bahrain/game
    s/bahrain2.htm
  • Game 3 Kramnik black, Fritz black
  • Fritz traded queens, but couldnt fight this kind
    of battle, Kramnik wins

41
But later
  • Game 4 Kramnik white, Fritz black
  • Kramnik ended up in a long, drawn out ending
    resulting in a draw
  • Game 5 Kramnik black, Fritz white
  • Deep in a difficult game, Kramnik makes worst
    mistake of career and resigns, Fritz wins
  • Game 6 Kramnik white, Fritz black
  • Kramnik resigns, but analysis after the fact
    hasnt found a certain win for black, Fritz wins
  • Game 7 Kramnik black, Fritz white
  • Kramnik plays to draw
  • Game 8 Kramnik white, Fritz black
  • 21 moves in, Kramnik cant do anything, offers
    draw and Fritz accepts

42
Alpha-Beta PruningCoding It
  • (defun max-value (state alpha beta)
  • (let ((node-value 0))
  • (if (cutoff-test state) (evaluate state)
  • (dolist (new-state (neighbors state) nil)
  • (setf node-value
  • (min-value new-state alpha beta))
  • (setf alpha (max alpha node-value))
  • (if (gt alpha beta) (return beta)))
  • alpha)))

43
Alpha-Beta PruningCoding It
  • (defun min-value (state alpha beta)
  • (let ((node-value 0))
  • (if (cutoff-test state) (evaluate state)
  • (dolist (new-state (neighbors state) nil)
  • (setf node-value
  • (max-value new-state alpha beta))
  • (setf beta (min beta node-value))
  • (if (lt beta alpha) (return alpha)))
  • beta)))

44
Nondeterminstic Games
  • Games with an element of chance (e.g., dice,
    drawing cards) like backgammon, Risk, RoboRally,
    Magic, etc.
  • Add chance nodes to tree

45
Example with coin flip instead of dice (simple)
0.5
0.5
0.5
0.5
2
4
7
4
6
0
5
-2
46
Example with coin flip instead of dice (simple)
3
3
-1
0.5
0.5
0.5
0.5
2
4
0
-2
2
4
7
4
6
0
5
-2
47
Expectiminimax Methodology
  • For each chance node, determine expected value
  • Evaluation function should be linear with value,
    otherwise expected value calculations are wrong
  • Evaluation should be linearly proportional to
    expected payoff
  • Complexity O(bmnm), where nnumber of random
    states (distinct dice rolls)
  • Alpha-beta pruning can be done
  • Requires a bounded evaluation function
  • Need to calculate upper / lower bounds on
    utilities
  • Less effective

48
Real World
  • Most gaming systems start with these concepts,
    then apply various hacks and tricks to get around
    computability problems
  • Databases of stored game configurations
  • Learning (coming up next) Chapter 18
Write a Comment
User Comments (0)
About PowerShow.com