Title: Artificial Intelligence in Games
1Artificial Intelligence in Games
- CA107 Topics in Computing
- Dr. David Sinclair
- School of Computer Applications
- David.Sinclair_at_computing.dcu.ie
2What is Artificial Intelligence?
- There are many answers but the simplest
definition is - A field of research whose goal is to make
machines that do things that require
intelligence if done by a human. - Intelligence is the ability to learn and
understand, to solve problems and make decisions.
3Why add A.I. to games?
- From the A.I. side of the house
- Games are excellent testbeds as they
- Have well-defined rules generating a large search
space - Easily represented in a computer
- easy to test
- From the Games side of the house
- A.I. can make the game much more enjoyable to
play.
4Search
- The brute force approach of search has been
highly effective in games such as Draughts and
Chess. - Draughts/Checkers
- Chinook (World Champion)
- Chess
- Best programs can hold their own with the best
humans. - Deep Blue II
- move generation and evaluation in hardware
- parallel search in software
5Total Search
- From the starting position
- Generate every legal move for player 1.
- For each legal move of player 1 generate every
legal move for player 2. - Repeat steps 1 2 until the game reaches a
definitive result.
6Problem with Total Search
- Not practical
- A player in chess has, on average, 36 legal
moves. - A game could take 45 moves to reach a conclusion
(underestimate). - Total number of positions 3690
- There is only 1081 atoms in the universe
- Couldnt store all the positions in computer the
size of the universe.
7Evaluation Functions
- Searching from a position to a definitive result
is not practical. - Generate all possible outcomes in a fixed number
of moves from a position. - Builds a game tree
- For each terminal position in the game tree
calculate the likelihood that the terminal
position will result in a win, loss or draw for
the player moving.
8Searching the Game Tree
-2
-3
-5
-2
-2
-5
1
4
2
-3
This is the Minimax Algorithm
9Improving Minimax
- The Minimax Algorithm has various improvements
that are used in paractice. - Alpha-Beta
- Principle Variation Search (PVS)
- Transposition Tables
- Killer Move Heuristics
- At best they can halve the work of the search.
10Computer Chess
- Deep Blue II
- 256 dedicated chess processors
- generate moves
- evaluate positions
- Search process in software (PVS)
- Database of opening sequences
- Databases of endgame sequences
- Deep Blue II can evaluate 200 million possitions
per second (3 billion in 3 minutes). - Deep Blue II can hold its own with the best
players in the world, but it is not invincible!
11Learning
- Backgammon is a very interesting testing ground
for computer game playing for two reasons - the stochastic nature of the game and
- the experience that an accurate evaluation of a
position is far more effective than a deep
search. - Backgammon (TD-Gammon)
- In the top 10 (6th)
- One of the top human players says it has a better
evaluation capability than him. - Has changed the way humans play backgammon.
12Learning how to evaluate a position
- Evaluates positions with a neural network that
has trained itself by playing over 200,000 games
against itself. - From an initial state of knowing the rules and
zero strategical/tactical knowlegde - network learned a number of elementary strategies
and tactics during the first few thousand
training games against itself . - After several tens of thousands of training games
more sophisticated concepts began to emerge.
13Learning how to evaluate a position (contd.)
- After 200,000 training games with a basic board
encoding the network was as strong as its
predecessor NeuroGammon. - NeuroGammon was trained on a corpus of expert
games and used a sophisticated board encoding. - When TD-Gammon was retrained using NeuroGammons
board encoding, TD-Gammon reached the level of
strong master play.
14Deep AnchorsTD-Gammons influence on humans
What should white play when he rolls a double 4?
15Opponent Modeling - Poker
- Poker differs from games such as Chess and
Draughts in two major respects. - it is a game of imperfect information
- the game-theoretic optimal strategy does work as
well as a maximising strategy in practice - An essential element is bluffing (betting to give
the impression that a bad hand is good) and
sandbagging (betting to give the impression that
a good hand is bad) - To do this you need to model your opponent!
16Properties of a World Class Poker program
- Hand Strength Assessment
- Hand Potential Assessment
- Betting Strategy
- Bluffing
- Unpredictability
- Opponent Modelling
17Loki
- Play Texas Holdem
- Pre-flop Each player is dealt two cards face
down, followed by the first round of betting. - Flop Three community cards are dealt face up and
a second round of betting occurs. - Turn A fourth community card is dealt face up
and the third round of betting occurs. - River A final fifth community card is dealt face
up and the final round of betting occurs. - There are 1326 possible combinations from the
initial two cards.
18Hand Strength Assessment in Loki
- Loki played a million hands to calculate the
approximate income rate from each starting hand. - After the pre-flop, there are 47 remaining
unknown cards and 1081 possible hands an opponent
might hold. We can calculate how many of these
hands, combined with the community cards, will
lose to our hand, tie with our hand or be beaten
by our hand. - For example, if our hand is A?-Q? and the flop is
3?-4?-J? then 444 cases would beat us, 9 would
tie and the remaining 628 cases would lose to our
hand. Therefore our hand strength is 0.585
(58.5).
19Opponent Modeling in Loki
- For each of the possible 1081 combinations of
hole cards an opponent may have a weight is
assigned to it. - These weights are determined by the 169 distinct
income rates determined by simulation. - There are 36 possible classes of opponent actions
depending on - their action (fold, call/check, bet/raise),
- how much the action costs (bets of 0, 1, gt1) and
- when the action occurred (pre-flop, flop, turn,
river).
20Opponent Modeling in Loki (contd.)
- Each action modifies the probabilities for each
of the possible 1081 combinations of hole cards
an opponent may have. - We can make the opponent models interact so that
if one players hand has a very high probability
of containing an A?, then we can reduce the
weights on other players hands that also contain
the A?.
21Intelligent Opponents?
- A simple way to make an opponent appear
intelligent is to use a stochastic state machine. - Stochastic ? random element
- State machine is a program that is in a definite
state and only moves from state to state
depending on how it is interacted with.
22Example (loosely based on Civilisation II)
- There are a collection of civilisations competing
for shared resources. A civilisation behaviour to
another civilisation is influenced by - The goodwill 0...100 between the two
civilisations and - The expected/actual gain -100...100.
- An action will modify the goodwill
23State machine
goodwill lt 85
ALLIED treaty 0.8 assist 0.8
COOPERATIVE treaty 0.4 assist 0.2
goodwill gt 85
goodwill lt 65
goodwill gt 65
goodwill lt 65
Neutral treaty 0.1 attack 0.1 assist 0.05
goodwill gt 85
30 lt goodwill lt 45 or gain gt 60
goodwill lt 50 and gain gt 90
goodwill gt 45
AGGRESIVE attack 0.6 ceasefire 0.3 peace
0.8
HOSTILE attack 0.95 ceasefire 0.05
goodwill gt 30 or gain gt -50
gain gt10
24Go
- Go represents the biggest challenge yet to the
application of A.I. in games. - None of the existing techniques has proved
sucessful. - Go will require a combinations of techniques.
- Pattern matching
- Search (Forward prunning and focusing)
- Planning
- Resolving threats and plans
25Go the game
- Go is played on a 19x19 grid made of horizontal
and vertical lines. Each player, black and white,
place stones on the intersection points of the
grid. Once a stone is placed it cannot be moved,
unless it is captured. - Each stone or set of vertically and/or
horizontally connected stones has a set of
liberty points. These are the vertically and
horizontally unoccupied adjacent grid points.
26Go the game (contd.)
- To capture a group of stones all you need do is
reduce the groups number of liberty points to
zero. - There are 2 restrictions on placing stones on the
board. - The first is that you cannot place a stone on a
point that would result in it having no
liberties. This is called suicide. - The second is that you cannot immediately play a
stone on a point that has just been captured. You
must play the stone elsewhere on the board on the
move immediately after the capture. Then you can
return to the capture point.
27Capture and liberties
If white plays a stone at the point a then the
three black stones will be captured and removed
from the board.
a
28Eyes
This black group of stones can never be captured
since white would have to remove both the
liberties at the a and b point at the same time.
But white can only play one stone at a time, and
white cannot play into a or b as this is suicide.
29The result
- At each turn a player has the option of placing a
stone on the board or passing (skipping their
move). - The game continues with both players placing
stones on the board until both players pass
consecutively. - The result is determined by each player adding up
the territory they control plus the number of the
opponents stones they have captured. Each
territory point controlled and each stone
captured are worth one point.
30Go the great challenge
- Huge branching factor (180 for Go, 36 for
Chess) - evaluation of a position
- in Chess there is a good correlation between the
strength of position and the number and quality
of pieces. - in Go there is a poor correlation between the
strength of the position and the number of stones
and territory surrounded. - Best Go program is standard of average player
(despite a 1 million prize).