Artificial Intelligence in Games - PowerPoint PPT Presentation

1 / 30

About This Presentation

Title:

Artificial Intelligence in Games

Description:

After 200,000 training games with a basic board encoding the network was as ... was trained on a corpus of expert games and used a sophisticated board encoding. ... – PowerPoint PPT presentation

Number of Views:147

Avg rating:3.0/5.0

Slides: 31

Provided by: davidsi7

Category:

more less

Transcript and Presenter's Notes

Title: Artificial Intelligence in Games

1
Artificial Intelligence in Games

CA107 Topics in Computing
Dr. David Sinclair
School of Computer Applications
David.Sinclair_at_computing.dcu.ie

2
What is Artificial Intelligence?

There are many answers but the simplest
definition is
A field of research whose goal is to make
machines that do things that require
intelligence if done by a human.
Intelligence is the ability to learn and
understand, to solve problems and make decisions.

3
Why add A.I. to games?

From the A.I. side of the house
Games are excellent testbeds as they
Have well-defined rules generating a large search
space
Easily represented in a computer
easy to test
From the Games side of the house
A.I. can make the game much more enjoyable to
play.

4
Search

The brute force approach of search has been
highly effective in games such as Draughts and
Chess.
Draughts/Checkers
Chinook (World Champion)
Chess
Best programs can hold their own with the best
humans.
Deep Blue II
move generation and evaluation in hardware
parallel search in software

5
Total Search

From the starting position
Generate every legal move for player 1.
For each legal move of player 1 generate every
legal move for player 2.
Repeat steps 1 2 until the game reaches a
definitive result.

6
Problem with Total Search

Not practical
A player in chess has, on average, 36 legal
moves.
A game could take 45 moves to reach a conclusion
(underestimate).
Total number of positions 3690
There is only 1081 atoms in the universe
Couldnt store all the positions in computer the
size of the universe.

7
Evaluation Functions

Searching from a position to a definitive result
is not practical.
Generate all possible outcomes in a fixed number
of moves from a position.
Builds a game tree
For each terminal position in the game tree
calculate the likelihood that the terminal
position will result in a win, loss or draw for
the player moving.

8
Searching the Game Tree
-2
-3
-5
-2
-2
-5
1
4
2
-3
This is the Minimax Algorithm
9
Improving Minimax

The Minimax Algorithm has various improvements
that are used in paractice.
Alpha-Beta
Principle Variation Search (PVS)
Transposition Tables
Killer Move Heuristics
At best they can halve the work of the search.

10
Computer Chess

Deep Blue II
256 dedicated chess processors
generate moves
evaluate positions
Search process in software (PVS)
Database of opening sequences
Databases of endgame sequences
Deep Blue II can evaluate 200 million possitions
per second (3 billion in 3 minutes).
Deep Blue II can hold its own with the best
players in the world, but it is not invincible!

11
Learning

Backgammon is a very interesting testing ground
for computer game playing for two reasons
the stochastic nature of the game and
the experience that an accurate evaluation of a
position is far more effective than a deep
search.
Backgammon (TD-Gammon)
In the top 10 (6th)
One of the top human players says it has a better
evaluation capability than him.
Has changed the way humans play backgammon.

12
Learning how to evaluate a position

Evaluates positions with a neural network that
has trained itself by playing over 200,000 games
against itself.
From an initial state of knowing the rules and
zero strategical/tactical knowlegde
network learned a number of elementary strategies
and tactics during the first few thousand
training games against itself .
After several tens of thousands of training games
more sophisticated concepts began to emerge.

13
Learning how to evaluate a position (contd.)

After 200,000 training games with a basic board
encoding the network was as strong as its
predecessor NeuroGammon.
NeuroGammon was trained on a corpus of expert
games and used a sophisticated board encoding.
When TD-Gammon was retrained using NeuroGammons
board encoding, TD-Gammon reached the level of
strong master play.

14
Deep AnchorsTD-Gammons influence on humans
What should white play when he rolls a double 4?
15
Opponent Modeling - Poker

Poker differs from games such as Chess and
Draughts in two major respects.
it is a game of imperfect information
the game-theoretic optimal strategy does work as
well as a maximising strategy in practice
An essential element is bluffing (betting to give
the impression that a bad hand is good) and
sandbagging (betting to give the impression that
a good hand is bad)
To do this you need to model your opponent!

16
Properties of a World Class Poker program

Hand Strength Assessment
Hand Potential Assessment
Betting Strategy
Bluffing
Unpredictability
Opponent Modelling

17
Loki

Play Texas Holdem
Pre-flop Each player is dealt two cards face
down, followed by the first round of betting.
Flop Three community cards are dealt face up and
a second round of betting occurs.
Turn A fourth community card is dealt face up
and the third round of betting occurs.
River A final fifth community card is dealt face
up and the final round of betting occurs.
There are 1326 possible combinations from the
initial two cards.

18
Hand Strength Assessment in Loki

Loki played a million hands to calculate the
approximate income rate from each starting hand.
After the pre-flop, there are 47 remaining
unknown cards and 1081 possible hands an opponent
might hold. We can calculate how many of these
hands, combined with the community cards, will
lose to our hand, tie with our hand or be beaten
by our hand.
For example, if our hand is A?-Q? and the flop is
3?-4?-J? then 444 cases would beat us, 9 would
tie and the remaining 628 cases would lose to our
hand. Therefore our hand strength is 0.585
(58.5).

19
Opponent Modeling in Loki

For each of the possible 1081 combinations of
hole cards an opponent may have a weight is
assigned to it.
These weights are determined by the 169 distinct
income rates determined by simulation.
There are 36 possible classes of opponent actions
depending on
their action (fold, call/check, bet/raise),
how much the action costs (bets of 0, 1, gt1) and
when the action occurred (pre-flop, flop, turn,
river).

20
Opponent Modeling in Loki (contd.)

Each action modifies the probabilities for each
of the possible 1081 combinations of hole cards
an opponent may have.
We can make the opponent models interact so that
if one players hand has a very high probability
of containing an A?, then we can reduce the
weights on other players hands that also contain
the A?.

21
Intelligent Opponents?

A simple way to make an opponent appear
intelligent is to use a stochastic state machine.
Stochastic ? random element
State machine is a program that is in a definite
state and only moves from state to state
depending on how it is interacted with.

22
Example (loosely based on Civilisation II)

There are a collection of civilisations competing
for shared resources. A civilisation behaviour to
another civilisation is influenced by
The goodwill 0...100 between the two
civilisations and
The expected/actual gain -100...100.
An action will modify the goodwill

23
State machine
goodwill lt 85
ALLIED treaty 0.8 assist 0.8
COOPERATIVE treaty 0.4 assist 0.2
goodwill gt 85
goodwill lt 65
goodwill gt 65
goodwill lt 65
Neutral treaty 0.1 attack 0.1 assist 0.05
goodwill gt 85
30 lt goodwill lt 45 or gain gt 60
goodwill lt 50 and gain gt 90
goodwill gt 45
AGGRESIVE attack 0.6 ceasefire 0.3 peace
0.8
HOSTILE attack 0.95 ceasefire 0.05
goodwill gt 30 or gain gt -50
gain gt10
24
Go

Go represents the biggest challenge yet to the
application of A.I. in games.
None of the existing techniques has proved
sucessful.
Go will require a combinations of techniques.
Pattern matching
Search (Forward prunning and focusing)
Planning
Resolving threats and plans

25
Go the game

Go is played on a 19x19 grid made of horizontal
and vertical lines. Each player, black and white,
place stones on the intersection points of the
grid. Once a stone is placed it cannot be moved,
unless it is captured.
Each stone or set of vertically and/or
horizontally connected stones has a set of
liberty points. These are the vertically and
horizontally unoccupied adjacent grid points.

26
Go the game (contd.)

To capture a group of stones all you need do is
reduce the groups number of liberty points to
zero.
There are 2 restrictions on placing stones on the
board.
The first is that you cannot place a stone on a
point that would result in it having no
liberties. This is called suicide.
The second is that you cannot immediately play a
stone on a point that has just been captured. You
must play the stone elsewhere on the board on the
move immediately after the capture. Then you can
return to the capture point.

27
Capture and liberties
If white plays a stone at the point a then the
three black stones will be captured and removed
from the board.
a
28
Eyes
This black group of stones can never be captured
since white would have to remove both the
liberties at the a and b point at the same time.
But white can only play one stone at a time, and
white cannot play into a or b as this is suicide.
29
The result

At each turn a player has the option of placing a
stone on the board or passing (skipping their
move).
The game continues with both players placing
stones on the board until both players pass
consecutively.
The result is determined by each player adding up
the territory they control plus the number of the
opponents stones they have captured. Each
territory point controlled and each stone
captured are worth one point.

30
Go the great challenge

Huge branching factor (180 for Go, 36 for
Chess)
evaluation of a position
in Chess there is a good correlation between the
strength of position and the number and quality
of pieces.
in Go there is a poor correlation between the
strength of the position and the number of stones
and territory surrounded.
Best Go program is standard of average player
(despite a 1 million prize).