Artificial Intelligence in Games - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Artificial Intelligence in Games

Description:

After 200,000 training games with a basic board encoding the network was as ... was trained on a corpus of expert games and used a sophisticated board encoding. ... – PowerPoint PPT presentation

Number of Views:147
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Artificial Intelligence in Games


1
Artificial Intelligence in Games
  • CA107 Topics in Computing
  • Dr. David Sinclair
  • School of Computer Applications
  • David.Sinclair_at_computing.dcu.ie

2
What is Artificial Intelligence?
  • There are many answers but the simplest
    definition is
  • A field of research whose goal is to make
    machines that do things that require
    intelligence if done by a human.
  • Intelligence is the ability to learn and
    understand, to solve problems and make decisions.

3
Why add A.I. to games?
  • From the A.I. side of the house
  • Games are excellent testbeds as they
  • Have well-defined rules generating a large search
    space
  • Easily represented in a computer
  • easy to test
  • From the Games side of the house
  • A.I. can make the game much more enjoyable to
    play.

4
Search
  • The brute force approach of search has been
    highly effective in games such as Draughts and
    Chess.
  • Draughts/Checkers
  • Chinook (World Champion)
  • Chess
  • Best programs can hold their own with the best
    humans.
  • Deep Blue II
  • move generation and evaluation in hardware
  • parallel search in software

5
Total Search
  • From the starting position
  • Generate every legal move for player 1.
  • For each legal move of player 1 generate every
    legal move for player 2.
  • Repeat steps 1 2 until the game reaches a
    definitive result.

6
Problem with Total Search
  • Not practical
  • A player in chess has, on average, 36 legal
    moves.
  • A game could take 45 moves to reach a conclusion
    (underestimate).
  • Total number of positions 3690
  • There is only 1081 atoms in the universe
  • Couldnt store all the positions in computer the
    size of the universe.

7
Evaluation Functions
  • Searching from a position to a definitive result
    is not practical.
  • Generate all possible outcomes in a fixed number
    of moves from a position.
  • Builds a game tree
  • For each terminal position in the game tree
    calculate the likelihood that the terminal
    position will result in a win, loss or draw for
    the player moving.

8
Searching the Game Tree
-2
-3
-5
-2
-2
-5
1
4
2
-3
This is the Minimax Algorithm
9
Improving Minimax
  • The Minimax Algorithm has various improvements
    that are used in paractice.
  • Alpha-Beta
  • Principle Variation Search (PVS)
  • Transposition Tables
  • Killer Move Heuristics
  • At best they can halve the work of the search.

10
Computer Chess
  • Deep Blue II
  • 256 dedicated chess processors
  • generate moves
  • evaluate positions
  • Search process in software (PVS)
  • Database of opening sequences
  • Databases of endgame sequences
  • Deep Blue II can evaluate 200 million possitions
    per second (3 billion in 3 minutes).
  • Deep Blue II can hold its own with the best
    players in the world, but it is not invincible!

11
Learning
  • Backgammon is a very interesting testing ground
    for computer game playing for two reasons
  • the stochastic nature of the game and
  • the experience that an accurate evaluation of a
    position is far more effective than a deep
    search.
  • Backgammon (TD-Gammon)
  • In the top 10 (6th)
  • One of the top human players says it has a better
    evaluation capability than him.
  • Has changed the way humans play backgammon.

12
Learning how to evaluate a position
  • Evaluates positions with a neural network that
    has trained itself by playing over 200,000 games
    against itself.
  • From an initial state of knowing the rules and
    zero strategical/tactical knowlegde
  • network learned a number of elementary strategies
    and tactics during the first few thousand
    training games against itself .
  • After several tens of thousands of training games
    more sophisticated concepts began to emerge.

13
Learning how to evaluate a position (contd.)
  • After 200,000 training games with a basic board
    encoding the network was as strong as its
    predecessor NeuroGammon.
  • NeuroGammon was trained on a corpus of expert
    games and used a sophisticated board encoding.
  • When TD-Gammon was retrained using NeuroGammons
    board encoding, TD-Gammon reached the level of
    strong master play.

14
Deep AnchorsTD-Gammons influence on humans
What should white play when he rolls a double 4?
15
Opponent Modeling - Poker
  • Poker differs from games such as Chess and
    Draughts in two major respects.
  • it is a game of imperfect information
  • the game-theoretic optimal strategy does work as
    well as a maximising strategy in practice
  • An essential element is bluffing (betting to give
    the impression that a bad hand is good) and
    sandbagging (betting to give the impression that
    a good hand is bad)
  • To do this you need to model your opponent!

16
Properties of a World Class Poker program
  • Hand Strength Assessment
  • Hand Potential Assessment
  • Betting Strategy
  • Bluffing
  • Unpredictability
  • Opponent Modelling

17
Loki
  • Play Texas Holdem
  • Pre-flop Each player is dealt two cards face
    down, followed by the first round of betting.
  • Flop Three community cards are dealt face up and
    a second round of betting occurs.
  • Turn A fourth community card is dealt face up
    and the third round of betting occurs.
  • River A final fifth community card is dealt face
    up and the final round of betting occurs.
  • There are 1326 possible combinations from the
    initial two cards.

18
Hand Strength Assessment in Loki
  • Loki played a million hands to calculate the
    approximate income rate from each starting hand.
  • After the pre-flop, there are 47 remaining
    unknown cards and 1081 possible hands an opponent
    might hold. We can calculate how many of these
    hands, combined with the community cards, will
    lose to our hand, tie with our hand or be beaten
    by our hand.
  • For example, if our hand is A?-Q? and the flop is
    3?-4?-J? then 444 cases would beat us, 9 would
    tie and the remaining 628 cases would lose to our
    hand. Therefore our hand strength is 0.585
    (58.5).

19
Opponent Modeling in Loki
  • For each of the possible 1081 combinations of
    hole cards an opponent may have a weight is
    assigned to it.
  • These weights are determined by the 169 distinct
    income rates determined by simulation.
  • There are 36 possible classes of opponent actions
    depending on
  • their action (fold, call/check, bet/raise),
  • how much the action costs (bets of 0, 1, gt1) and
  • when the action occurred (pre-flop, flop, turn,
    river).

20
Opponent Modeling in Loki (contd.)
  • Each action modifies the probabilities for each
    of the possible 1081 combinations of hole cards
    an opponent may have.
  • We can make the opponent models interact so that
    if one players hand has a very high probability
    of containing an A?, then we can reduce the
    weights on other players hands that also contain
    the A?.

21
Intelligent Opponents?
  • A simple way to make an opponent appear
    intelligent is to use a stochastic state machine.
  • Stochastic ? random element
  • State machine is a program that is in a definite
    state and only moves from state to state
    depending on how it is interacted with.

22
Example (loosely based on Civilisation II)
  • There are a collection of civilisations competing
    for shared resources. A civilisation behaviour to
    another civilisation is influenced by
  • The goodwill 0...100 between the two
    civilisations and
  • The expected/actual gain -100...100.
  • An action will modify the goodwill

23
State machine
goodwill lt 85
ALLIED treaty 0.8 assist 0.8
COOPERATIVE treaty 0.4 assist 0.2
goodwill gt 85
goodwill lt 65
goodwill gt 65
goodwill lt 65
Neutral treaty 0.1 attack 0.1 assist 0.05
goodwill gt 85
30 lt goodwill lt 45 or gain gt 60
goodwill lt 50 and gain gt 90
goodwill gt 45
AGGRESIVE attack 0.6 ceasefire 0.3 peace
0.8
HOSTILE attack 0.95 ceasefire 0.05
goodwill gt 30 or gain gt -50
gain gt10
24
Go
  • Go represents the biggest challenge yet to the
    application of A.I. in games.
  • None of the existing techniques has proved
    sucessful.
  • Go will require a combinations of techniques.
  • Pattern matching
  • Search (Forward prunning and focusing)
  • Planning
  • Resolving threats and plans

25
Go the game
  • Go is played on a 19x19 grid made of horizontal
    and vertical lines. Each player, black and white,
    place stones on the intersection points of the
    grid. Once a stone is placed it cannot be moved,
    unless it is captured.
  • Each stone or set of vertically and/or
    horizontally connected stones has a set of
    liberty points. These are the vertically and
    horizontally unoccupied adjacent grid points.

26
Go the game (contd.)
  • To capture a group of stones all you need do is
    reduce the groups number of liberty points to
    zero.
  • There are 2 restrictions on placing stones on the
    board.
  • The first is that you cannot place a stone on a
    point that would result in it having no
    liberties. This is called suicide.
  • The second is that you cannot immediately play a
    stone on a point that has just been captured. You
    must play the stone elsewhere on the board on the
    move immediately after the capture. Then you can
    return to the capture point.

27
Capture and liberties
If white plays a stone at the point a then the
three black stones will be captured and removed
from the board.
a
28
Eyes
This black group of stones can never be captured
since white would have to remove both the
liberties at the a and b point at the same time.
But white can only play one stone at a time, and
white cannot play into a or b as this is suicide.
29
The result
  • At each turn a player has the option of placing a
    stone on the board or passing (skipping their
    move).
  • The game continues with both players placing
    stones on the board until both players pass
    consecutively.
  • The result is determined by each player adding up
    the territory they control plus the number of the
    opponents stones they have captured. Each
    territory point controlled and each stone
    captured are worth one point.

30
Go the great challenge
  • Huge branching factor (180 for Go, 36 for
    Chess)
  • evaluation of a position
  • in Chess there is a good correlation between the
    strength of position and the number and quality
    of pieces.
  • in Go there is a poor correlation between the
    strength of the position and the number of stones
    and territory surrounded.
  • Best Go program is standard of average player
    (despite a 1 million prize).
Write a Comment
User Comments (0)
About PowerShow.com