Evolving - PowerPoint PPT Presentation

About This Presentation

Title:

Evolving

Description:

evolving & adapting opponents in pacman presented by: debarghya majumdar somdas bandyopadhyay kaustav dey biswas – PowerPoint PPT presentation

Number of Views:63

Avg rating:3.0/5.0

Slides: 35

Provided by: KaustavD

Category:

more less

Transcript and Presenter's Notes

Title: Evolving

1
Evolving Adapting opponents in PACMAN

PRESENTED BY
DEBARGHYA MAJUMDAR
SOMDAS BANDYOPADHYAY
KAUSTAV DEY BISWAS

2
Topics to be discussed

PacMan
The Game
Areas of Intelligence
Classical Ghost Algorithm
Basic Neuro-Evolution
The Model
The ANN
Offline Learning
Online Learning
Neuro-Evolution of Augmenting Topologies
The Model
The ANN
Training Algorithm
Experiments
Discussions
Conclusions
References

3
PacMan The Game

Classical PacMan released by Namco (Japan) in
1980
Single player predator/prey game
Player plays as the PacMan
Navigates through a 2D maze
Eats pellets
Avoids Ghosts
Ghosts played by the computer
Usually 4 in number
Chase the PacMan and try to catch it
Winning / Losing
PacMan wins if it has eaten all pellets or
survived for 180 sec
PacMan loses if caught by a Ghost
Special Power-pills
Allows PacMan to eat the Ghosts for a short
period of time
Eaten Ghosts respawns

4
PacMan Areas of Intelligence

Programming the PacMan to replace human player
Heavily researched field
Gives insight into AI, but doesn't add value to
the game engine
Making the Ghosts more intelligent
Thrives to
Make the Ghosts more efficient killers
Incorporate teamwork
Minimize score of the PacMan
Make the Ghosts learn adapt
Make the game more 'interesting
Adds value to the game
Scarcely researched field
Currently our topic of interest

5
PacMan The Classical Ghost algorithm

Two modes of play
Attack Chase down Pacman
Scatter Break away when Pacman has a power-pill
Decide target cell on reaching intersection
Attack target
Red Ghost Current position of Pacman
Pink Ghost Cell 4 positions ahead of Pacman
Blue Ghost Cell 2 positions ahead of Pacman
Orange Ghost Target when far away, retire to
corner when nearby
Scatter target
Pseudo-random behaviour
Minimize distance from target cell

6
PacMan Making the Ghosts Intelligent

We will discuss two approaches for programming
intelligence, learnability adaptability into
the Ghosts
Local optimization using basic Neuro-Evolution
Global optimization using Neuro-Evolution of
Augmenting Topologies (NEAT)

7
Basic Neuro-Evolution The Model

Game field modeled as a grid. Each square can
have any of
Wall block
Pellet
Pacman
Ghost(s)
Power-pills not incorporated to keep things
simple
Ghosts have only local info about the grid
Game proceeds in cycles of 2 steps
Gather information about environment
Make a move by processing the information
Each Ghost controlled independently by a
dedicated ANN
ANNs trained by Evolutionary Algorithm (GA)
Offline (Train Deploy)
Online (Learn as you go)
Weight-adjusting is the only means of training
the ANNs

8
Basic Neuro-Evolution The ANN

A 4-5-4 Feed-Forward neural controller,
comprising of sigmoid neurons is employed to
manage the Ghosts motion.
Inputs
Using their sensors, Ghosts inspect the
environment from their own point of view and
decide their next action.
Each Ghost receives input information from its
environment expressed in the neural networks
input array of dimension 4
Outputs
Four scores corresponding to each of the
direction
The direction with the maximum score is selected

9
Basic Neuro-Evolution The ANN Inputs

The Input array Consists of
?x,P xg xp
?y,P xy xp
?x,C xg xC
?y,C xg xC

Picture courtesy Ref.1
where (xg,yg), (xp,yp) and (xc,yc) are the
Cartesian co-ordinates of the current Ghosts,
Pacmans and closest Ghosts current position
respectively.
10
Basic Neuro-Evolution The Network
UP
DOWN
LEFT
RIGHT
?y,P
?x,P
?x,C
?y,C
11
Basic Neuro-Evolution Offline Learning

Off-line evolutionary learning approach is used
in order to produce some good (i.e. in terms of
performance) initial behaviors for the online
learning mechanism.
The neural networks that determine the behavior
of the Ghosts are themselves evolved.
The evolving process is limited to the connection
weights of the neural network.
The evolutionary procedure is based on genetic
algorithm
Each Ghost has a genome that encodes the
connection weights of its neural network.
A population of neural networks (Ghosts) is
initialized randomly with initial uniformly
distributed random connection weights that lie
within -5, 5.

12
Basic Neuro-Evolution Offline Learning
Algorithm

At each generation
Step 1
Every Ghost in the population is cloned 4 times.
These 4 clones are placed in the PacMan game ?eld
and play N games, each one for an evaluation
period of t simulation steps.
The outcome of these games is to ascertain the
time taken to kill pacman tk for each game.

13
Basic Neuro-Evolution Offline Learning
Algorithm (contd.)

Step 2
Each Ghost is evaluated for each game and its
fitness value is given Ef over N games where f
is given by -
By the use of fitness function f , we promote
Pacman killing behaviors capable of achieving
high performance value.

14
Basic Neuro-Evolution Offline Learning
Algorithm (contd.)

Step 3
A pure elitism selection method is used where
only the 10 best ?t solutions determine the
members of the intermediate population and,
therefore, are able to breed.
Step 4
Each parent clones an equal number of o?spring in
order to replace the non-picked solutions from
elitism.

15
Basic Neuro-Evolution Offline Learning
Algorithm (contd.)

Step 5
Mutation occurs in each gene (connection weight)
of each o?springs genome with a small
probability. A uniform random distribution is
used to de?ne the mutated value of the connection
weight.
The algorithm is terminated when a predetermined
number of generations g is achieved (e.g. g
1000) and the best-?t Ghosts connections weights
are saved.

16
Basic Neuro-Evolution Offline Learning

Pros
Computationally efficient Minimum in-game
computations
Can be tailor-made for specific maps
Cons
Cannot adapt to changing maps
May overfit to training player's characteristics

17
Basic Neuro-Evolution Online Learning

This learning approach is based on the idea of
Ghosts that learn while they are playing against
Pacman.
In other words, Ghosts that are reactive to any
players behavior and learn from its strategy
instead of being predictable and uninteresting
Furthermore, this approachs additional objective
is to keep the games interest at high levels as
long as it is being played.

18
Basic Neuro-Evolution Online Learning (contd)

Beginning from any initial group of homogeneous
o?line trained (OLT) Ghosts, the OLL mechanism
attempts to transform them into a group of
heterogeneous Ghosts that are interesting to play
against.
An OLT Ghost is cloned 4 times and its clones are
placed in the Pacman game ?eld to play against a
selected Pacman type of player.

19
Basic Neuro-Evolution Online Learning Algorithm

At each generation
Step 1
Each ghost is evaluated every t simulation steps
while the game is played. The fitness function is
-
f d1 dt
where ,
di Distance between ghost and pacman at the ith
simulation step.
This fitness function promotes ghosts that move
towards pacman within an evaluation period of t
seconds.

20
Basic Neuro-Evolution Online Learning
Algorithm (contd.)

Step 2
A pure elitism selection method is used where
only the ?ttest solution is able to breed. The
best-?t parent clones an o?spring.
Step 3
Mutation occurs in each gene (connection weight)
of each o?springs genome with a probability that
is inversely proportional to the entropy of the
group of ghosts.

21
Basic Neuro-Evolution Online Learning
Algorithm (contd.)

Step 4
The cloned o?spring is evaluated brie?y o?line
mode, that is, by replacing the worst-?t member
of the population and playing an o?line (i.e. no
visualization of the actions) short game of t
simulation steps.
The ?tness values of the mutated o?spring and the
worst-?t Ghost are compared and the better one is
kept for the next generation.

22
Basic Neuro-Evolution Online Learning

Pros
Adapts easily to varying maps, player mindsets
Hugely generalized
Cons
Slow due to intensive computations during
run-time
May take some time to re-train on new maps

23
Neuro-Evolution of Augmenting Topologies The
Model

Takes into account team work and strategic
formations
Operates on global data
Has three modes of operation
Chasing (pursuing Pacman)
Fleeing (evading Pacman when Pacman has consumed
a power-pill)
Returning (returning back to the hideout to be
restored)
Optimizes the team of Ghosts as a whole
Each Ghost controlled independently by a
dedicated ANN
ANNs trained by Evolutionary Algorithm (GA)
ANN training affects
Weights of the edges
Interconnection of the perceptrons (ANN topology)
Ghosts trained in real-time
Training proceeds parallely with the game
Adapts the Ghosts over short time slices
Ghosts classified according to their distance
from Pacman
Each distance class has a dedicated ANN
population which evolves genetically
Multiple populations aid in heterogeneous
strategy development

24
Neuro-Evolution of Augmenting Topologies The ANN

Each ANN represents a Ghost
Input
Current status of Ghost
Current status of closest Ghost
Current status of closest Ghost to Pacman
Distances to objects of interest (Pacman, Ghost,
Powerpill, Pellet, Intersection, etc)
Distances between Pacman objects of interest
(Ghost, Powerpill, Pellet, Intersection, etc)
Output
Score of a cell
Applied 4 times, one for each adjacent cell
Cell with maximum score selected for making move
Connections
Minimally connected
Evolves through NEAT

25
Neuro-Evolution of Augmenting Topologies
Training algorithm

Initialize
A number of random neural network populations
generated, each corresponding to ghosts
classified according to their distance to Pacman.
Game divided into time slices of a small number
of moves. Gn represents the state of the game
beginning at time slice n.

26
Neuro-Evolution of Augmenting Topologies
Training algorithm (contd.)

Algorithm
Mark a ghost for learning during current time
slice, beginning at Gn.
Look ahead (based on the models of the other
ghosts and Pacman) and store the game state as
expected to be like at the beginning of the next
slice through simulated play (eGn1 ). This will
be the starting state for the NEAT simulation
runs.
The fitness of a ghost strategy is determined by
evaluating the game state that we expect to reach
when the strategy is used in place of the marked
ghost (eGn2 ). This evaluation is an evaluation
of the end state. Various fitness schemes are
considered.
In parallel to the running of the actual game,
run NEAT until the actual game reaches Gn1.
The best individual from the simulations is
substituted into the game, replacing the marked
ghost.
Repeat the process for the next ghost in turn.

27
Neuro-Evolution of Augmenting Topologies
Training algorithm Pictorial Representation
Picture courtesy Ref.2
28
Neuro-Evolution of Augmenting Topologies
Experiment 1 Chasing and Evading Pacman
Rank 1 Pacmans number of lives Rank 2
Score Lives Lost
Classical AI 4808.4 1.44
Experiment 1 4127.6 1.12
Table courtesy Ref.2

Improvement over Classical AI
Ghosts tend to form clusters, reducing
effectiveness

29
Neuro-Evolution of Augmenting Topologies
Experiment 2 Remaining Dispersed
Rank 1 Pacmans number of lives Rank 2 Rank
3
Score Lives Lost
Classical AI 4808.4 1.44
Experiment 1 4127.6 1.12
Experiment 2 4930.8 1.52
Table courtesy Ref.2

Inefficient as compared to Experiment 1
Ghosts tend to oscillate in dispersed locations

30
Neuro-Evolution of Augmenting Topologies
Experiment 3 Protection Behaviour
Rank 1 Pacmans number of lives Rank 2
count(Ghostr) Rank 3 count(Ghostf) Rank 4
Rank 5
Score Lives Lost
Classical AI 4808.4 1.44
Experiment 1 4127.6 1.12
Experiment 2 4930.8 1.52
Experiment 3 4271.6 1.64
Table courtesy Ref.2

Teamwork improved
Ghosts committing suicide!

31
Neuro-Evolution of Augmenting Topologies
Experiment 4 Ambushing Pacman
Rank 1 Pacmans number of lives Rank 2
Intersections controlled by Pacman Rank 3
Rank 4 Rank 5 Pacmans score
Score Lives Lost
Classical AI 4808.4 1.44
Experiment 1 4127.6 1.12
Experiment 2 4930.8 1.52
Experiment 3 4271.6 1.64
Experiment 4 4494.4 1.96
Table courtesy Ref.2

Kill rate significantly increased

32
Neuro-Evolution of Augmenting Topologies
Discussions

Uses high-level global data about the state of
the game
Reduces computational lag by looking ahead and
employing parallelism
Encourages system to learn short-term strategies,
rather than generalized long-term ones
Chalks down basic fitness strategies, opening up
a horizon for many more
Demonstrates complex team behaviours

33
Conclusion

PacMan serves as a good test-bed for programming
intelligent agents
Generalized strategies applicable to a vast class
of predator/prey game
Programming Ghosts gives good insight into
efficient team strategies

34
References

1 G. N. Yannakakis, and J. Hallam, "Evolving
Opponents for Interesting Interactive Computer
Games,'' in Proceedings of the 8th International
Conference on the Simulation of Adaptive Behavior
(SAB'04) From Animals to Animats 8, pp. 499-508,
Los Angeles, CA, USA, July 13-17, 2004. The MIT
Press.
2 Mark Wittkamp, Luigi Barone, Philip Hingston,
Using NEAT for Continuous Adaptation and Teamwork
Formation in Pacman , 2008 IEEE Symposium on
computational Intelligence and Games (CIG 08)
3 Kenneth O. Stanley, Risto Miikkulainen(2002),
Evolving Neural Networks through Augmenting
Topologies, The MIT Press Journals