Evolving - PowerPoint PPT Presentation

About This Presentation
Title:

Evolving

Description:

evolving & adapting opponents in pacman presented by: debarghya majumdar somdas bandyopadhyay kaustav dey biswas – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 35
Provided by: KaustavD
Category:

less

Transcript and Presenter's Notes

Title: Evolving


1
Evolving Adapting opponents in PACMAN
  • PRESENTED BY
  • DEBARGHYA MAJUMDAR
  • SOMDAS BANDYOPADHYAY
  • KAUSTAV DEY BISWAS

2
Topics to be discussed
  • PacMan
  • The Game
  • Areas of Intelligence
  • Classical Ghost Algorithm
  • Basic Neuro-Evolution
  • The Model
  • The ANN
  • Offline Learning
  • Online Learning
  • Neuro-Evolution of Augmenting Topologies
  • The Model
  • The ANN
  • Training Algorithm
  • Experiments
  • Discussions
  • Conclusions
  • References

3
PacMan The Game
  • Classical PacMan released by Namco (Japan) in
    1980
  • Single player predator/prey game
  • Player plays as the PacMan
  • Navigates through a 2D maze
  • Eats pellets
  • Avoids Ghosts
  • Ghosts played by the computer
  • Usually 4 in number
  • Chase the PacMan and try to catch it
  • Winning / Losing
  • PacMan wins if it has eaten all pellets or
    survived for 180 sec
  • PacMan loses if caught by a Ghost
  • Special Power-pills
  • Allows PacMan to eat the Ghosts for a short
    period of time
  • Eaten Ghosts respawns

4
PacMan Areas of Intelligence
  • Programming the PacMan to replace human player
  • Heavily researched field
  • Gives insight into AI, but doesn't add value to
    the game engine
  • Making the Ghosts more intelligent
  • Thrives to
  • Make the Ghosts more efficient killers
  • Incorporate teamwork
  • Minimize score of the PacMan
  • Make the Ghosts learn adapt
  • Make the game more 'interesting
  • Adds value to the game
  • Scarcely researched field
  • Currently our topic of interest

5
PacMan The Classical Ghost algorithm
  • Two modes of play
  • Attack Chase down Pacman
  • Scatter Break away when Pacman has a power-pill
  • Decide target cell on reaching intersection
  • Attack target
  • Red Ghost Current position of Pacman
  • Pink Ghost Cell 4 positions ahead of Pacman
  • Blue Ghost Cell 2 positions ahead of Pacman
  • Orange Ghost Target when far away, retire to
    corner when nearby
  • Scatter target
  • Pseudo-random behaviour
  • Minimize distance from target cell

6
PacMan Making the Ghosts Intelligent
  • We will discuss two approaches for programming
    intelligence, learnability adaptability into
    the Ghosts
  • Local optimization using basic Neuro-Evolution
  • Global optimization using Neuro-Evolution of
    Augmenting Topologies (NEAT)

7
Basic Neuro-Evolution The Model
  • Game field modeled as a grid. Each square can
    have any of
  • Wall block
  • Pellet
  • Pacman
  • Ghost(s)
  • Power-pills not incorporated to keep things
    simple
  • Ghosts have only local info about the grid
  • Game proceeds in cycles of 2 steps
  • Gather information about environment
  • Make a move by processing the information
  • Each Ghost controlled independently by a
    dedicated ANN
  • ANNs trained by Evolutionary Algorithm (GA)
  • Offline (Train Deploy)
  • Online (Learn as you go)
  • Weight-adjusting is the only means of training
    the ANNs

8
Basic Neuro-Evolution The ANN
  • A 4-5-4 Feed-Forward neural controller,
    comprising of sigmoid neurons is employed to
    manage the Ghosts motion.
  • Inputs
  • Using their sensors, Ghosts inspect the
    environment from their own point of view and
    decide their next action.
  • Each Ghost receives input information from its
    environment expressed in the neural networks
    input array of dimension 4
  • Outputs
  • Four scores corresponding to each of the
    direction
  • The direction with the maximum score is selected

9
Basic Neuro-Evolution The ANN Inputs
  • The Input array Consists of
  • ?x,P xg xp
  • ?y,P xy xp
  • ?x,C xg xC
  • ?y,C xg xC

Picture courtesy Ref.1
where (xg,yg), (xp,yp) and (xc,yc) are the
Cartesian co-ordinates of the current Ghosts,
Pacmans and closest Ghosts current position
respectively.
10
Basic Neuro-Evolution The Network
UP
DOWN
LEFT
RIGHT
?y,P
?x,P
?x,C
?y,C
11
Basic Neuro-Evolution Offline Learning
  • Off-line evolutionary learning approach is used
    in order to produce some good (i.e. in terms of
    performance) initial behaviors for the online
    learning mechanism.
  • The neural networks that determine the behavior
    of the Ghosts are themselves evolved.
  • The evolving process is limited to the connection
    weights of the neural network.
  • The evolutionary procedure is based on genetic
    algorithm
  • Each Ghost has a genome that encodes the
    connection weights of its neural network.
  • A population of neural networks (Ghosts) is
    initialized randomly with initial uniformly
    distributed random connection weights that lie
    within -5, 5.

12
Basic Neuro-Evolution Offline Learning
Algorithm
  • At each generation
  • Step 1
  • Every Ghost in the population is cloned 4 times.
  • These 4 clones are placed in the PacMan game ?eld
    and play N games, each one for an evaluation
    period of t simulation steps.
  • The outcome of these games is to ascertain the
    time taken to kill pacman tk for each game.

13
Basic Neuro-Evolution Offline Learning
Algorithm (contd.)
  • Step 2
  • Each Ghost is evaluated for each game and its
    fitness value is given Ef over N games where f
    is given by -
  • By the use of fitness function f , we promote
    Pacman killing behaviors capable of achieving
    high performance value.

14
Basic Neuro-Evolution Offline Learning
Algorithm (contd.)
  • Step 3
  • A pure elitism selection method is used where
  • only the 10 best ?t solutions determine the
    members of the intermediate population and,
    therefore, are able to breed.
  • Step 4
  • Each parent clones an equal number of o?spring in
    order to replace the non-picked solutions from
    elitism.

15
Basic Neuro-Evolution Offline Learning
Algorithm (contd.)
  • Step 5
  • Mutation occurs in each gene (connection weight)
    of each o?springs genome with a small
    probability. A uniform random distribution is
    used to de?ne the mutated value of the connection
    weight.
  • The algorithm is terminated when a predetermined
  • number of generations g is achieved (e.g. g
    1000) and the best-?t Ghosts connections weights
    are saved.

16
Basic Neuro-Evolution Offline Learning
  • Pros
  • Computationally efficient Minimum in-game
    computations
  • Can be tailor-made for specific maps
  • Cons
  • Cannot adapt to changing maps
  • May overfit to training player's characteristics

17
Basic Neuro-Evolution Online Learning
  • This learning approach is based on the idea of
    Ghosts that learn while they are playing against
    Pacman.
  • In other words, Ghosts that are reactive to any
    players behavior and learn from its strategy
    instead of being predictable and uninteresting
  • Furthermore, this approachs additional objective
    is to keep the games interest at high levels as
    long as it is being played.

18
Basic Neuro-Evolution Online Learning (contd)
  • Beginning from any initial group of homogeneous
    o?line trained (OLT) Ghosts, the OLL mechanism
    attempts to transform them into a group of
    heterogeneous Ghosts that are interesting to play
    against.
  • An OLT Ghost is cloned 4 times and its clones are
    placed in the Pacman game ?eld to play against a
    selected Pacman type of player.

19
Basic Neuro-Evolution Online Learning Algorithm
  • At each generation
  • Step 1
  • Each ghost is evaluated every t simulation steps
    while the game is played. The fitness function is
    -
  • f d1 dt
  • where ,
  • di Distance between ghost and pacman at the ith
    simulation step.
  • This fitness function promotes ghosts that move
    towards pacman within an evaluation period of t
    seconds.

20
Basic Neuro-Evolution Online Learning
Algorithm (contd.)
  • Step 2
  • A pure elitism selection method is used where
    only the ?ttest solution is able to breed. The
    best-?t parent clones an o?spring.
  • Step 3
  • Mutation occurs in each gene (connection weight)
    of each o?springs genome with a probability that
    is inversely proportional to the entropy of the
    group of ghosts.

21
Basic Neuro-Evolution Online Learning
Algorithm (contd.)
  • Step 4
  • The cloned o?spring is evaluated brie?y o?line
    mode, that is, by replacing the worst-?t member
    of the population and playing an o?line (i.e. no
    visualization of the actions) short game of t
    simulation steps.
  • The ?tness values of the mutated o?spring and the
    worst-?t Ghost are compared and the better one is
    kept for the next generation.

22
Basic Neuro-Evolution Online Learning
  • Pros
  • Adapts easily to varying maps, player mindsets
  • Hugely generalized
  • Cons
  • Slow due to intensive computations during
    run-time
  • May take some time to re-train on new maps

23
Neuro-Evolution of Augmenting Topologies The
Model
  • Takes into account team work and strategic
    formations
  • Operates on global data
  • Has three modes of operation
  • Chasing (pursuing Pacman)
  • Fleeing (evading Pacman when Pacman has consumed
    a power-pill)
  • Returning (returning back to the hideout to be
    restored)
  • Optimizes the team of Ghosts as a whole
  • Each Ghost controlled independently by a
    dedicated ANN
  • ANNs trained by Evolutionary Algorithm (GA)
  • ANN training affects
  • Weights of the edges
  • Interconnection of the perceptrons (ANN topology)
  • Ghosts trained in real-time
  • Training proceeds parallely with the game
  • Adapts the Ghosts over short time slices
  • Ghosts classified according to their distance
    from Pacman
  • Each distance class has a dedicated ANN
    population which evolves genetically
  • Multiple populations aid in heterogeneous
    strategy development

24
Neuro-Evolution of Augmenting Topologies The ANN
  • Each ANN represents a Ghost
  • Input
  • Current status of Ghost
  • Current status of closest Ghost
  • Current status of closest Ghost to Pacman
  • Distances to objects of interest (Pacman, Ghost,
    Powerpill, Pellet, Intersection, etc)
  • Distances between Pacman objects of interest
    (Ghost, Powerpill, Pellet, Intersection, etc)
  • Output
  • Score of a cell
  • Applied 4 times, one for each adjacent cell
  • Cell with maximum score selected for making move
  • Connections
  • Minimally connected
  • Evolves through NEAT

25
Neuro-Evolution of Augmenting Topologies
Training algorithm
  • Initialize
  • A number of random neural network populations
    generated, each corresponding to ghosts
    classified according to their distance to Pacman.
  • Game divided into time slices of a small number
    of moves. Gn represents the state of the game
    beginning at time slice n.

26
Neuro-Evolution of Augmenting Topologies
Training algorithm (contd.)
  • Algorithm
  • Mark a ghost for learning during current time
    slice, beginning at Gn.
  • Look ahead (based on the models of the other
    ghosts and Pacman) and store the game state as
    expected to be like at the beginning of the next
    slice through simulated play (eGn1 ). This will
    be the starting state for the NEAT simulation
    runs.
  • The fitness of a ghost strategy is determined by
    evaluating the game state that we expect to reach
    when the strategy is used in place of the marked
    ghost (eGn2 ). This evaluation is an evaluation
    of the end state. Various fitness schemes are
    considered.
  • In parallel to the running of the actual game,
    run NEAT until the actual game reaches Gn1.
  • The best individual from the simulations is
    substituted into the game, replacing the marked
    ghost.
  • Repeat the process for the next ghost in turn.

27
Neuro-Evolution of Augmenting Topologies
Training algorithm Pictorial Representation
Picture courtesy Ref.2
28
Neuro-Evolution of Augmenting Topologies
Experiment 1 Chasing and Evading Pacman
Rank 1 Pacmans number of lives Rank 2
Score Lives Lost
Classical AI 4808.4 1.44
Experiment 1 4127.6 1.12
Table courtesy Ref.2
  • Improvement over Classical AI
  • Ghosts tend to form clusters, reducing
    effectiveness

29
Neuro-Evolution of Augmenting Topologies
Experiment 2 Remaining Dispersed
Rank 1 Pacmans number of lives Rank 2 Rank
3
Score Lives Lost
Classical AI 4808.4 1.44
Experiment 1 4127.6 1.12
Experiment 2 4930.8 1.52
Table courtesy Ref.2
  • Inefficient as compared to Experiment 1
  • Ghosts tend to oscillate in dispersed locations

30
Neuro-Evolution of Augmenting Topologies
Experiment 3 Protection Behaviour
Rank 1 Pacmans number of lives Rank 2
count(Ghostr) Rank 3 count(Ghostf) Rank 4
Rank 5
Score Lives Lost
Classical AI 4808.4 1.44
Experiment 1 4127.6 1.12
Experiment 2 4930.8 1.52
Experiment 3 4271.6 1.64
Table courtesy Ref.2
  • Teamwork improved
  • Ghosts committing suicide!

31
Neuro-Evolution of Augmenting Topologies
Experiment 4 Ambushing Pacman
Rank 1 Pacmans number of lives Rank 2
Intersections controlled by Pacman Rank 3
Rank 4 Rank 5 Pacmans score
Score Lives Lost
Classical AI 4808.4 1.44
Experiment 1 4127.6 1.12
Experiment 2 4930.8 1.52
Experiment 3 4271.6 1.64
Experiment 4 4494.4 1.96
Table courtesy Ref.2
  • Kill rate significantly increased

32
Neuro-Evolution of Augmenting Topologies
Discussions
  • Uses high-level global data about the state of
    the game
  • Reduces computational lag by looking ahead and
    employing parallelism
  • Encourages system to learn short-term strategies,
    rather than generalized long-term ones
  • Chalks down basic fitness strategies, opening up
    a horizon for many more
  • Demonstrates complex team behaviours

33
Conclusion
  • PacMan serves as a good test-bed for programming
    intelligent agents
  • Generalized strategies applicable to a vast class
    of predator/prey game
  • Programming Ghosts gives good insight into
    efficient team strategies

34
References
  • 1 G. N. Yannakakis, and J. Hallam, "Evolving
    Opponents for Interesting Interactive Computer
    Games,'' in Proceedings of the 8th International
    Conference on the Simulation of Adaptive Behavior
    (SAB'04) From Animals to Animats 8, pp. 499-508,
    Los Angeles, CA, USA, July 13-17, 2004. The MIT
    Press.
  • 2 Mark Wittkamp, Luigi Barone, Philip Hingston,
    Using NEAT for Continuous Adaptation and Teamwork
    Formation in Pacman , 2008 IEEE Symposium on
    computational Intelligence and Games (CIG 08)
  • 3 Kenneth O. Stanley, Risto Miikkulainen(2002),
    Evolving Neural Networks through Augmenting
    Topologies, The MIT Press Journals
Write a Comment
User Comments (0)
About PowerShow.com