Behavioral Diversity in Genetic Programming Starting Populations - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Behavioral Diversity in Genetic Programming Starting Populations

Description:

Why are starting populations important in genetic programming? ... This is one of the most commonly used generation techniques in GP. Syntax Vs Semantics ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 28
Provided by: Cam53
Category:

less

Transcript and Presenter's Notes

Title: Behavioral Diversity in Genetic Programming Starting Populations


1
Behavioral Diversity in Genetic Programming
Starting Populations
  • Lawrence Beadle

2
Introduction 1/2
  • A brief introduction to Genetic Programming (GP).
  • Existing methods to create starting populations.
  • Why are starting populations important in genetic
    programming?
  • A brief introduction to Reduced Ordered Binary
    Decision Diagrams (ROBDDs).
  • Semantic analysis of the output of the existing
    methods.

3
Introduction 2/2
  • The State Differential Algorithm
  • Results using the State Differential Algorithm
  • Reverse Engineering a Search Space
  • Conclusions
  • Future work.

4
An Introduction to Genetic Programming
Initialise Starting Population
Evaluate Fitness
Perform Crossover
Select Best Programs
GP Complete
5
Why are Starting Populations Important in GP?
  • They provide a variety of syntactic and semantic
    material to be used as components for programs
    evolved using GP. Variety is key.
  • Previous work identified that syntactic bias
    present in programs at the start of GP runs can
    have a dramatic impact on the performance of the
    GP.
  • The size and shape of starting populations are
    important as it will have an impact on the size
    of the programs as they evolve.

6
Existing Population Generation Methods
  • The three main initialisation methods which are
    the most commonly used
  • Grow
  • Full
  • Ramped Half and Half

7
But First, a Little More Background...
  • Terminals Input variables, for example a Boolean
    variable or a number depending on the problem
    being tacked
  • Functions Some form of operation, for example
    IF, AND, , - depending on the problem being
    tackled.
  • In this work, functions and terminals are
    represented in a tree form.

8
The Grow Starting Population
  • A maximum program depth is defined (commonly 6).
  • Grow will randomly select functions or terminals
    until the tree depth reaches 5 at which point it
    will randomly select terminals only such that
    depth will remain at 6.
  • If functions outnumber terminals then there is a
    bias to fill the tree to full depth (6) when
    considering function and terminal selection with
    a uniform probability.

9
The Full Starting Population
  • Again a maximum program depth is defined
    (commonly 6).
  • Full will fill the tree such that all branches of
    the tree will reach the maximum depth.
  • This will result in all the programs being the
    same size and shape is this realistic in terms
    of a solution?

10
Ramped Half and Half
  • Ramped Half and Half is a combination of Full and
    Grow with varying maximum depths within a range
    (for example, depths 2 to 6).
  • 20 of the population will be initialised to
    depth 2, the next 20 to depth 3 and so on until
    the final 20 is depth 6.
  • Of each 20 half the trees are generated using
    the Full method and half using the Grow method.
  • This is one of the most commonly used generation
    techniques in GP.

11
Syntax Vs Semantics
  • What do these generation techniques actually
    produce?
  • A number of papers have focused on controlling
    the syntax of the output population, but only one
    has studied semantic output from starting
    population generation techniques.
  • We have taken this further by not only counting
    semantically equivalent programs but establishing
    whether there are repeated behaviours produced by
    the Ramped Half and Half technique.

12
Measuring Semantics Using ROBDDs
  • Reduced Ordered Binary Decision Diagrams allow us
    to represent behaviour in a canonical form.
  • This is important because whilst there can be
    many syntax representations for one behaviour,
    there is only one ROBDD for a particular
    behaviour.
  • We can not only, count the number of times a
    specific behaviour is represented in syntax form,
    we can also detect other useful or not so useful
    properties of a behaviour.

13
So what do ROBDDs look like? 1/2
  • Consider the program IF A0 D0 D1

A0
D1
D0
0
1
14
What do ROBDDs look like? 2/2
  • Consider the program AND A1 A1
  • AND A1 A1 reduces to A1 when the reduction
    mechanism is applied.

A1
0
1
15
What Can ROBDDs Tell Us?
  • A node count (the circles on the diagram and in
    the GP context the terminals) of zero would imply
    the ROBDD represents a tautology.
  • A sat count of 1 or 0 would imply the ROBDD
    represents a tautology of true or false
    respectively.
  • We can also deduce that, with only two nodes and
    a sat count of 0.25 or 0.75 the functions are AND
    and OR respectively. We can establish using
    these details whether it is simplistic behaviour
    or not.

16
Semantic Analysis of the Output of RHH
  • The experiment used in this case is a three bit
    multiplexer.
  • The objective is that one control bit specifies
    which input to return from the other two bits.
  • This is a very simple experiment, but it gives us
    a manageable search space of 256 behaviours to
    work with.
  • We used RHH initialisations to generate
    populations of varying sizes and counted the
    numbers of unique behaviours.

17
3 Bit Multiplexer and Ramped Half and Half
18
RHH Bias
19
Simplistic Output
  • We can see a bias in the output of the Ramped
    Half and Half technique towards small simple
    programs and undesirable tautologies.
  • We can also see that some behaviours are
    difficult to generate even with syntactic
    populations of 3000 for 256 behaviours.
  • We repeated this experiment on a 6 bit
    multiplexer system.
  • We saw the same kind of bias towards simplistic
    behaviours as well as significant behaviour
    duplication.

20
A New Approach
  • We know that randomly throwing together syntax
    does not produce 100 useful output.
  • We need an algorithm capable of building more
    complex behaviour quickly.
  • We need an algorithm that is driven by behaviour
    and not syntax.
  • We need an algorithm that encourages maximum
    diversity through unique behaviour.
  • We need an algorithm that will not produce
    undesirable effects such as tautologies.

21
The State Differential Algorithm
  • We do one run with a full generation (depth 4)
    technique and capture the unique behaviours in
    ROBDD form.
  • We then use the unique behaviours we obtained
    from 1, randomly select and combine the ROBDDs at
    the root using a random function until we obtain
    the desired population of unique behaviours.
  • We translate the ROBDDs back to Boolean code.

22
Experiments Comparing SDA to RHH
23
Reverse Engineering
  • The State Differential Algorithm results in
    semantically unique output with no tautologies.
  • If we think back to the 3 Bit Multiplexer with
    the small search space of 256, we should be able
    to generate all behaviours.
  • We can check these for bias for particular types
    of functions.
  • More specifically we can look at the frequency of
    root functions.

24
Reverse Engineering Results
25
Conclusions
  • The existing Ramped Half and Half technique
    applies a bias to the output programs, producing
    small and simplistic programs.
  • The assumption that randomness in syntax
    selection is required to generate a variety of
    behaviours is simply not always true.
  • Behaviourally driven population creation using
    the State Differential Algorithm demonstrates a
    significant improvement in results in terms of
    performance and speed for comparable results.

26
Future Work
  • We are planning to work on mechanisms to enable
    us to utilise this approach for non Boolean
    problems.
  • We have implemented systems to apply semantic
    control at other stages of Genetic Programming
    and have had some success.

27
  • Questions
Write a Comment
User Comments (0)
About PowerShow.com