Behavioral Diversity in Genetic Programming Starting Populations - PowerPoint PPT Presentation

1 / 27

About This Presentation

Title:

Behavioral Diversity in Genetic Programming Starting Populations

Description:

Why are starting populations important in genetic programming? ... This is one of the most commonly used generation techniques in GP. Syntax Vs Semantics ... – PowerPoint PPT presentation

Number of Views:62

Avg rating:3.0/5.0

Slides: 28

Provided by: Cam53

Category:

more less

Transcript and Presenter's Notes

Title: Behavioral Diversity in Genetic Programming Starting Populations

1
Behavioral Diversity in Genetic Programming
Starting Populations

Lawrence Beadle

2
Introduction 1/2

A brief introduction to Genetic Programming (GP).
Existing methods to create starting populations.
Why are starting populations important in genetic
programming?
A brief introduction to Reduced Ordered Binary
Decision Diagrams (ROBDDs).
Semantic analysis of the output of the existing
methods.

3
Introduction 2/2

The State Differential Algorithm
Results using the State Differential Algorithm
Reverse Engineering a Search Space
Conclusions
Future work.

4
An Introduction to Genetic Programming
Initialise Starting Population
Evaluate Fitness
Perform Crossover
Select Best Programs
GP Complete
5
Why are Starting Populations Important in GP?

They provide a variety of syntactic and semantic
material to be used as components for programs
evolved using GP. Variety is key.
Previous work identified that syntactic bias
present in programs at the start of GP runs can
have a dramatic impact on the performance of the
GP.
The size and shape of starting populations are
important as it will have an impact on the size
of the programs as they evolve.

6
Existing Population Generation Methods

The three main initialisation methods which are
the most commonly used
Grow
Full
Ramped Half and Half

7
But First, a Little More Background...

Terminals Input variables, for example a Boolean
variable or a number depending on the problem
being tacked
Functions Some form of operation, for example
IF, AND, , - depending on the problem being
tackled.
In this work, functions and terminals are
represented in a tree form.

8
The Grow Starting Population

A maximum program depth is defined (commonly 6).
Grow will randomly select functions or terminals
until the tree depth reaches 5 at which point it
will randomly select terminals only such that
depth will remain at 6.
If functions outnumber terminals then there is a
bias to fill the tree to full depth (6) when
considering function and terminal selection with
a uniform probability.

9
The Full Starting Population

Again a maximum program depth is defined
(commonly 6).
Full will fill the tree such that all branches of
the tree will reach the maximum depth.
This will result in all the programs being the
same size and shape is this realistic in terms
of a solution?

10
Ramped Half and Half

Ramped Half and Half is a combination of Full and
Grow with varying maximum depths within a range
(for example, depths 2 to 6).
20 of the population will be initialised to
depth 2, the next 20 to depth 3 and so on until
the final 20 is depth 6.
Of each 20 half the trees are generated using
the Full method and half using the Grow method.
This is one of the most commonly used generation
techniques in GP.

11
Syntax Vs Semantics

What do these generation techniques actually
produce?
A number of papers have focused on controlling
the syntax of the output population, but only one
has studied semantic output from starting
population generation techniques.
We have taken this further by not only counting
semantically equivalent programs but establishing
whether there are repeated behaviours produced by
the Ramped Half and Half technique.

12
Measuring Semantics Using ROBDDs

Reduced Ordered Binary Decision Diagrams allow us
to represent behaviour in a canonical form.
This is important because whilst there can be
many syntax representations for one behaviour,
there is only one ROBDD for a particular
behaviour.
We can not only, count the number of times a
specific behaviour is represented in syntax form,
we can also detect other useful or not so useful
properties of a behaviour.

13
So what do ROBDDs look like? 1/2

Consider the program IF A0 D0 D1

A0
D1
D0
0
1
14
What do ROBDDs look like? 2/2

Consider the program AND A1 A1
AND A1 A1 reduces to A1 when the reduction
mechanism is applied.

A1
0
1
15
What Can ROBDDs Tell Us?

A node count (the circles on the diagram and in
the GP context the terminals) of zero would imply
the ROBDD represents a tautology.
A sat count of 1 or 0 would imply the ROBDD
represents a tautology of true or false
respectively.
We can also deduce that, with only two nodes and
a sat count of 0.25 or 0.75 the functions are AND
and OR respectively. We can establish using
these details whether it is simplistic behaviour
or not.

16
Semantic Analysis of the Output of RHH

The experiment used in this case is a three bit
multiplexer.
The objective is that one control bit specifies
which input to return from the other two bits.
This is a very simple experiment, but it gives us
a manageable search space of 256 behaviours to
work with.
We used RHH initialisations to generate
populations of varying sizes and counted the
numbers of unique behaviours.

17
3 Bit Multiplexer and Ramped Half and Half
18
RHH Bias
19
Simplistic Output

We can see a bias in the output of the Ramped
Half and Half technique towards small simple
programs and undesirable tautologies.
We can also see that some behaviours are
difficult to generate even with syntactic
populations of 3000 for 256 behaviours.
We repeated this experiment on a 6 bit
multiplexer system.
We saw the same kind of bias towards simplistic
behaviours as well as significant behaviour
duplication.

20
A New Approach

We know that randomly throwing together syntax
does not produce 100 useful output.
We need an algorithm capable of building more
complex behaviour quickly.
We need an algorithm that is driven by behaviour
and not syntax.
We need an algorithm that encourages maximum
diversity through unique behaviour.
We need an algorithm that will not produce
undesirable effects such as tautologies.

21
The State Differential Algorithm

We do one run with a full generation (depth 4)
technique and capture the unique behaviours in
ROBDD form.
We then use the unique behaviours we obtained
from 1, randomly select and combine the ROBDDs at
the root using a random function until we obtain
the desired population of unique behaviours.
We translate the ROBDDs back to Boolean code.

22
Experiments Comparing SDA to RHH
23
Reverse Engineering

The State Differential Algorithm results in
semantically unique output with no tautologies.
If we think back to the 3 Bit Multiplexer with
the small search space of 256, we should be able
to generate all behaviours.
We can check these for bias for particular types
of functions.
More specifically we can look at the frequency of
root functions.

24
Reverse Engineering Results
25
Conclusions

The existing Ramped Half and Half technique
applies a bias to the output programs, producing
small and simplistic programs.
The assumption that randomness in syntax
selection is required to generate a variety of
behaviours is simply not always true.
Behaviourally driven population creation using
the State Differential Algorithm demonstrates a
significant improvement in results in terms of
performance and speed for comparable results.

26
Future Work

We are planning to work on mechanisms to enable
us to utilise this approach for non Boolean
problems.
We have implemented systems to apply semantic
control at other stages of Genetic Programming
and have had some success.