Title: Evolutionary Computing Dialects
1Evolutionary Computing Dialects
- Presented by A.E. Eiben
- Free University Amsterdam
- with thanks to the EvoNet Training Committee and
its Flying Circus
2Contents
- General formal famework
- Dialects
- genetic algorithms
- evolution strategies
- evolutionary programming
- genetic programming
- Beyond dialects
3General EA Framework
- t 0
- initialize P(0) a1(0), , an(0)
- evaluate F(a1(0)), , F(an(0))
- while (stop(P(t)) ? true)
- recombine P(t) rpar(r)(P(t))
- mutate P(t) mpar(m)(P(t))
- evaluate F(a1(t)), , F(an(t))
- select P(t 1) spar(s)(P(t) ? Q)
- t t 1
- where
- F is the fitness function
- r, m, s are recombination, mutation, selection
operators - par(x) contains the paramteres of operator x
- Q is either ? or P(t)
4From framework to dialects
- In theory
- every EA is an instantiation of this framework,
thus - specifying a dialect needs only filling in the
characteristic features - In practice
- this would be too formalistic
- there are many exceptions (EAs not fitting into
this framework)
5Genetic algorithm(s)
- Developed USA in the 1970s
- Early names J. Holland, K. DeJong, D. Goldberg
- Typically applied to
- discrete optimization
- Attributed features
- not too fast
- good solver for combinatorial problems
- Special
- many variants, e.g., reproduction models,
operators - formerly the GA, nowdays a GA, GAs
6GA representation
Required accuracy determines the of bits to
represent a trait (variable)
7GA crossover (1)
- Crossover is used with probability pc
- 1-point crossover
- Choose a random point on the two parents (same
for both) - Split parents at this crossover point
- Create children by exchanging tails
- n-point crossover
- Choose n random crossover points
- Split along those points
- Glue parts, alternating between parents
- uniform crossover
- Assign 'heads' to one parent, 'tails' to the
other - Flip a coin for each gene of the first child
- Make an inverse copy of the gene for the second
child
8GA crossover (2)
9GA mutation
- Mutation
- Alter each gene independently with a probability
pm - Relatively large chance for not being mutated
- (exercise L100, pm 1/L)
10GA crossover OR mutation?
- If we define distance in the search space as
Hamming distance then - Crossover is explorative, it makes a big jump to
an area somewhere in between two (parent)
areas. - Mutation is exploitative, it creates random small
variations, thereby staying near the parent. - To hit the optimum you often need a lucky
mutation. - GA community crossover is mission critical.
11GA selection
- Basic idea fitness proportional selection.
- Implementation roulette wheel technique.
- Assign to each individual a part of the roulette
wheel (size proportional to its fitness). - Spin the wheel n times to select n individuals.
- Example
- f ? max
- 3 individuals
- f (A) 6, f (B) 5, f (C ) 1
12GA reproduction cycle
1. Select parents for the mating pool (size of
mating pool equals population size). 2. Shuffle
the mating pool. 3. For each consecutive pair
apply crossover with probability pc. 4. For each
new-born apply mutation. 5. Replace the whole
population by the mating pool.
13GA Goldberg 89 example (1)
14GA Goldberg 89 example (2)
15Evolution strategies
- Developed Germany in the 1970s
- Early names I. Rechenberg, H.-P. Schwefel
- Typically applied to
- numerical optimization
- Attributed features
- fast
- good optimizer for real-valued optimization
- relatively much theory
- Special
- self-adaptation of (mutation) parameters standard
16ES representation
- Problem optimize f ?n ? ?
- Phenotype space (solution space) ?n
- Genotype space (individual space)
- object values directly (no encoding)
- strategy parameter values
- standard deviations (?s) and
- rotation angles (?s) of mutation
- One individual
17ES mutation (1)
- One step size for each xi (coordinate direction)
- Individual (x1, , xn, ?)
- xi is mutated by adding some ?xi from a normal
probability distribution - ? is mutated by a log-normal scheme
multiplying by e?, with ? from a normal
distribution - ? is mutated first ! (why?)
18ES mutation (2)
- Each xi (coordinate direction) has its own step
size - Individual (x1, , xn, ?1, , ? n)
- ?0, ?, ? are parameters
19ES mutation (3)
Equal probability to place an offspring
20ES recombination (1)
- Basic ideas
- I? ? I, ? parents yield 1 offspring
- Applied ? times, typically ? gtgt ?
- Applied to object variables as well as strategy
parameters - Per offspring gene two corresponding parent genes
are involved - Two ways to recombine two parent alleles
- Discrete recombination choose one of them
randomly - Intermediate recombination average the values
- Might involve two or more parents (global
recombination)
21ES recombination (2)
- The standard operator
- For each object variable
- Choose two parents
- Apply discrete recombination on the corresponding
variables - For each strategy parameter
- Choose two parents
- Apply intermediate recombination on the
corresponding parameters - Global recombination re-choosing the two parents
for each variable anew (step a above). - Local recombination same two parents for each
variable (position i).
22ES recombination (3)
Recombination illustrated
? 3
23ES selection
- Strictly deterministic, rank based
- The ? best ranks are handled equally
- The ? best individuals survive from
- the ? offspring (?, ?) selection
- the parents and the offspring (? ?) selection
- (?, ?) selection often preferred for it is
- important for self-adaptation
- applicable also for noisy objective functions,
moving optima - Selective pressure very high
24Evolutionary programming
- Developed USA in the 1960s
- Early names D. Fogel
- Typically applied to
- traditional EP machine learning tasks by finite
state machines - contemporary EP (numerical) optimization
- Attributed features
- very open framework any representation and
mutation ops OK - crossbred with ES (contemporary EP)
- consequently hard to say what standard EP is
- Special
- no recombination
- self-adaptation of parameters standard
(contemporary EP)
25Traditional EP Finite State Machines
Initial state C Input string 011101 Output
string 110111 Good predictions 60
Fitnesss prediction capability outputi
inputi1
26EP mutation crossover
- Mutation
- For FSMs change a state-transition, add a state,
etc. - For numerical optimization see later
- Crossover none !
representation naturally determines the operators
no crossover between species
27Modern EP representation mutation
- Representation ? x1,,xn, ?1,, ?n ?
- Mutation
- xi is mutated by adding some ?xi from a normal
probability distribution - ? is mutated by a normal scheme (ES
log-normal) - ?i ?i ? (1 ? ? Ni(0, 1))
- xi xi ?I ? Ni(0, 1)
- ? is mutated first !
- other prob. distributions, e.g., Cauchy, are also
applied
28EP selection
- Stochastic variant of (? ?)-selection
- P(t) ? parents, P(t) ? offspring
- Selection by conducting pairwise competitions in
round-robin format - Each solution x ? P(t) ? P(t) is evaluated
against q other randomly chosen solutions from
the population - For each comparison, a "win" is assigned if x is
better than its opponent - The ? solutions with the greatest number of wins
are retained to be parents of the next generation - Typically q 10
29Genetic programming
- Developed USA in the 1990s
- Early names J. Koza
- Typically applied to
- machine learning tasks
- Attributed features
- competes with neural nets and alike
- slow
- needs huge populations (thousands)
- Special
- non-linear chromosomes trees, graphs
- mutation possible but not necessary (disputed!)
30GP representation
- Problem domain modelling (forecasting,
regression, classification, data mining, robot
control). - Fitness the performance on a given (training)
data set, e.g. the nr. of hits/matches/good
predictions - Representation implied by problem domain, i.e.
- individual model parse tree
- parse trees sometimes viewed as LISP expressions
? - GP evolving computer programs
- parse trees sometimes viewed as
just-another-genotype ? - GP a GA sub-dialect
31GP mutation
- Replace randomly chosen subtree by a randomly
generated (sub)tree
32GP crossover
- Exchange randomly selected subtrees in the parents
33GP selection
- Standard GA selection is usual
- Sometimes overselection to increase efficiency
- rank population by fitness and divide it into two
groups - group 1 best c of population
- group 2 other 100-c
- when executing selection
- 80 of selection operations chooses from group 1
- 20 from group 2
- for pop. size 1000, 2000, 4000, 8000 the
portion c is - c 32, 16, 8, 4
- s come from rule of thumb
34Beyond dialects
- Field merging from the early 1990s
- No hard barriers between dialects, many hybrids,
outliers - Choice for dialect should be motivated by given
problem - Best practical approach choose representation,
operators, pop. model pragmatically (and end up
with an unclassifiable EA) - There are general issues for EC as a whole
35General issues
is an evolutionary algorithm ? (HC, SA, TS)
does it work ? (problem X, setup Y, performance Z)
does it work ? (Markov chains, schema theory, BBH)
invented evolutionary algorithms first ?
(Turing, Fogel, Holland, Schwefel, )
are we going next for a nice conference ?
36The end