Evolutionary Computation I

About This Presentation

Title:

Evolutionary Computation I

Description:

the action of the genetic operators used in the EA provided advantages over ... fortunately genetically endowed conspecifics, and thus will survive and pass on ... – PowerPoint PPT presentation

Number of Views:56

Avg rating:3.0/5.0

Slides: 44

Provided by: jhall9

Category:

more less

Transcript and Presenter's Notes

Title: Evolutionary Computation I

1
Evolutionary Computation I

COMP4001/7001
5 September 2005

2
Learning ObjectivesAt the end of this lecture
students will understand

What is an evolutionary algorithm?
Effects of evolutionary operators
The course of computational evolution
Interactions between evolution and

3
Learning ObjectivesAt the end of this lecture
students will understand

What is an evolutionary algorithm?
Effects of evolutionary operators
The course of computational evolution
Interactions between evolution and

4
Why look to evolution?

Evolution is inherently interesting
Evolutionary theories are easy to generate,
almost impossible to test
EC modelling can be used as a proof of principle,
but never existence proof
Evolution is an optimization technique
Biological evolution optimizes organisms to their
environments
EC can be used to optimize programs / processes
Evolution is the only process to date which has
produced intelligence
EC can be used in an attempt to understand how
human intelligence works
May be the best hope for AI

5
History of EC

1957 Fraser (Australian geneticist)
bitstring representation of a chromosome and a
stochastic Monte Carlo approach
investigating questions such as the effects of
linkage on the efficiency of selection, the
relationship between the fitnesses of alleles and
factors such as population size and intensity of
selection, and the comparison of the efficiencies
of different breeding plans for varying degrees
of inter-locus interactions
The algorithm was run on SILLIAC, the parallel
computer at the University of Sydney. SILLIAC
(Sydney ILLIAC) was a slightly modified version
of the ILLIAC developed by the University of
Illinois, Urbana, United States. It cost 50,000
to construct, had a store of 1024 40-bit words,
could perform 13,333 additions/subtractions per
second, and read its input off punched paper tape

6
More history

1957 Box
evolutionary optimisation process for the
improvement of processes in a chemical plant,
involving carefully planned variations to the
procedures used in the operation of the plant
itself
1966 Fogel, Owens and Walsh
groundbreaking analysis of the possibilities of
simulated evolution for the development of
artificial intelligence
1975 John Holland
Adaptation in Natural and Artificial Systems an
Introductory Analysis with Applications to
Biology, Control and Artificial Intelligence
Classic GA

7
Evolutionary algorithms

All EC algorithms involve
a population of individuals
which undergo repeated generations of genetic
modification, fitness evaluation and
fitness-proportionate selection.
The genetic operators used to perform the
genetic modifications are simplified versions of
those found in biological systems.
Many operators have been described in the
literature
Lots of different flavours of EA
Each makes different decisions about
implementation

8
Learning ObjectivesAt the end of this lecture
students will understand

What is an evolutionary algorithm?
Effects of evolutionary operators
The course of computational evolution
Interactions between evolution and

9
Operators

Representation
Hollands GA used binary chromosomes
(bitstrings)
representations ranging from strings of floating
point numbers to entire Lisp programs are used
for different problems by various practitioners
Mutation
acts to introduce variability into the population
by altering the chromosome
most usual mutation operator for a bitstring
chromosome consists of flipping a bit from 0 to 1
or vice versa, with a given probability, the
mutation rate.

10
More operators

Crossover
recombines parts of two (or more) chromosomes to
form new individuals
Single point crossover

11
Selection

Selection should be fitness proportionate
fitter individuals should contribute more to the
next generation, on average, than less fit
individuals
selection method should have an element of
stochasticity so that every individual, no matter
how unfit, has a chance of becoming a parent
If only the fittest individuals in each
generation are allowed to breed the population
rapidly converges to the best solution found
early, which is very unlikely to be the global
best solution
Lots of different selections algorithms, produce
different types of selection pressure

12
The Simple Genetic Algorithm
13
Other Approaches

Evolutionary Programming (EP)
Fogel in the early 1960s, it has no genomic
representation. Each individual in the population
is an algorithm chosen at random over an
appropriate sample space. Mutation is the only
genetic operator used EP does not use crossover
Evolution Strategies (ES)
Schwefel, also in the 1960s, as an optimisation
tool. ES uses a real-valued chromosome with a
population size of one and mutation as the only
genetic operator. In each generation the parent
is mutated to produce a descendant if the
descendant it fitter it becomes the parent for
the next generation, otherwise the original
parent is retained.

14
And more

Classifier Systems
Holland (1975). A classifier takes inputs from
the environment and produces outputs indicating a
classification of the input events. A classifier
system produces new classifiers through the
action of a genetic algorithm on the systems
population of classifiers
Genetic Programming (GP)
Koza in the late 1980s, the aim of GP is the
automatic programming of computers allowing
programs to evolve to solve a given problem. The
population consists of programs expressed as
parse trees operators used include crossover,
mutation and architecture-altering operations
patterned after gene duplication and gene
deletion in nature
Many others, often tailored to problem at hand

15
Learning ObjectivesAt the end of this lecture
students will understand

What is an evolutionary algorithm?
Effects of evolutionary operators
The course of computational evolution
Interactions between evolution and

16
Fitness landscapes

Wright (1932) for a given set of genes each
possible combination of gene values (alleles)
could be assigned a fitness value for a
particular set of conditions
Entire genotype space can then be visualized as a
landscape, with genotypes of high fitness
occupying peaks and those of low fitness forming
troughs
Generally very high-dimensional

17
The course of evolution in silico
This EA has a chromosome length of 10 bits and a
population of 10 individuals. The fitness
function is simply a count of the number of 1s in
the chromosome maximum fitness is therefore 10.
The EA uses elitism, where the fittest individual
in each generation is retained. Elitism ensures
that a good solution, once found, is never lost,
and means that the maximum fitness in the
population always increases
18
Computational evolution

Fitness originally random
Increases over time
Faster at first
Eventually converges to a local optimum
Not necessarily the global optimum
Stochastic, so usually must be repeated
Can be time consuming
Can produce good solutions that work unexpectedly

19
Schema Theorem

Holland, 1975
short, low-order, above-average schemata receive
exponentially increasing trials in subsequent
generations
If the chromosome is a bit string, a schema is a
set of building blocks described by a template
consisting of ones, zeros and asterisks
Template 100011 can be
10100111
10100101
10000101
10000111

20
Schema theorem

an evolutionary algorithm proceeds by identifying
short schemas of high fitness in different
individuals, and recombining them using crossover
in order to produce longer schemas of higher
fitness, and eventually entire individuals having
high fitness
attractive because it suggests that schemas can
be identified and the effects of mutation and
crossover upon schemas in a population of a given
size can be calculated exactly
mathematical tractability would potentially
provide useful insights into the way in which an
EA functions

21
Testing schema theory

Royal road functions - Mitchell, Forrest and
Holland (1991)
structured to provide a smooth, easy path to
maximum fitness under the assumptions of schema
theory
hierarchical fitness landscape, in which
crossover between instances of fit lower-order
schemas tends to produce ever fitter higher-order
schemas
relatively highly fit intermediate stages could
in fact interfere with the finding of fit
higher-order solutions, since once an instance of
a fit intermediate schema is discovered its
relatively high fitness allows it to spread
quickly throughout the population, carrying with
it hitchhiking genes in positions not included
in the schema. Low-order schemas tend to be
discovered more-or-less sequentially, rather than
in parallel

22
Variability

Basis of evolution
Mostly mutation
In Eas, mostly point mutations

23
Mutation
24
Mutation rate

Mutational meltdown a mutation rate so high
that the species cannot survive in the face of
the number of errors generated
about 1 mutation per genome per generation given
that mutations occur at random
maximum rate at which an organism can expect to
produce at least one error-free offspring in its
lifetime
Many EC implementations use a mutation rate of
1/genome
In RNA viruses, about one nucleotide per genome
is incorrectly reproduced per replication for
retroviruses the rate is one nucleotide per ten
genomic replications and for DNA-based microbes
it is about one per 300 replications
Longer genomes do not have higher mutation rates
error-correcting machinery

25
Error correction (Ridley, 2000)

Autocopying the first reproducers were probably
molecules of RNA or something similar, that could
copy themselves using bases from their
environment
Copying enzymes the evolution of enzymes which
catalysed the copying process would also have
made the process more reliable
Double stranded genetic material organisms which
used DNA rather then RNA have the advantage of
having a more stable information carrying
molecule, plus the advantage of having a two
complementary copies of the sequence, to
facilitate error checking
Suite of proofreading and repair enzymes
Development a developmental process translating
a genotype into a phenotype allows for the
correction of errors on the fly in the course of
development all errors do not have to be
corrected in the genome
Ploidy using two or more copies of each
chromosome provides redundancy of the genetic
information, permitting the identification and
correction of errors
Sex recombination of genetic material from more
than one individual introduces the possibility of
concentrating genetic errors in a small
proportion of scapegoat offspring, allowing the
other offspring to be error-free

26
Neutrality

Before the details of the molecular basis of
genetics were worked out in the late 1950s, it
was generally assumed that most mutations cause
phenotypic alterations that are immediately
subject to selection. Under these circumstances
all the variation in a population is adaptive
Electrophoresis huge amount of variability at
the protein level
Motoo Kimura (1968) evolution is driven
primarily by random drift among equally
well-adapted sequence variants
Ohta (1973) Nearly neutral variants which do,
in fact, have a small selective difference can
become effectively neutral in small populations,
where random events become more important
Neutral networks in EC have been demonstrated
to affect the course of evolution by facilitating
random drift to more useful areas of the search
space

27
Managing variability

Variability is systematically eroded by
selection, while at the same time being
replenished via mutation and recombination
Different flavours of EA emphasise the importance
of mutation (e.g. evolutionary programming)
versus recombination (e.g. genetic algorithms) in
generating novelty
Effects of selection tend to outweigh those of
mutation and recombination, and the population
converges towards a peak in fitness space
Neutral mutations rarely occur, unless
deliberately designed into the algorithm

28
Premature convergence

In most EAs the entire population eventually
reaches a single peak and tends to stay there
If this peak is not the global maximum, the EA is
considered to have converged prematurely
Premature convergence occurs when the population
loses the genetic variability which is essential
to continued evolution
This almost complete loss of genetic diversity is
never observed in biological populations

29
Causes of premature convergence

Haploid genotype exposes every mutation to
selection
Diploid genotype have been used require a
dominance map or equivalent
EAs using diploid chromosomes do
tend to maintain more genetic variability
than haploid EAs, but they rarely find
better solutions
benefit of recessively masked variability
will only be realised if the environment
in which the population is evolving changes

30
Psuedo Founder Effects

The Founder Effect occurs when a population
passes through a population size bottleneck, from
which only a few individuals emerge to establish
a new population, for example when a small number
of individuals colonize a new island
In EAs a related phenomenon frequently occurs
when a very fit individual arises in the
population it tends to dominate future
generations
since most individuals are descended from a
single individual they tend to be very similar in
sequence, and so the crossover operator will have
little effect
any genes which happen to be on the
pseudo-founders chromosome will also spread
throughout the population, whether or not they
are valuable, a phenomenon know as hitchhiking

31
Other factors

Intense, unidirectional selection pressure
Development
Troubleshooting mechanisms
Added source of noise
Environmental interactions

32
Speciation

Preselection two individuals are mated to
produce an offspring, which is compared with both
the parents. If the fitness of the child is
greater than that of the worst parent, it
replaces that parent in the population. The idea
is that individuals are replaced by others which
are fitter than they are, but similar in
sequence, so that a number of different solutions
can be maintained in the population, improving
gradually over time
Crowding the crowding of solutions in search
space is discouraged, for example by comparing a
new individual with a subset of the existing
population, and replacing the most similar of
that subset with the new individual.
Fitness Sharing when there are a number of
individuals with very similar sequences, the
fitness of that genotype is shared amongst them
all. This is a very popular diversity maintenance
operator, and there are a large number of
variants on the scheme.

33
More speciation

Niching encouraging the development of different
ecological niches in the population, using an
approach such as the spatially restrained
grid-based algorithm
Coevolution evolving more than one type of
individual at once, with different species
attempting as part of their fitness function to
maintain as much genetic distance from other
species as possible.
Restricted Mating individuals are only allowed
to mate if they are in the basin of attraction of
the same optimum. Once again, this scheme
attempts to replace like with like in the
population. There are a number of variants on the
restricted mating approach

34
Hill Climbing

implicit parallelization by maintaining a
population of candidate solutions which are
modified by mutation and/or crossover, the
algorithm is, in effect, exploring different
regions of its search space in parallel
simplest alternative to a population based EA is
a hill climber, an algorithm which has a
population of one individual, and performs a
strictly local search using mutation
The parallel nature of an EA provides no
advantages over multiple random restarts of a
hill climber in terms of the number of solution
evaluations performed

35
When is an EA better?

the action of the genetic operators used in the
EA provided advantages over local search, which
would, indeed, be the case if the schema theorem
was acting as described, with useful partial
solutions discovered by different individuals
being recombined to produce fitter individuals
more rapidly than could be done by mutation
alone or
the structure of the fitness landscape was such
that the implicit memory of a population-based
algorithm (i.e. the memory encoded into the
structure of the population itself as a result of
evolution) allowed it to concentrate its search
in areas of high fitness in a manner that would
not be possible for a hill climber
In practice, hill climbers with multiple restarts
often perform as efficiently as or better than
population-based algorithms

36
Learning ObjectivesAt the end of this lecture
students will understand

What is an evolutionary algorithm?
Effects of evolutionary operators
The course of computational evolution
Interactions between evolution and

37
Coevolution

The interactions between two or more species as
they evolve
Kauffmans rubber sheet evolution by one species
modifies the fitness landscape for both species
the coevolving species is thus given a spur to
further evolution, as its environment changes
Fitness landscape is constantly changing
powerful strategy for avoiding premature
convergence in evolutionary algorithms is less
chance of the population converging to a local
minimum, since local (and global) minima are
constantly forming and dissolving as the fitness
landscape changes

38
Using coevolution

Samuels checker players (1963)
hill climber, in which two programs played
against each other
In the course of the game one program modified
its parameter settings, while the other remained
static
If the modified copy won the game, it was
accepted, otherwise the original was retained
eventually played checkers at the level of a
human champion
Fogel (2001) still using evolution to develop
checkers players (Blondie21)

39
Learning and evolution

Neural networks may be evolved architecture,
connection weights, or both
Baldwin Effect (Baldwin, 1896) learning on the
part of individuals could guide the course of
evolution in the population as a whole
A particular trait may be learned, or it may be
innate
A learned trait has the advantage of providing
flexibility, but the disadvantage of being slow
to acquire an innate trait is present from
birth, but inflexible
Traits which are initially learned may become,
over time, encoded into the genotype of the
population

40
The Baldwin effect

Two preconditions must be met
The trait in question (which may be a behaviour
or a physical trait) must be influenced by
several interacting genes, so that a mutation in
one of these genes will make the phenotypic
expression of the trait more likely and
an individual bearing such a mutation can learn
to express the trait
learning acts to provide partial credit for a
mutation
An individual carrying a mutation that
predisposes it towards an advantageous phenotype
will learn the trait more easily than its less
fortunately genetically endowed conspecifics, and
thus will survive and pass on more copies of that
allele to the next generation. Over time,
multiple mutations will accumulate in the genes
for the desirable trait, which will thus become
innate in the population