Title: Introduction to Genetic Algorithms
1Introduction toGenetic Algorithms
2Genetic Algorithms
- What are they?
- Evolutionary algorithms that make use of
operations like mutation, recombination, and
selection - Uses?
- Difficult search problems
- Optimization problems
- Machine learning
- Adaptive rule-bases
3Theory of Evolution
- Every organism has unique attributes that can be
transmitted to its offspring - Offspring are unique and have attributes from
each parent - Selective breeding can be used to manage changes
from one generation to the next - Nature applies certain pressures that cause
individuals to evolve over time
4Evolutionary Pressures
- Environment
- Creatures must work to survive by finding
resources like food and water - Competition
- Creatures within the same species compete with
each other on similar tasks - Rivalry
- Different species affect each other by direct
confrontation (e.g. hunting) or indirectly by
fighting for the same resources
5Natural Selection
- Creatures that are not good at completing tasks
like hunting have fewer chances of having
offspring - Creatures that are successful in completing basic
tasks are more likely to transmit their
attributes to the next generation since there
will be more creatures born that can survive and
pass on these attributes
6Genetics
- Genome (class)
- Sequence of genes describing the overall
structure of the genetic for a particular species - Genomics
- Study of the meaning of the genes for a
particular species - Alleles
- Values that can be assigned to a given gene
- Genotype (instance)
- Sequence of alleles
7Physical Properties
- Phenetics
- Study of physical properties and morphology of
creatures independent of genetic information - Phenome
- General structure of creatures body and
attributes - Phenotype
- Particular instance of phenome realized as a
unique creature - Product of genotype and environment forces
8Conversions
- In real-world mapping between genotypes and
phenotypes is hard - In AI work it can be done by defining a
convenient function or even designing encodings
by hand - It is often easier to adapt genetic operators to
work with the evolutionary data structure used to
represent the phenotype than to encode and decode
phenotypes
9Genetic Algorithmic Process
- Potential solution for problem domains are
encoded using machine representation (e.g. bit
strings) that supports variation and selection
operations - Mating and mutation operations produce new
generation of solutions from parent encodings - Fitness function judges the individuals that are
best suited (e.g. most appropriate problem
solution) for survival
10Initialization
- Initial population must be a representative
sample of the search space - Random initialization can be a good idea (if the
sample is large enough) - Random number generator can not be biased
- Can reuse or seed population with existing
genotypes based on algorithms or expert opinion
or previous evolutionary cycles
11Evaluation
- Each member of the population can be seen as
candidate solution to a problem - The fitness function determines the quality of
each solution - The fitness function takes a phenotype and
returns a floating point number as its score - It is problem dependent so can be very simple
- It can be a bottleneck if it is not carefully
thought out (there are magic ways to create them)
12Selection
- Want to give preference to better individuals
to add to mating pool - If entire population ends up being selected it
may be desirable to conduct a tournament to order
individuals in population - Would like to keep the best in the mating pool
and drop the worst (elitism) - Elitism is trade-off with search space
completeness
13Crossover
- In sexual reproduction the genetic codes of both
parents are combined to create offspring - A sexual crossover has no impact on the mating
pool - Would like to keep 60/40 split between parent
contributions - 95/5 splits negate the benefits of crossover
14Crossover
- If we have selected two strings
- A 11111 and B 00000
- We might choose a uniformly random site (e.g.
position 3) and trade bits - This would create two new strings
- A 11100 and B 00011
- These new strings might then be added to the
mating pool if they are fit
15Mutation
- Mutations happen at the genome level (rarely and
not good) and the genotype level (better for the
GA process) - Mutation is important for maintaining diversity
in the genetic code - In humans, mutation was responsible for the
evolution of intelligence - Example The occasional (low probably) alteration
of a bit position in a string
16Operators
- Selection and mutation
- When used together give us a genetic algorithm
equivalent of to parallel, noise tolerant, hill
climbing algorithm - Selection, crossover, and mutation
- Provide an insurance policy against losing
population diversity and avoiding some of the
pitfalls of ordinary hill climbing
17Replacement
- Determine when to insert new offspring into the
mating pool and which individuals to drop out
based on fitness - Steady state evolution calls for the same number
of individuals in the population, so each new
offspring processed one at a time so fit
individuals can remain a long time - In generational evolution, the offspring are
placed into a new population with all other
offspring (genetic code only survives in kids)
18Genetic Algorithm
- Set time t 0
- Initialize population P(t)
- While termination condition not met
- Evaluate fitness of each member of P(t)
- Select members from P(t) based on fitness
- Produce offspring from the selected pairs
- Replace members of P(t) with better offspring
- Set time t t 1
19Why use genetic algorithms?
- They can solve hard problems
- Easy to interface genetic algorithms to existing
simulations and models - GAs are extensible
- GAs are easy to hybridize
- GAs work by sampling, so populations can be
sized to detect differences with specified error
rates - Use little problem specific code
20Traveling Salesman Problem
- To use a genetic algorithm to solve the traveling
salesman problem we could begin by creating a
population of candidate solutions - We need to define mutation, crossover, and
selection methods to aid in evolving a solution
from this population - At random pick two solutions and combine them to
create a child solution, then a fitness function
is used to rank the solutions
21Traveling Salesman Problem
- For crossover we might take two paths (P1 and P2)
break them at arbitrary points and define new
solutions Left1Right2 and Left2Right1 - For mutation we might randomly switch two cites
in an existing path
22Evolve Algorithm for TSP
- Set up initial population
- For G generations
- Create M mutations and add them to the population
- Subject mutations to population constraints and
determine their relative fitness - Create C crossovers and add them to the
population - Subject crossovers to population constraints and
determine their relative fitness
23Solving TSP using GA
- Steps
- Create group of random tours
- Stored as sequence of numbers (parents)
- Choose 2 of the better solutions
- Combine and create new sequences (children)
- Problems here
- City 1 repeated in Child 1
- City 5 repeated in Child 2
24Modifications Needed
- Algorithm must not allow repeated cities
- Also, order must be considered
- 12345 is same as 32154
- Based upon these considerations, a computer model
for N cities can be created - Gets quite detailed
25Genetic Algorithm Example
Parent A
Parent B
A
A
B
B
C
E
E
C
D
D
26Genetic Algorithm Example
Combined Path
B
A
B
A
A
B
A
B
E
B
C
A
A
B
D
27Genetic Algorithm Example
Child
B
A
B
A
B
E
C
A
B
D
28Mutations
Chance of 1 in 50 to introduce a mutation to the
next generation (the child if it replaces a
parent, or the first parent)
R1
R2
E
B
F
D
G
A
C
E
A
G
D
F
B
C
29Premature Convergence
- Occasionally a gene takes over because it is so
much fitter than all others (genetic drift) - If this is the best solution, that may be OK (if
not you may never find the optimal solution if
this happens too soon) - Large populations genetic drift is less likely to
happen - Using higher mutation rates can combat genetic
drift
30Premature Convergence
- High levels of randomness are not always helpful
to GA - To prevent genetic drift
- You might have several small populations and
cross-breed individuals from them - Take game of life approach, pretend individuals
live on 2D grid and only allow breeding between
neighbors (spatial organizational structure)
31Slow Convergence
- Some GA will simply fail to converge
- Similar to plateau problem in hill climbing (need
to add noise to fitness values to make them
converge) - Can increase elitism to encourage fitter
individuals to spread their genes (at the risk of
premature convergence) - Increasing level of random mutations sometimes
helps
32Parameters
- Require lots of parameters (mutation rate,
crossover type, population size, fitness scaling
policy) - Can make use of a hierarchy of GAs with a master
GA setting the parameters for an ordinary GA - Parameterless GA have default values chosen for
parameters so that human interaction is not
needed for fine tuning
33Domain Knowledge
- GA do not exploit domain knowledge unless the KE
designs special policies and operators - During initialization there can be a bias toward
certain genotypes selected by the domain expert - Can use gene dependent mutation rates and
heuristic crossover split points - The choice of representation can affect the size
and search efficiency of the problem space
34GA Strengths
- Do well at avoiding local minima and can often
times find near optimal solutions since search is
not restricted to small search areas - Easy to extend by creating custom operators
- Perform well for global optimizations
- Work required to to choose representations and
conversion routines is acceptable
35GA Weaknesses
- Do not take advantage of domain knowledge
- Not very efficient at local optimization (fine
tuning solutions) - Randomness inherent in GA make them hard to
predict (solutions can take a long time to
stumble upon) - Require entire populations to work (takes lots of
time and memory) and may not work well for
real-time applications
36Evolvee
- Uses existing representations (like Neural Net)
- Realism is relatively poor
- Attack simple tasks (e.g. attack behaviors) do
not pose any problems for it - (not found in current archive)
37Actions and Parameters
- Limited action set needed
- Look parameter direction
- Single value up, ahead, down
- Move parameter weights
- Vector (projectile, collision point, impact
location) - Fire parameter
- Jump parameter
38Sequences
- Contained in simple arrays of actions and times
- Times can be associated with actions in two ways
- Time offset relative to previous action
- Absolute time since start of sequence
- The order of sequences in an array is not
important (this allows symmetric solutions but
avoids the cost of sorting actions before
evolution is complete)
39Random Generation
- Time offset will be a randomly generated values
within maximum sequence length - Action type can be encoded as a symbol randomly
chosen from set of all possible actions - Parameters values are action specific and need to
be chosen after action is selected and given in
range values
40Random Generation
- The length of all action sequences can also be
generated randomly (with an maximum upper bound) - The sequences of actions will be housed in a
dynamic array - Start time of first action in a sequence can be
reset to zero
41Crossover
- Simple one point crossover
- Randomly split two move sequences from parents
and swap sub-arrays to create two new children - Fairly easy to program using arrays
42Mutation
- A low probability mutation might be to change the
length of a sequence - Empty spaces can be filled with random action
- Excess actions are simply ignored
- A low probability mutation might be to replace
individual actions within existing sequences - Gene storage time follows normal distribution
43Evolution
- Population size will remain constant
- Evolution happens on request
- If individual unassigned fitness exists chose it
otherwise choose two parents with probabilities
proportional to their fitness for
crossover/mutation - Individuals are removed from the population using
random selection based on inverse fitness - To diversify the population remove the poorer of
two similar behaviors
44Object for Defensive Tactics
- In combat game terms, defensive tactics is the
sequence of actions carried out by an object to
protect itself when it comes under attack - This is a natural choice for learning behavior by
genetic algorithm, because the object is in a
highly competitive situation with a survival
mandate - It should be possible to decide on the fittest
behaviors and select for them in the evolving
sequence of actions - To keep things simple, we will focus on only two
behaviors dodging enemy fire and rocket jumping - But the method could be extended to include other
defensive moves, such as weaving and seeking
cover
45Computing FitnessRocket Jumping
- Assign rewards only for upward movement when
object is not touching the floor, to avoid
rewarding running up the stairs - Reward high jump a lot more than lower jumps
46Computing FitnessDodging Fire
- Provide 0 reward when hit and high reward when
object escapes with no damage - Must include distance of dodging movement away
from point of impact to avoid rewarding standing
still - Damage to object must also be measured and
subtracted from fitness value - Use time as a 4th dimension to resolve ties
47For the Game
- Make use of genetic algorithm
- Learn its jumping and dodging behaviors during
the game - Fitness function provides rewards on a per jump
or per dodge basis
48Evaluation
- Learns to jump fairly quickly
- Multiple jumps are no problem
- Dodging behavior is also learned quickly
- Any balanced combination of vector weights
(estimated point of impact, closest collision
point, project attributes) that causes movement
to safety work well - Approach is sub-optimal but acceptable
49Evaluation
- Continuous fitness values are more helpful to the
genetic algorithm than Boolean success indicators - Scheme reveals how well it is possible to evolve
behaviors using genetic operators - The representation is better suited to modeling
sequences than either decision trees or fuzzy
rules - Representation is incompatible with rule-based
schemes
50Related Technologies
- Genetic Programming
- Existing programs are combined to breed new
programs - Artificial Life
- Using cellular automata to simulate population
growth