Title: Exploring Processor Design Using Genetic Programming
1Exploring Processor Design Using Genetic
Programming
- Borys Bradel
- Kirk Stewart
ECE1718 28 April 2004
2Outline
- Introduction
- Genetic Programming
- Processor Parameters
- Methodology
- Results
- Conclusion
3Introduction
- Processors have many parameters
- Good parameters are not obvious
- Good parameters may be application specific
- Simulation takes a long time
- Impractical to search entire space
- Efficient algorithms need to be used
4Genetic Programming
- Mimics real life
- Entities are defined by chromosomes
- Entities evolve over many generations
- Survival of the fittest
- Propagation, crossover, mutation
5Propagation
- Random selection
- Weighted by fitness function
Example 1
Example 2
0.0
0.0
0.0
2 4 2 1 5
3 2 1 5 5
1 f0.5 2 f0.7 3 f0.9 4 f1.1 5 f1.8
x
x
choose four times randomly
x
x
x
x
x
propagate best
x
5.0
5.0
5.0
6Crossover
1 4 5 6 4 5 3 6 2 3
1 4 5 3 2 4 0 6 2 3
2 4 0 3 2 4 0 5 2 1
2 4 0 6 4 5 3 5 2 1
start
end
2 2 3 6 5 4 0 2 2 5
0 2 3 6 5 4 0 2 5 1
0 1 5 3 6 4 3 6 5 1
2 1 5 3 2 4 3 6 2 5
start
end
mutation
7Transition Between Generations
5 4 3 3 4 3 6 0 3 4 1 4 5 0 0 2 0 0 2 2 6 3 2 2 5
0 1 5 3 4 5 6 3 5 6 5 3 4 4 3 6 2 5 4 1 4 4 1 4
2 6 0 4 1 1 3 5 5 0 6 3 4 0 4 5 0 2 2 1 4
5 6 3 5 6 5 3 4 4 3 6 3 2 2 5 0 1 5 3 4 1 4 5 0 0
2 0 0 2 2 6 2 5 4 1 4 4 1 4 2 3 4 0 4 5 0 2 2 1
4 6 0 4 1 1 3 5 5 0 6 5 4 3 3 4 3 6 0 3 4
5 6 3 5 6 5 3 1 4 3 4 3 1 2 5 0 1 4 3 4 6 2 5 4 1
2 0 0 4 2 1 4 0 0 0 4 4 1 2 2 6 4 0 4 3 0 2 5 0
6 3 0 4 1 1 3 5 5 1 4 5 4 3 3 4 3 6 0 3 4
crossover and mutate
shuffle
evaluate fitness and propagate based on it
8Chromosomes
- Instruction fetch queue length
- Branch prediction algorithm
- Decode width
- Issue width
- Commit width
- Load/store queue length
- L1 D-cache size
- L1 I-cache size
- L2 size
- L2 associativity
- L1 replacement algorithm
- L2 replacement algorithm
- of integer ALUs
- of integer multiplier/dividers
- of floating point ALUs
- of floating point multiplier/dividers
47 bits ? 247 140 thousand billion
9Processor Fitness
- Fast execution is good
- Large area is bad
- Large power consumption is bad
- Fitness IPC/power
- Hard upper limit on area
- Alternates IPC/power2 or IPC2/power
10Methodology
- Simulations performed in Wattch
- An extension of SimpleScalar v3.0 sim-outorder
- Calculates static and dynamic power consumption
- Used area model from Exploring the Design Space
of Future CMPs, Huh, Keckler and Burger - Perl subroutine written by authors
- Very rough estimate
11Methodology
- C program to calculate fitness, create new
generations - Perl scripts coordinate simulation, population
generation - Ran on cluster of 30 dual AMD Athlon MP 2600
genetic.cpp run_sim.pl
sim-outorder
sim-outorder
sim-outorder
12Results Fitness Over Time
13Results IPC Over Time
14Results Power Over Time
15Results Area Over Time
16Results Effect of Mutation Rate
crafty
17Results Number of New Members
0.5
2.5
18Conclusion
- Created a fast algorithm to search a large design
space - Explored effects of varying algorithm parameters
- Evaluated algorithm performance through extensive
simulation - Algorithm is effective at finding good processor
configurations
19Future Work
- Improve area estimation
- Use area as part of fitness
- Evaluate best configurations performance
- Simulate benchmarks not used in genetic algorithm
- Use results from average runs on individual
benchmarks - Compare to existing architectures