Title: Expediting GABased Evolution Using Group Testing Techniques for Reconfigurable Hardware1
1Expediting GA-Based Evolution Using Group Testing
Techniques for Reconfigurable Hardware1
ReConFig06San Luis Potosi - Mexico
Rashad S. Oreifej, Carthik A. Sharma, and Ronald
F. DeMaraUniversity of Central Florida
1. Research support in-part by NSF grant CRCD
0203446
2Evolvable Hardware
- Evolutionary Design
- Start with available CLBs and IOBs
- Implement a design using Genetic Operators etc
Fogarty97 - Limited or no ability to re-design to account for
suspected faulty resources
- Evolutionary Regeneration
- Start with an existing pool of designs
- Some existing configurations may use faulty
resources - Eliminate use of suspected faulty resources
- Genetic Operators can be applied to refurbish
designs Vigander01
3Previous Work
- Pre-compiled Column-Based Dual FPGA architecture
Mitra04 - Autonomous detection, repair by shifting
pre-compiled columns - Isolation using distributed CED-checkers and
blind reconfiguration attempts - Overview of Combinatorial Group Testing and
Applications Du00 - Provides taxonomy and general algorithms for
applying CGT - Examples of CGT applications DNA clone library
filtering, vaccine screening, computer fault
diagnosis, etc. - CGT Enhanced Circuit Diagnosis Kahng04
- Present doubling, halving etc for circuit fault
diagnosis using BIST, CGT - Requires ability to test resources individually
- Chinese Remainder Sieve technique Eppstein05
- Efficient non-adaptive and two-stage CGT based on
prime number driven test formation - Improved algorithms for practical problem sizes
(n lt 1080) with small number of defectives (d lt
4)
4Genetic Algorithms Evolvable Hardware
- GAs are strong candidates for implementing
system refurbishment - They implement guided trial-and-error search
using principles of Darwinian evolution - Iterative selection enforces survival of the
fittest - Genetic operators - mutation, crossover, - can
be used to refurbish designs - Hypothesis Information regarding resource
performance can expedite GA-based refurbishment
- GAs frequently use strings of 1s and 0s to
represent candidate solutions - FPGA Configuration File is a String of 1s and 0s
5Conventional vs. CGT-Pruned GA
- Conventional GA Searches the whole space to
evolve a working design or repair - Information about resource suitability may
accelerate search - CGT-Pruned GA Prefers resources of higher
fitness to evolve a working design or repair. - Q. How to obtain resource fitness information?
- A. Using Group Testing Techniques.
- Combinatorial Group Testing identifies a
decreasing group of defectives by iterative
refinement - Tests on subsets of suspects
- Is expected to take less time. Faster Design and
Faster Repair
?
6CGT-Pruned GA Simulator
7Experimental Setup
8CGT-Pruned Refurbishment
- Isolate and Avoid suspect resources from being
used
- Hypothesis
- CGT-Pruned GA Repair evolves a full fitness
circuit faster than Conventional GA Repair - Results show performance improvement in
CGT-Pruned Repair
9Results Conventional Vs. CGT-Pruned Repair
10Achieving Refurbishment with Cell Swapping
- Isolate and Swap suspect resources
- Cell Swapping Operator
- Copy suspect resource Cell configuration to
another unused cell - GA searches for routing strategy to re-route
interconnect to the previously-unused cell - Refurbishment with Cell Swapping
- Swap suspect cells one by one and evaluate
fitness until full fitness is evolved - If swapping all suspect cells does not realize
complete refurbishment, then employ other GA
operators
11Repair Progress
12CGT-Pruned GA Design
- Evolve the entire circuit design from scratch
- Avoid suspect resources and take advantage of
resource redundancy within the FPGA
- CGT-Pruning outperforms Conventional GA-based
techniques
13Results Conventional Vs. CGT-Pruned Design
14Comparison of Performance Number of
Generations for Repair
- More than 70 of the experiments benefited
substantially from resource information generated
using CGT
15Results Summary
- As opposed to Conventional GAs, CGT-Pruned GAs
- Completely refurbish configurations in 38 fewer
generations - Design fully functional configurations in 16
fewer generations - Faulty resources are eliminated from
- Pool of unused-resources in the case of repair as
opposed to the pool of all-resources in the case
of design. - Repair complexity vs. Design complexity
- Repair complexity ltlt Design complexity
- Repairs were realized in one-fifth of the time
required for Design
16Backup Slides
17Motivation
- Mission-critical Embedded Systems require high
reliability and availability - Characteristics of Operating Environment may
induce hardware failures - Aging, Manufacturing Defects, etc.
- System Reliability
- Fault Avoidance. Always Possible? No
- Design Margin. Always Adequate? No
- Modular Redundancy. Always Recoverable?No
- Fault Refurbishment. Highly Flexible? Yes
but technically challenging to achieve
?
18Group Testing Techniques
H i,j
- Competitive Group Testing
- Algorithm based on group testing methods
- Use competition between configurations
- Temporal information stored in H matrix
- Successive intersection
- Monitor health history of resources which
presents resource fitness - Simulated using C programming language and GSL
functions Sharma-06
?i,j
Relative fitness of resource a 1/H i,j
19Three Fast Runs of the CGT-pruned GA Repair
- GA evolves to a relatively very high fitness
within the first few hundreds generations, but
takes significantly more generations to reach the
maximum fitness
20References
- 1 Fogarty T. C., J. F. Miller, and P. Thomson,
"Evolving Digital Logic Circuits on Xilinx 6000
Family FPGAs," in Proceedings of The 2nd Online
Conference on Soft Computing, 23-27 June 1997. - 2 Sverre Vigander, Evolutionary Fault Repair
in Space Applications, Masters Thesis, Dept. of
Computer Information Science, Norwegian
University of Science and Technology (NTNU),
Trondheim, 2001. - 3 C. A. Sharma, R. F. DeMara, "A Combinatorial
Group Testing Method for FPGA Fault Location",
accepted to International Conference on Advances
in Computer Science and Technology (ACST 2006),
Puerto Vallarta, Mexico, January 23 - 25, 2006 - 4 S. Mitra and E. J. McCluskey, Which
Concurrent Error Detection Scheme to Choose?, in
Proceedings of the International Test Conference
2000, p. 985, October 2000. - 5 D. Du and F. K. Hwang. Combinatorial Group
Testing and its Applications, volume 12 of Series
on Applied Mathematics. World Scientific, 2000. - 6 A. B. Kahng and S. Reda. Combinatorial Group
Testing Methods for the BIST Diagnosis Problem,
in Proceedings of the Asia and South Pacific
Design Automation Conference, January 2004. - 7 Keymeulen, D. Zebulum, R.S. Jin, Y.
Stoica, A.. Fault-Tolerant Evolvable Hardware
Using Field-Programmable Transistor Arrays, IEEE
Transactions On Reliability, Vol. 49, No. 3,
September 2000 - 8 Lohn, J. Larchev, G. DeMara, R.
Evolutionary fault recovery in a Virtex FPGA
using a representation that incorporates
routing, Parallel and Distributed Processing
Symposium, 2003. Proceedings. International 22-26
April 2003 - 9 Lach, J. Mangione-Smith, W.H. Potkonjak, M.
Low overhead fault-tolerant FPGA systems, Very
Large Scale Integration (VLSI) Systems, IEEE
Transactions on Volume 6, Issue 2, June 1998 - 10 Miron Abramovici, John M. Emmert and Charles
E. Stroud , Roving Stars An Integrated Approach
To On-Line Testing, Diagnosis, And Fault
Tolerance For Fpgas In Adaptive Computing
Systems, The Third NASA/DoD Workshop on
Evolvable Hardware, Long Beach, Cailfornia 2001
21Previous Work
- Fault Tolerant Design and Detection
Characteristics
Incorporates resource performance information
22Previous Work
- Fault Recovery Characteristics
23Our Goal Autonomous FPGA Refurbishment
increase availability without carrying
pre-configured spares
- Redundancy
- increases with amount
- of spare capacity
-
- restricted at design-time
-
-
- based on time required to select spare
resource - determined by adequacy of spares available (?)
-
- yes
- Refurbishment
- weakly-related to number
- recovery capacity
-
- variable at recovery-time
- based on time required to find suitable
recovery - affected by multiple characteristics (
or -) - yes
everyday example
spare tires
can of fix-a-flat
?
Overhead from Unutilized Spares weight, size,
power Granularity of Fault Coverage
resolution where fault handled
Fault-Resolution Latency availability via
downtime required to handle fault Quality
of Repair likelihood and completeness
Autonomous Operation fix without outside
intervention
?
?
?
?
?
24GA Success Stories
- Commercial Applications
- Nextel frequency allocation for cellular phone
networks -- 15M predicted savings in
NY market - Pratt Whitney turbine engine design ---
engineer 8 weeks
GA 2 days w/3x improvement - International Truck production scheduling
improved by 90 in 5 plants - NASA superior Jupiter trajectory optimization,
antennas, FPGAs - Koza 25 instances showing human-competitive
performance such as analog circuit design,
amplifiers, filters
25Adaptive GA Design
Arithmetic mean for twenty experiments
Standard Deviation for twenty experiments
26Analysis Metrics
27CGT-Pruned GA Simulator
- C based console application
- Consists of
- Combinatorial Group Testing component
- Uses Gnu Scientific Library (GSL)
- Genetic Algorithm component
- Object oriented architecture that models FPGA
resources - Modes of Operation
- CGT-Pruned GA Repair
- Use CGT to isolate suspect resources
- Avoid use of suspect-faulty resource in design
refurbishment process - CGT-Pruned GA Repair with Cell Swapping
- Swap suspect-faulty resources with previously
unused resources to evolve a recovery - CGT-Pruned GA Design
- Evolve a new working design while avoiding
suspect resources