Title: http://www.cs.virginia.edu/~krw7c/avf.html
1Dynamic Prediction of Architectural Vulnerability
n0
Kristen Walcott, Greg Humphreys, Sudhanva
Gurumurthi
University of Virginiawalcott, humper,
gurumurthi_at_cs.virginia.edu
Challenge
As soft errors become more of a problem,
protection will be needed even for every day
PCs. Providing total redundancy is too expensive
and assumes that AVF is 100. Our work shows
that AVF varies over time and across
applications.
- Transient faults due to particle strikes are a
key challenge in microprocessor design. - As transistor counts increase exponentially,
per-chip faults are a growing burden. - Spatial and temporal redundancy techniques are
used to protect against faults. - Redundancy techniques assume that any fault will
result in a visible program error (i.e., the
Architectural Vulnerability Factor (AVF) is 100
percent). - Over-design can hurt performance and drain power.
2 SimPoints of bzip2
Rising Problem
Dynamic AVF Prediction
Outliers
(Correlation to AVF)
Intel Corporation
Microarchitectural Metrics
FIT Failure in Time 1 failure in a billion
hours
Prediction Results
We identify strong correlations between
structural AVF values and a small set of
processor metrics.
Particle Strike Causes Bit Flip!
no
yes
Detection Correction
Using linear and quadratic regression, we
determined an AVF characterization that uses only
a few easily measurable variables. These
characterizations can be used to predict AVF
accurately.
no
Detection only
no
yes
no
yes
galgel benchmark
What bits matter?
Future Work
With an accurate predictor, redundancy may be
turned on only when vulnerability is high.
Preliminary results show that partial redundancy
provides a significant performance boost over
full redundancy. Next we will perform a more
rigorous exploration of the design space of
partial redundant multithreading implementations
and investigate redundancy toggling policies.
Calculating Vulnerability
Computer Science
http//www.cs.virginia.edu/krw7c/avf.html
at the UNIVERSITY of VIRGINIA