Title: SMARTS:Accelerating Microarchitecture Simulation via Rigorous Statistical Sampling
1SMARTSAccelerating Microarchitecture Simulation
via Rigorous Statistical Sampling
- Roland E. Wunderlich Thomas F. Wenisch Babak
Falsafi James C. Hoe - Computer Architecture Laboratory
- Carnegie Mellon University, Pittsburgh, PA
- Joo hyung Kim
2Contents
- Introduction
- Statistical sampling
- The SMARTS framework
- SMARTS in practice
- Using SMARTS
- Conclusion
3Introduction
- Current approaches
- Abbreviated instruction execution streams
- Fewer or smaller input sets
- Shortcomings
- On the efficiency front, large sampling units
- On the accuracy front, no tight error bounds
- The SMARTS approach
- Statistical sampling theory
- Exact and constructive procedure
4Statistical sampling
- Theory of sampling
- Choosing a minimal
- Representative sample to achieve a quantifiable
accuracy and precision in the estimate - Not presume a normally distributed population
- Our goal
- Identify a minimal but representative sample from
the population for microarchitecture simulation - Establish a confidence level for the error on
sample estimates
Sampling variables
5The SMARTS framework
- Technique overview
- Detailed mode
- Functional mode
- Detailed warming short-comings
- Expensive
- Difficult to derive analytically
- Functional warming
- The cache hierarchies and branch predictors are
prime candidates
SMARTS variables
Systematic sampling in SMARTS
6The SMARTS framework
- Benchmarks
- Demonstrate the effectiveness of SMARTS
- CPI and EPI of the SPEC CPU2000 integer and
floating-point - SimpleScalar 3.0sim-outorder simulator
Machine configurations
7The SMARTS framework
Coefficient of variation of CPI
VCPI decreases with increasing U because
Short-term CPI variations within a window of U
instructions are hidden by averaging over the
sampling unit.
8The SMARTS framework
Modeled SMARTS simulation rate The two SD plots
show the simulation rate without function
warming. The SFW plot shows the simulation rate
when using functional warming to bound W.
9SMARTS in practice
- SMARTSim
- Sim-outorder
- Support functional simulation prior to starting
detailed simulation - SMARTSim
- Repeated transitions back-and-forth between
functional and detailed simulation modes - Accepts the systematic sampling parameters
- Fast-forwarding options
- Functional simulation
- Functional simulation with warming
10SMARTS in practice
- Optimal sampling unit size
Optimal U
The left chart shows that the optimal U increases
with W. The right chart shows that U 1000 is a
reasonable choice across benchmarks and W.
11SMARTS in practice
- Effectiveness of detailed warming
Detailed warming requirements without functional
warming. (8-way) Choosing U 1000 and n
sufficient for a 99.7 confidence interval of 3
12SMARTS in practice
- Effectiveness of functional warming
- All benchmarks have bias under 2.0
- Only 6 benchmarks in each configuration exceed
1.0
CPI bias achieved with functional warming and
minimal detailed warming
13Using SMARTS
- SMARTS procedure
- W is selected to exceed the bounded history of
the microarchitectural state - Setting U 1000
- Determine n, and correspondingly k
- Correct value for n
- A sampling measurement is made using a generic
initial value ninit - ntuned for a second run is calculated from the Vx
of the initial run
14Using SMARTS
- Comparison to SimPoint
- SimPoint advantages
- Obviates the need for functional warming
- Quick integration into a simulation
infrastructure - Early termination of simulation
- SimPoint shortcomings
- High CPI error
- Unquantifiable confidence in estimates
15Using SMARTS
- Comparison to SimPoint
- SimPoint has a higher average error (3.7 vs.
0.6) - higher worst-case error (-14.3 for gcc-2)
- SimPoint estimate based on just a single instance
of the basic block sequences yields a large
error. - SMARTS uses the measured coefficient of variation
to help gauge both the required sample size and
the confidence in the estimates.
Comparison of SMARTS with SimPoint SimPoints
mean runtime per benchmark is 2.8 hours compared
to 5.0 hours for SMARTS.
16Conclusion
- Evaluation Results
- Average error 0.64 on CPI and 0.59 on EPI
- Average speedups of 35 and 60 times
- Future simulator designs
- Designers should focus on the simulators
flexibility and realism - Designers should focus on techniques to speed up
fast-forwarding and functional warming