Title: Ronald Barnes
1Combining Predication and Software Speculation
with Dynamic Speculation
- Ronald Barnes
- George Mason University
2Motivation
- Architectural features empower compilers to
express instruction-level parallelism (ILP) - Large architecturally visible register files
- Predicated instruction sets
- Explicit speculation
- Compiler schedulers meticulously group
instructions for parallel execution - Makes static assumptions about control-flow based
on profile information - Assumes hit latency for memory operations
3Compiler Expressed Parallelism
- Compiler can find a significant number of
instructions for parallel execution on 6-issue
processor
4Motivation cont.
- Run-time stalls retard compilers carefully
planned execution - Instruction cache misses
- Branch mispredictions
- Long, fixed latency instructions
- Variable memory-access latency
5Compiler Expressed Parallelism
- Dynamic stalls (of which cache misses are most
important) drastically reduce observed performance
6Experimental configuration
- Benchmarks compiled and (unless otherwise stated)
aggressively optimized with the IMPACT C compiler
7Microarchitectural solutions to tolerating memory
latency
- Out-of-order EPIC executionWang01
- In-order runaheadDundas97
- Multipass Barnes05
8Dynamic Speculation in EPIC
- Many options for tolerating memory latency have
characteristics of dynamic speculation/scheduling - An instruction stream chosen by branch prediction
is speculatively executed - This execution (even if it is only prefetching)
is speculated before control flow instructions - In particular speculation during cache miss that
otherwise would cause the processor to stall - EPIC features are envisioned to statically expose
ILPare they helpful w/ dynamic speculation?
9EPIC Features
- Predication
- Allows compiler toeffectively manage parallel
execution in the presence of control-dependences - Converts control-flow dependence to data-flow
dependence - Software control speculation
- Enables compiler to schedule instructions around
(otherwise serializing) branches
10Predication
- Wang01 discussed some of the problems with
predication and out-of-order execution - Multiple definitions
- Predicate define on dependence chain
11Register renaming and predication
- Register renaming requires complexity in RAT and
select_ops
12Compare instruction dependences
- Predication adds a data-flow dependence that
hinders instruction dispatch from issue queue - Predicate slip Wang01 allows dispatch of
predicated instruction before its predicate is
computed - Instruction cannot bypass to other instructions
until its predicate is known
13Dependence with distance from miss
14Dynamic speculation performance
15Branch prediction
- Biggest difference twolf lt 5 performance
improvement
16Using aggressive promotion to reduce dependences
17Performance w/ aggressive promotion
18Speculation
- Fill up reservation tables with instructions off
path
19Fullness of issue queues
20Effect of speculation on dynamic scheduling
21Conclusions
- Look for more EPIC-friendly ways to achieve
benefits of dynamic speculation - Runahead execution/Multipass pipelines
- Prefetching helper threads
- Researchers should consider characteristics of
compilation - Dynamic systems should be specifically targeted
- ex. Less predication, more aggressive promotion,
less aggressive static speculation)
22Runahead execution performance