Title: Branch Prediction Techniques
1Branch Prediction Techniques
- Alex Ramirez
- (based on the work of others)
- UPC-Barcelona
2Motivation (Run-time)
- Pipelined execution
- A new intruction enters the pipeline every
cycle... - but still takes several cycles to execute
- Control flow changes
- Two possible paths after a branch is fetched
- Introduces pipeline "bubbles"
- Branch delay slots
- Prediction offers a chance to avoid this bubbles
- Problem increases with superscalar execution
A branch is fetched
But takes N cycles to execute
Pipeline bubble
3Motivation (Compile-time)
- Many compiler optimizations depend on accurate
branch prediction - Trace / Superblock scheduling
- Routine inline expansion
- Inter-procedural register allocation
- Code transformations
- Code layout optimizations
4Performance Metrics
- Misprediction rate
- Mispredicted branches per executed branch
- Unfortunately the most usually found
- Instructions per mispredicted branch
- Gives a better idea of the program behaviour
- Branches are not evenly spaced
5Talk Outline
- Static branch prediction
- Simple heuristics
- Complex heuristics
- Semi-static branch prediction
- Profile-based
- Dynamic branch prediction
- Bimodal predictors
- Two level predictors
- Hybrid predictors
- Multiple branch prediction
- Adapted branch predictors
- Tree-like subgraph predictors
- Trace predictors
6Simple Static PredictorsJ.E.Smith ISCA81
- Simple heuristics
- Always taken
- Always not taken
- Backwards taken / Forward not taken
- Relies on the compiler to arrange the code
following this assertion - Certain opcodes taken
- Programmer provided hints
7Simple Static Predictors (II)
8Simple Static Predictors (III)Optimized code
layout
9Complex Static PredictorsBall Larus PLDI93
- Based on
- Control flow analysis of the code to determine
loops - Classify branches in
- Pointer comparison false
- Avoid executing subroutine calls
- exception handlers
- Less than zero false, Greater than zero true
- error codes less than zero
- FP equal false
- Avoid returning from a subroutine
- recursion base case
- Avoid blocks containing a store instruction
- Go towards loop headers, avoid exiting loops
- Favor reuse of the branch operand
- Misprediction rate about 20
10Improving Static PredictorsMueller Whalley
PLDI92/95, Krall PLDI94, Young Smith
ASPLOS94
- The compiler can lay out the code to match the
static prediction - Code replicating techniques allow
- Unconditional jump elimination
- Replicate code after the if-then-else convergence
in both conditional paths - Change unconditional loop edges to conditional
ones - Conditional branch elimination
- In some situations the branch outcome is known
- Test a variable which has not been modified or
has just been set - Static prediction using branch history
information - Replicate if-then-else bodies to know the
previous branch outcomes
11Removing Unconditional Jumps
12Removing Unconditional Jumps (II)
13Removing Conditional Branches
flag 1 while (cnd1 flag) A if (cnd2)
B flag 0 C
while (cnd1) A if (cnd2) B C
break C
14Static Prediction Using Branch History
If xgt10
If xgt10
False
True
True
False
If xgt20
If xgt20
If xgt20
True
True
True
False
False
False
15Semi-static PredictorsMc Farling Hennessy
ISCA86, Fisher Freudenberger ASPLOS92
- Use profile information from previous program
runs - Branches tend to behave in a fixed way
- Branches tend to behave in the same way across
different program executions - Performance metric
- How close to a perfect static predictor
- Best direction to statically predict a branch
- Real static predictors
- Based on a past dataset / group of datasets
16Semi-static Predictors (II)
17Bimodal Branch Predictors
- Dynamically store information about the branch
behaviour - Branches tend to behave in a fixed way
- Branches tend to bevave in the same way across
program execution - Index a Pattern History Table using the branch
address - 1 bit branch behaves as it did last time
- Saturating 2 bit counter branch behaves as it
usually does - Usually used as an extension to the BTB
18Bimodal Predictors BTB
Branch Address
19Bimodal Branch Predictors (II)
20Two-level Branch PredictorsPan, So Rahmeh
ASPLOS92, Yeh Patt ISCA93
- A branch outcome depends on the outcomes of
previous branches - First level Branch History Registers (BHR)
- Global history / Branch correlation past
executions of all branches - Self history / Private history past executions
of the same branch - Second level Pattern History Table (PHT)
- Use first level information to index a table
- Possibly XOR with the branch address McFarling
93 - PHT Usually saturating 2 bit counters
- Also private, shared or global
21Gshare Two-level Predictor
Branch History
Branch Address
Index
22PAp Two-level Predictor
Branch History
PHT
Branch Address
23Two-level Branch Predictors (II)
24Improving Dynamic Predictors
- Combining branch predictors McFarling WRL'93
- Use two different branch predictors
- Access both in parallel
- A third table determines which prediction to use
- Using value prediction to predict branch outcomes
Gonzalez Gonzalez PACT'99 - Combine a branch predictor with a value predictor
- Predict the branch inputs
- Compute branch outcome
- Variable history length Juan Sanjeevan Navarro
ISCA'98, Stark Evers Patt ASPLOS'98 - Record accuracy of a given history length
- Define a time frame
- Use the best recorded history length
25Hybrid Branch Predictors
Branch Address
- Predictor A
- static
- bimodal
- PApsg
- Meta
- Predictor
- bimodal
- two-level
does not replicate the history register
26Hybrid Branch Predictors (II)
27Multiple Branch Predictors
- Adapt existing branch predictors
- Multi-bank BTB with bimodal predictor
- Access many consecutive addresses
- Invalid predictions after the first taken branch
- Adapted two-level predictor Yeh,Marr Patt
SC'93 - Fast sequential accesses to the same PHT
- Tree-like subgraph predictors S.Dutta
M.Franklin MICRO'95 - Predict which of the tree-like paths to follow
- Next trace predictor Jacobson, Rottenberg
Smith MICRO'97 - Designed to work in conjuntion with the trace
cache
28Multi-bank Branch Target Buffer
Type
Pred
Target
Branch Address
Tag
¹
¹
¹
Taken!
Not taken
29Adapted Gshare Predictor
PHT
Branch History
Branch Address
P3
P2
P1
Index
30Adapted (fast) Gshare Predictor
PHT
Branch History
P3
Branch Address
P2
P1
Index
31Tree-like Subgraph Predictor (I)
Subgraph Address
A0
L0
A1
A4
L1
L4
A2
A3
A5
A6
L2
L3
L5
L6
T0
T1
T2
T3
Path 0 Attributes A0,A1,A2,L0,L1,L2,T0
32Tree-like Subgraph Predictor (II)
Subgraph Address
2-bit counters
Subgraph history
Tag
Path Attributes
Path Selector
To Fetch Unit
33Next Trace Predictor
History register
Hash Function
Secondary table
Index Generation
Correlating table
34Bibliography
35Bibliography
- Static branch prediction
- J.E.Smith. A study of branch prediction
strategies. ISCA-8, 1981. - F.Mueller and D.B.Whalley. Avoiding unconditional
jumps by code replication. PLDI, 1992. - T.Ball and J.R.Larus. Branch prediction for free.
PLDI, 1993. - C.Young and M.D.Smith. Improving the accuracy of
static branch prediction using branch
correlation. ASPLOS-6, 1994. - F.Mueller and D.B.Whalley. Avoiding conditional
branches by code replication. PLDI, 1995.
36Bibliography
- Semi-static branch prediction
- S.McFarling and J.Hennessy. Reducing the cost of
branches. ISCA-13, 1986. - D.E.Wall. Predicting program behavior using real
or estimated profiles. PLDI, 1991. - J.A.Fisher and S.M.Freudenberger. Predicting
conditional branch directions from previous runs
of a program. ASPLOS-5, 1992. - A.Krall. Improving semi-static branch prediction
by code replication. PLDI, 1994. - B.Calder and D.Grunwald. Reducing branch costs
via branch alignment. ASPLOS-6, 1994. - B.Calder, D.Grunwald, D.Lindsay, J.Martin,
M.Mozer and B.Zorn. Corpus-based static branch
prediction. PLDI, 1995.
37Bibliography
- Bimodal predictors and BTBs
- J.E.Smith. A study of branch prediction
strategies. ISCA-8, 1981. - J.Lee and A.Smith. Branch prediction strategies
and branch target buffer design. IEEE Computer
21(7). 1984. - S.McFarling and J.Hennessy. Reducing the cost of
branches. ISCA-13, 1986. - T.Yeh and Y.N.Patt. A comprehensive instruction
fetch mechanism for a processor supporting
speculative execution. MICRO-25, 1992. - B.Calder and D.Grunwald. Fast accurate
instruction fetch and branch prediction. ISCA-21,
1994.
38Bibliography
- Two-level branch predictors
- S.Pan, K.So and J.Rahmeh. Improving the accuracy
of dynamic branch prediction using branch
correlation. ASPLOS-5, 1992. - T.Yeh and Y.N.Patt. Alternative implementations
of two-level adaptive branch prediction. ISCA-19,
1992. - T.Yeh and Y.N.Patt. A comparison of dynamic
branch predictors that use two levels of branch
history. ISCA-20, 1993. - S.McFarling. Combining branch predictors.
Technical note TN-36, DEC-WRL, 1993.
39Bibliography
- Combining branch predictors
- S.McFarling. Combining branch predictors.
Technical note TN-36, DEC-WRL, 1993. - P.Y.Chang, E.Hao and Y.N.Patt. Alternative
Implementations of hybrid branch predictors.
MICRO-28, 1995. - Dynamic history length
- T.Juan, S.Sanjeevan and J.Navarro. Dynamic
history length fitting a third level of
adaptivity for branch prediction. ISCA-25, 1998. - J.Stark, M.Evers and Y.N.Patt. Variable length
path branch prediction. ASPLOS-8, 1998.
40Bibliography
- Novel branch predictors
- C.C.Lee, I.Chen and T.Mudge. The bi-mode branch
predictor. MICRO-30 1997. - E.Sprangle, R.Chappell, M.Alsup and Y.N.Patt. The
agree predictor a mechanism for reducing
negative branch history interference. ISCA-24,
1997. - J.Gonzalez and A.Gonzalez. Control-flow
speculation through value prediction for
superscalar processors. PACT 1999.
41Bibliography
- Multiple branch predictors
- T.Yeh, D.Marr and Y.N.Patt. Increasing
instruction fetch rate via multiple branch
prediction and a branch address cache. SC-7,
1993. - S.Dutta and M.Franklin. Control flow prediction
with tree-like subgraphs for superscalar
processors. MICRO-28, 1995. - T.Conte, K.Menezes, P.Mills and B.Patel.
Optimization of instruction fetch mechanisms for
high issue rates. ISCA-22, 1995. - S.Wallance and N.Bagherzadeh. Multiple branch and
block prediction. 1997.
42Bibliography
- Trace predictors
- Q.Jacobson, E.Rottenberg and J.E.Smith.
Path-based next trace prediction. MICRO-30, 1997. - B.Black, B.Ryclick and J.P.Shen. The block-based
trace cache. ISCA-26, 1999.