Research on Branch Prediction Algorithms - PowerPoint PPT Presentation

1 / 36

About This Presentation

Title:

Research on Branch Prediction Algorithms

Description:

Our benchmarks represent general computer. ... Our benchmarks use quick sort and heap sort, most widely used. ... on miss rate, it's the benchmark's limit ... – PowerPoint PPT presentation

Number of Views:38

Avg rating:3.0/5.0

Slides: 37

Provided by: Daniela106

Category:

more less

Transcript and Presenter's Notes

Title: Research on Branch Prediction Algorithms

1
Research on Branch Prediction Algorithms
Baozhen Yu Wenyuan Xu Xiaoxuan Li Department
of Electrical Computer Engineering Rutgers
University
2
Agenda

Why branch prediction
Our Branch Prediction Simulator
Our simulator system environment.
Some branch prediction schemes and some
experiment result
Comparison
Conclusions
Contributions from team members

3
Why need branch prediction?
4
How an Instruction is Processed
Processing can be divided into five stages
Instruction fetch
Instruction decode
Execute
Memory access
Write back
5
Instruction-Level Parallelism
To speed up the process, pipelining overlaps
execution of multiple instructions, exploiting
parallelism between instructions
Instruction fetch
Instruction decode
Execute
Memory access
Write back
6
Control Hazards Branches
Conditional branches create a problem for
pipelining the next instruction can't be fetched
until the branch has executed, several stages
later.
Branch instruction
7
Pipelining and Branches
Pipelining overlaps instructions to exploit
parallelism, allowing the clock rate to be
increased. Branches cause bubbles in the
pipeline, where some stages are left idle.
Instruction fetch
Instruction decode
Execute
Memory access
Write back
Unresolved branch instruction
8
Branch Prediction
A branch predictor allows the processor to
speculatively fetch and execute instructions down
the predicted path.
Instruction fetch
Instruction decode
Execute
Memory access
Write back
Speculative execution
Branch predictors must be highly accurate to
avoid mispredictions!
9
Branch Predictors Must Improve

The cost of a misprediction is proportional to
pipeline depth
Pentium 4 pipeline has 20 stages
Future pipelines will have gt 32 stages
As pipelines deepen, we need more accurate branch
predictors
Our research on branch prediction will mainly
based on the prediction accuracy.

10
Our BP Simulator System environment
11
Simulator System

Our benchmarks represent general computer.
Because sorting is the typical task for both
integer and float point program. Our benchmarks
use quick sort and heap sort, most widely used.
Date file collect two kind of information
Conditional branch address
Branch taken result
One BP simulator for each branch
prediction scheme.

12
Simulator System---- Benchmarks

Heapsort is much more predictable than quicksort.
The size of branch pattern in Heapsort is much
larger than the pattern in Quicksort.
The pattern in heapsort is more obvious.
Patterns are more regular in sorting longer list
of numbers than sorting short.
More branch address in Heapsort

13
Branch Prediction Schemes and Some Experiment
Result
14
Scheme1--- Branch History Table(BHT)

Use part of the branch address as index.
Uses branch history to predict outcome.
Larger bits of history table yield better
performance but higher cost.

Branch Taken Result
update
Branch history Table
index
Branch Address
Prediction
15
BHT Prediction for HeapSort (10000)

Larger width bits, lower miss rate. But the miss
rate wont improve for ever. If the size of the
pattern is much smaller than bits, larger bits
will hurt performance.
Bigger BHT table, lower miss rate. Because there
are less conflicts or alias.

16
BHT Prediction for QuickSort (10000)

Table size has little influence on miss rate,
its the benchmarks limit
Longer width bits dont match with the relatively
random pattern in quicksort, it hurt the
performance.

17
Scheme2 2-level adaptive prediction

Use two levels of branch history information to
make prediction.
Use the branch address and pattern of branch
history as two-dimension index.
Predict according to the content in Pattern
History Table.

18
2-level-adaptive prediction for HeapSort
19
2-level-adaptive prediction for QuickSort
20
2-level-adaptive prediction conclusions

Using Heapsort Benchmark
The bigger the PHT width , the better the
performance. But after PHTgt64, the improvement
is not obvious, because no more address aliases
exist after BHTgt64
The bigger the BHR width , the better the
performance. But when BHR widthgt64, performance
improvement is not obvious, because no more
patterns can be traced.
Using Quicksort Benchmark
Wider PHT and BHR dont always bring better
performance. That may be due to the lack of
obvious pattern in the branches of Quicksort.

21
Scheme3 A Neural Method

Perceptron - Artificial neural networks
Simple model of neural networks in brain cells
Learn to recognize and classify patterns
A small and fast neural method
Very high accuracy for branch prediction

22
Branch-Predicting Perceptron

Inputs (xs) are from branch history register
Weights (ws) are small integers learned by
on-line training
Output (y) gives prediction dot product of xs
and ws
Training finds correlations between history and
outcome

23
Training Algorithm
24
Organization of the Perceptron Predictor

Keeps a table of perceptrons, indexed by branch
address
Inputs are from branch history register
Predict taken if output ? 0, otherwise predict
not taken

25
Perceptron prediction history length
26
Perceptron prediction of perceptrons
27
Comparing Different Schemes
28
Scheme under QuickSort data file
0 10 100
1000 10000 100000
Number of random numbers to be sorted
29
Scheme under HeapSort data file
0 10 100
1000 10000 100000
Number of random numbers to be sorted
30
Scheme comparison
0 10 100
1000 10000 100000
Number of random numbers to be sorted
31
Conclusions
32
Conclusions

The pattern of QuickSort branch is more random
than HeapSort, its harder to predict, especially
when the list to be sorted is small.
For BHT and 2-level-adaptive predictor, the
parameters are sensitive to the pattern of
branchs. The performance will not improve
indefinitely when increase the table size or
Branch history register. Sometimes larger
parameter even hurts.

33
Conclusions --- continue

Perceptron neural predictor is a Smart scheme.
It has accurate prediction even when the branches
have no obvious patterns.
Better representation
The weight for history branch result is not
always 1. Important one has higher weight.
Neural predictor needs less hardware(15 global
perceptron history register, and 163
preceptrons), to achieve higher prediction
accuracy.

34
Contribution
35
Reference

Some slides from Neural Methods for Dynamic
Branch Prediction-----Professor Daniel A.
Jiménez
1 Calder, Brad, Dirk Grumwald and J. Emer, A
System Level Perspective on Branch Prediction
Architecture Performance, Proceeding of the 28th
Intl. Symposium on Microarchitecture, pp 199-206,
1995
2 S-T Pan, K. So, and J.T. Rahmeh, Improving
the accuracy of dynamic branch prediction using
branch correlation, Fifth Intl. Conf. on Arch.
Support for Prog. Lang. And OS, Boston, MA, Oct.
1992,pp 76-84
3 Yeh, Tse-Yu, and Yale N, Patt, A
comprehensive instruction fetch mechanism for a
processor supporting speculative execution, In
25th Intl. Symposium on Microarchitecture,
Portland, OR, ACM, Dec 1992, pp 129-139
4 S. McFarling, Combining Branch Predictors,
WRL Technical Note TN-36, Digital Equipment
Corporation, Jun 1993
5 P.-Y Chang E. Hao, T-Y Yeh and Y. n. Patt,
branch Classification a new Mechanism for
Improving Branch Predictor Performance,
Proceedings of the International Conference on
Parallel Processing, 1995.
6 Daniel Jimenez, Neural Methods for Dynamic
Branch Prediction, ACM Transactions on Computer
Systems, Vol 20, No.4, November 2002, page
369-397.