CS 230: Computer Organization and Assembly Language - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

CS 230: Computer Organization and Assembly Language

Description:

... Arithmetic/logic operations Data ... in MEM stage Read reg 1 Shift left 2 Sign extend Instruction ... signal from hazard detection unit EX Stage: ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 26
Provided by: Avira9
Category:

less

Transcript and Presenter's Notes

Title: CS 230: Computer Organization and Assembly Language


1
CS 230 Computer Organization and Assembly
Language
  • Aviral Shrivastava

Department of Computer Science and
Engineering School of Computing and
Informatics Arizona State University
Slides courtesy Prof. Yann Hang Lee, ASU, Prof.
Mary Jane Irwin, PSU, Ande Carle, UCB
2
Announcements
  • Alternate Project
  • Submit Nov 24
  • Quiz 5
  • Thursday, Nov 19, 2009
  • Pipelining
  • Finals
  • Tuesday, Dec 08, 2009
  • Please come on time (Youll need all the time)
  • Open book, notes, and internet
  • No communication with any other human

3
Benefits of Pipelining
  • Pipeline latches pass the status and result of
    the current instruction to next stage
  • Comparison

Clock
Single- cycle inst.
Dec/Reg
Exec
Ifetch
Mem
Ifetch
sw
lw
4
Branch Hazards
  • So far, weve limited discussion of hazards to
  • Arithmetic/logic operations
  • Data transfers
  • Also need to consider hazards involving branches
  • Example
  • 40 beq 1, 3, 28
  • 44 and 12, 2, 5
  • 48 or 13, 6, 2
  • 52 add 14, 2, 2
  • 72 lw 4, 50(7)
  • How long will it take before the branch decision
    takes effect?
  • What happens in the meantime?

5
Branch signal determined in MEM stage
Registers
6
Pipeline impact on branch
  • If branch condition true, must skip 44, 48, 52
  • But, these have already started down the pipeline
  • They will complete unless we do something about
    it
  • How do we deal with this?
  • Well consider 2 possibilities

7
Dealing w/branch hazards always stall
  • Branch taken
  • Wait 3 cycles
  • No proper instructions in the pipeline
  • Same delay as without stalls (no time lost)

8
Dealing w/branch hazards always stall
  • Branch not taken
  • Still must wait 3 cycles
  • Time lost
  • Could have spent cycles fetching and decoding
    next instructions

9
Assume branch not taken
  • On average, branches are taken ½ the time
  • If branch not taken
  • Continue normal processing
  • Else, if branch is taken
  • Need to flush improper instruction from pipeline
  • Cuts overall time for branch processing in ½

10
Flushing unwanted instructions from pipeline
  • Useful to compare w/stalling pipeline
  • Simple stall inject bubble into pipe at ID
    stage only
  • Change control to 0 in the ID stage
  • Let bubbles percolate to the right
  • Flushing pipe must change inst. In IF, ID, and
    EX
  • IF Stage
  • Zero instruction field of IF/ID pipeline register
  • Use new control signal IF.Flush
  • ID Stage
  • Use existing bubble injection mux that zeros
    control for stalls
  • Signal ID.Flush is ORed w/stall signal from
    hazard detection unit
  • EX Stage
  • Add new muxes to zero EX pipeline register
    control lines
  • Both muxes controlled by single EX.Flush signal
  • Control determines when to flush
  • Depends on Opcode and value of branch condition

11
Flushing Pipeline
IF.Flush
EX.Flush
Flush Pipeline
Hazard
ID.Flush
Detection
Unit
ID/EX
0
M
EX/MEM
WB
u
M
x
MEM/WB
M
u
WB
Control
x
0
EX
M
WB
M
IF/ID
u
x
0
PC
Branch Decision
12
Assume branch not takenand branch is not taken
  • Execution proceeds normally no penalty

13
Assume branch not takenand branch is taken
  • Bubbles injected into 3 stages during cycle 5

14
Reservation Table Picture
  • Another way of looking at it

Assume Branch Not Taken and Correct
40 beq 1, 3, 72 44 and 12, 2, 5 48 or
13, 6, 2 52 add 14, 2, 2 72 lw 4,
50(7)
1 2 3 4 5 6 7 8 9
IF Beq And Or Add 56
ID Beq And Or Add 56
EX Beq And Or Add 56
Mem Beq And Or Add 56
WB Beq And Or Add 56
No penalty 3 cycle penalty
Assume Branch Not Taken and NOT Correct
1 2 3 4 5 6 7 8 9
IF Beq And Or Add Sw
ID Beq And Or Add Sw
EX Beq And Or Add Sw
Mem Beq --- --- --- 56
WB Beq --- --- --- 56
(FYI, branch Freq 20 3 cycle penalty 50 of
time)
15
Branch Penalty Impact
  • Assume 16 of all instructions are branches
  • 4 unconditional branches 3 cycle penalty
  • 12 conditional 50 taken
  • For a sequence of N instructions (assume N is
    large)
  • N cycles to initiate each
  • 3 0.04 N delays due to unconditional branches
  • 0.5 3 0.12 N delays due to conditional
    taken
  • Also, an extra 4 cycles for pipeline to empty
  • Total
  • 1.3N 4 total cycles (or 1.3 cycles/instruction)
    (CPI)
  • 30 Performance Hit!!! (Bad thing)

16
Branch Penalty Impact
  • Some solutions
  • In ISA branches always execute next 1 or 2
    instructions
  • Instruction so executed said to be in delay slot
  • See SPARC ISA
  • (example loop counter update)
  • In organization move comparator to ID stage and
    decide in the ID stage
  • Reduces branch delay by 2 cycles
  • Increases the cycle time

17
Branch Prediction
  • Prior solutions are ugly
  • Better ( more common) guess in IF stage
  • Technique is called branch predicting needs 2
    parts
  • Predictor to guess where/if instruction will
    branch (and to where)
  • Recovery Mechanism i.e. a way to fix your
    mistake
  • Prior strategy
  • Predictor always guess branch never taken
  • Recovery flush instructions if branch taken
  • Alternative accumulate info. in IF stage as to
  • Whether or not for any particular PC value a
    branch was taken next
  • To where it is taken
  • How to update with information from later stages

18
A Branch Predictor
19
Branch History Table
20
Branch Prediction Information
  • One bit predictor
  • Use result from last time we saw this instruction
  • Problem
  • Even if branch is almost always taken, we will be
    wrong at least twice
  • 1st time we the instruction
  • 1st time the branch is not taken
  • Also, 1st time branch is taken again after than
  • And if branch alternates b/t taken, not taken
  • We get 0 accuracy
  • Can we do better? Yep.

21
Branch Prediction Information
  • How to do better?
  • Keep a counter in each entry of the number of
    times taken in the last N times executed
  • Keep information about the pattern of previous
    branches
  • Books scheme a 2-bit saturating counter
  • Increment when branch is taken
  • Decrement when branch is not taken
  • Dont increment or decrement above or below a
    max/min count
  • Use sign of count as predictor

22
Books 2 Bit Branch Counter
23
Computing Performance
  • Program assumptions
  • 23 loads and in ½ of cases, next instruction
    uses load value
  • 13 stores
  • 19 conditional branches
  • 2 unconditional branches
  • 43 other
  • Machine Assumptions
  • 5 stage pipe with all forwarding
  • Only penalty is 1 cycle on use of load value
    immediately after a load)
  • Jumps are totally resolved in ID stage for a 1
    cycle branch penalty
  • 75 branch prediction accuracy
  • 1 cycle delay on misprediction

24
The Answer
  • CPI penalty calculation
  • Loads
  • 50 of the 23 of loads have 1 cycle penalty
    .5.230.115
  • Jumps
  • All of the 2 of jumps have 1 cycle penalty
    0.021 0.02
  • Conditional Branches
  • 25 of the 19 are mispredicted for a 1 cycle
    penalty 0.250.191 0.0475
  • Total Penalty 0.115 0.02 0.0475 0.1825
  • Average CPI 1 0.1825 1.1825

25
Yoda says
  • Death is a natural part of life. Rejoice for
    those around you who transform into the Force.
    Mourn them do not. Miss them do not
Write a Comment
User Comments (0)
About PowerShow.com