EECS 470 - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

EECS 470

Description:

EECS 470 ILP and Exceptions Lecture 7 Coverage: Chapter 3 – PowerPoint PPT presentation

Number of Views:226
Avg rating:3.0/5.0
Slides: 22
Provided by: GaryT159
Category:
Tags: eecs | finite | machine | state

less

Transcript and Presenter's Notes

Title: EECS 470


1
EECS 470
  • ILP and Exceptions
  • Lecture 7
  • Coverage Chapter 3

2
Optimizing CPU Performance
  • Golden Rule tCPU NinstCPItCLK
  • Given this, what are our options
  • Reduce the number of instructions executed
  • Reduce the cycles to execute an instruction
  • Reduce the clock period
  • Our first focus Reducing CPI
  • Approach Instruction Level Parallelism (ILP)

3
Why ILP?
  • Requirements
  • Parallelism
  • Large window
  • Limited control deps
  • Eliminate false deps
  • Find run-time deps

Vs.
4
How Much ILP is There?
5
How Large Must the Window Be?
6
ALU Operation GOOD, Branch BAD
Expected Number of Branches Between
Mispredicts E(X) 1/(1-p) E.g., p 95, E(X)
20 brs, 100-ish insts
7
How Accurate are Branch Predictors?
8
Impact of Physical Storage Limitations
  • Each instruction in flight must have storage
    for its result
  • Really worse than this because of mispeculation

9
Registers GOOD, Memory BAD
  • Benefits of registers
  • Well described deps
  • Fast access
  • Finite resource
  • Memory loses these benefits for flexibility
  • p
  • q
  • p

?
10
Bottom Line for an Ambitious Design
11
First Optimization Out-of-Order Writeback
12
Playing by the Rules In-order Writeback
IF
ID
D1
D2
D3
D4
MEM
WB
D5
DIV.D
ADD
IF
ID
EX
MEM
WB
13
Playing by the Rules In-order Writeback
Divide by Zero!
IF
ID
D1
D2
D3
D4
MEM
WB
D5
DIV.D
ADD
IF
ID
EX
MEM
WB
Whats wrong with this picture?
14
Playing by the Rules In-order Writeback
Divide by Zero!
IF
ID
D1
D2
D3
D4
MEM
WB
D5
DIV.D
ADD
IF
ID
EX
MEM
WB
Whats wrong with this picture?
IF
ID
D1
D2
D3
D4
MEM
WB
D5
DIV.D
ADD
IF
ID
EX
MEM
WB
stall
stall
stall
stall
15
Another Way to Get in the Same Mess
  • Many systems use microcode
  • Simplifies mapping of complex instructions to CPU
    resources
  • iA32 add-with-carry
  • ADC (EAX),EBXtmp MEMEAXtmp tmp EBXCF,
    update CFMEMEAX tmp

Side Effect!
Potential Fault!
16
Exceptions and Interrupts
Exception Type Sync/Async Maskable? Restartable?
I/O request Async Yes Yes
System call Sync No Yes
Breakpoint Sync Yes Yes
Overflow Sync Yes Yes
Page fault Sync No Yes
Misaligned access Sync No Yes
Memory Protect Sync No Yes
Machine Check Async/Sync No No
Power failure Async No No
17
Solution Precise Interrupts
  • Implementation approaches
  • Dont
  • E.g., Cray-1
  • Force in-order WB
  • E.g., ARM SA-1
  • Force in-order checks
  • E.g., Alpha 21064
  • Buffer speculative results
  • E.g., P4, Alpha 21264
  • History buffer
  • Future file/Reorder buffer

Instructions Completely Finished
Precise State
PC
Speculative State
No Instruction Has Executed At All
18
Precise Interrupts via the Reorder Buffer
  • _at_ Alloc
  • Allocate result storage at Tail
  • _at_ Sched
  • Get inputs (ROB T-to-H then ARF)
  • Wait until all inputs ready
  • _at_ WB
  • Write results/fault to ROB
  • Indicate result is ready
  • _at_ CT
  • Wait until inst _at_ Head is done
  • If fault, initiate handler
  • Else, write results to ARF
  • Deallocate entry from ROB

Any order
MEM
IF
ID
Alloc
Sched
EX
CT
In-order
In-order
ARF
PC Dst regID Dst value Except?
Head
Tail
  • Reorder Buffer (ROB)
  • Circular queue of spec state
  • May contain multiple definitions of same register

19
Reorder Buffer Example
ROB
Code Sequence f1 f2 / f3 r3 r2 r3 r4
r3 r2 Initial Conditions - reorder buffer
empty - f2 3.0 - f3 2.0 - r2 6 - r3
5
regID f1 result ? Except ?
regID r8 result 2 Except n
H
T
regID r8 result 2 Except n
regID f1 result ? Except ?
regID r3 result ? Except ?
Time
H
T
regID r4 result ? Except ?
regID r8 result 2 Except n
regID f1 result ? Except ?
regID r3 result 11 Except N
r3
H
T
20
Reorder Buffer Example
ROB
Code Sequence f1 f2 / f3 r3 r2 r3 r4
r3 r2 Initial Conditions - reorder buffer
empty - f2 3.0 - f3 2.0 - r2 6 - r3
5
regID r4 result 5 Except n
regID r8 result 2 Except n
regID f1 result ? Except ?
regID r3 result 11 Except n
H
T
regID r4 result 5 Except n
regID f1 result ? Except y
regID r3 result 11 Except n
regID r8 result 2 Except n
Time
H
T
regID r4 result 5 Except n
regID f1 result ? Except y
regID r3 result 11 Except n
H
T
21
Reorder Buffer Example
ROB
Code Sequence f1 f2 / f3 r3 r2 r3 r4
r3 r2 Initial Conditions - reorder buffer
empty - f2 3.0 - f3 2.0 - r2 6 - r3
5
H
T
first inst of fault handler
Time
H
T
Write a Comment
User Comments (0)
About PowerShow.com