Title: SEQ CPU Implementation
1SEQ CPU Implementation
2Outline
- Naïve PIPE Implementation
- Suggested Reading 4.5
3valE
valM
,
valM
valM
Memory
Addr
, Data
valE
Execute
Bch
aluA
aluB
,
valA
valB
,
,
icode
valC
valP
srcA
,
srcB
Decode
,
dstA
dstB
Write Back
valP
icode
,
ifun
rA
,
rB
valC
Fetch
PC
pState
4Memory
Execute
Decode
Fetch
5Pipeline Stages
- Fetch
- Select current PC
- Read instruction
- Compute incremented PC
- Decode
- Read program registers
- Execute
- Operate ALU
- Memory
- Read or write data memory
- Write Back
- Update register file
6PIPE- Hardware
- Pipeline registers hold intermediate values from
instruction execution - Forward (Upward) Paths
- Values passed from one stage to next
- Cannot jump past stages
- e.g., valC passes through decode
7Write back
valE
valM
dstE
dstM
W
icode
data out
read
Data Memory
write
Memory
data in
Addr
M_valA
M_Bch
valA
valE
M
Bch
dstE
dstM
icode
e_Bch
ALU fun
CC
ALU
ALUA
ALUB
Execute
valC
srcB
dstM
srcA
E
icode
ifun
valA
valB
dstE
d_srcA
d_rvalA
d_srcB
Select A
Decode
W_valM
W_valE
D
valC
valP
icode
rB
ifun
rA
Predict PC
increment
Fetch
M_valA
f_PC
Select PC
W_valM
predPC
F
8Feedback Paths
- Predicted PC
- Guess value of next PC
- Branch information
- Jump taken/not-taken
- Fall-through or target address
- Return point
- Read from memory
- Register updates
- To register file write ports
9Write back
valE
valM
dstE
dstM
W
icode
data out
read
Data Memory
write
Memory
data in
Addr
M_valA
M_Bch
valA
valE
M
Bch
icode
dstE
dstM
e_Bch
ALU fun
CC
ALU
ALUA
ALUB
Execute
valC
srcB
dstM
srcA
E
icode
ifun
valA
valB
dstE
d_srcA
d_rvalA
d_srcB
Select A
Decode
W_valM
W_valE
D
valC
valP
icode
rB
ifun
rA
Predict PC
increment
Fetch
M_valA
f_PC
Select PC
W_valM
predPC
F
10Predicting the PC
- Start fetch of new instruction after current one
has completed fetch stage - Not enough time to reliably determine next
instruction - Guess which instruction will follow
- Recover if prediction was incorrect
11Predicting the PC
M_icode
M_Bch
M_valA
D
W_icode
Predict
PC
Need
W_valM
valC
Instr
Need
valid
regids
Bytes1-5
Byte 0
Instruction Memory
Instruction
Select
PC
predPC
F
12Our Prediction Strategy
- Instructions that Dont Transfer Control
- Predict next PC to be valP
- Always reliable
- Call and Unconditional Jumps
- Predict next PC to be valC (destination)
- Always reliable
13Our Prediction Strategy
- Conditional Jumps
- Predict next PC to be valC (destination)
- Only correct if branch is taken
- Typically right 60 of time
- Return Instruction
- Dont try to predict
14Recovering from PC Misprediction
- Mispredicted Jump
- Will see branch flag once instruction reaches
memory stage - Can get fall-through PC from valA
- Return Instruction
- Will get return PC when ret reaches write-back
stage
15Select PC
Predict PC
Fetch
f_PC
M_Bch
M_valA
Select PC
W_valM
M_icode
W_icode
Int F_predPC f_icode in IJXX, ICALL
f_valC 1 f_valP
16Select PC
- int f_PC
- mispredicted branch. Fetch at incremented PC
- M_icode IJXX !M_Bch M_valA
- completion of RET instruciton
- W_icode IRET W_valM
- default Use predicted value of PC
- 1 F_predPC
17Pipeline Demonstration
1
2
3
4
5
6
7
8
9
F
D
E
M
W
W
F
D
E
M
irmovl 1,eax I1 irmovl 2,ecx I2 irmovl
3,edx I3 irmovl 4,ebx I4 halt I5
Cycle 5
18Data Dependencies in Processors
- Result from one instruction used as operand for
another - Read-after-write (RAW) dependency
- Very common in actual programs
- Must make sure our pipeline handles these
properly - Get correct results
- Minimize performance impact
19Data Dependencies 3 Nops
demo-h3.ys 0x000irmovl 10,edx 0x006irmovl
3,eax 0x00cnop 0x00dnop 0x00enop 0x00faddl
edx,eax 0x011halt
Cycle 6
W
W
Reax?3
Cycle 7
D
D
val?Redx10 val?Reax3
20Data Dependencies 2 Nops
demo-h2.ys 0x000irmovl 10,edx 0x006irmovl
3,eax 0x00cnop 0x00dnop 0x00eaddl
edx,eax 0x010halt
21Data Dependencies 1 Nop
demo-h1.ys 0x000irmovl 10,edx 0x006irmovl
3,eax 0x00cnop 0x00daddl edx,eax 0x0fhalt
22Data Dependencies No Nop
demo-h0.ys 0x000irmovl 10, edx 0x006irmovl
3,eax 0x00caddl edx,eax 0x0ehalt
23Classes of Data Hazards
- Hazards can potentially occur when one
instruction updates part of the program state
that read by a later instruction
24Classes of Data Hazards
- Program states
- Program registers
- The hazards already identified.
- Condition codes
- Both written and read in the execute stage.
- No hazards can arise
- Program counter
- Conflicts between updating and reading PC cause
control hazards - Memory
- Both written and read in the memory stage.
- Without self-modified code, no hazards.
25Data Dependencies 2 Nops
demo-h2.ys 0x000irmovl 10,edx 0x006irmovl
3,eax 0x00cnop 0x00dnop 0x00eaddl
edx,eax 0x010halt
26Data Dependencies 2 Nops
demo-h2.ys 0x000irmovl 10,edx 0x006irmovl
3,eax 0x00cnop 0x00dnop bubble 0x00eaddl
edx,eax 0x010halt
- If instruction follows too closely after one that
writes register, slow it down - Hold instruction in decode
- Dynamically inject nop into execute stage
27Write back
valE
valM
dstE
dstM
W
icode
data out
read
Data Memory
write
Memory
data in
Addr
M_valA
M_Bch
valA
valE
M
Bch
icode
dstE
dstM
e_Bch
ALU fun
CC
ALU
ALUA
ALUB
Execute
valC
srcB
srcA
E
icode
ifun
valA
valB
dstE
dstM
d_srcA
d_rvalA
d_srcB
Select A
srcA
srcB
dstM
dstE
Decode
W_valM
W_valE
D
valC
valP
icode
rB
ifun
rA
Predict PC
increment
Fetch
M_valA
f_PC
Select PC
W_valM
predPC
F
28Stall Condition
- Source Registers
- srcA and srcB of current instruction in decode
stage - Destination Registers
- dstE and dstM fields
- Instructions in execute, memory, and write-back
stages - Condition
- srcAdstE or srcAdstM
- srcBdstE or srcBdstM
- Special Case
- Dont stall for register ID 8
- Indicates absence of register operand
29Data Dependencies 2 Nops
demo-h2.ys 0x000irmovl 10,edx 0x006irmovl
3,eax 0x00cnop 0x00dnop bubble 0x00eaddl
edx,eax 0x010halt
30Stalling X3
demo-h0.ys 0x000irmovl 10,edx 0x006irmovl
3,eax bubble bubble bubble 0x00caddl
edx,eax 0x0ehalt
31What Happens When Stalling?
- Stalling instruction held back in decode stage
- Following instruction stays in fetch stage
- Bubbles injected into execute stage
- Like dynamically generated nops
- Move through later stages
32Implementing Stalling
33Implementing Stalling
- Pipeline Control
- Combinational logic detects stall condition
- Sets mode signals for how pipeline registers
should update
34Pipeline Register Modes
35Pipeline Register Modes
36Pipeline Register Modes