Title: ELEN 350 MultiCycle Datapath
1ELEN 350Multi-Cycle Datapath
- Adapted from the lecture notes of John
Kubiatowicz (UCB) - and Hank Walker (TAMU)
2Abstract View of our single cycle processor
Control Unit
op
fun
ALUSrc
Equal
ExtOp
MemWr
MemWr
MemRd
RegDst
RegWr
nPC_sel
ALUctr
Reg. Wrt
ALU
Register Fetch
Ext
Mem Access
PC
Instruction Fetch
Next PC
Data Mem
- looks like a FSM with PC as state
3Whats wrong with our CPI1 processor?
Arithmetic Logical
PC
Reg File
Inst Memory
ALU
setup
mux
mux
Load
PC
Inst Memory
ALU
Data Mem
Reg File
setup
mux
mux
Critical Path
Store
PC
Inst Memory
ALU
Data Mem
Reg File
mux
Branch
PC
Inst Memory
cmp
Reg File
mux
- All instructions take as much time as the slowest
- Long Cycle Time
- Real memory is not as nice as our idealized
memory - cannot always get the job done in one (short)
cycle
4Reducing Cycle Time
- Cut combinational dependency graph and insert
register / latch - Do same work in two fast cycles, rather than one
slow one - May be able to short-circuit path and remove some
components for some instructions!
storage element
Combinational Logic (A)
?
storage element
Combinational Logic (B)
storage element
5Partitioning the Singlecycle Datapath
- Add registers between smallest steps
- Place enables on all registers
MemWr
MemWr
MemRd
RegDst
RegWr
nPC_sel
ALUSrc
ExtOp
ALUctr
Reg. File
Exec
Operand Fetch
Mem Access
Instruction Fetch
PC
Next PC
Result Store
Data Mem
6Example Multicycle Datapath
Equal
nPC_sel
E
Reg File
A
PC
IR
Next PC
B
Instruction Fetch
Operand Fetch
7R-type (add, sub, . . .)
inst Logical Register Transfers ADDU Rrd Rrs Rrt PC
Instruction Register Transfers cycle Register Transfers
1. IR B
E
Reg. File
Reg File
Exec
PC
IR
Next PC
Inst. Mem
Mem Access
Data Mem
8Logical immed
- Instruction
- Register Transfers
ORI Rrt
cycle Register Transfers
1. IR Rrt 3. S PC
E
Reg. File
Reg File
Exec
PC
IR
Next PC
Inst. Mem
B
Mem Access
Data Mem
9Load
LW Rrt 4
- Instruction
- Register Transfers
cycle Register Transfers 1. IR AM
E
Reg. File
Reg File
Exec
PC
IR
Next PC
Inst. Mem
B
Mem Access
Data Mem
10Store
SW MEMRrs SExt(Im16)
Instruction Register Transfersinst Register Transfers IR Rrs B
E
Reg. File
Reg File
Exec
PC
IR
Next PC
Inst. Mem
B
Mem Access
Data Mem
11Branch
- Instruction
- Register Transfers
BEQ if Rrs Rrt then PC 4SExt(Im16) 00 else PC
inst Register Transfers IR (Rrs Rrt) if !E then PC PC
E
Reg. File
Reg File
A
Exec
PC
IR
Next PC
Inst. Mem
B
Mem Access
Data Mem
12Performance Evaluation
- What is the average CPI?
- state diagram gives CPI for each instruction type
- workload gives frequency of each type
Type CPIi for type Frequency CPIi x freqIi
Arith/Logic 4 40 1.6 Load 5 30 1.5 Store 4 10
0.4 branch 3 20 0.6 Average CPI 4.1
13Verilog Implementation (IM)
module IM(IR, PC, clk, IRen) output 310
IR input 310 PC input clk, IRen
reg 310 IR reg 310 mem01023
wire 310 IR_next // OK, but slow // always
_at_(posedge clk) // IR memPC122
assign IR_next memPC122 always
_at_(posedge clk) if (IRen) IR
IR_next endmodule
PC
IR
Inst. Mem
14Verilog Implementation (REGS)
module REGS(A, B, E, RA, RB, RW, W, RegWr, clk,
REGSen) output 310 A, B output E
// A B input 40 RA, RB, RW input
310 W input RegWr, clk, REGSen reg
310 A, B reg E wire E_next
reg 310 regs031 assign E_next
(A_next B_next) ? 1 0 always
_at_(posedge clk) begin if (REGSen
1) begin A regsRA B
regsRB E E_next if (RegWr
1b1) regsRW W
regs0 0 end end end endmodule
E
Reg File
A
B
15Verilog Implementation (ALU)
module ALU(S, A, B, ALUCtr, clk, ALUen)
output 310 S input 310 A, B input
20 ALUCtr input clk, ALUen reg
310 S, S_next always _at_(A or B or
ALUCtr) begin if (ALUCtr 3'h0)
S_next A B ... end
always _at_(posedge clk) begin if (ALUen
1) S S_next end endmodule
A
Exec
S
B
16Control
- State specifies control points for Register
Transfer - Transfer occurs upon entering state (rising edge)
inputs
Next State Logic
Current State
Output Logic
Output control signals
17State Machine for multicycle MIPS
start / instruction fetch
IR A decode / operand fetch
BEQ
LW
R-type
ORi
SW
PC S S S S M MEMS Rrd Rrt Rrt
18State Machine that Generates Control Signals
IR start, instruction fetch
IRen
A decode
REGSen
LW
R-type
ORi
SW
BEQ
S PC S S S ALUCtr, ALUen
M MEMS Rrd RegDst, RegWr, PCen
Rrt Rrt
19State Machine Implementation in Verilog 1
module CTRL(clk, rst, opcode, IRen, REGSen,
ALUen, ALUCtr, REGDst, REGWr, PCen)
input clk, rst input 50 opcode output
IRen, REGSen, ALUen, ALUCtr, REGDst,
REGWr, PCen reg 30 state, next_state
reg IRen, REGSen, ALUen, ALUCtr,
REGDst, REGWr, PCen parameter 30
START 0, DECODE 1,
RTYPE_1 2, RTYPE_2 3 //
other states omitted
20State Machine in Verilog 2
always _at_ (posedge clk or negedge rst) begin
if (!rst) state START else
state next_state // asynchronous
reset end always _at_ (opcode or state) begin
case (state) START state_next DECODE
DECODE if (opcode 6h00) state_next
RTYPE_1 else if (opcode 6h02)
state_next ORI else if (opcode
6h32) state_next LW // other
states omitted RTYPE_1 state_next
RTYPE_2 RTYPE_2 state_next START
endcase end
21State Machine in Verilog 3
assign IRen (state START) ? 1 0 assign
REGSen (state DECODE) ? 1 0 assign
ALUen (state RTYPE_1
state ORI state LW
state SW) ? 1 0
22Assigning States
start, instruction fetch
IR 0000
decode
A 0001
LW
R-type
ORi
SW
BEQ
PC S S S S 0100
0110
1000
0011
1011
M MEMS 1001
1100
Rrd Rrt Rrt 0101
0111
1010
23(Mostly) Detailed Control Specification
(missing?0)
State Op field Eq Next IR PC Ops Exec Mem Write-B
ack en sel A B E Ex Sr ALU S R W M M-R Wr
Dst
- 0000 ?????? ? 0001 1
- 0001 BEQ x 0011 1 1 1
- 0001 R-type x 0100 1 1 1
- 0001 ORI x 0110 1 1 1
- 0001 LW x 1000 1 1 1
- 0001 SW x 1011 1 1 1
- 0011 xxxxxx 0 0000 1 0 x 0 x
- 0011 xxxxxx 1 0000 1 1 x 0 x
- 0100 xxxxxx x 0101 0 1 fun 1
- 0101 xxxxxx x 0000 1 0 0 1 1
- 0110 xxxxxx x 0111 0 0 or 1
- 0111 xxxxxx x 0000 1 0 0 1 0
- 1000 xxxxxx x 1001 1 0 add 1
- 1001 xxxxxx x 1010 1 0 1
- 1010 xxxxxx x 0000 1 0 1 1 0
- 1011 xxxxxx x 1100 1 0 add 1
- 1100 xxxxxx x 0000 1 0 0 1 0
-all same in Moore machine
BEQ
R
ORi
LW
SW
24Controller Design Alternative Microprogramming
- The state machines defining the controller for an
instruction set processor are highly structured - Use this structure to construct a simple
microsequencer - Control reduces to programming this very simple
device - ? microprogramming