Title: Computer Architecture Chapter 5
1Computer ArchitectureChapter 5
- Fall 2005
- Department of Computer Science
- Kent State University
2The Processor Datapath Control
- Our implementation of the MIPS is simplified
- memory-reference instructions lw, sw
- arithmetic-logical instructions add, sub, and,
or, slt - control flow instructions beq, j
- Generic implementation
- use the program counter (PC) to supply
- the instruction address and fetch the
instruction
from memory (and update the PC) - decode the instruction (and read registers)
- execute the instruction
- All instructions (except j) use the ALU after
reading the registers - How? memory-reference? arithmetic? control
flow?
3Abstract / Simplified View
-
- Two types of functional units
- elements that operate on data values
(combinational) - elements that contain state (sequential)
4More Implementation Details
5Fetching Instructions
- Fetching instructions involves
- reading the instruction from the Instruction
Memory - updating the PC to hold the address of the next
instruction
Add
4
Instruction Memory
Read Address
Instruction
PC
- PC is updated every cycle, so it does not need an
explicit write control signal - Instruction Memory is read every cycle, so it
doesnt need an explicit read control signal
6Fetch-Decode-Execute
- In order to execute an instruction we must
- Fetch the instruction from memory
- Determine what the instruction is (decode)
- Execute it
- Fetch and decode are the same for all
instructions - Execute depends on the type of instruction
7Instruction Formats
op rs rt rd shamt funct
3126 2521 2016 1511 106 50
op rs rt immed
3126 2521 2016 150
op addr
3126 250
8Decoding Instructions
- Decoding instructions involves
- sending the fetched instructions opcode and
function field bits to the control unit
Control Unit
Instruction
- reading two values from the Register File
- Register File addresses are contained in the
instruction
9Executing Load and Store
- Load
- Fetch operand (base address) from register
- Compute effective address
- Read data from memory
- Write result back to register
- Store
- Fetch operands from registers
- Compute effective address
- Write data to memory
10Executing Arithmetic/Logic
- Arithmetic/logic (add, sub, and, or, slt)
- Fetch operands from registers
- Perform operation
- Write result back to register
11Executing Branch and Jump
- Conditional branch (beq)
- Fetch operands from registers
- Compare operands
- If equal add displacement to PC
- Jump (j)
- Write new value to PC
12ALU Instructions
- Components
- Register File
- ALU
- Operation
- Use instruction fields to select registers
- Read source registers and send them to ALU
- Send ALU result to destination register
13Components for ALU Instrs
14ALU Datapath
15Memory Access
- Components
- Register File
- ALU
- Data Memory
- Sign-Extension Unit
- Operation
- ALU adds base register and sign-extended
immediate - Send ALU result to memory as the address
- Read the value from memory into the destination
register (lw) or write the value from the source
register into memory (sw)
16Components for Mem Access
17Memory Access Datapath
18Branches
- Components
- Register File
- ALU
- Program Counter (PC)
- Adder
- Sign-Extension Unit
- Operation
- Send source register values to ALU for comparison
- Adder computes branch target address
- Control logic decides whether branch is taken or
not
19Branch Datapath
20Putting It All Together
21Control Unit
- Control unit takes an instruction as input and
produces control signals as output - Types of control signals
- Multiplexor selector signals
- Write enables for state elements
- Control signals for other blocks (ALU, etc.)
- In a single-cycle datapath the control unit is
simple, just look up instruction in a table
22Control Signals
- RegDst Selects either rd or rt as the
destination register - RegWrite The value on the write data port will
be written into the register specified by the
write register input when asserted - ALUOp Selects ALU operation
- ALUSrc Selects the second ALU input to be either
the second register output or the sign-extended
immediate value
23Control Signals (cont'd)
- PCSrc Selects new PC as either PC 4 or the
output of the branch target adder - This signal is derived from the Branch control
signal and the ALU's Zero output - MemRead/MemWrite Causes data memory to perform a
read/write operation when asserted - MemToReg Selects either the ALU output or the
data memory output as the data input into the
register file
24ALU Control
- In order to simplify design of the control unit
we give the ALU its own control logic - The ALU control block takes a 2-bit input from
the control unit (ALUOp) and the funct field from
the instruction and produces the ALU control
signals
25ALU Control Signals
Instruction ALUOp funct Field ALU Function ALU Inputs
lw 00 Add 0010
sw 00 Add 0010
beq 01 Subtract 0110
add 10 100000 Add 0010
sub 10 100010 Subtract 0110
and 10 100100 AND 0000
or 10 100101 OR 0001
slt 10 101010 Set on less than 0111
26Operation of Control Unit
ALU lw sw beq
ALUOp 10 00 00 01
ALUSrc 0 1 1 0
Branch 0 0 0 1
MemRead 0 1 0 0
MemWrite 0 0 1 0
MemToReg 0 1 x x
RegDst 1 0 x x
RegWrite 1 1 0 0
27Datapath with Control Unit
28Jump Instructions
- The unconditional branch instruction (j) computes
its branch target differently from the
conditional branch instruction (beq) - Branch target address is
- Top 4 bits of PC 4
- 26-bit immediate value
- Two zero bits
29Datapath with Jump
30Performance
- The single-cycle datapath executes each
instruction in just one cycle - CPI is 1.0, which is optimal
- However, minimum clock cycle time is determined
by slowest instruction - In practice the execution time can vary
considerably between instructions making a
single-cycle implementation a poor choice
31Using Multiple Cycles
- A multi-cycle datapath splits instruction
execution into multiple steps, where each step
take one cycle - If an instruction doesn't need a step it skips
it, so different instructions run for different
numbers of cycles - Slow instructions don't slow down the entire
processor - Control unit becomes more complicated
- Hardware can be shared between steps
32Multicycle Datapath (1)
33Multicycle Differences
- A functional unit can be used more than once in
the execution of an instruction, so long as those
uses occur in different steps - Instruction memory and data memory are combined
into a single unit - ALU takes over for the two separate adders
- Additional registers are needed to save
information between steps
34Multicycle Registers
- Instruction register (IR) hold the instruction
during its execution - Memory data register (MDR) hold the data read
from memory for one cycle - A hold source register for one cycle
- B hold source register for one cycle
- ALUOut hold ALU output for one cycle
35Multicycle Datapath (2)
36Multicycle Datapath (3)
37New Control Signals
- ALUSrcA selects first ALU operand to be either
the PC or the A register - ALUSrcB selects second ALU operand from B
register, constant 4, sign-extended immediate,
sign-extended and shifted immediate - MemtoReg selects register file write data as
coming from either ALUOut or MDR - IorD selects the memory address as coming from
either PC or ALUOut
38New Control Signals (cont'd)
- IRWrite If asserted the memory output is written
to IR - PCSource Selects the new value for the PC from
ALU, ALUOut, jump target address - PCWrite If asserted the PC is written
- PCWriteCond If asserted and the zero output from
the ALU is 1 then the PC is written
39Instruction Execution Steps
- Instruction fetch
- Instruction decode and register fetch
- Execution, memory address computation, or branch
completion - Memory access or R-type completion
- Memory read completion
40Instruction Fetch
- Fetch instruction from memory
- IR ? MemoryPC
- Increment the PC
- PC ? PC 4
41Instruction Decode
- Fetch operands from register file
- A ? RegIR2521
- B ? RegIR2016
- Compute branch target address
- ALUOut ? PC (sign-extend(IR150) ltlt 2)
42Execute
- Load/store Compute memory address
- ALUOut ? A sign-extend(IR150)
- R-type Perform operation specified by
instruction - ALUOut ? A op B
- Branch Compare registers and set PC if equal
- if (A B) PC ? ALUOut
- Jump Set PC to jump target address
- PC ? PC3128, (IR250 ltlt 2)
43Memory Access
- Load Read memory word into MDR
- MDR ? MemoryALUOut
- Store Write B into memory
- MemoryALUOut ? B
- R-type Write result to destination register
- RegIR1511 ? ALUOut
44Memory Read Completion
- Load Write result to destination register
- RegIR2016 ? MDR
45Multicycle Datapath (4)
46State Machine
- A state machine is a sequential logic device
with - Set of states
- Next-state function which determines the next
state from the current state and the inputs - Output function which determines the outputs from
the current state and possibly the inputs - In a Moore machine the output depends only on the
state in a Mealy machine the output depends on
the state and the inputs
47Control with a State Machine
- The control unit for our multicycle datapath will
be a state machine - The only input is the op field of the
instruction the outputs are the control signals - Each step may have multiple states if control
signals depend on the instruction
48Fetch and Decode States
49Load and Store States
50R-Type States
51Branch State
52Jump State
53Complete State Machine
54Single Cycle Datapath with Control Unit
0
Add
Add
1
4
Shift left 2
PCSrc
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
55R-type Instruction Data/Control Flow
0
Add
Add
1
4
Shift left 2
PCSrc
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
56Load Word Instruction Data/Control Flow
0
Add
Add
1
4
Shift left 2
PCSrc
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
57Load Word Instruction Data/Control Flow
0
Add
Add
1
4
Shift left 2
PCSrc
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
58Branch Instruction Data/Control Flow
0
Add
Add
1
4
Shift left 2
PCSrc
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
59Branch Instruction Data/Control Flow
0
Add
Add
1
4
Shift left 2
PCSrc
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
60Executing R Format Operations
- R format operations (add, sub, slt, and, or)
- perform the (op and funct) operation on values in
rs and rt - store the result back into the Register File
(into location rd)
ALU control
RegWrite
Read Addr 1
Read Data 1
Register File
Read Addr 2
overflow
Instruction
zero
ALU
Write Addr
Read Data 2
Write Data
- The Register File is not written every cycle
(e.g. sw), so we need an explicit write control
signal for the Register File
61Executing Load and Store Operations
- Load and store operations involves
- compute memory address by adding the base
register (read from the Register File during
decode) to the 16-bit signed-extended offset
field in the instruction - store value (read from the Register File during
decode) written to the Data Memory - load value, read from the Data Memory, written to
the Register File
62Executing Branch Operations
- Branch operations involves
- compare the operands read from the Register File
during decode for equality (zero ALU output) - compute the branch target address by adding the
updated PC to the 16-bit signed-extended
offset field in the instr
Branch target address
Add
Add
4
Shift left 2
ALU control
PC
zero
(to branch control logic)
Read Addr 1
Read Data 1
Register File
Read Addr 2
Instruction
ALU
Write Addr
Read Data 2
Write Data
Sign Extend
16
32
63Executing Jump Operations
- Jump operation involves
- replace the lower 28 bits of the PC with the
lower 26 bits of the fetched instruction shifted
left by 2 bits
Add
4
4
Jump address
Instruction Memory
Shift left 2
28
Read Address
Instruction
PC
26
64Creating a Single Datapath from the Parts
- Assemble the datapath segments and add control
lines and multiplexors as needed - Single cycle design fetch, decode and execute
each instructions in one clock cycle - no datapath resource can be used more than once
per instruction, so some must be duplicated
(e.g., separate Instruction Memory and Data
Memory, several adders) - multiplexors needed at the input of shared
elements with control lines to do the selection - write signals to control writing to the Register
File and Data Memory - Cycle time is determined by length of the longest
path
65Fetch, R, and Memory Access Portions
66Adding the Control
- Selecting the operations to perform (ALU,
Register File and Memory read/write) - Controlling the flow of data (multiplexor inputs)
31
25
20
15
5
0
10
R-type
op
rs
rt
rd
funct
shamt
31
25
20
15
0
- Observations
- op field always
in bits 31-26 - addr of registers
to be read are
always specified by the
rs field (bits 25-21) and
rt field (bits 20-16) for lw and sw rs is the
base register - addr. of register to be written is in one of two
places in rt (bits 20-16) for lw in rd (bits
15-11) for R-type instructions - offset for beq, lw, and sw always in bits 15-0
I-Type
address offset
op
rs
rt
67Single Cycle Datapath with Control Unit
0
Add
Add
1
4
Shift left 2
PCSrc
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
68R-type Instruction Data/Control Flow
0
Add
Add
1
4
Shift left 2
PCSrc
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
69Load Word Instruction Data/Control Flow
0
Add
Add
1
4
Shift left 2
PCSrc
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
70Load Word Instruction Data/Control Flow
0
Add
Add
1
4
Shift left 2
PCSrc
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
71Branch Instruction Data/Control Flow
0
Add
Add
1
4
Shift left 2
PCSrc
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
72Branch Instruction Data/Control Flow
0
Add
Add
1
4
Shift left 2
PCSrc
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
73Adding the Jump Operation
Instr25-0
1
Shift left 2
32
28
26
0
PC431-28
0
Add
Add
1
4
Shift left 2
PCSrc
Jump
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
74Single Cycle Disadvantages Advantages
- Uses the clock cycle inefficiently the clock
cycle must be timed to accommodate the slowest
instruction - especially problematic for more complex
instructions like floating point multiply - May be wasteful of area since some functional
units (e.g., adders) must be duplicated since
they can not be shared during a clock cycle - but
- Is simple and easy to understand
75Multicycle Datapath Approach
- Let an instruction take more than 1 clock cycle
to complete - Break up instructions into steps where each step
takes a cycle while trying to - balance the amount of work to be done in each
step - restrict each cycle to use only one major
functional unit - Not every instruction takes the same number of
clock cycles - In addition to faster clock rates, multicycle
allows functional units that can be used more
than once per instruction as long as they are
used on different clock cycles, as a result - only need one memory but only one memory access
per cycle - need only one ALU/adder but only one ALU
operation per cycle
76Multicycle Datapath Approach, cont
- At the end of a cycle
- Store values needed in a later cycle by the
current instruction in an internal register (not
visible to the programmer). All (except IR) hold
data only between a pair of adjacent clock cycles
(no write control signal needed) - IR Instruction Register MDR Memory Data
Register - A, B regfile read data registers ALUout ALU
output register
- Data used by subsequent instructions are stored
in programmer visible registers (i.e., register
file, PC, or memory)
77The Multicycle Datapath with Control Signals
PCWriteCond
PCWrite
PCSource
ALUOp
IorD
Control
MemRead
ALUSrcB
MemWrite
ALUSrcA
MemtoReg
RegWrite
IRWrite
RegDst
PC31-28
Instr31-26
Shift left 2
28
Instr25-0
2
0
1
Address
Memory
0
PC
Read Addr 1
0
A
Read Data 1
IR
Register File
1
zero
1
Read Addr 2
Read Data (Instr. or Data)
0
ALUout
ALU
Write Addr
Read Data 2
Write Data
1
B
0
Write Data
1
4
1
0
2
Sign Extend
Shift left 2
3
Instr15-0
ALU control
32
Instr5-0
78Multicycle Control Unit
- Multicycle datapath control signals are not
determined solely by the bits in the instruction - e.g., op code bits tell what operation the ALU
should be doing, but not what instruction cycle
is to be done next - Must use a finite state machine (FSM) for control
- a set of states (current state stored in State
Register) - next state function (determined
by
current state and the input) - output function (determined by
current state
and the input)
79The Five Steps of the Load Instruction
Cycle 1
Cycle 2
Cycle 3
Cycle 4
Cycle 5
Dec
lw
- IFetch Instruction Fetch and Update PC
- Dec Instruction Decode, Register Read, Sign
Extend Offset - Exec Execute R-type Calculate Memory Address
Branch Comparison Branch and Jump Completion - Mem Memory Read Memory Write Completion R-type
Completion (RegFile write) - WB Memory Read Completion (RegFile write)
INSTRUCTIONS TAKE FROM 3 - 5 CYCLES!
80Multicycle Advantages Disadvantages
- Uses the clock cycle efficiently the clock
cycle is timed to accommodate the slowest
instruction step - Multicycle implementations allow functional units
to be used more than once per instruction as long
as they are used on different clock cycles - but
- Requires additional internal state registers,
more muxes, and more complicated (FSM) control
81Single Cycle vs. Multiple Cycle Timing
82Next Lecture and Reminders
- Next lecture
- MIPS pipelined datapath review
- Reading assignment PH, Chapter 6.1-6.3
- Reminders
- HW2 due September 27th
- Evening midterm exam scheduled
- Tuesday, October 18th , 2015 to 2215, Location
113 IST - You should have let me know by now if you have a
conflict !!
83MIPS Subset
- Memory access instructions
- lw, sw
- Arithmetic and logic instructions
- add, sub, and, or, slt
- Branch instructions
- beq, j
84Instruction Formats
op rs rt rd shamt funct
3126 2521 2016 1511 106 50
op rs rt immed
3126 2521 2016 150
op addr
3126 250
85Fetch-Decode-Execute
- In order to execute an instruction we must
- Fetch the instruction from memory
- Determine what the instruction is (decode)
- Execute it
- Fetch and decode are the same for all
instructions - Execute depends on the type of instruction
86Executing Load and Store
- Load
- Fetch operand (base address) from register
- Compute effective address
- Read data from memory
- Write result back to register
- Store
- Fetch operands from registers
- Compute effective address
- Write data to memory
87Executing Arithmetic/Logic
- Arithmetic/logic (add, sub, and, or, slt)
- Fetch operands from registers
- Perform operation
- Write result back to register
88Executing Branch and Jump
- Conditional branch (beq)
- Fetch operands from registers
- Compare operands
- If equal add displacement to PC
- Jump (j)
- Write new value to PC
89Instruction Fetch
- Components
- Instruction Memory
- Program Counter (PC)
- Adder
- Operation
- Fetch the instruction whose address is in the PC
- Increment the PC by 4
90Components for Instr Fetch
91Instruction Fetch Datapath
92ALU Instructions
- Components
- Register File
- ALU
- Operation
- Use instruction fields to select registers
- Read source registers and send them to ALU
- Send ALU result to destination register
93Components for ALU Instrs
94ALU Datapath
95Memory Access
- Components
- Register File
- ALU
- Data Memory
- Sign-Extension Unit
- Operation
- ALU adds base register and sign-extended
immediate - Send ALU result to memory as the address
- Read the value from memory into the destination
register (lw) or write the value from the source
register into memory (sw)
96Components for Mem Access
97Memory Access Datapath
98Branches
- Components
- Register File
- ALU
- Program Counter (PC)
- Adder
- Sign-Extension Unit
- Operation
- Send source register values to ALU for comparison
- Adder computes branch target address
- Control logic decides whether branch is taken or
not
99Branch Datapath
100Putting It All Together
101Control Unit
- Control unit takes an instruction as input and
produces control signals as output - Types of control signals
- Multiplexor selector signals
- Write enables for state elements
- Control signals for other blocks (ALU, etc.)
- In a single-cycle datapath the control unit is
simple, just look up instruction in a table
102Control Signals
- RegDst Selects either rd or rt as the
destination register - RegWrite The value on the write data port will
be written into the register specified by the
write register input when asserted - ALUOp Selects ALU operation
- ALUSrc Selects the second ALU input to be either
the second register output or the sign-extended
immediate value
103Control Signals (cont'd)
- PCSrc Selects new PC as either PC 4 or the
output of the branch target adder - This signal is derived from the Branch control
signal and the ALU's Zero output - MemRead/MemWrite Causes data memory to perform a
read/write operation when asserted - MemToReg Selects either the ALU output or the
data memory output as the data input into the
register file
104ALU Control
- In order to simplify design of the control unit
we give the ALU its own control logic - The ALU control block takes a 2-bit input from
the control unit (ALUOp) and the funct field from
the instruction and produces the ALU control
signals
105ALU Control Signals
Instruction ALUOp funct Field ALU Function ALU Inputs
lw 00 Add 0010
sw 00 Add 0010
beq 01 Subtract 0110
add 10 100000 Add 0010
sub 10 100010 Subtract 0110
and 10 100100 AND 0000
or 10 100101 OR 0001
slt 10 101010 Set on less than 0111
106Operation of Control Unit
ALU lw sw beq
ALUOp 10 00 00 01
ALUSrc 0 1 1 0
Branch 0 0 0 1
MemRead 0 1 0 0
MemWrite 0 0 1 0
MemToReg 0 1 x x
RegDst 1 0 x x
RegWrite 1 1 0 0
107Datapath with Control Unit
108Jump Instructions
- The unconditional branch instruction (j) computes
its branch target differently from the
conditional branch instruction (beq) - Branch target address is
- Top 4 bits of PC 4
- 26-bit immediate value
- Two zero bits
109Datapath with Jump
110Adding the Jump Operation
Instr25-0
1
Shift left 2
32
28
26
0
PC431-28
0
Add
Add
1
4
Shift left 2
PCSrc
Jump
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
111Performance
- The single-cycle datapath executes each
instruction in just one cycle - CPI is 1.0, which is optimal
- However, minimum clock cycle time is determined
by slowest instruction - In practice the execution time can vary
considerably between instructions making a
single-cycle implementation a poor choice
112Using Multiple Cycles
- A multi-cycle datapath splits instruction
execution into multiple steps, where each step
take one cycle - If an instruction doesn't need a step it skips
it, so different instructions run for different
numbers of cycles - Slow instructions don't slow down the entire
processor - Control unit becomes more complicated
- Hardware can be shared between steps
113Multicycle Datapath (1)
114Multicycle Differences
- A functional unit can be used more than once in
the execution of an instruction, so long as those
uses occur in different steps - Instruction memory and data memory are combined
into a single unit - ALU takes over for the two separate adders
- Additional registers are needed to save
information between steps
115Multicycle Datapath (2)
116Multicycle Datapath (3)
117Multicycle Datapath (4)
118Multicycle Registers
- Instruction register (IR) hold the instruction
during its execution - Memory data register (MDR) hold the data read
from memory for one cycle - A hold source register for one cycle
- B hold source register for one cycle
- ALUOut hold ALU output for one cycle
119New Control Signals
- ALUSrcA selects first ALU operand to be either
the PC or the A register - ALUSrcB selects second ALU operand from B
register, constant 4, sign-extended immediate,
sign-extended and shifted immediate - MemtoReg selects register file write data as
coming from either ALUOut or MDR - IorD selects the memory address as coming from
either PC or ALUOut
120New Control Signals (cont'd)
- IRWrite If asserted the memory output is written
to IR - PCSource Selects the new value for the PC from
ALU, ALUOut, jump target address - PCWrite If asserted the PC is written
- PCWriteCond If asserted and the zero output from
the ALU is 1 then the PC is written
121Instruction Execution Steps
- Instruction fetch
- Instruction decode and register fetch
- Execution, memory address computation, or branch
completion - Memory access or R-type completion
- Memory read completion
122Instruction Fetch
- Fetch instruction from memory
- IR ? MemoryPC
- Increment the PC
- PC ? PC 4
123Instruction Decode
- Fetch operands from register file
- A ? RegIR2521
- B ? RegIR2016
- Compute branch target address
- ALUOut ? PC (sign-extend(IR150) ltlt 2)
124Execute
- Load/store Compute memory address
- ALUOut ? A sign-extend(IR150)
- R-type Perform operation specified by
instruction - ALUOut ? A op B
- Branch Compare registers and set PC if equal
- if (A B) PC ? ALUOut
- Jump Set PC to jump target address
- PC ? PC3128, (IR250 ltlt 2)
125Memory Access
- Load Read memory word into MDR
- MDR ? MemoryALUOut
- Store Write B into memory
- MemoryALUOut ? B
- R-type Write result to destination register
- RegIR1511 ? ALUOut
126Memory Read Completion
- Load Write result to destination register
- RegIR2016 ? MDR
127State Machine
- A state machine is a sequential logic device
with - Set of states
- Next-state function which determines the next
state from the current state and the inputs - Output function which determines the outputs from
the current state and possibly the inputs - In a Moore machine the output depends only on the
state in a Mealy machine the output depends on
the state and the inputs
128Control with a State Machine
- The control unit for our multicycle datapath will
be a state machine - The only input is the op field of the
instruction the outputs are the control signals - Each step may have multiple states if control
signals depend on the instruction
129Fetch and Decode States
130Load and Store States
131R-Type States
132Branch State
133Jump State
134Complete State Machine
135Exceptions
- An exception is an event that causes an
unscheduled transfer of control - Also known as interrupts and traps
- Typically an interrupt is caused externally while
an exception or trap is caused internally - Arithmetic overflow is an example of an
exception an I/O device request is an example of
an interrupt
136Handling Exceptions
- When hardware detects an exception it transfers
control to a software routine called an exception
handler which is typically a part of an operating
system - The hardware saves the value of the PC in the
exception PC (EPC) register so it can return
there after the exception is handled
137Determining the Cause
- The hardware must tell the exception handler what
the cause of the exception was - One way to do this is store a value into a
special Cause register (MIPS) - Another way is to use vectored interrupts where
control is transferred to a different address
depending on the cause
138Exceptions to Implement
- Undefined instruction occurs when the op field of
an instruction indicates an undefined or
unimplemented instruction - Arithmetic overflow occurs when the ALU indicates
that overflow has occurred during an R-type
instruction
139Adding Exceptions
- The EPC register saves the old PC it is written
when the EPCWrite is asserted - The Cause register records the cause of the
exception it is written when CauseWrite is
asserted - The IntCause signal indicates the cause of the
exception - Control is always transferred to 0x80000180
140Changes to the Datapath
141Changes to the Control Unit
142Microprogramming
- An alternative to state machines for control is
microprogramming - Each instruction corresponds to a sequence of
microinstructions (a microprogram) - The opcode bits specify the starting address of
the microprogram within the microcode ROM. - A microinstruction contains values for all of the
control signals plus some sequencing control bits - Microprogramming makes it easier to change the
control unit or to implement complex instructions
143Microprogramming (cont'd)
144Multicycle Performance
- The multicycle datapath has a much shorter clock
cycle time than the single-cycle datapath - However, it also has a larger CPI
- Is the multicycle datapath really faster?
- Depends on the instruction mix
- Can we still do better?