Computer Architecture Chapter 5 - PowerPoint PPT Presentation

About This Presentation
Title:

Computer Architecture Chapter 5

Description:

Multicycle Registers. Instruction register (IR): hold the instruction during its execution ... The control unit for our multicycle datapath will be a state machine ... – PowerPoint PPT presentation

Number of Views:240
Avg rating:3.0/5.0
Slides: 141
Provided by: KevinSc2
Learn more at: https://www.cs.kent.edu
Category:

less

Transcript and Presenter's Notes

Title: Computer Architecture Chapter 5


1
Computer ArchitectureChapter 5
  • Fall 2005
  • Department of Computer Science
  • Kent State University

2
The Processor Datapath Control
  • Our implementation of the MIPS is simplified
  • memory-reference instructions lw, sw
  • arithmetic-logical instructions add, sub, and,
    or, slt
  • control flow instructions beq, j
  • Generic implementation
  • use the program counter (PC) to supply
  • the instruction address and fetch the
    instruction
    from memory (and update the PC)
  • decode the instruction (and read registers)
  • execute the instruction
  • All instructions (except j) use the ALU after
    reading the registers
  • How? memory-reference? arithmetic? control
    flow?

3
Abstract / Simplified View
  • Two types of functional units
  • elements that operate on data values
    (combinational)
  • elements that contain state (sequential)

4
More Implementation Details
5
Fetching Instructions
  • Fetching instructions involves
  • reading the instruction from the Instruction
    Memory
  • updating the PC to hold the address of the next
    instruction

Add
4
Instruction Memory
Read Address
Instruction
PC
  • PC is updated every cycle, so it does not need an
    explicit write control signal
  • Instruction Memory is read every cycle, so it
    doesnt need an explicit read control signal

6
Fetch-Decode-Execute
  • In order to execute an instruction we must
  • Fetch the instruction from memory
  • Determine what the instruction is (decode)
  • Execute it
  • Fetch and decode are the same for all
    instructions
  • Execute depends on the type of instruction

7
Instruction Formats
op rs rt rd shamt funct
3126 2521 2016 1511 106 50
op rs rt immed
3126 2521 2016 150
op addr
3126 250
8
Decoding Instructions
  • Decoding instructions involves
  • sending the fetched instructions opcode and
    function field bits to the control unit

Control Unit
Instruction
  • reading two values from the Register File
  • Register File addresses are contained in the
    instruction

9
Executing Load and Store
  • Load
  • Fetch operand (base address) from register
  • Compute effective address
  • Read data from memory
  • Write result back to register
  • Store
  • Fetch operands from registers
  • Compute effective address
  • Write data to memory

10
Executing Arithmetic/Logic
  • Arithmetic/logic (add, sub, and, or, slt)
  • Fetch operands from registers
  • Perform operation
  • Write result back to register

11
Executing Branch and Jump
  • Conditional branch (beq)
  • Fetch operands from registers
  • Compare operands
  • If equal add displacement to PC
  • Jump (j)
  • Write new value to PC

12
ALU Instructions
  • Components
  • Register File
  • ALU
  • Operation
  • Use instruction fields to select registers
  • Read source registers and send them to ALU
  • Send ALU result to destination register

13
Components for ALU Instrs
14
ALU Datapath
15
Memory Access
  • Components
  • Register File
  • ALU
  • Data Memory
  • Sign-Extension Unit
  • Operation
  • ALU adds base register and sign-extended
    immediate
  • Send ALU result to memory as the address
  • Read the value from memory into the destination
    register (lw) or write the value from the source
    register into memory (sw)

16
Components for Mem Access
17
Memory Access Datapath
18
Branches
  • Components
  • Register File
  • ALU
  • Program Counter (PC)
  • Adder
  • Sign-Extension Unit
  • Operation
  • Send source register values to ALU for comparison
  • Adder computes branch target address
  • Control logic decides whether branch is taken or
    not

19
Branch Datapath
20
Putting It All Together
21
Control Unit
  • Control unit takes an instruction as input and
    produces control signals as output
  • Types of control signals
  • Multiplexor selector signals
  • Write enables for state elements
  • Control signals for other blocks (ALU, etc.)
  • In a single-cycle datapath the control unit is
    simple, just look up instruction in a table

22
Control Signals
  • RegDst Selects either rd or rt as the
    destination register
  • RegWrite The value on the write data port will
    be written into the register specified by the
    write register input when asserted
  • ALUOp Selects ALU operation
  • ALUSrc Selects the second ALU input to be either
    the second register output or the sign-extended
    immediate value

23
Control Signals (cont'd)
  • PCSrc Selects new PC as either PC 4 or the
    output of the branch target adder
  • This signal is derived from the Branch control
    signal and the ALU's Zero output
  • MemRead/MemWrite Causes data memory to perform a
    read/write operation when asserted
  • MemToReg Selects either the ALU output or the
    data memory output as the data input into the
    register file

24
ALU Control
  • In order to simplify design of the control unit
    we give the ALU its own control logic
  • The ALU control block takes a 2-bit input from
    the control unit (ALUOp) and the funct field from
    the instruction and produces the ALU control
    signals

25
ALU Control Signals
Instruction ALUOp funct Field ALU Function ALU Inputs
lw 00 Add 0010
sw 00 Add 0010
beq 01 Subtract 0110
add 10 100000 Add 0010
sub 10 100010 Subtract 0110
and 10 100100 AND 0000
or 10 100101 OR 0001
slt 10 101010 Set on less than 0111
26
Operation of Control Unit
ALU lw sw beq
ALUOp 10 00 00 01
ALUSrc 0 1 1 0
Branch 0 0 0 1
MemRead 0 1 0 0
MemWrite 0 0 1 0
MemToReg 0 1 x x
RegDst 1 0 x x
RegWrite 1 1 0 0
27
Datapath with Control Unit
28
Jump Instructions
  • The unconditional branch instruction (j) computes
    its branch target differently from the
    conditional branch instruction (beq)
  • Branch target address is
  • Top 4 bits of PC 4
  • 26-bit immediate value
  • Two zero bits

29
Datapath with Jump
30
Performance
  • The single-cycle datapath executes each
    instruction in just one cycle
  • CPI is 1.0, which is optimal
  • However, minimum clock cycle time is determined
    by slowest instruction
  • In practice the execution time can vary
    considerably between instructions making a
    single-cycle implementation a poor choice

31
Using Multiple Cycles
  • A multi-cycle datapath splits instruction
    execution into multiple steps, where each step
    take one cycle
  • If an instruction doesn't need a step it skips
    it, so different instructions run for different
    numbers of cycles
  • Slow instructions don't slow down the entire
    processor
  • Control unit becomes more complicated
  • Hardware can be shared between steps

32
Multicycle Datapath (1)
33
Multicycle Differences
  • A functional unit can be used more than once in
    the execution of an instruction, so long as those
    uses occur in different steps
  • Instruction memory and data memory are combined
    into a single unit
  • ALU takes over for the two separate adders
  • Additional registers are needed to save
    information between steps

34
Multicycle Registers
  • Instruction register (IR) hold the instruction
    during its execution
  • Memory data register (MDR) hold the data read
    from memory for one cycle
  • A hold source register for one cycle
  • B hold source register for one cycle
  • ALUOut hold ALU output for one cycle

35
Multicycle Datapath (2)
36
Multicycle Datapath (3)
37
New Control Signals
  • ALUSrcA selects first ALU operand to be either
    the PC or the A register
  • ALUSrcB selects second ALU operand from B
    register, constant 4, sign-extended immediate,
    sign-extended and shifted immediate
  • MemtoReg selects register file write data as
    coming from either ALUOut or MDR
  • IorD selects the memory address as coming from
    either PC or ALUOut

38
New Control Signals (cont'd)
  • IRWrite If asserted the memory output is written
    to IR
  • PCSource Selects the new value for the PC from
    ALU, ALUOut, jump target address
  • PCWrite If asserted the PC is written
  • PCWriteCond If asserted and the zero output from
    the ALU is 1 then the PC is written

39
Instruction Execution Steps
  • Instruction fetch
  • Instruction decode and register fetch
  • Execution, memory address computation, or branch
    completion
  • Memory access or R-type completion
  • Memory read completion

40
Instruction Fetch
  • Fetch instruction from memory
  • IR ? MemoryPC
  • Increment the PC
  • PC ? PC 4

41
Instruction Decode
  • Fetch operands from register file
  • A ? RegIR2521
  • B ? RegIR2016
  • Compute branch target address
  • ALUOut ? PC (sign-extend(IR150) ltlt 2)

42
Execute
  • Load/store Compute memory address
  • ALUOut ? A sign-extend(IR150)
  • R-type Perform operation specified by
    instruction
  • ALUOut ? A op B
  • Branch Compare registers and set PC if equal
  • if (A B) PC ? ALUOut
  • Jump Set PC to jump target address
  • PC ? PC3128, (IR250 ltlt 2)

43
Memory Access
  • Load Read memory word into MDR
  • MDR ? MemoryALUOut
  • Store Write B into memory
  • MemoryALUOut ? B
  • R-type Write result to destination register
  • RegIR1511 ? ALUOut

44
Memory Read Completion
  • Load Write result to destination register
  • RegIR2016 ? MDR

45
Multicycle Datapath (4)
46
State Machine
  • A state machine is a sequential logic device
    with
  • Set of states
  • Next-state function which determines the next
    state from the current state and the inputs
  • Output function which determines the outputs from
    the current state and possibly the inputs
  • In a Moore machine the output depends only on the
    state in a Mealy machine the output depends on
    the state and the inputs

47
Control with a State Machine
  • The control unit for our multicycle datapath will
    be a state machine
  • The only input is the op field of the
    instruction the outputs are the control signals
  • Each step may have multiple states if control
    signals depend on the instruction

48
Fetch and Decode States
49
Load and Store States
50
R-Type States
51
Branch State
52
Jump State
53
Complete State Machine
54
Single Cycle Datapath with Control Unit
0
Add
Add
1
4
Shift left 2
PCSrc
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
55
R-type Instruction Data/Control Flow
0
Add
Add
1
4
Shift left 2
PCSrc
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
56
Load Word Instruction Data/Control Flow
0
Add
Add
1
4
Shift left 2
PCSrc
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
57
Load Word Instruction Data/Control Flow
0
Add
Add
1
4
Shift left 2
PCSrc
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
58
Branch Instruction Data/Control Flow
0
Add
Add
1
4
Shift left 2
PCSrc
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
59
Branch Instruction Data/Control Flow
0
Add
Add
1
4
Shift left 2
PCSrc
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
60
Executing R Format Operations
  • R format operations (add, sub, slt, and, or)
  • perform the (op and funct) operation on values in
    rs and rt
  • store the result back into the Register File
    (into location rd)

ALU control
RegWrite
Read Addr 1
Read Data 1
Register File
Read Addr 2
overflow
Instruction
zero
ALU
Write Addr
Read Data 2
Write Data
  • The Register File is not written every cycle
    (e.g. sw), so we need an explicit write control
    signal for the Register File

61
Executing Load and Store Operations
  • Load and store operations involves
  • compute memory address by adding the base
    register (read from the Register File during
    decode) to the 16-bit signed-extended offset
    field in the instruction
  • store value (read from the Register File during
    decode) written to the Data Memory
  • load value, read from the Data Memory, written to
    the Register File

62
Executing Branch Operations
  • Branch operations involves
  • compare the operands read from the Register File
    during decode for equality (zero ALU output)
  • compute the branch target address by adding the
    updated PC to the 16-bit signed-extended
    offset field in the instr

Branch target address
Add
Add
4
Shift left 2
ALU control
PC
zero
(to branch control logic)
Read Addr 1
Read Data 1
Register File
Read Addr 2
Instruction
ALU
Write Addr
Read Data 2
Write Data
Sign Extend
16
32
63
Executing Jump Operations
  • Jump operation involves
  • replace the lower 28 bits of the PC with the
    lower 26 bits of the fetched instruction shifted
    left by 2 bits

Add
4
4
Jump address
Instruction Memory
Shift left 2
28
Read Address
Instruction
PC
26
64
Creating a Single Datapath from the Parts
  • Assemble the datapath segments and add control
    lines and multiplexors as needed
  • Single cycle design fetch, decode and execute
    each instructions in one clock cycle
  • no datapath resource can be used more than once
    per instruction, so some must be duplicated
    (e.g., separate Instruction Memory and Data
    Memory, several adders)
  • multiplexors needed at the input of shared
    elements with control lines to do the selection
  • write signals to control writing to the Register
    File and Data Memory
  • Cycle time is determined by length of the longest
    path

65
Fetch, R, and Memory Access Portions
66
Adding the Control
  • Selecting the operations to perform (ALU,
    Register File and Memory read/write)
  • Controlling the flow of data (multiplexor inputs)

31
25
20
15
5
0
10
R-type
op
rs
rt
rd
funct
shamt
31
25
20
15
0
  • Observations
  • op field always

    in bits 31-26
  • addr of registers

    to be read are

    always specified by the

    rs field (bits 25-21) and
    rt field (bits 20-16) for lw and sw rs is the
    base register
  • addr. of register to be written is in one of two
    places in rt (bits 20-16) for lw in rd (bits
    15-11) for R-type instructions
  • offset for beq, lw, and sw always in bits 15-0

I-Type
address offset
op
rs
rt
67
Single Cycle Datapath with Control Unit
0
Add
Add
1
4
Shift left 2
PCSrc
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
68
R-type Instruction Data/Control Flow
0
Add
Add
1
4
Shift left 2
PCSrc
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
69
Load Word Instruction Data/Control Flow
0
Add
Add
1
4
Shift left 2
PCSrc
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
70
Load Word Instruction Data/Control Flow
0
Add
Add
1
4
Shift left 2
PCSrc
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
71
Branch Instruction Data/Control Flow
0
Add
Add
1
4
Shift left 2
PCSrc
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
72
Branch Instruction Data/Control Flow
0
Add
Add
1
4
Shift left 2
PCSrc
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
73
Adding the Jump Operation
Instr25-0
1
Shift left 2
32
28
26
0
PC431-28
0
Add
Add
1
4
Shift left 2
PCSrc
Jump
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
74
Single Cycle Disadvantages Advantages
  • Uses the clock cycle inefficiently the clock
    cycle must be timed to accommodate the slowest
    instruction
  • especially problematic for more complex
    instructions like floating point multiply
  • May be wasteful of area since some functional
    units (e.g., adders) must be duplicated since
    they can not be shared during a clock cycle
  • but
  • Is simple and easy to understand

75
Multicycle Datapath Approach
  • Let an instruction take more than 1 clock cycle
    to complete
  • Break up instructions into steps where each step
    takes a cycle while trying to
  • balance the amount of work to be done in each
    step
  • restrict each cycle to use only one major
    functional unit
  • Not every instruction takes the same number of
    clock cycles
  • In addition to faster clock rates, multicycle
    allows functional units that can be used more
    than once per instruction as long as they are
    used on different clock cycles, as a result
  • only need one memory but only one memory access
    per cycle
  • need only one ALU/adder but only one ALU
    operation per cycle

76
Multicycle Datapath Approach, cont
  • At the end of a cycle
  • Store values needed in a later cycle by the
    current instruction in an internal register (not
    visible to the programmer). All (except IR) hold
    data only between a pair of adjacent clock cycles
    (no write control signal needed)
  • IR Instruction Register MDR Memory Data
    Register
  • A, B regfile read data registers ALUout ALU
    output register
  • Data used by subsequent instructions are stored
    in programmer visible registers (i.e., register
    file, PC, or memory)

77
The Multicycle Datapath with Control Signals
PCWriteCond
PCWrite
PCSource
ALUOp
IorD
Control
MemRead
ALUSrcB
MemWrite
ALUSrcA
MemtoReg
RegWrite
IRWrite
RegDst
PC31-28
Instr31-26
Shift left 2
28
Instr25-0
2
0
1
Address
Memory
0
PC
Read Addr 1
0
A
Read Data 1
IR
Register File
1
zero
1
Read Addr 2
Read Data (Instr. or Data)
0
ALUout
ALU
Write Addr
Read Data 2
Write Data
1
B
0
Write Data
1
4
1
0
2
Sign Extend
Shift left 2
3
Instr15-0
ALU control
32
Instr5-0
78
Multicycle Control Unit
  • Multicycle datapath control signals are not
    determined solely by the bits in the instruction
  • e.g., op code bits tell what operation the ALU
    should be doing, but not what instruction cycle
    is to be done next
  • Must use a finite state machine (FSM) for control
  • a set of states (current state stored in State
    Register)
  • next state function (determined
    by
    current state and the input)
  • output function (determined by
    current state
    and the input)

79
The Five Steps of the Load Instruction
Cycle 1
Cycle 2
Cycle 3
Cycle 4
Cycle 5
Dec
lw
  • IFetch Instruction Fetch and Update PC
  • Dec Instruction Decode, Register Read, Sign
    Extend Offset
  • Exec Execute R-type Calculate Memory Address
    Branch Comparison Branch and Jump Completion
  • Mem Memory Read Memory Write Completion R-type
    Completion (RegFile write)
  • WB Memory Read Completion (RegFile write)

INSTRUCTIONS TAKE FROM 3 - 5 CYCLES!
80
Multicycle Advantages Disadvantages
  • Uses the clock cycle efficiently the clock
    cycle is timed to accommodate the slowest
    instruction step
  • Multicycle implementations allow functional units
    to be used more than once per instruction as long
    as they are used on different clock cycles
  • but
  • Requires additional internal state registers,
    more muxes, and more complicated (FSM) control

81
Single Cycle vs. Multiple Cycle Timing
82
Next Lecture and Reminders
  • Next lecture
  • MIPS pipelined datapath review
  • Reading assignment PH, Chapter 6.1-6.3
  • Reminders
  • HW2 due September 27th
  • Evening midterm exam scheduled
  • Tuesday, October 18th , 2015 to 2215, Location
    113 IST
  • You should have let me know by now if you have a
    conflict !!

83
MIPS Subset
  • Memory access instructions
  • lw, sw
  • Arithmetic and logic instructions
  • add, sub, and, or, slt
  • Branch instructions
  • beq, j

84
Instruction Formats
op rs rt rd shamt funct
3126 2521 2016 1511 106 50
op rs rt immed
3126 2521 2016 150
op addr
3126 250
85
Fetch-Decode-Execute
  • In order to execute an instruction we must
  • Fetch the instruction from memory
  • Determine what the instruction is (decode)
  • Execute it
  • Fetch and decode are the same for all
    instructions
  • Execute depends on the type of instruction

86
Executing Load and Store
  • Load
  • Fetch operand (base address) from register
  • Compute effective address
  • Read data from memory
  • Write result back to register
  • Store
  • Fetch operands from registers
  • Compute effective address
  • Write data to memory

87
Executing Arithmetic/Logic
  • Arithmetic/logic (add, sub, and, or, slt)
  • Fetch operands from registers
  • Perform operation
  • Write result back to register

88
Executing Branch and Jump
  • Conditional branch (beq)
  • Fetch operands from registers
  • Compare operands
  • If equal add displacement to PC
  • Jump (j)
  • Write new value to PC

89
Instruction Fetch
  • Components
  • Instruction Memory
  • Program Counter (PC)
  • Adder
  • Operation
  • Fetch the instruction whose address is in the PC
  • Increment the PC by 4

90
Components for Instr Fetch
91
Instruction Fetch Datapath
92
ALU Instructions
  • Components
  • Register File
  • ALU
  • Operation
  • Use instruction fields to select registers
  • Read source registers and send them to ALU
  • Send ALU result to destination register

93
Components for ALU Instrs
94
ALU Datapath
95
Memory Access
  • Components
  • Register File
  • ALU
  • Data Memory
  • Sign-Extension Unit
  • Operation
  • ALU adds base register and sign-extended
    immediate
  • Send ALU result to memory as the address
  • Read the value from memory into the destination
    register (lw) or write the value from the source
    register into memory (sw)

96
Components for Mem Access
97
Memory Access Datapath
98
Branches
  • Components
  • Register File
  • ALU
  • Program Counter (PC)
  • Adder
  • Sign-Extension Unit
  • Operation
  • Send source register values to ALU for comparison
  • Adder computes branch target address
  • Control logic decides whether branch is taken or
    not

99
Branch Datapath
100
Putting It All Together
101
Control Unit
  • Control unit takes an instruction as input and
    produces control signals as output
  • Types of control signals
  • Multiplexor selector signals
  • Write enables for state elements
  • Control signals for other blocks (ALU, etc.)
  • In a single-cycle datapath the control unit is
    simple, just look up instruction in a table

102
Control Signals
  • RegDst Selects either rd or rt as the
    destination register
  • RegWrite The value on the write data port will
    be written into the register specified by the
    write register input when asserted
  • ALUOp Selects ALU operation
  • ALUSrc Selects the second ALU input to be either
    the second register output or the sign-extended
    immediate value

103
Control Signals (cont'd)
  • PCSrc Selects new PC as either PC 4 or the
    output of the branch target adder
  • This signal is derived from the Branch control
    signal and the ALU's Zero output
  • MemRead/MemWrite Causes data memory to perform a
    read/write operation when asserted
  • MemToReg Selects either the ALU output or the
    data memory output as the data input into the
    register file

104
ALU Control
  • In order to simplify design of the control unit
    we give the ALU its own control logic
  • The ALU control block takes a 2-bit input from
    the control unit (ALUOp) and the funct field from
    the instruction and produces the ALU control
    signals

105
ALU Control Signals
Instruction ALUOp funct Field ALU Function ALU Inputs
lw 00 Add 0010
sw 00 Add 0010
beq 01 Subtract 0110
add 10 100000 Add 0010
sub 10 100010 Subtract 0110
and 10 100100 AND 0000
or 10 100101 OR 0001
slt 10 101010 Set on less than 0111
106
Operation of Control Unit
ALU lw sw beq
ALUOp 10 00 00 01
ALUSrc 0 1 1 0
Branch 0 0 0 1
MemRead 0 1 0 0
MemWrite 0 0 1 0
MemToReg 0 1 x x
RegDst 1 0 x x
RegWrite 1 1 0 0
107
Datapath with Control Unit
108
Jump Instructions
  • The unconditional branch instruction (j) computes
    its branch target differently from the
    conditional branch instruction (beq)
  • Branch target address is
  • Top 4 bits of PC 4
  • 26-bit immediate value
  • Two zero bits

109
Datapath with Jump
110
Adding the Jump Operation
Instr25-0
1
Shift left 2
32
28
26
0
PC431-28
0
Add
Add
1
4
Shift left 2
PCSrc
Jump
ALUOp
Branch
MemRead
MemtoReg
Control Unit
Instr31-26
MemWrite
ALUSrc
RegWrite
RegDst
ovf
Instr25-21
Read Addr 1
Instruction Memory
Read Data 1
Address
Register File
zero
Instr20-16
Read Addr 2
Data Memory
Read Address
Instr31-0
PC
Read Data
1
0
ALU
Write Addr
Read Data 2
0
1
Write Data
0
Instr15 -11
Write Data
1
Sign Extend
Instr15-0
ALU control
16
32
Instr5-0
111
Performance
  • The single-cycle datapath executes each
    instruction in just one cycle
  • CPI is 1.0, which is optimal
  • However, minimum clock cycle time is determined
    by slowest instruction
  • In practice the execution time can vary
    considerably between instructions making a
    single-cycle implementation a poor choice

112
Using Multiple Cycles
  • A multi-cycle datapath splits instruction
    execution into multiple steps, where each step
    take one cycle
  • If an instruction doesn't need a step it skips
    it, so different instructions run for different
    numbers of cycles
  • Slow instructions don't slow down the entire
    processor
  • Control unit becomes more complicated
  • Hardware can be shared between steps

113
Multicycle Datapath (1)
114
Multicycle Differences
  • A functional unit can be used more than once in
    the execution of an instruction, so long as those
    uses occur in different steps
  • Instruction memory and data memory are combined
    into a single unit
  • ALU takes over for the two separate adders
  • Additional registers are needed to save
    information between steps

115
Multicycle Datapath (2)
116
Multicycle Datapath (3)
117
Multicycle Datapath (4)
118
Multicycle Registers
  • Instruction register (IR) hold the instruction
    during its execution
  • Memory data register (MDR) hold the data read
    from memory for one cycle
  • A hold source register for one cycle
  • B hold source register for one cycle
  • ALUOut hold ALU output for one cycle

119
New Control Signals
  • ALUSrcA selects first ALU operand to be either
    the PC or the A register
  • ALUSrcB selects second ALU operand from B
    register, constant 4, sign-extended immediate,
    sign-extended and shifted immediate
  • MemtoReg selects register file write data as
    coming from either ALUOut or MDR
  • IorD selects the memory address as coming from
    either PC or ALUOut

120
New Control Signals (cont'd)
  • IRWrite If asserted the memory output is written
    to IR
  • PCSource Selects the new value for the PC from
    ALU, ALUOut, jump target address
  • PCWrite If asserted the PC is written
  • PCWriteCond If asserted and the zero output from
    the ALU is 1 then the PC is written

121
Instruction Execution Steps
  • Instruction fetch
  • Instruction decode and register fetch
  • Execution, memory address computation, or branch
    completion
  • Memory access or R-type completion
  • Memory read completion

122
Instruction Fetch
  • Fetch instruction from memory
  • IR ? MemoryPC
  • Increment the PC
  • PC ? PC 4

123
Instruction Decode
  • Fetch operands from register file
  • A ? RegIR2521
  • B ? RegIR2016
  • Compute branch target address
  • ALUOut ? PC (sign-extend(IR150) ltlt 2)

124
Execute
  • Load/store Compute memory address
  • ALUOut ? A sign-extend(IR150)
  • R-type Perform operation specified by
    instruction
  • ALUOut ? A op B
  • Branch Compare registers and set PC if equal
  • if (A B) PC ? ALUOut
  • Jump Set PC to jump target address
  • PC ? PC3128, (IR250 ltlt 2)

125
Memory Access
  • Load Read memory word into MDR
  • MDR ? MemoryALUOut
  • Store Write B into memory
  • MemoryALUOut ? B
  • R-type Write result to destination register
  • RegIR1511 ? ALUOut

126
Memory Read Completion
  • Load Write result to destination register
  • RegIR2016 ? MDR

127
State Machine
  • A state machine is a sequential logic device
    with
  • Set of states
  • Next-state function which determines the next
    state from the current state and the inputs
  • Output function which determines the outputs from
    the current state and possibly the inputs
  • In a Moore machine the output depends only on the
    state in a Mealy machine the output depends on
    the state and the inputs

128
Control with a State Machine
  • The control unit for our multicycle datapath will
    be a state machine
  • The only input is the op field of the
    instruction the outputs are the control signals
  • Each step may have multiple states if control
    signals depend on the instruction

129
Fetch and Decode States
130
Load and Store States
131
R-Type States
132
Branch State
133
Jump State
134
Complete State Machine
135
Exceptions
  • An exception is an event that causes an
    unscheduled transfer of control
  • Also known as interrupts and traps
  • Typically an interrupt is caused externally while
    an exception or trap is caused internally
  • Arithmetic overflow is an example of an
    exception an I/O device request is an example of
    an interrupt

136
Handling Exceptions
  • When hardware detects an exception it transfers
    control to a software routine called an exception
    handler which is typically a part of an operating
    system
  • The hardware saves the value of the PC in the
    exception PC (EPC) register so it can return
    there after the exception is handled

137
Determining the Cause
  • The hardware must tell the exception handler what
    the cause of the exception was
  • One way to do this is store a value into a
    special Cause register (MIPS)
  • Another way is to use vectored interrupts where
    control is transferred to a different address
    depending on the cause

138
Exceptions to Implement
  • Undefined instruction occurs when the op field of
    an instruction indicates an undefined or
    unimplemented instruction
  • Arithmetic overflow occurs when the ALU indicates
    that overflow has occurred during an R-type
    instruction

139
Adding Exceptions
  • The EPC register saves the old PC it is written
    when the EPCWrite is asserted
  • The Cause register records the cause of the
    exception it is written when CauseWrite is
    asserted
  • The IntCause signal indicates the cause of the
    exception
  • Control is always transferred to 0x80000180

140
Changes to the Datapath
141
Changes to the Control Unit
142
Microprogramming
  • An alternative to state machines for control is
    microprogramming
  • Each instruction corresponds to a sequence of
    microinstructions (a microprogram)
  • The opcode bits specify the starting address of
    the microprogram within the microcode ROM.
  • A microinstruction contains values for all of the
    control signals plus some sequencing control bits
  • Microprogramming makes it easier to change the
    control unit or to implement complex instructions

143
Microprogramming (cont'd)
144
Multicycle Performance
  • The multicycle datapath has a much shorter clock
    cycle time than the single-cycle datapath
  • However, it also has a larger CPI
  • Is the multicycle datapath really faster?
  • Depends on the instruction mix
  • Can we still do better?
Write a Comment
User Comments (0)
About PowerShow.com