The Processor: Datapath and Control

About This Presentation

Title:

The Processor: Datapath and Control

Description:

We will see how to ... use the program counter (PC) to supply instruction address. get the ... Use multiplexors to stitch them together. Building the ... – PowerPoint PPT presentation

Number of Views:28

Avg rating:3.0/5.0

Slides: 55

Provided by: jamiirulu

Category:

more less

Transcript and Presenter's Notes

Title: The Processor: Datapath and Control

1
The Processor Datapath and Control
CHAPTER 5 Part 2
2
The processor datapath and control

Instruction Execution Cycle

We will see how to design the processor

3
The Processor Datapath Control

We're ready to look at an implementation of the
MIPS processor
Simplified to contain only
memory-reference instructions lw, sw
arithmetic-logical instructions add, sub, and,
or, slt
control flow instructions beq, j
Generic Implementation
use the program counter (PC) to supply
instruction address
get the instruction from memory
read registers
use the instruction to decide exactly what to do

4
The Processor More Implementation Details

Abstract / Simplified View of a
ProcessorTwo types of functional
units
elements that operate on data values
(combinational) e.g. ALU
elements that contain state (sequential) e.g.
Registers and Memory

5
Overview of chapter 5

State (sequential) Elements and storage mechanism
for registers.
Register File (reading and writing)
Building a single cycle MIPS datapath to
accommodate
Instruction fetch
R-type instructions
lw/sw instructions
beq instruction
Control unit for a single cycle MIPS datapath
Multicyle MIPS datapath
Steps involved in executing an instruction
Overview of design

6
Combinational vs. Sequential Circuits

Combinational circuits
Output fully depends on the inputs.
Applying the same inputs always produces the same
output.
E.g. Combinational circuits which just do
arithmetic and have no memory.
E.g. An ALU with a 3, b 2, Operation 10
(addition), Output will always be 5.

Sequential (state) circuits
Output depends on both inputs and state (memory).
Same inputs can yield different outputs depending
on both input and state (memory).
State can also change with inputs!
E.g. A register containing say 0010 0010 when
shifted and read. Input Shift command (same)
Output Read Different each time!

7
State Elements (Sequential Elements)

Unclocked vs. Clocked
Clocks used in synchronous logic
when should an element that contains state be
updated? (Possibilities Rising Edge, Falling
Edge, During Assertion, During Disassertion.)

8
(Storing a bit) An unclocked state element

The set-reset latch
output depends on present inputs and also on past
inputs
If R 0, S 0 The value stored on the output Q
is recycled by inverting it to obtain Q and then
inverting Q to obtain Q and so on. The latch acts
as a storage device.
If R 0, S 1 The latch stores S 1 into Q
eventually and 0 into Q.
If R 1, S 0 The latch stores S 0 into Q
eventually and 1 into Q.
If R 1, S 1 The latch stores 0 into Q and 0
into Q (unacceptable!)

9
Clocked state elements Latches / Flip-flops

In Computer Applications, flip-flops and latches
are used to store data/state/signals.
A clocking methodology defines when data/signals
can be read and written
We wouldn't want to read a signal at the same
time it was being written
Output is equal to the stored value inside the
element(don't need to ask for permission to look
at the value)
Change of state (value) is based on the type of
component
Latches whenever the inputs change, and the
clock is asserted
Flip-flop state changes only on a clock
edge (edge-triggered methodology)

10
D-latch

Two inputs
the data value to be stored (D)
the clock signal (C) indicating when to read
store D
Two outputs
the value of the internal state (Q) and it's
complement Q
When C 1, D-latch stores D as Q (can accept
data in the duration of C 1)
When C 0, D-latch keeps its internal state in Q
and Q

Q
D
D Latch
C
11
D flip-flop State changes only on falling clock
edge

Two inputs
the data value to be stored (D)
the clock signal (C) indicating when to read
store D
Two outputs
the value of the internal state (Q) and it's
complement
Internal changes only on the clock edge (falling
edge). As soon as C becomes 1 D-flip-flop stores
D as Q.
At other times D-flop-flop keeps its internal
state in Q and Q
How would you implement a D flip flop that
changes state only at the rising edge?

12
Our Implementation

An edge triggered methodology (State elements
accept data only at the edge.
edge methodology is at rising or falling edge but
not both.
Typical execution
read contents of some state elements,
send values through some combinational logic
write results to one or more state elements

13
Register File

An MIPS processor contains 32 registers. The
registers are grouped in a place called a
register file. A register file consists of a set
of registers that can be read or written by
supplying a register number to be accessed.
Registers are built using D flip-flopsImplementa
tion of a n1-Register file for reading purposes.

14
Reading a Register File

Consider an operation z x y with x 6, y 7
to be carried out by the MIPS add s2, s0, s1
to carry out s2 s0 s1.
Supply the addresses of s0 and s1 (16 and 17)
via Read register1, Read register2 and read the
register contents 6, and 7 via Read data1, Read
data2.

15
Decoder

An (n-1) decoder is a logical block that has
n-bits of inputs and up to 2n output where only
one output is asserted (enabled) for each input
combination.
Consider the 3-1 decoder below with inputs
a2a1a0 and outputs s7 s6 s5 s4 s3 s2 s1 s0
What are the Boolean formulas each output? A
single product!
E.g. Assume Input a2a1a0, Then

input 011 selects output 3
0
0
0
0
1
3-1 decoder
3-1 decoder
1
0
1
0
0
0
16
Writing into a Register File

We will use a decoder to choose which register
should receive the data.
Note we still use the real clock to determine
when to write.
Example Show how to write 4 into register number
1.

17
Writing into a Register File

Inputs Register number, register data, write
signal.
Process Register number is decoded to select
(enable) the proper register to receive the data.
Proper register is enabled by the write signal.
Register data is supplied.
Proper register receives the register data.

18
Reading and writing a Register File

Consider an operation z x y with x 6, y 7
to be carried out by the MIPS add s2, s0, s1
to carry out s2 s0 s1.
Supply the addresses of s0 and s1 (16 and 17)
via Read register1, Read register2 and read the
register contents 6, and 7 via Read data1, Read
data2.
Write the result 13 (via Write data) into
register s2 by supplying the register address 18
(via Write register) with write signal 1.

19
Building a datapath

The program to be executed is first loaded into
the instruction memory.
Each instruction to be executed is fetched into
the datapath.
The address in the instruction memory of the
current instruction being executed is in the
program counter, PC.
This address in the PC is incremented by 4 using
the adder in preparation for the next
instruction.

20
Instructions are fetched from Memory

In this chapter we consider the datapath and
control and how they relate to memory.
Instruction execution is timed by a CPU clock.
The CPU's clock cycles run at a speed called the
processor speed.
Processor speed now days run in hundred thousands
to millions of times per second. e.g. 800 MHz.
800 Mega cycles per second.
Each instruction takes a few CPU cycles say 2-5
cycles.
Instructions in execution are stored in a part of
memory called Instruction Memory.
The address in the instruction memory of the
currently executed instruction is stored in the
Program Counter, PC.
To get the address of the next instruction, the
datapath adds 4 to PC. A specialized addition
machine called an adder is used for this purpose.

21
Instructions are fetched from Memory

Memory Implementation Memory is composed of
memory words. Each memory word 32 bits of
storage. Each storage bit is implemented
electronically using flip-flops or latches.

22
Datapath Fetching instruction and adding 4 to
PC.

Supply the address in PC to instruction memory.
Read the instruction.
Increment PC by 4 to get the next instruction
byte address.

23
Datapath for executing R-type instructions

Supply address of registers to be read via Read
register 1, 2
Read the register contents via Read data 1, 2
Direct ALU to do operation by supplying its ALU
operation
Store the result back into register file via
Write data at address Write register and enabling
RegWrite

24
Executing R-type instructions

Consider add s3, s1, s2
Address for s1, and s2 supplied to memory
register via Read register1, Read register2.
Data is read from registers s1, and s2 via Read
data1, Read data2.
Data is added in the ALU.
ALU result is written via Write data with the
help of the address of s3 via Write register to
s3.
All the above is regulated by sending control
signals at the proper time.

25
Datapath for lw and sw
lw s1, 20(s2) same as s1 Memorys2
20 Decoded as op s2 s1 20

sw s1, 20(s2) same as Memorys2 20
s1 Decoded as op s2 s1 20

26
Executing lw and sw instructions

Consider lw s1, 20(s2)
Address for s2 supplied to register file via
Read register1.
Data is read from register s2 via Read data1
(note this data is itself an address)
The 16-bit offset, 20 is sign-extended to 32
bits.
The result of s2 20 is obtained and used to
fetch data from the memory.

27
Datapath for beq
28
Executing a beq instruction

beq t1, t2, offset
if t1 t2 branch to offset else go to the
next instruction. Assume offset 4, then in the
memory to program instructions are as follows
beq t1, t2, offset
--------------------
--------------------
--------------------
--------------------
offset --------------------

29
Steps in executing a beq instruction

Step 1 PC PC 4
Step 2 Supply address of t1, t2 to Register
file via Read register1, Read register2 to get
the contents of t1, t2 via Read data1, Read
data2.
Step 3 Use ALU to determine if the values in
Read data1, Read data2 are equal (Zero 1) or
not equal (Zero 0). Zero is sent to branch
logic to determine when to branch.
Step 4 Sign extend the 16-bit offset to 32 bits.
Shift offset left by 2 bits (same as x 4 bytes).
Add offset to PC.
Step 5 If we are not branching, the control
logic replaces PC with previous value in Step 1.

30
Building the Datapath

Use multiplexors to stitch them together

31
Building the Complete Datapath

Share datapath elements among instruction
classes.
Allow multiple connections to an element.
Component A to B and C Component A from B
and C(Split connection) (use an mux and
add control)

B
B
A
A
C
C
32
Stages of Combining Components

Combine R-type Mem. Ref unitsAdd 2 mux
(ALUSrc, MemToReg)
Add instruction fetch part Connect instruction
output
Add branch datapath Add PCSrc mux and split
common sources.

33
Complete Datapath with all control lines
identified Single-cycle Datapath

Calculate cycle time assuming negligible delays
except
memory (2ns), ALU and adders (2ns), register file
access (1ns)

34
(No Transcript)
35
(No Transcript)
36
(No Transcript)
37
The Instruction classes (R-type, load, store,
branch)

Can you figure out where each instruction section
of each instruction goes on the datapath?

38
The effect of each of the seven control signals
39
Control

Selecting the operations to perform (ALU,
read/write, etc.)
Controlling the flow of data (multiplexor inputs)
Information comes from the 32 bits of the
instruction
Example add t0, s1, s2 Instruction
Format
ALU's operation based on instruction type and
function code

40
Control

e.g., what should the ALU do with this
instruction
Example lw 1, 100(2) 35 2 1
100 op rs rt 16 bit offset
ALU control input 000 AND 001 OR 010 add 110
subtract 111 set-on-less-than
Why is the code for subtract 110 and not 011?

41
Control

Must describe hardware to compute 3-bit ALU
control input
given instruction type (input into ALUCntrol)
00 lw, sw (ALUs result to be subtraction) 01
beq (ALUs result to be Less)11 arithmetic
(ALUs result determined from instruction)
Also input into ALU Control function code for
arithmetic
Describe it using a truth table (can turn into
gates)

42
Control

43
Control

Simple combinational logic (truth tables)

Main Control unit
ALU Control unit
44
Improving on the Datapath Multi-cycle Datapath

Single-cycle datapath is inefficient. Why?
Five Execution Steps are
Instruction Fetch
Instruction Decode and Register Fetch
Execution, Memory Address Computation, or Branch
Completion
Memory Access or R-type instruction completion
Write-back step INSTRUCTIONS TAKE FROM 3 - 5
CYCLES!

45
Step 1 Instruction Fetch

Use PC to get instruction and put it in the
Instruction Register.
Increment the PC by 4 and put the result back in
the PC.
Can be described succinctly using RTL
"Register-Transfer Language" IR
MemoryPC PC PC 4

46
Step 2 Instruction Decode and Register Fetch

Read registers rs and rt in case we need them
Compute the branch address in case the
instruction is a branch
RTL A RegIR25-21 B
RegIR20-16 ALUOut PC (sign-extend(IR15-
0) ltlt 2)
We aren't setting any control lines based on the
instruction type (we are busy "decoding" it in
our control logic)

47
Step 3 (instruction dependent)

ALU is performing one of three functions, based
on instruction type
Memory Reference ALUOut A
sign-extend(IR15-0)
R-type ALUOut A op B
Branch if (AB) PC ALUOut

48
Step 4 (R-type or memory-access)

Loads and stores access memory MDR
MemoryALUOut or MemoryALUOut B
R-type instructions finish RegIR15-11
ALUOut Step 5 Write-back step
Load finishesRegIR20-16 MDR

49
Summary
50
Simple Questions

How many cycles will it take to execute this
code? lw t2, 0(t3) lw t3, 4(t3) beq
t2, t3, Label assume not equal add t5, t2,
t3 sw t5, 8(t3)Label ...
Can you represent these instructions into
micro-operations?

51
High level multi-cycle processor
52
MIPS Multi-cycle processor without controls
53
MIPS Multi-cycle processor with controls
54
Chapter five Summary

The Datapath and control can be designed based in
the instruction set architecture.
The datapath is composed in combinational units
(e.g. adder, ALU, mux) and sequential units such
as registers and memory.
Have considered mainly the single-cycle datapath
design and introduced multi-cycle datapath.
The control unit issues the right control signals
at the right time to enable complete execution of
an instruction.
The control design requires a through
understanding of the design.
Have only seen control design for single cycle
datapath.
Control design of multi-cycle datapath requires
finite state machine theory.
Datapath has mechanism for fetching next
instruction, and thus a executing a whole
program.