The Processor: Datapath and Control - PowerPoint PPT Presentation

1 / 54
About This Presentation
Title:

The Processor: Datapath and Control

Description:

We will see how to ... use the program counter (PC) to supply instruction address. get the ... Use multiplexors to stitch them together. Building the ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 55
Provided by: jamiirulu
Category:

less

Transcript and Presenter's Notes

Title: The Processor: Datapath and Control


1
The Processor Datapath and Control
CHAPTER 5 Part 2
2
The processor datapath and control
  • Instruction Execution Cycle
  • We will see how to design the processor

3
The Processor Datapath Control
  • We're ready to look at an implementation of the
    MIPS processor
  • Simplified to contain only
  • memory-reference instructions lw, sw
  • arithmetic-logical instructions add, sub, and,
    or, slt
  • control flow instructions beq, j
  • Generic Implementation
  • use the program counter (PC) to supply
    instruction address
  • get the instruction from memory
  • read registers
  • use the instruction to decide exactly what to do

4
The Processor More Implementation Details
  • Abstract / Simplified View of a
    ProcessorTwo types of functional
    units
  • elements that operate on data values
    (combinational) e.g. ALU
  • elements that contain state (sequential) e.g.
    Registers and Memory

5
Overview of chapter 5
  • State (sequential) Elements and storage mechanism
    for registers.
  • Register File (reading and writing)
  • Building a single cycle MIPS datapath to
    accommodate
  • Instruction fetch
  • R-type instructions
  • lw/sw instructions
  • beq instruction
  • Control unit for a single cycle MIPS datapath
  • Multicyle MIPS datapath
  • Steps involved in executing an instruction
  • Overview of design

6
Combinational vs. Sequential Circuits
  • Combinational circuits
  • Output fully depends on the inputs.
  • Applying the same inputs always produces the same
    output.
  • E.g. Combinational circuits which just do
    arithmetic and have no memory.
  • E.g. An ALU with a 3, b 2, Operation 10
    (addition), Output will always be 5.
  • Sequential (state) circuits
  • Output depends on both inputs and state (memory).
  • Same inputs can yield different outputs depending
    on both input and state (memory).
  • State can also change with inputs!
  • E.g. A register containing say 0010 0010 when
    shifted and read. Input Shift command (same)
    Output Read Different each time!

7
State Elements (Sequential Elements)
  • Unclocked vs. Clocked
  • Clocks used in synchronous logic
  • when should an element that contains state be
    updated? (Possibilities Rising Edge, Falling
    Edge, During Assertion, During Disassertion.)

8
(Storing a bit) An unclocked state element
  • The set-reset latch
  • output depends on present inputs and also on past
    inputs
  • If R 0, S 0 The value stored on the output Q
    is recycled by inverting it to obtain Q and then
    inverting Q to obtain Q and so on. The latch acts
    as a storage device.
  • If R 0, S 1 The latch stores S 1 into Q
    eventually and 0 into Q.
  • If R 1, S 0 The latch stores S 0 into Q
    eventually and 1 into Q.
  • If R 1, S 1 The latch stores 0 into Q and 0
    into Q (unacceptable!)

9
Clocked state elements Latches / Flip-flops
  • In Computer Applications, flip-flops and latches
    are used to store data/state/signals.
  • A clocking methodology defines when data/signals
    can be read and written
  • We wouldn't want to read a signal at the same
    time it was being written
  • Output is equal to the stored value inside the
    element(don't need to ask for permission to look
    at the value)
  • Change of state (value) is based on the type of
    component
  • Latches whenever the inputs change, and the
    clock is asserted
  • Flip-flop state changes only on a clock
    edge (edge-triggered methodology)

10
D-latch
  • Two inputs
  • the data value to be stored (D)
  • the clock signal (C) indicating when to read
    store D
  • Two outputs
  • the value of the internal state (Q) and it's
    complement Q
  • When C 1, D-latch stores D as Q (can accept
    data in the duration of C 1)
  • When C 0, D-latch keeps its internal state in Q
    and Q

Q
D
D Latch
C
11
D flip-flop State changes only on falling clock
edge
  • Two inputs
  • the data value to be stored (D)
  • the clock signal (C) indicating when to read
    store D
  • Two outputs
  • the value of the internal state (Q) and it's
    complement
  • Internal changes only on the clock edge (falling
    edge). As soon as C becomes 1 D-flip-flop stores
    D as Q.
  • At other times D-flop-flop keeps its internal
    state in Q and Q
  • How would you implement a D flip flop that
    changes state only at the rising edge?

12
Our Implementation
  • An edge triggered methodology (State elements
    accept data only at the edge.
  • edge methodology is at rising or falling edge but
    not both.
  • Typical execution
  • read contents of some state elements,
  • send values through some combinational logic
  • write results to one or more state elements

13
Register File
  • An MIPS processor contains 32 registers. The
    registers are grouped in a place called a
    register file. A register file consists of a set
    of registers that can be read or written by
    supplying a register number to be accessed.
  • Registers are built using D flip-flopsImplementa
    tion of a n1-Register file for reading purposes.

14
Reading a Register File
  • Consider an operation z x y with x 6, y 7
    to be carried out by the MIPS add s2, s0, s1
    to carry out s2 s0 s1.
  • Supply the addresses of s0 and s1 (16 and 17)
    via Read register1, Read register2 and read the
    register contents 6, and 7 via Read data1, Read
    data2.

15
Decoder
  • An (n-1) decoder is a logical block that has
    n-bits of inputs and up to 2n output where only
    one output is asserted (enabled) for each input
    combination.
  • Consider the 3-1 decoder below with inputs
    a2a1a0 and outputs s7 s6 s5 s4 s3 s2 s1 s0
  • What are the Boolean formulas each output? A
    single product!
  • E.g. Assume Input a2a1a0, Then

input 011 selects output 3
0
0
0
0
1
3-1 decoder
3-1 decoder
1
0
1
0
0
0
16
Writing into a Register File
  • We will use a decoder to choose which register
    should receive the data.
  • Note we still use the real clock to determine
    when to write.
  • Example Show how to write 4 into register number
    1.

17
Writing into a Register File
  • Inputs Register number, register data, write
    signal.
  • Process Register number is decoded to select
    (enable) the proper register to receive the data.
  • Proper register is enabled by the write signal.
  • Register data is supplied.
  • Proper register receives the register data.

18
Reading and writing a Register File
  • Consider an operation z x y with x 6, y 7
    to be carried out by the MIPS add s2, s0, s1
    to carry out s2 s0 s1.
  • Supply the addresses of s0 and s1 (16 and 17)
    via Read register1, Read register2 and read the
    register contents 6, and 7 via Read data1, Read
    data2.
  • Write the result 13 (via Write data) into
    register s2 by supplying the register address 18
    (via Write register) with write signal 1.

19
Building a datapath
  • The program to be executed is first loaded into
    the instruction memory.
  • Each instruction to be executed is fetched into
    the datapath.
  • The address in the instruction memory of the
    current instruction being executed is in the
    program counter, PC.
  • This address in the PC is incremented by 4 using
    the adder in preparation for the next
    instruction.

20
Instructions are fetched from Memory
  • In this chapter we consider the datapath and
    control and how they relate to memory.
  • Instruction execution is timed by a CPU clock.
    The CPU's clock cycles run at a speed called the
    processor speed.
  • Processor speed now days run in hundred thousands
    to millions of times per second. e.g. 800 MHz.
    800 Mega cycles per second.
  • Each instruction takes a few CPU cycles say 2-5
    cycles.
  • Instructions in execution are stored in a part of
    memory called Instruction Memory.
  • The address in the instruction memory of the
    currently executed instruction is stored in the
    Program Counter, PC.
  • To get the address of the next instruction, the
    datapath adds 4 to PC. A specialized addition
    machine called an adder is used for this purpose.

21
Instructions are fetched from Memory
  • Memory Implementation Memory is composed of
    memory words. Each memory word 32 bits of
    storage. Each storage bit is implemented
    electronically using flip-flops or latches.

22
Datapath Fetching instruction and adding 4 to
PC.
  • Supply the address in PC to instruction memory.
  • Read the instruction.
  • Increment PC by 4 to get the next instruction
    byte address.

23
Datapath for executing R-type instructions
  • Supply address of registers to be read via Read
    register 1, 2
  • Read the register contents via Read data 1, 2
  • Direct ALU to do operation by supplying its ALU
    operation
  • Store the result back into register file via
    Write data at address Write register and enabling
    RegWrite

24
Executing R-type instructions
  • Consider add s3, s1, s2
  • Address for s1, and s2 supplied to memory
    register via Read register1, Read register2.
  • Data is read from registers s1, and s2 via Read
    data1, Read data2.
  • Data is added in the ALU.
  • ALU result is written via Write data with the
    help of the address of s3 via Write register to
    s3.
  • All the above is regulated by sending control
    signals at the proper time.

25
Datapath for lw and sw
lw s1, 20(s2) same as s1 Memorys2
20 Decoded as op s2 s1 20

sw s1, 20(s2) same as Memorys2 20
s1 Decoded as op s2 s1 20

26
Executing lw and sw instructions
  • Consider lw s1, 20(s2)
  • Address for s2 supplied to register file via
    Read register1.
  • Data is read from register s2 via Read data1
    (note this data is itself an address)
  • The 16-bit offset, 20 is sign-extended to 32
    bits.
  • The result of s2 20 is obtained and used to
    fetch data from the memory.

27
Datapath for beq
28
Executing a beq instruction
  • beq t1, t2, offset
  • if t1 t2 branch to offset else go to the
    next instruction. Assume offset 4, then in the
    memory to program instructions are as follows
  • beq t1, t2, offset
  • --------------------
  • --------------------
  • --------------------
  • --------------------
  • offset --------------------

29
Steps in executing a beq instruction
  • Step 1 PC PC 4
  • Step 2 Supply address of t1, t2 to Register
    file via Read register1, Read register2 to get
    the contents of t1, t2 via Read data1, Read
    data2.
  • Step 3 Use ALU to determine if the values in
    Read data1, Read data2 are equal (Zero 1) or
    not equal (Zero 0). Zero is sent to branch
    logic to determine when to branch.
  • Step 4 Sign extend the 16-bit offset to 32 bits.
    Shift offset left by 2 bits (same as x 4 bytes).
    Add offset to PC.
  • Step 5 If we are not branching, the control
    logic replaces PC with previous value in Step 1.

30
Building the Datapath
  • Use multiplexors to stitch them together

31
Building the Complete Datapath
  • Share datapath elements among instruction
    classes.
  • Allow multiple connections to an element.
  • Component A to B and C Component A from B
    and C(Split connection) (use an mux and
    add control)

B
B
A
A
C
C
32
Stages of Combining Components
  • Combine R-type Mem. Ref unitsAdd 2 mux
    (ALUSrc, MemToReg)
  • Add instruction fetch part Connect instruction
    output
  • Add branch datapath Add PCSrc mux and split
    common sources.

33
Complete Datapath with all control lines
identified Single-cycle Datapath
  • Calculate cycle time assuming negligible delays
    except
  • memory (2ns), ALU and adders (2ns), register file
    access (1ns)

34
(No Transcript)
35
(No Transcript)
36
(No Transcript)
37
The Instruction classes (R-type, load, store,
branch)
  • Can you figure out where each instruction section
    of each instruction goes on the datapath?

38
The effect of each of the seven control signals
39
Control
  • Selecting the operations to perform (ALU,
    read/write, etc.)
  • Controlling the flow of data (multiplexor inputs)
  • Information comes from the 32 bits of the
    instruction
  • Example add t0, s1, s2 Instruction
    Format
  • ALU's operation based on instruction type and
    function code

40
Control
  • e.g., what should the ALU do with this
    instruction
  • Example lw 1, 100(2) 35 2 1
    100 op rs rt 16 bit offset
  • ALU control input 000 AND 001 OR 010 add 110
    subtract 111 set-on-less-than
  • Why is the code for subtract 110 and not 011?

41
Control
  • Must describe hardware to compute 3-bit ALU
    control input
  • given instruction type (input into ALUCntrol)
    00 lw, sw (ALUs result to be subtraction) 01
    beq (ALUs result to be Less)11 arithmetic
    (ALUs result determined from instruction)
  • Also input into ALU Control function code for
    arithmetic
  • Describe it using a truth table (can turn into
    gates)

42
Control

43
Control
  • Simple combinational logic (truth tables)

Main Control unit
ALU Control unit
44
Improving on the Datapath Multi-cycle Datapath
  • Single-cycle datapath is inefficient. Why?
  • Five Execution Steps are
  • Instruction Fetch
  • Instruction Decode and Register Fetch
  • Execution, Memory Address Computation, or Branch
    Completion
  • Memory Access or R-type instruction completion
  • Write-back step INSTRUCTIONS TAKE FROM 3 - 5
    CYCLES!

45
Step 1 Instruction Fetch
  • Use PC to get instruction and put it in the
    Instruction Register.
  • Increment the PC by 4 and put the result back in
    the PC.
  • Can be described succinctly using RTL
    "Register-Transfer Language" IR
    MemoryPC PC PC 4

46
Step 2 Instruction Decode and Register Fetch
  • Read registers rs and rt in case we need them
  • Compute the branch address in case the
    instruction is a branch
  • RTL A RegIR25-21 B
    RegIR20-16 ALUOut PC (sign-extend(IR15-
    0) ltlt 2)
  • We aren't setting any control lines based on the
    instruction type (we are busy "decoding" it in
    our control logic)

47
Step 3 (instruction dependent)
  • ALU is performing one of three functions, based
    on instruction type
  • Memory Reference ALUOut A
    sign-extend(IR15-0)
  • R-type ALUOut A op B
  • Branch if (AB) PC ALUOut

48
Step 4 (R-type or memory-access)
  • Loads and stores access memory MDR
    MemoryALUOut or MemoryALUOut B
  • R-type instructions finish RegIR15-11
    ALUOut Step 5 Write-back step
  • Load finishesRegIR20-16 MDR

49
Summary
50
Simple Questions
  • How many cycles will it take to execute this
    code? lw t2, 0(t3) lw t3, 4(t3) beq
    t2, t3, Label assume not equal add t5, t2,
    t3 sw t5, 8(t3)Label ...
  • Can you represent these instructions into
    micro-operations?

51
High level multi-cycle processor
52
MIPS Multi-cycle processor without controls
53
MIPS Multi-cycle processor with controls
54
Chapter five Summary
  • The Datapath and control can be designed based in
    the instruction set architecture.
  • The datapath is composed in combinational units
    (e.g. adder, ALU, mux) and sequential units such
    as registers and memory.
  • Have considered mainly the single-cycle datapath
    design and introduced multi-cycle datapath.
  • The control unit issues the right control signals
    at the right time to enable complete execution of
    an instruction.
  • The control design requires a through
    understanding of the design.
  • Have only seen control design for single cycle
    datapath.
  • Control design of multi-cycle datapath requires
    finite state machine theory.
  • Datapath has mechanism for fetching next
    instruction, and thus a executing a whole
    program.
Write a Comment
User Comments (0)
About PowerShow.com