COMP 206: Computer Architecture and Implementation - PowerPoint PPT Presentation

About This Presentation
Title:

COMP 206: Computer Architecture and Implementation

Description:

COMP 206: Computer Architecture and Implementation – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 24
Provided by: Montek5
Learn more at: http://www.cs.unc.edu
Category:

less

Transcript and Presenter's Notes

Title: COMP 206: Computer Architecture and Implementation


1
COMP 206Computer Architecture and Implementation
  • Montek Singh
  • Wed., Sep 18, 2002
  • Topic Pipelining -- Intermediate Concepts
  • (Multicycle Operations Exceptions)

2
Pipelining Multicycle Operations
  • Assume five-stage pipeline
  • Third stage (execution) has two functional units
    E1 and E2
  • Instruction goes through either E1 or E2, but not
    both
  • E1 and E2 are not pipelined
  • Stage delay of E1 2 cycles
  • Stage delay of E2 4 cycles
  • No buffering on inputs of E1 and E2
  • Stage delay of other stages 1 cycle
  • Consider an instruction sequence of five
    instructions
  • Instructions 1, 3, 5 need E1
  • Instructions 2, 4 need E2

3
Space-Time Diagram Multicycle Operations
  • Out-of-order completion
  • 3 finishes before 2, and 5 finishes before 4
  • Instructions may be delayed after entering the
    pipeline because of structural hazards
  • Instructions 2 and 4 both want to use E2 unit at
    same time
  • Instruction 4 stalls in ID unit
  • This causes instruction 5 to stall in IF unit

4
Floating-Point Operations in MIPS
IF
ID
EX
Out-of-order completion has ramifications
for exceptions
WAW hazards possible WAR hazards not possible
Longer operation latency implies more
frequent stalls for RAW hazards
MEM
Structural hazard instructions have varying
running times
WB
Structural hazard not fully pipelined
5
Structural Hazard on WB Unit
  • This is worst-case scenario max steady-state
    number of write ports is 1
  • Dont replicate resources detect and serialize
    access as needed
  • Early resolution
  • Track use of WB in ID stage (using shift
    register), stall instructions there
  • reservation register
  • Simplifies pipeline control all stalls occur in
    ID
  • adds shift register and write-conflict logic
  • Late resolution
  • Stall instructions at entry to MEM or WB stage
  • Complicates pipeline control (two stall locations)

6
WAW Hazards
  • WAW hazard arises only when no instruction
    between ADD.D and L.D uses result computed by
    ADD.D
  • Adding an instruction like ADD.D F8,F2,F4
    before L.D would stall pipeline enough for RAW
    hazard to avoid WAW hazard
  • Can happen through a branch/trap (example in HP3,
    Section A.9)
  • Rare situation, but must still handle correctly
  • Hazard resolution
  • Delay the issue of L.D until ADD.D enters MEM
  • Cancel write of ADD.D

7
RAW Hazards
  • Longer delays of FP operations increases number
    of stalls in response to RAW hazards
  • Two methods for reducing stalls
  • Compiler could have moved instruction D between
    instructions M and A, which would allow D to
    complete earlier or hardware could detect this
    possibility and issue instruction D out of order
  • ID stage is a bottleneck because instructions
    wait their for their operands to be available
    could add buffers (reservation stations) to
    functional units and let instructions await their
    operands there

8
Responsibilities of ID (all stalls in ID)
  • Three sets of checks
  • Structural hazards
  • Check for availability of FP unit
  • Ensure WB unit will be available when needed
  • RAW hazards
  • Stall current instruction until its source
    registers are not listed as pending registers in
    a pipeline register that will not be available
    when current instruction needs the result
  • WAW hazards
  • If any instruction in adder, divider, or
    multiplier has same register destination as
    current instruction, stall current instruction
  • Hazards between FP and integer instructions
  • Integer and FP instructions use disjoint sets of
    registers, except for FP-integer register moves
  • FP load-stores can conflict with integer
    load-stores in MEM stage

9
MIPS R4000 Floating-Point Pipeline
10
Instruction Mixes in FP Pipeline Adds Only
Cant initiate another add on cycle 2 Conflict
here
  • Forbidden latencies 1 and 2
  • Steady-state utilization (cycles 4 through 18)
  • (57)/(815) 35/120 29.17
  • Total utilization (cycles 1 through 19)
  • (5572)/(819) 42/152 27.63

Cant initiate another add on cycle 3 Conflict
here
11
FP Pipeline Multiplies Only
  • Collision vector
  • 1 indicates forbidden latency
  • 0 indicates allowed latency
  • Steady-state utilization (cycles 5-24)
  • (510)/(820) 50/160 31.25
  • Total utilization (cycles 1-28)
  • (55105)/(828) 60/224 26.79

Multiply
12
FP Pipeline Adds and Multiplies
  • Note out-of-order
  • completion
  • Steady-state utilization
  • (cycles 6-21)
  • (417)/(816) 68/128
  • 53.13
  • Total utilization
  • (1241722)/(828)
  • 85/224 37.95

13
Interrupts, Faults, or Exceptions
  • Synchronous, coerced interrupts that occur within
    instructions and after which execution must
    resume are the hardest to implement
  • See Figure A.27 in HP3

14
Precise Interrupts (Sequential Processor)
  • When interrupt occurs, state of interrupted
    process is saved, including PC ( u), registers,
    and memory
  • Interrupt is precise if the following three
    conditions hold
  • All instructions preceding u have been executed,
    and have modified the state correctly
  • All instructions following u are unexecuted, and
    have not modified the state
  • If the interrupt was caused by an instruction, it
    was caused by instruction u, which is either
    completely executed (overflow) or completely
    unexecuted (VM page fault)
  • Precise interrupts are desirable if software is
    to fix up error that caused interrupt and
    execution has to be resumed
  • Easy for external interrupts, could be complex
    and costly for internal
  • Imperative for some interrupts (VM page faults,
    IEEE FP standard)

15
Problems on Sequential Processors
  • Long-running instructions
  • Not enough to be able to restore state, must make
    progress from interrupt to interrupt
  • Example MVC on IBM 360 copies 256 bytes
  • No virtual memory, so interrupts not allowed to
    stop MVC
  • Example MVC on IBM 370 copies 256 bytes
  • Has virtual memory, so first access all pages
    involved after that, no interrupts allowed
  • Example MVCL on IBM 370 copies up to 224 bytes
  • Has VM two addresses and length are in registers
  • Registers saved and restored on interrupts
    (making progress)
  • Instruction modifies state early, then causes an
    interrupt
  • State change must be undone
  • Example First operand of VAX instruction uses
    autodecrement addressing mode, which writes a
    register. Trying to access second operand causes
    a page fault. Since instruction execution cannot
    be completed, we must restore the register
    written by autodecrement to its original value

16
Interrupts in MIPS Pipeline
  • How do we stop and restart execution on an
    interrupt to keep it precise?
  • What problems do delayed branches cause?
  • What happens if multiple exceptions occur in the
    pipeline?
  • Can exceptions occur out-of-order?
  • What problems do multi-cycle instructions cause?

17
MIPS Integer Pipeline, Single Interrupt
  • Force TRAP instruction in pipeline on next IF
  • Turn off all writes for faulting instruction and
    subsequent instructions
  • After exception-handling routine in OS receives
    control, save PC of faulting instruction
  • When exception has been handled, the RFE
    instruction reloads PC and restarts sequential
    instruction execution

18
Complications with Delayed Branches
  • Suppose instruction 2 causes an exception (e.g.,
    a page fault) after the taken branch completes
    (determining that the branch outcome is true)
  • Instruction 2 cannot complete
  • Neither can instruction u
  • On restart, we do not have sequential execution
  • We must remember two PC values 2 and u

19
Complications with Multiple Exceptions
  • At same cycle, LW takes a data page fault and ADD
    takes an arithmetic exception
  • On an unpipelined machine, LWs exception would
    occur first
  • Handle the page fault
  • Restart execution
  • ADD will cause arithmetic exception to reoccur
    handle it then

20
Complications with Out-of-order Exceptions
  • LW takes data page fault, ADD takes instruction
    page fault
  • Relative timing differs between unpipelined and
    pipelined machines
  • To maintain precise interrupts, we need to
    consider both when they occur and the
    instructions that caused them
  • Post exceptions in exception status vector, turn
    off state modifications, and check vector in WB
    unit

21
Complications with Multicycle Operations
  • Instructions are independent (no hazards) and
    therefore issue immediately
  • Differences in running times causes out-of-order
    termination
  • DIVF throws arithmetic exception late in its
    execution
  • At that point, ADDF and SUBF have both completed
    execution and destroyed one of their operands
  • Can we maintain precise interrupts under these
    conditions?

22
FP Pipeline Exceptions Solns. 1 and 2
  • Settle for imprecise interrupts (CRAY, with
    checkpointing)
  • Done on Alpha 21064 and 21164, IBM Power-1 and
    Power-2, MIPS R8000 by supporting a fast
    imprecise mode and a slow precise mode
  • Not an option if you have to support virtual
    memory or IEEE floating point standard
  • Software finishes certain instructions (SPARC)
  • Keep enough state around for trap handler to
    create a precise sequence for exception and
    finish work for some instruction stages
  • Only FP instructions cause this problem

23
FP Pipeline Exceptions Solns. 3 and 4
  • Stalling (MIPS R2000/3000, MIPS R4000, Pentium)
  • An instruction is allowed to issue only if it is
    certain that all the instructions before the
    issuing instruction will complete without causing
    an exception
  • To prevent excessive stalling, FP units must
    decide on possibility of exceptions early in
    pipeline
  • General methods (PowerPC 620, MIPS R10000)
  • Reorder buffer, history file, future file
  • An instruction is allowed to finalize its writes
    only when all previously issued instructions are
    complete
  • More naturally used in connection with ILP
    (Chapter 4)
  • Significant complexity (to be discussed later)
Write a Comment
User Comments (0)
About PowerShow.com