CSECE 365 Computer Architecture - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

CSECE 365 Computer Architecture

Description:

Manifesto for Agile Software Development. We are uncovering better ways of developing ... not a single approach to software development. ... – PowerPoint PPT presentation

Number of Views:126
Avg rating:3.0/5.0
Slides: 23
Provided by: eso67
Category:

less

Transcript and Presenter's Notes

Title: CSECE 365 Computer Architecture


1
CS/ECE 365 Computer Architecture
  • Soundararajan Ezekiel
  • Department of Computer Science
  • Ohio Northern University

2
Plan for Today
  • Home work out today
  • Thursday Quiz
  • Study Hazard for DLX Architecture-- Do Problems
  • structural, control, and data
  • Data path--- Introduction

3
Structural Hazard
  • A machine with only one memory port will generate
    a conflict whenever a memory reference occurs
  • in this example the load instruction uses the
    memory for a data access at the same time
    instruction 3 wants to fetch an instruction from
    memory

4
2 instruction need access to memory in clock
cycle4. This is the big reason for having
separate memory I memory for instruction and D
memory for data value
5
Two instructions need access memory in clock
cycle 4. We had to stall to fix this as it
6
Example
  • Suppose that data transfer constitute 40 of the
    mix, and that the ideal CPI of the pipelined
    machine, ignoring the structural hazard, is 1.
    Assume that the machine with the structural
    hazard has a clock rate that is 1.05 times
    higher than the clock rate of the machine without
    the hazard. Disregarding any other performance
    losses, is the pipeline with or without the
    structural hazard faster, and by how much?

7
Answer
  • Several ways we can do this problem
  • simplest form compute average instruction time
    for 2 machines
  • Ave instruction time CPIClock cycle time
  • since no stall, the average instruction time for
    the ideal machinesimply then clock cycle time
    ideal
  • The average instruction time for the machine with
    the structural hazard is
  • Ave Instruction timeCPI Clock cycle time
  • (10.41)(clock cycle time
    ideal/1.05
  • (1.33Clock Cycle Time ideal)
  • machine without hazard is faster-- 1.33 times
    faster

8
Data Hazards
  • A major effect of pipelining is to change the
    relative timing of instruction by overlapping
    their execution
  • ADD R1, R2, R3
  • SUB R4,R1,R5
  • AND R6,R1,R7
  • OR R8,R1,R9
  • XOR R10, R1,R11
  • All the instruction after the ADD use the result
    of the ADD instruction

9
Time (in clock cycle)
CC1 CC2 CC3 CC4
CC5 CC6
ALU
IM
Reg
DM
Reg
ADD R1,R2,R3
ALU
DM
Reg
IM
Reg
SUB R4,R1,R5
ALU
DM
IM
Reg
AND R6,R1,R7
ALU
IM
Reg
OR R8,R1,R9
IM
Reg
XOR R10,R1,R11
The use of the result of the add instruction in
the next 3 instruction causes a
hazard, since the register is not written until
after those instructions read it
10
Remedy
  • The problem posed in above slide can be solved
    with a simple hardware technique called
    forwarding
  • That is ADD produce the result in EX/MEM
    register
  • SUB need this value at ALU input latch
  • Forward move the result from EX/MEM to ALU
    input latch

11
Data Hazard Classification
  • consider two instruction i and j , which i
    occurring before j
  • possible data hazards
  • RAW(read after write)--j tries to read a source
    before I writes it----j incorrectly gets older
    value---more general --- forwarding will
    overcome
  • WAW(write after write) j tries to write an
    operand before it is written by I
  • This will happen when the pipelines that write
    more than one pipe stage
  • The DLX integer pipeline writes s register only
    in WB and avoids this class of hazard
  • we will discuss this situation later

12
Continue
  • WAR(write after read) --j tries to write a
    destination before it is read by I, so I
    incorrectly gets a new value
  • Not all potential data hazards can be handled by
    bypassing

13
Example
  • Suppose that 30 of the instructions are loads,
    and half the time the instruction following a
    load instruction depends on the result of the
    load.If this hazard creates a single cycle delay,
    how much faster is the ideal pipeline
    machine(with CPI of 1)that does not delay the
    pipeline than the real pipeline? Ignore any
    stalls other than pipeline stalls.

14
Ans
  • The ideal machine will be faster by the ratio of
    CPI
  • The CPI for an instruction following a load is
    1.5(since it stall half the time)
  • the effective CPI is (0.71 0.31.5)1.15
  • this means that the ideal machine is 1.15 faster

15
compiler scheduling for data hazard
  • many types of stall are quite frequent
  • Typical code generation pattern for a statement
    such as ABC produces a stall for a load of the
    second data value C
  • Next slide shows that the store of A need not
    cause another stall, since the result of the
    addition can be forwarded to the data memory for
    use by the store

16
Figure
LW R1,B IF ID EX MEM WB LW R2,C IF ID E
X MEM WB ADD R3,R1,R2 IF ID STALL EX MEM W
B
SW A, R3 IF STALL ID EX MEMWB
  • The DLX code sequence for ABC. The ADD
    instruction must be stalled to allow the load of
    C complete. The SW need not be delayed further
    the forwarding hardware passes the result from
    the MEM?WB directly to the data memory input for
    storing

17
pipeline scheduling or instruction scheduling
  • Rather than just allow the pipeline to stall, the
    compiler could try to schedule the pipeline to
    avoid these stalls by rearranging the code
    sequence to eliminate the hazard.
  • Example the compiler could try to avoid
    generating a code with a load followed by the
    immediate use of the load destination register.
  • This technique is called pipeline scheduling or
    instruction scheduling
  • First used in 1960-- 1980 it become more popular

18
Implementing the control for the DLX Pipeline
  • The process of letting an instruction move from
    the instruction decode stage (ID) into the
    execution stage (EX) of this pipeline is usually
    called instruction issue an instruction that has
    made this step is said to have issued
  • For DLX integer pipeline all the data hazards can
    be checked during the ID phase of the pipeline

19
  • The load instruction has a delay or latency that
    cannot be eliminated by forwarding alone.
    Instead, we need to add hardware, called a
    pipeline interlock, to preserve the correct
    execution pattern.
  • In general, pipeline interlock detects a hazard
    and stall the pipeline until hazard is cleared

20
Datapath and control Introduction
  • We did performance of a machine
  • 3 factors--- instruction count---clock cycle
    time--and clock cycles per instruction(CPI)
  • CPU time ICCPIClock cycle time
  • clock cycle time 1/clock rate
  • Clock cycle time Hardware technology and
    organization
  • CPI Organization and instruction set
    architecture
  • Instruction count Instruction set architecture
    and compiler technology

21
continue
  • We will discuss datapath and control unit for two
    different implementation of the MIPS instruction
    set
  • which includes
  • Memory -reference instructions load word (lw)
    and store word(sw)
  • Arithmetical-Logical Instruction add, sub, and
    , or, slt
  • The instructions branch equal (beq) and jump (j)

22
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com