Title: CS 2200 Lecture 09a Hazards
1CS 2200 Lecture 09aHazards
- (Lectures based on the work of Jay Brockman,
Sharon Hu, Randy Katz, Peter Kogge, Bill Leahy,
Ken MacKenzie, Richard Murphy, and Michael
Niemier)
2The hazards of pipelining
- Pipeline hazards prevent the next instruction
from executing during its designated clock cycle - There are 3 classes of hazards
- Structural Hazards
- Arise from resource conflicts when HW cannot
support all possible combinations of instructions - Data Hazards
- Occur when a given instruction depends on data
from an instruction ahead of it in the pipeline - Control Hazards
- Result from branch type and other instructions
that change the flow of the program (i.e. the PC)
3How do we deal with hazards?
- Often, the pipeline must be stalled
- Stalling the pipeline usually lets some
instruction(s) in the pipeline proceed while
another/others wait for data, a resource, etc. - A note on terminology
- If we say an instruction was issued later than
instruction x, we mean that it was issued after
instruction x and is not as far along in the
pipeline - If we say an instruction was issued earlier than
instruction x, we mean that it was issued before
instruction x and is further along in the pipeline
4Stalls and performance
- Stalls impede the progress of a pipeline and
result in the deviation of 1 instruction
executing each clock cycle - Recall that pipelining can be viewed to
- Decrease the CPI or clock cycle time for an
instruction - Lets see what affect stalls have on CPI
- CPI pipelined
- Ideal CPI Pipeline stall cycles per instruction
- 1 Pipeline stall cycles per instruction
- Ignoring overhead and assuming stages are
balanced
5More pipeline performance issues
- Pipelining can also appear to improve the clock
cycle time - We can assume that the CPI of an unpipelined and
a pipelined machine is 1 - This results in
- If pipe stages are perfectly balanced and we
assume no overhead, the clock cycle on pipelined
machine is smaller than unpipelined machine by a
factor equal to the pipeline depth.
6Even more pipeline performance issues!
- This results in
- Which leads to
- Thus, if there are no stalls, the speedup is
equal to the number of pipeline stages in the
ideal case
7Structural hazards
- One way to avoid structural hazards is to
duplicate resources - For example An ALU to perform an arithmetic
operation and an adder to increment the PC - However, if not all possible combinations of
instructions can be executed, structural hazards
occur - Most common instances of structural hazards
occur - When some functional unit is not fully pipelined
- When some resource has not been duplicated enough
- Pipelines stall as a result of this hazard and
CPI is increased from the usual 1
8An example of a structural hazard
Load
Instruction 1
Instruction 2
Instruction 3
Instruction 4
Whats the problem here?
Time
9How is it resolved?
Load
Instruction 1
Instruction 2
Stall
Instruction 3
Pipeline generally stalled by inserting a
bubble or NOP
Time
10Or alternatively
Clock Number
- The LOAD instruction effectively steals an
instruction fetch cycle - which will cause the pipeline to stall.
- Thus, no instruction completes on clock cycle 8
11An example
- The facts
- Data references constitute 40 of an instruction
mix - The ideal CPI of the pipelined machine is 1
- The machine with the structural hazard has a
clock rate thats 1.05 times higher than the
machine without the hazard. - How much does this LOAD problem hurt us?
- Recall Avg. Inst. Time CPI x Clock Cycle Time
- (1 0.4 x 1) x (Clock cycle timeideal/1.05)
- 1.3 x Clock cycle timeideal
- Therefore the machine without the hazard is
better
12Remember the common case!
- All things being equal, a machine without
structural hazards will always have a lower CPI. - However, what in some cases it may be better to
allow them than to eliminate them. - These are situations a computer architect might
have to consider - Is pipelining functional units or duplicating
them costly in terms of HW? - Does the structural hazard occur often?
- Whats the common case???
13Data hazards
- These exist because of pipelining
- Why do they exist???
- Pipelining changes the order or read/write
accesses to operands - The order differs from the order seen by
sequentially executing instructions on an
unpipelined machine - Consider this example
- ADD R1, R2, R3
- SUB R4, R1, R5
- AND R6, R1, R7
- OR R8, R1, R9
- XOR R10, R1, R11
All instructions after ADD use the result of the
ADD instruction However, for the DLX mP,
ADD writes the register in WB but SUB needs it in
ID. This is a data hazard
14Illustrating a data hazard
ADD R1, R2, R3
The ADD instruction causes a hazard in the next 3
inst. because the register is not written
until after those 3 read it.
SUB R4, R1, R5
Reg
Mem
DM
AND R6, R1, R7
Reg
Mem
OR R8, R1, R9
Reg
Mem
XOR R10, R1, R11
Time
15Forwarding
- The problem illustrated on the previous slide can
actually be solved relatively easily with
forwarding - In this example, the key is that the result of
the ADD instruction is not really needed until
after the ADD actually produces it - Can we move the result from the EX/MEM register
to the beginning of the ALU (where SUB needs it)? - Yes! Hence this slide!
- Generally speaking
- Forwarding occurs when a result is passed
directly to the functional unit that requires it. - Result goes from output of one unit to input of
another
16When can we forward?
SUB inst. gets its information from EX/MEM
pipe register AND inst. Gets its information
from MEM/WB pipe register OR inst. gets
its information by forwarding from the register
file
ADD R1, R2, R3
SUB R4, R1, R5
Reg
Mem
DM
AND R6, R1, R7
Reg
Mem
OR R8, R1, R9
Reg
Mem
XOR R10, R1, R11
The rule of thumb If a line goes forward you
can do forwarding. If its drawn backward, its
physically impossible
Time
17HW Change for Forwarding
18Data hazard specifics
- Not only are there data hazards, but there are
different kind of data hazards! - Specifically, there are 3 different kinds
- Read After Write (RAW)
- Write After Write (WAW)
- Write After Read (WAR)
- We will discuss and illustrate each on
forthcoming slides. However, first a note on
convention. - The discussion of hazards will involve 2 generic
instructions i and j. Instruction i is always
issued before instruction j. Thus, instruction I
will always be further along in the pipeline. - With an in-order issue/in-order completion
machine, were not as concerned with WAW,
WAR..but we will be, oh, we will be.
19Read after write (RAW) hazards
- With a RAW hazard, instruction j tries to read a
source operand before instruction i writes it. - Thus, j would incorrectly receive an old or
incorrect value - Graphically/Example
- Can use stalling or forwarding to resolve this
hazard
i ADD R1, R2, R3 j SUB R4, R1, R6
Instruction j is a read instruction issued after i
Instruction i is a write instruction issued
before j
20Write after write (WAW) hazards
- With a WAW hazard, instruction j tries to write
an operand before instruction i writes it. - The writes end up being performed in the wrong
order leaving the value written by the earlier
instruction - Graphically/Example
i DIV F1, F2, F3 j SUB F1, F4, F6
Instruction j is a write instruction issued after
i
Instruction i is a write instruction issued
before j
21Write after read (WAR) hazards
- With a WAR hazard, instruction j tries to write
an operand before instruction i reads it. - Thus, instruction i would incorrectly receive the
newer value of its operand instead of getting
the old value, it could receive some newer,
undesired value - Graphically/Example
i DIV F7, F1, F3 j SUB F1, F4, F6
Instruction j is a write instruction issued after
i
Instruction i is a read instruction issued before
j
22Forwarding It aint all its cracked up to be
LW R1, 0(R2)
The load instruction has a latency
that forwarding cant solve. The pipeline must
be stalled until the hazard is cleared (starting
with the instruction that wants to use the
data until the source produces it).
Reg
IM
DM
SUB R4, R1, R5
Reg
IM
AND R6, R1, R7
Reg
IM
OR R8, R1, R9
Time
Thus, to get the data to the subtract instruction
we would need a time machine!
23The solution pictorially
Reg
IM
DM
Reg
LW R1, 0(R2)
Reg
IM
DM
SUB R4, R1, R5
IM
Reg
AND R6, R1, R7
Reg
IM
OR R8, R1, R9
Time
The insertion of a bubble causes the number of
cycles to complete This sequence to grow by 1
24Data hazards and the compiler
- The compiler should be able to help eliminate
some of the stalls caused by data hazards - For example, the compiler could not generate a
LOAD instruction that is immediately followed by
an instruction that uses the result of the LOADs
destination register. - This techniques is called pipeline/instruction
scheduling
25What about control logic?
- For the DLX integer pipeline, all data hazards
can be checked during the ID phase of the
pipeline - If a data hazard exists, the instruction is
stalled before it is issued - Whether or not forwarding will be needed can also
be determined at this stage and controls signals
are set - If a hazard is detected, the control unit of the
pipeline must stall the pipeline and prevent
instructions in IF and ID from advancing - All control information is carried along in the
pipeline registers so only these fields must be
changed
26Some example situations