ILP: Software Approaches - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

ILP: Software Approaches

Description:

ILP: Software Approaches Bazat pe -urile lui Vincent H. Berk – PowerPoint PPT presentation

Number of Views:130
Avg rating:3.0/5.0
Slides: 13
Provided by: Aka143
Category:

less

Transcript and Presenter's Notes

Title: ILP: Software Approaches


1
ILP Software Approaches
  • Bazat pe slide-urile lui Vincent H. Berk

2
HW Support for More ILP
  • Avoid branch prediction by turning branches into
    conditionally executed instructions
  • If (X) then A B op C else NOP
  • If false, then neither store result nor cause
    exception
  • Expanded ISA of Alpha, MIPS, PowerPC, SPARC have
    conditional move PA-RISC can annul any
    following instruction.
  • IA-64 61 1-bit condition fields selected so
    conditional execution of any instruction
  • Drawbacks to conditional instructions
  • Still takes a clock even if annulled
  • Stall if condition evaluated late
  • Complex conditions reduce effectiveness
    condition becomes known late in pipeline

X
A B op C
3
Software Pipelining
  • Observation if iterations from loops are
    independent, then can get more ILP by taking
    instructions from different iterations
  • Software pipelining reorganizes loops so that
    each iteration is made from instructions chosen
    from different iterations of the original loop

4
SW Pipelining Example
4
1 LD F0, 0 (R1) LD F0, 0 (R1) 2 ADDD F4, F0,
F2 ADDD F4, F0, F2 3 SD 0 (R1), F4 LD F0, 8
(R1) 4 LD F6, 8 (R1) 1 SD 0 (R1), F4 Stores
Mi 5 ADDD F8, F6, F2 2 ADDD F4, F0, F2 Adds to
Mi-1 6 SD 8, (R1), F8 3 LD F0, 16 (R1) Loads
Mi-2 7 LD F10, 16 (R1) 4 SUBI R1, R1,
8 8 ADDD F12, F10, F2 5 BNEZ R1, LOOP 9 SD 16
(R1), F12 SD 0 (R1), F4 10 SUBI R1, R1,
24 ADDD F4, F0, F2 11 BNEZ R1, LOOP SD 8
(R1), F4
Read F4 Read F0 SD IF ID EX Mem WB Write
F4 ADD IF ID EX Mem WB LD IF ID EX Mem WB
Write F0
5
SW Pipelining Example
5
  • Symbolic Loop Unrolling
  • Smaller code space
  • Overhead paid only once vs. each iteration in
    loop unrolling
  • 100 iterations 25 loops with 4 unrolled
    iterations each

Software Pipelining
Number of overlapped operations
(a) Software pipelining
Time
Loop Unrolling
Number of overlapped operations
Time
(b) Loop unrolling
6
Trace Scheduling
  • Focus on critical path (trace selection)
  • Compiler has to decide what the critical path
    (the trace) is
  • Most likely basic blocks are put in the trace
  • Loops are unrolled in the trace
  • Now speed it up (trace compaction)
  • Focus on limiting instruction count
  • Branches are seen as jumps into or out of the
    trace
  • Problem
  • Significant overhead for parts that are not in
    the trace
  • Unclear if it is feasible in practice

7
Superblocks
  • Similar to Trace Scheduling but
  • Single entrance, multiple exits
  • Tail duplication
  • Handle cases that exited the superblock
  • Residual loop handling
  • Could in itself be a superblock
  • Problem
  • Code size
  • Worth the hassle?

8
(No Transcript)
9
Conditional instructions
  • Instruction that is executed depending on one of
    its arguments
  • BNEZ R1, L
  • ADDU R2, R3, R0
  • L
  • VS
  • CMOVZ R2, R3, R1
  • Instruction is executed but results are not
    always written.
  • Should only be used for very small sequences,
    else use normal branch

10
Speculation
  • Compiler moves instructions before branch if
  • Data flow is not affected (optionally with use of
    renaming)
  • Preserve exception behavior
  • Avoid load/store address conflicts (no renaming
    for memory loc.)
  • Preserving exception behavior
  • Mechanism to indicate an instruction is
    speculative
  • Poison bit raise exception when value is used
  • Using Conditional instructions
  • Requires In-Order instruction commit
  • Register renaming
  • Writeback at commit
  • Forwarding
  • Raise exceptions at commit

11
Speculation
  • if (A0) AB else AA4
  • LD R1, 0(R3) load A
  • BNEZ R1, L1 test A
  • LD R1, 0(R2) then
  • J L2 skip else
  • L1 DADDI R1, R1, 4 else
  • L2 SD R1, 0(R3) store A
  • LD R1, 0(R3) load A
  • LD R14, 0(R2) load B (speculative)
  • BEQZ R1, L3 branch if
  • DADDI R14, R1, 4 else
  • L3 SD R14, 0(R3) store A

12
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com