Emery Berger - PowerPoint PPT Presentation

About This Presentation

Title:

Emery Berger

Description:

Title: Multiprocessor Memory Allocation Last modified by: Emery Berger Created Date: 2/24/2000 4:19:41 AM Document presentation format: Custom Other titles – PowerPoint PPT presentation

Number of Views:98

Avg rating:3.0/5.0

Slides: 24

Provided by: uma61

Learn more at: https://people.cs.umass.edu

Category:

more less

Transcript and Presenter's Notes

Title: Emery Berger

1
Advanced CompilersCMPSCI 710Spring
2003Balanced Scheduling

Emery Berger
University of Massachusetts, Amherst

2
Topics

Last time
Instruction scheduling
Gibbons Muchnick
This time
Balanced scheduling
Kerns Eggers

3
List Scheduling, Redux

Build dependence dag
Choose instructions from ready list
Schedule using heuristicsGibbons Muchnick
Instruction with greatest latency
Instruction with most successors
Instruction on critical path

4
Fly in the Ointment

When scheduling loads, assume hit in primary
cache
On older architectures, this makes sense
Stall execution on cache miss
But newer architectures are nonblocking
Processor executes other instructions while load
in progress
Good creates more ILP but

5
Scheduling Options

Now what?
Assume cache miss takes N cycles
N typically 10 or more
Do we schedule load
Anticipating 1 cycle delay (a hit)?
optimistic
Or N cycle delay (a miss)?
pessimistic

6
Optimistic vs. Pessimistic
Optimistic L0 X2 X1 X3 X4
Pessimistic L0 X2 X3 X1 X4

Optimistic fine for hits, inferior for misses
Pessimistic fine for hits, better for misses

7
Optimistic vs. Pessimistic,Multiple Loads
Optimistic L1 X1 L2 X2 X3
Pessimistic L1 X1 X2 L2 X3

Optimistic better for hits, same for misses
Pessimistic worse for hits, same for misses

8
Balanced Scheduling

Key insights
No fixed estimate of memory latency is best
Schedule based available parallelism in the code
Load level parallelism
Balanced scheduling
Computes each weight separately
Takes other possible instructions into account
Space out loads, using available instructions as
filler

9
Balanced Scheduling,Example
Balanced L0 X2 X3 X1 X4

Maximizes distance between L0 X1
Good in case of miss

10
Balanced Scheduling,Example

W load instruction weight
W5 over-estimate
Greedy schedule
W1 under-estimate
Lazy schedule
Balanced scheduler
W3 ( load-level parallelism)

11
Balanced Scheduling,Results

Always achieves fewest interlocks

12
Algorithm Idea

Examine each instruction i in dag
Determine which loads can run in parallel with i
Use all (or part) of is execution time to cover
latency of loads

13
Balanced Scheduling,Weight Calculation

Time complexity?

14
Balanced Scheduling,Example

Locate longest load paths in connected components
Add 1/( of loads) to loads weights

15
Balanced Scheduling,Example II

Consider instruction X1
Locate longest load paths in connected components
Add 1/( of loads) to loads weights
contributions of X1

16
Balanced Scheduling,All Weights
17
Balanced Scheduling Algorithm

After computing weights, perform list scheduling
where
Priority weight plus max priority of successors
Break ties
Largest delta between consumed defined
registers
Rank based on successors in dag that would be
exposed
Select instruction generated earliest
Bottom-up scheduler
Reverse-order, schedule from leaves toward roots

18
Balanced Scheduling,Example I
Balanced L0 X2 X3 X1 X4
19
Balanced Scheduling,Example II
20
Limitations

Performed after register allocation
But introduces false dependences
Reuse of registers ) dag has extra edges
Can be fixed with software register renaming
Had to modify gccs RTL
Approach required manual pipelining
Profile-based feedback
Benchmark based on FORTRANconverted to C with
f2c
Cant disambiguate memory
Adds many edges to dag

21
Workaround Simulate Fortran

Modify code to avoid aliases
Improves results, but incorrect!
Needs advanced alias analysis

22
Empirical Results

Evaluated using simulation
3 to 18 improvement over regular scheduler
across different models
Mean 9.9
Unfortunately
No results presented without above-mentioned
modifications

23
Conclusion

Balanced scheduling
Spreads out instructions to cover load latency
Based on exploitable load-level parallelism
Effective at improving performance
Modulo methodological limitations
Not so great for C/C, possibly useful for Java
Next time interprocedural analysis
ACDI Ch. 19, pp. 607-636, 641-656

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

Introducing-PowerShowcom PowerPoint PPT Presentation

Introducing-PowerShowcom - Introducing-PowerShowcom (Without Music)

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

Emery Berger PowerPoint PPT Presentation

Emery Berger - Lecture 14: VM Meets the Real World ... The Clock Algorithm. Variant of FIFO & LRU. Keep frames in circle. On page fault, OS: ... | PowerPoint PPT presentation | free to view

Emery Berger - Runtime Systems: Old School. Hardware. Programming Language. Compiler. Object Code. Libraries ... Old Issues. Then: Computers were small and slow = runtime ... | PowerPoint PPT presentation | free to view

Emery Berger - University of Massachusetts, Amherst. Operating Systems. CMPSCI 377. Lecture 7: ... Lottery Scheduling. Jobs get tickets. Scheduler randomly picks winner ... | PowerPoint PPT presentation | free to view

Emery Berger - alias analysis, = points-to analysis) Goal: statically determine possible runtime values of a pointer ... From [Hind, 2001] ... | PowerPoint PPT presentation | free to view

Emery Berger - Lecture 3: OS Structures. Lecture 4: Processes ... Modern OS Functionality (brief review) Architecture Basics. Hardware Support ... First -kernel: Hydra (CMU) ... | PowerPoint PPT presentation | free to view

Emery Berger - Can we eliminate equivalent expressions without constants? ... EVAL(b) = expressions defined in b and not subsequently killed in b ... | PowerPoint PPT presentation | free to view

Emery Berger - Never wakes up! UNIVERSITY OF MASSACHUSETTS, AMHERST Department of Computer Science ... Wakes up one waiting thread, if any. notifyAll() Wakes up all waiting threads ... | PowerPoint PPT presentation | free to view

Emery Berger - Goal: statically determine possible runtime values of a pointer ... Luckily: good approximations. Trade-off efficiency & precision. Result: points-to sets ... | PowerPoint PPT presentation | free to view

Emery Berger - ... quantum = 1s ignore context switch time Round-Robin vs. FCFS Example 1: 5 jobs, 100 seconds each, quantum = 1s ignore context switch time Round-Robin vs. FCFS ... | PowerPoint PPT presentation | free to view

Emery Berger - ... dead code elimination Loop invariant code motion Constant Propagation Lattice for integer addition, multiplication, ... meet lattice values ... | PowerPoint PPT presentation | free to view

Flux A Language for Programming High-Performance Servers Brendan Burns, Kevin Grimaldi, Alex Kostadinov, Emery Berger, Mark Corner University of Massachusetts Amherst - Flux A Language for Programming High-Performance Servers Brendan Burns, Kevin Grimaldi, Alex Kostadinov, Emery Berger, Mark Corner University of Massachusetts Amherst | PowerPoint PPT presentation | free to view

Emery Berger - Operating Systems CMPSCI 377 Lecture 22: Protection & Security Emery Berger University of Massachusetts, Amherst Security Secure if either: Cost of attacking system ... | PowerPoint PPT presentation | free to view

Emery Berger - Operating Systems CMPSCI 377 Lecture 19: Network Structures Emery Berger University of Massachusetts Amherst Next Few Classes Networking basics Distributed services e ... | PowerPoint PPT presentation | free to view

Garbage Collection Without Paging PowerPoint PPT Presentation

Garbage Collection Without Paging - Garbage Collection Without Paging Matthew Hertz, Yi Feng, Emery Berger University of Massachusetts Amherst | PowerPoint PPT presentation | free to view

Emery Berger - Operating Systems CMPSCI 377 Lecture 12: Paging Emery Berger University of Massachusetts, Amherst | PowerPoint PPT presentation | free to view

CRAMM: Virtual Memory Support for Garbage-Collected Applications PowerPoint PPT Presentation

CRAMM: Virtual Memory Support for Garbage-Collected Applications - Title: Class-Directed Memory Management Subject: garbage collection Author: Emery Berger Last modified by: Emery Berger Created Date: 2/24/2000 4:19:41 AM | PowerPoint PPT presentation | free to view

Software Systems File Systems and Storage PowerPoint PPT Presentation

Software Systems File Systems and Storage - Software Systems File Systems and Storage Emery Berger and Mark Corner University of Massachusetts Amherst | PowerPoint PPT presentation | free to view

Emery Berger - Operating Systems CMPSCI 377 Lecture 18: Storage Systems Emery Berger University of Massachusetts, Amherst Last Time: Improving I/O Performance Approaches: Caching ... | PowerPoint PPT presentation | free to view

Computer Systems Principles Synchronization PowerPoint PPT Presentation

Computer Systems Principles Synchronization - Computer Systems Principles Synchronization Emery Berger and Mark Corner University of Massachusetts Amherst | PowerPoint PPT presentation | free to view

Computer Systems Principles C/C PowerPoint PPT Presentation

Computer Systems Principles C/C - Computer Systems Principles C/C++ Emery Berger and Mark Corner University of Massachusetts Amherst UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer ... | PowerPoint PPT presentation | free to view

Emery Berger - Operating Systems CMPSCI 377 Lecture 14: VM Meets the Real World Emery Berger University of Massachusetts Amherst | PowerPoint PPT presentation | free to view

Emery Berger - Operating Systems CMPSCI 377 Lecture 2: OS & Architecture Emery Berger University of Massachusetts, Amherst | PowerPoint PPT presentation | free to view

Emery Berger - Operating Systems CMPSCI 377 Lecture 16: File Systems II Emery Berger University of Massachusetts, Amherst | PowerPoint PPT presentation | free to view

Emery Berger - Operating Systems CMPSCI 377 Lecture 21: Distributed File Systems Emery Berger University of Massachusetts Amherst | PowerPoint PPT presentation | free to view

Emery Berger - Operating Systems CMPSCI 377 Lecture 1 Emery Berger University of Massachusetts, Amherst | PowerPoint PPT presentation | free to view

Emery Berger - Operating Systems CMPSCI 377 Lecture 22: Protection & Security Emery Berger University of Massachusetts, Amherst | PowerPoint PPT presentation | free to view

Emery Berger - Operating Systems CMPSCI 377 Lecture 19: Network Structures Emery Berger University of Massachusetts Amherst | PowerPoint PPT presentation | free to view