Title: Low Cost, High Performance, and Scalability:
1(No Transcript)
2GUARANTEED!
- Low Cost, High Performance, and Scalability
- A New Approach to User-Level Distributed Shared
Memory
OR YOUR MONEY BACK!!!
Patrick Anthony La Fratta WORTS 2005 15 December
2005
3- Programming Models Message-Passing
4- Programming Models Message-Passing
5- Programming Models Shared Memory
6- Implementing a DSM System at the User Level
7- Implementing the DSM Client
Initialization, Step 1 Get size of shared memory
segment.
8- Implementing the DSM Client
Initialization, Step 2 Map n pages into local
memory.
9- Implementing the DSM Client
Initialization, Step 3 Take away all access
privileges from the shared segments.
10- Implementing the DSM Client
Initialization, Step 4 Set up the segmentation
fault handler.
11- Implementing the DSM System
Application Reads Shared Address Preview
12Implementing the DSM System
Shared address read, Step 1 Application reads
shared address.
13Implementing the DSM System
Shared address read, Step 2 Control transferred
to seg-fault handler.
14Implementing the DSM System
Shared address read, Step 3 Client contacts the
server to get the pages data.
15Implementing the DSM System
Shared address read, Step 4 Client grants read
access privileges to application.
16- Implementing the DSM System
Application Writes Shared Address Preview
17Implementing the DSM System
Shared address write, Step 1 Application writes
shared address.
18Implementing the DSM System
Shared address write, Step 2 Control
transferred to seg-fault handler.
19Implementing the DSM System
Shared address write, Step 3 Client contacts
server to with write notification.
20Implementing the DSM System
Shared address write, Step 4 Server calls back
all other copies of pages being written.
21Implementing the DSM System
Shared address write, Step 5 Server indicates
to client to proceed.
22Implementing the DSM System
Shared address write, Step 6 Client grants
write privileges to application.
23Implementing the DSM System
Shared address write, Step 7 Later, the app
detaches pages so others may use them.
24- Preliminary Results All Pairs Shortest Paths
Note Results matched for all test cases, and all
runs completed successfully.
25System Profiling All Pairs Shortest Paths
26System Profiling All Pairs Shortest Paths
27System Modifications and Extensions
System profiles resulted in
- Better understanding of the trade-offs in
the design of the interface. - Efficient synchronization primitives through
extended memory semantics with full/empty bits. - Server-side per-page locking and client-side
full- page flushing. - Speedups gt 1!
28Performance Results Speedup for Various
Configurations
29Performance Results Trends
30- Scalability Enable clients to use more than
one server. - Peer-to-peer Merge the server and client
modules. - Fault-tolerance Checkpoint and Migration?
- Further testing Implement and evaluate
performance of other parallel applications.
Questions?