SEMPLAR: HighPerformance Remote Parallel IO over SRB - PowerPoint PPT Presentation

1 / 36

About This Presentation

Title:

SEMPLAR: HighPerformance Remote Parallel IO over SRB

Description:

Experimental Setup. Results. Conclusions. 8. Storage Resource Broker ... Experimental Setup. SRB server v3.2.1 running on orion.sdsc.edu ... – PowerPoint PPT presentation

Number of Views:53

Avg rating:3.0/5.0

Slides: 37

Provided by: cseOhi

Category:

more less

Transcript and Presenter's Notes

Title: SEMPLAR: HighPerformance Remote Parallel IO over SRB

1
SEMPLAR High-Performance Remote Parallel I/O
over SRB

Nawab Ali and Mario Lauria
Department of Computer Science and Engineering
The Ohio State University
Columbus, OH 43210
alin, lauria_at_cse.ohio-state.edu

2
Presentation Outline

Introduction
Remote I/O
Storage Resource Broker
SEMPLAR
Design
Implementation
Asynchronous I/O over WANs
Experimental Setup
Results
Conclusions

3
Introduction

Application Trends
Big Science projects increasingly require
processing of large data sets
Sloan Digital Sky Survey, Large Hadron Collider,
National Earthquake Engineering Simulation Grid,
NIH GenBank.
Large data sets stored in repositories at
specialized facilities (supercomputer centers)
San Diego Supercomputer Center (SDSC)
Technological Trends
Bandwidth of WAN and Internet backbones growing
at a rate that makes it comparable to local
interconnect speed
TeraGrid, LambdaRail 40Gb/s
Infiniband 10 Gb/s

4
Problem Definition

Data Storage
How do we store the large amounts of data
generated by HPC applications?
Data Retrieval
How do we effectively retrieve the data for local
processing?
Research Focus
High-Performance I/O over WANs
How do we reduce the performance penalty of
remote data access?

5
Remote I/O Constraints

I/O Bandwidth CCGRID 2005
I/O Latency HPDC 2006

6
Motivation

Common approach for retrieving large data sets
Staging
FTP, GridFTP, Wget
Problems with staging
Overlapping of data transfer and computation not
possible
Dynamic data sets require frequent refreshes of
the local copy

7
Presentation Outline

Introduction
Remote I/O
Storage Resource Broker
SEMPLAR
Design
Implementation
Asynchronous I/O over WANs
Experimental Setup
Results
Conclusions

8
Storage Resource Broker

SRB was developed at SDSC to provide access to
massive volumes of data in a production
environment
It provides transparent access to heterogeneous
storage resources
Filesystems
Database Systems
Archival Storage Systems
Other services offered
Authentication, location transparency

9
SRB Architecture

SRB Servers
Control distinct set of physical resources
Metadata Catalog Service
Stores file metadata
Access Control
File location
SRB Clients
Connect to the SRB servers using client API
C high-level API
C low-level API

10
Presentation Outline

Introduction
Remote I/O
Storage Resource Broker
SEMPLAR
Design
Implementation
Asynchronous I/O over WANs
Experimental Setup
Results
Conclusions

11
SEMPLAR SRB Enabled MPI I/O Library for Access
to Remote Storage

I/O over the Internet
Storage Virtualization
SRB
High I/O bandwidth
Multiple TCP Streams
Multiple I/O nodes
Standard Application Interface
MPI I/O

12
SEMPLAR Implementation

MPI I/O implementations such as ROMIO use the
portable ADIO interface
ADIO implementations are optimized for a
particular filesystem
We have provided an ADIO implementation for the
SRB filesystem

13
Remote Asynchronous I/O

Asynchronous I/O has been shown to be a flexible
programming model
For some reason, never yet applied to remote I/O
Traditional advantages of Asynchronous I/O
Overlapping of I/O and computation
Efficient use of system resources
Enhanced I/O performance
Additional benefits specific to remote I/O
Multiple concurrent TCP streams
On-the-fly data compression

14
Asynchronous I/O Implementation

Dual-threaded implementation
Main Compute Thread
Auxiliary I/O thread
Shared I/O queue
Asynchronous calls place I/O requests in queue
and return immediately
I/O thread dequeues I/O queue in FIFO order

15
Asynchronous I/O Primitives

POSIX pthread library
Asynchronous Primitives
MPI_File_iread
MPI_File_iwrite
MPIO_Wait
MPIO_Test

16
Presentation Outline

Introduction
Remote I/O
Storage Resource Broker
SEMPLAR
Design
Implementation
Asynchronous I/O over WANs
Experimental Setup
Results
Conclusions

17
Experimental Setup

SRB server v3.2.1 running on orion.sdsc.edu
Our reference server in San Diego
NCSA TeraGrid cluster
High bandwidth, Low latency
DAS-2 at VU, Amsterdam
Low bandwidth, High Latency
Intel Pentium 4 Xeon cluster at OSC
High bandwidth, Low latency
Private I/P addresses

18
Benchmarks

ROMIO perf
Measures the read and write performance
Synchronous and Contiguous I/O
Upper-bound on the MPI I/O performance
NAS btio
Non-contiguous data access pattern
Class C full version
Collective I/O
MPI-BLAST
BLAST Searches protein and nucleotide databases
for local alignment
MPI-BLAST MPI wrapper for BLAST

19
Presentation Outline

Introduction
Remote I/O
Storage Resource Broker
SEMPLAR
Design
Implementation
Asynchronous I/O over WANs
Experimental Setup
Results
Conclusions

20
Synchronous Remote I/O Performance Results
21
perf I/O Performance
NCSA TeraGrid Cluster
DAS-2 Cluster
22
perf I/O Performance
NCSA TeraGrid Read 290.88Mbps. Write
139.44Mbps. Ttcp 46.34Mbps DAS 2 Cluster Read
68.32Mbps. Write 97.68Mbps. Iperf
4.82Mbps OSC Xeon Cluster Read 82.96Mbps.
Write 76.48Mbps. Iperf 10.81Mbps
Results Summary
OSC P4 Cluster
23
btio Class C Write Performance
NCSA TeraGrid Cluster
DAS-2 Cluster
24
btio Class C Write Performance
NCSA TeraGrid btio Class C Write 74.04Mbps.
Ttcp 46.34Mbps DAS 2 Cluster btio Class C
Write 56.49Mbps. Iperf 4.82Mbps OSC Xeon
Cluster btio Class C Write 70.28Mbps. Iperf
10.81Mbps
OSC P4 Cluster
Results Summary
25
Asynchronous Remote I/O Performance Results
26
Asynchronous I/O Experiments

In our experiments we evaluated the performance
benefits achievable by
Restructuring of application code to achieve
overlap of computation and I/O
Doubling the number of TCP connections between
each node and the SRB server
Compressing/decompressing data on the fly

27
MPI-BLAST pseudocode
28
MPI-BLAST I/O Performance
NCSA TeraGrid Cluster
DAS-2 Cluster
29
MPI-BLAST I/O Performance
OSC P4 Cluster
30
perf I/O Performance
DAS-2 Cluster
NCSA TeraGrid Cluster
31
Related Work

Synchronous Remote I/O
MPI-IO Remote I/O
RIO Single client-server connection
Multiple connections
GridFTP Striped connections out of a single
client
Asynchronous Remote I/O
MTIO
Multi-threaded MPI based I/O library More et
al.
RFS
Active Buffering with threads (ABT)

32
Presentation Outline

Introduction
Remote I/O
Storage Resource Broker
SEMPLAR
Design
Implementation
Asynchronous I/O over WANs
Experimental Setup
Results
Conclusions

33
Conclusions

End-to-end Parallel Remote I/O
Multiple, parallel TCP streams increase the
available I/O bandwidth
SRB provides a consistent interface to
heterogeneous storage resources
By integrating SRB with MPI I/O, we have
developed a scalable, high-performance remote I/O
library based on widely deployed tools
Asynchronous Remote I/O
Asynchronous primitives enable the deployment of
different performance enhancing measures for
remote I/O
Overlap of computation with I/O
Asynchronous Split-TCP
On-the-fly data compression

34
Future Work

Caching in the network using public
infrastructure (IBP)
Dynamic degree of data stream parallelism
Adjust the number of connections based on the
network load
Coordination between multiple data streams

35
Acknowledgments

Thanks are due to the following
Reagan Moore, Marcio Faerman and Arcot Rajasekar
of the Data Intensive Group (DICE) at the San
Diego Supercomputer Center for giving us access
to the SRB source.
Henri Bal of Vrije Universiteit, Amsterdam for
giving us access to the DAS-2 cluster.
Rob Pennington and Ruth Aydt at the National
Center for Supercomputing Applications (NCSA) for
allowing us to use the NCSA TeraGrid cluster for
our experiments.
This work is supported in part by the National
Partnership for Advanced Computational
Infrastructure, by the Ohio Supercomputer Center
through grants PAS0036 and PAS0121, and by NSF
grant CNS-0403342. Mario Lauria is partially
supported by NSF grant DBI-0317335. Support from
Hewlett-Packard is also gratefully acknowledged.