Research @ Northeastern University - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

Research @ Northeastern University

Description:

Title: EMC Presentation Author: David Kaeli Last modified by: Dr. David R. Kaeli Created Date: 1/21/2003 2:51:20 AM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:195
Avg rating:3.0/5.0
Slides: 42
Provided by: DavidK203
Category:

less

Transcript and Presenter's Notes

Title: Research @ Northeastern University


1
Research _at_ Northeastern University
  • I/O storage modeling and performance
  • David Kaeli
  • Soft error modeling and mitigation
  • Mehdi B. Tahoori

2
I/O Storage Research at Northeastern University
  • David Kaeli
  • Yijian Wang
  • Department of Electrical and Computer Engineering
  • Northeastern University
  • Boston, MA
  • kaeli_at_ece.neu.edu

3
Outline
  • Motivation to study file-based I/O
  • Profile-driven partitioning for parallel file I/O
  • I/O Qualification Laboratory _at_ NU
  • Areas for future work

4
Important File-base I/O Workloads
  • Many subsurface sensing and imaging workloads
    involve file-based I/O
  • Cellular biology in-vitro fertilization with
    NU biologists
  • Medical imaging cancer therapy with MGH
  • Underwater mapping multi-sensor fusion with
    Woods Hole Oceanographic Institution
  • Ground-penetrating radar toxic waste tracking
    with Idaho National Labs

5
The Impact of Profile-guided Parallelization on
SSI Applications
  • Reduced the runtime of a single-body Steepest
    Descent Fast Multipole Method (SDFMM) application
    by 74 on a 32-node Beowulf cluster
  • Hot-path parallelization
  • Data restructuring
  • Reduced the runtime of a Monte Carlo
  • scattered light simulation by 98 on
  • a 16-node Silicon Graphics Origin 2000
  • Matlab-to-C compliation
  • Hot-path parallelization
  • Obtained superlinear speedup of Ellipsoid
  • Algorithm run on a 16-node IBM SP2
  • Matlab-to-C compliation
  • Hot-path parallelization

6
Limits of Parallelization
  • For compute-bound workloads, Beowulf clusters can
    be used effectively to overcome computational
    barriers
  • Middlewares (e.g., MPI and MPI/IO) can
    significantly reduce the programming effort on
    parallel systems
  • Multiple clusters can be combined, utilizing Grid
    Middleware (Globus Toolkit)
  • For file-based I/O-bound workloads, Beowulf
    clusters and Grid systems are presently
    ill-suited to exploit the potential parallelism
    present on these systems

7
Outline
  • Motivation to study file-based I/O
  • Profile-driven partitioning for parallel file I/O
  • I/O Qualification Laboratory _at_ NU
  • Areas for future work

8
Parallel I/O Acceleration
  • The I/O bottleneck
  • The growing gap between the speed of processors,
    networks and underlying I/O devices
  • Many imaging and scientific applications access
    disks very frequently
  • I/O intensive applications
  • Out-of-core applications
  • Work on large datasets that cannot fit in main
    memory
  • File-intensive applications
  • Access file-based datasets frequently
  • Large number of file operations

9
Introduction
  • Storage architectures
  • Direct Attached Storage (DAS)
  • Storage device is directly attached to the
    computer
  • Network Attached Storage (NAS)
  • Storage subsystem is attached to a network of
    servers and file requests are passed through a
    parallel filesystem to the centralized storage
    device
  • Storage Area Network (SAN)
  • A dedicated network to provide an any-to-any
    connection between processors and disks

10
I/O Partitioning
P
An I/O intensive application
Disk
11
I/O Partitioning
  • I/O is parallelized at both the application level
    (using MPI and MPI-IO) and the disk level (using
    file partitioning)
  • Ideally, every process will only access files on
    local disk (though this is typically not possible
    due to data sharing)
  • How to recognize the access patterns?
  • Profile-guided approach

12
Profile Generation
Run the application
Capture I/O execution profiles
Apply our partitioning algorithm
Rerun the tuned application
13
I/O traces and partitioning
  • For every process, for every contiguous file
    access, we capture the following I/O profile
    information
  • Process ID
  • File ID
  • Address
  • Chunk size
  • I/O operation (read/write)
  • Timestamp
  • Generate a partition for every process
  • Optimal partitioning is NP-complete, so we
    develop a greedy algorithm
  • We have found we can use partial profiles to
    guide partitioning

14
Greedy File Partitioning Algorithm
for each IO process, create a partition for each
contiguous data chunk total up the of
read/write accesses on a process-ID basis if
the chunk is accessed by only one
process assign the chunk to the associated
partition if the chunk is read (but never
written) by multiple processes duplicate the
chunk in all partitions where read if the chunk
is written by one process, but later read by
multiple assign the chunk to all partitions
where read and broadcast the updates on
writes else assign the chunk to a shared
partition For each
partition sort chunks based on the earliest
timestamp for each chunk
15
Parallel I/O Workloads
  • NASA Parallel Benchmark (NPB2.4)/BT
  • Computational fluid dynamics
  • Generates a file (1.6 GB) dynamically and then
    reads it back
  • Writes/reads sequentially in chunk sizes of 2040
    Bytes
  • SPEChpc96/seismic
  • Seismic processing
  • Generates a file (1.5 GB) dynamically and then
    reads it back
  • Writes sequential chunks of 96 KB and reads
    sequential chunks of 2 KB
  • Tile-IO
  • Parallel Benchmarking Consortium
  • Tile access to a two-dimensional matrix (1 GB)
    with overlap
  • Writes/reads sequential chunks of 32 KB, with 2KB
    of overlap
  • Perf
  • Parallel I/O test program within MPICH
  • Writes a 1 MB chunk at a location determined by
    rank, no overlap
  • Mandelbrot
  • An image processing application that includes
    visualization
  • Chunk size is dependent on the number of processes

16
RAID Node
Beowulf Cluster
P2-350Mhz
P2-350Mhz
P2-350Mhz
10/100Mb Ethernet Switch
Local PCI-IDE Disk
Local PCI-IDE Disk
P2-350Mhz
P2-350Mhz
P2-350Mhz
RAID Node
17
Hardware Specifics
  • DAS configuration
  • Linux box, Western Digital WD800BB (IDE), 80GB,
    7200RPM
  • Beowulf cluster (base configuration)
  • Fast Ethernet 100Mbits/sec
  • Network Attached RAID - Morstor TF200 with 6-9GB
    drives Seagate SCSI disks, 7200rpm, RAID-5
  • Local attached IDE disks IBM UltraATA-350840,
    5400rpm
  • Fibre channel disks
  • Seagate Cheetah X15 ST-336752FC, 15000rpm

18
Write/Read Bandwidth
NPB2.4/BT
SPECHPC/seis
19
Write/Read Bandwidth
MPI-Tile
Perf
Mandelbrot
20
(No Transcript)
21
Profile training sensitivity analysis
  • We have found that IO access patterns are
    independent of file-based data values
  • When we increase the problem size or reduce the
    number of processes, either
  • the number of IOs increases, but access patterns
    and chunk size remain the same (SPEChpc96,
    Mandelbrot), or
  • the number of IOs and IO access patterns remain
    the same, but the chunk size increases (NBT,
    Tile-IO, Perf)
  • Re-profiling can be avoided

22
Execution-driven Parallel I/O Modeling
  • Growing need to process large, complex datasets
    in high performance parallel computing
    applications
  • Efficient implementation of storage architectures
    can significantly improve system performance
  • An accurate simulation environment for users to
    test and evaluate different storage architectures
    and applications

23
Execution-driven I/O Modeling
  • Target applications parallel scientific programs
    (MPI)
  • Target machine/Host machine Beowulf clusters
  • Use DiskSim as the underlying disk drive
    simulator
  • Direct execution to model CPU and network
    communication
  • We execute the real parallel I/O accesses and
    meanwhile, calculate the simulated I/O response
    time

24
Validation Synthetic I/O Workload on DAS
25
Simulation Framework - NAS
26
(No Transcript)
27
Simulation Framework SAN direct
  • A variety of SAN where disks are distributed
    across the network and each
  • server is directly connected to a single device
  • File partitioning
  • Utilize I/O profiling and data partitioning
    heuristics to distribute portions of
  • files to disks close to the processing nodes

28
(No Transcript)
29
Hardware Specifications
30
(No Transcript)
31
(No Transcript)
32
Publications
  • Profile-guided File Partitioning on Beowulf
    Clusters, Journal of Cluster Computing, Special
    Issue on Parallel I/O, to appear 2005.
  • Execution-Driven Simulation of Network Storage
    Systems, Proceedings of the 12th ACM/IEEE
    International Symposium on Modeling, Analysis and
    Simulation of Computer and Telecommunication
    Systems (MASCOTS), October 2004, pp. 604-611.
  • Profile-Guided I/O Partitioning, Proceedings of
    the 17th ACM International Symposium on
    Supercomputing, June 2003, pp. 252-260.
  • Source Level Transformations to Apply I/O Data
    Partitioning, Proceedings of the IEEE Workshop
    on Storage Network Architecture And Parallel IO,
    Oct. 2003, pp. 12-21.
  • Profile-Based Characterization and Tuning for
    Subsurface Sensing and Imaging Applications,
    International Journal of Systems, Science and
    Technology, September 2002, pp. 40-55.

33
Summary of Cluster-based Work
  • Many imaging applications are dominated by
    file-based I/O
  • Parallel systems can only be effectively utilized
    if I/O is also parallelized
  • Developed a profile-guided approach to I/O data
    partitioning
  • Impacting clinical trials at MGH
  • Reduced overall execution time by 27-82 over
    MPI-IO
  • Execution-driven I/O model is highly accurate and
    provides significant modeling flexibility

34
Outline
  • Motivation to study file-based I/O
  • Profile-driven partitioning for parallel file I/O
  • I/O Qualification Laboratory _at_ NU
  • Areas for future work

35
I/O Qualification Laboratory
  • Working with Enterprise Strategy Group
  • Develop a state-of-the-art facility to provide
    independent performance qualification of
    Enterprise Storage systems
  • Provide a quarterly report to ES customer base on
    the status of current ES offerings
  • Work with leading ES vendors to provide them with
    custom early performance evaluation of their beta
    products

36
I/O Qualification Laboratory
  • Contacted by IOIntegrity and SANGATE for product
    qualification
  • Developed potential partners that are leaders in
    the ES field
  • Initial proposals already reviewed by IBM,
    Hitachi and other ES vendors
  • Looking for initial endorsement from industry

37
I/O Qualification Laboratory
  • Why _at_ NU
  • Track record with industry (EMC, IBM, Sun)
  • Experience with benchmarking and IO
    characterization
  • Interesting set of applications (medical,
    environmental, etc.)
  • Great opportunity to work within the cooperative
    education model

38
Outline
  • Motivation to study file-based I/O
  • Profile-driven partitioning for parallel file I/O
  • I/O Qualification Laboratory _at_ NU
  • Areas for future work

39
Areas for Future Work
  • Designing a Peer-to-Peer storage system on a Grid
    system by partitioning datasets across
    geographically distributed storage devices

Head node
Head node
40
(No Transcript)
41
Areas for Future Work
  • Reduce simulation time by identifying
    characteristic phases in I/O workloads
  • Apply machine learning algorithms to identify
    clusters of representative I/O behavior
  • Utilize K-Means and Multinomial clustering to
    obtain high fidelity in simulation runs utilizing
    sampled I/O behavior

A Multinomial Clustering Model for Fast
Simulation of Architecture Designs, submitted to
the 2005 ACM KDD Conference.
Write a Comment
User Comments (0)
About PowerShow.com