MPI-I/O for EQM APPLICATIONS - PowerPoint PPT Presentation

1 / 40

About This Presentation

Title:

MPI-I/O for EQM APPLICATIONS

Description:

MPI-I/O for EQM APPLICATIONS. David Cronk. Innovative Computing Lab ... CE-QUAL-IC - Victor Parr (UT-Austin) 9/24/09. David Cronk. 4. INTRODUCTION ... Victor Parr - UTA ... – PowerPoint PPT presentation

Number of Views:56

Avg rating:3.0/5.0

Slides: 41

Provided by: david358

Category:

more less

Transcript and Presenter's Notes

Title: MPI-I/O for EQM APPLICATIONS

1
MPI-I/O for EQM APPLICATIONS

David Cronk
Innovative Computing Lab
University of Tennessee
June 20, 2001

2
Outline

Introduction
What is parallel I/O
Why do we need parallel I/O
What is MPI-I/O
MPI-I/O
Derived data types and file views

3
OUTLINE (cont)

MPI-I/O (cont)
Data access
Non-collective access
Collective access
Split collective access
Examples
LBMPI - Bob Maier (ARC)
CE-QUAL-IC - Victor Parr (UT-Austin)

4
INTRODUCTION

What is parallel I/O?
Multiple processes accessing a single file

5
INTRODUCTION

What is parallel I/O?
Multiple processes accessing a single file
Often, both data and file access is
non-contiguous
Ghost cells cause non-contiguous data access
Block or cyclic distributions cause
non-contiguous file access

6
Non-Contiguous Access
File layout
Local Mem
7
INTRODUCTION

What is parallel I/O?
Multiple processes accessing a single file
Often, both data and file access is
non-contiguous
Ghost cells cause non-contiguous data access
Block or cyclic distributions cause
non-contiguous file access
Want to access data and files with as few I/O
calls as possible

8
INTRODUCTION (cont)

Why use parallel I/O?
Many users do not have time to learn the
complexities of I/O optimization

9
INTRODUCTION (cont)
Integer dim parameter (dim10000) Integer4
out_array(dim)
OPEN (fh,filename,UNFORMATTED) WRITE(fh)
(out_array(I), I1,dim)
rl 4dim OPEN (fh, filename, DIRECT,
RECLrl) WRITE (fh, REC1) out_array
10
INTRODUCTION (cont)

Why use parallel I/O?
Many users do not have time to learn the
complexities of I/O optimization
Use of parallel I/O can simplify coding
Single read/write operation vs. multiple
read/write operations

11
INTRODUCTION (cont)

Why use parallel I/O?
Many users do not have time to learn the
complexities of I/O optimization
Use of parallel I/O can simplify coding
Single read/write operation vs. multiple
read/write operations
Parallel I/O potentially offers significant
performance improvement over traditional
approaches

12
INTRODUCTION (cont)

Traditional approaches
Each process writes to a separate file
Often requires an additional post-processing step
Without post-processing, restarts must use same
number of processor
Result sent to a master processor, which collects
results and writes out to disk
Each processor calculates position in file and
writes individually

13
INTRODUCTION (cont)

What is MPI-I/O?
MPI-I/O is a set of extensions to the original
MPI standard
This is an interface specification It does NOT
give implementation specifics
It provides routines for file manipulation and
data access
Calls to MPI-I/O routines are portable across a
large number of architectures

14
DERIVED DATATYPES VIEWS

Derived datatypes are not part of MPI-I/O
They are used extensively in conjunction with
MPI-I/O
A filetype is really a datatype expressing the
access pattern of a file
Filetypes are used to set file views

15
DERIVED DATATYPES VIEWS

Non-contiguous memory access
MPI_TYPE_CREATE_SUBARRAY
NDIMS - number of dimensions
ARRAY_OF_SIZES - number of elements in each
dimension of full array
ARRAY_OF_SUBSIZES - number of elements in each
dimension of sub-array
ARRAY_OF_STARTS - starting position in full array
of sub-array in each dimension
ORDER - MPI_ORDER_(C or FORTRAN)
OLDTYPE - datatype stored in full array
NEWTYPE - handle to new datatype

16
NONCONTIGUOUS MEMORY ACCESS
0,101
0,0
1,1
1,100
101,1
100,100
101,101
101,0
17
NONCONTIGUOUS MEMORY ACCESS

INTEGER sizes(2), subsizes(2), starts(2), dtype,
ierr
sizes(1) 102
sizes(2) 102
subsizes(1) 100
subsizes(2) 100
starts(1) 1
starts(2) 1
CALL MPI_TYPE_CREATE_SUBARRAY(2,sizes,subsizes,sta
rts, MPI_ORDER_FORTRAN,MPI_REAL8,dtype,ierr)

18
NONCONTIGUOUS FILE ACCESS

MPI_FILE_SET_VIEW(
FH,
DISP,
ETYPE,
FILETYPE,
DATAREP,
INFO,
IERROR)

19
NONCONTIGUOUS FILE ACCESS

The file has holes in it from the processors
perspective
multi-dimensional array access

20
NONCONTIGUOUS FILE ACCESS

The file has holes in it from the processors
perspective
multi-dimensional array access
MPI_TYPE_CREATE_SUBARRAY()

21
Distributed array access
(0,0)
(0,199)
(199,0)
(199,199)
22
Distributed array access
Sizes(1) 200 sizes(2) 200 subsizes(1)
100 subsizes(2) 100 starts(1) 0 starts(2)
0 CALL MPI_TYPE_CREATE_SUBARRAY(2, SIZES,
SUBSIZES, STARTS, MPI_ORDER_FORTRAN, MPI_INT,
FILETYPE, IERR) CALL MPI_TYPE_COMMIT(FILETYPE,
IERR) CALL MPI_FILE_SET_VIEW(FH, 0, MPI_INT,
FILETYPE, NATIVE, MPI_INFO_NULL, IERR)
23
NONCONTIGUOUS FILE ACCESS

The file has holes in it from the processors
perspective
multi-dimensional array distributed with a block
distribution
Irregularly distributed arrays

24
Irregularly distributed arrays

MPI_TYPE_CREATE_INDEXED_BLOCK
COUNT - Number of blocks
LENGTH - Elements per block
MAP - Array of displacements
OLD - Old datatype
NEW - New datatype

25
Irregularly distributed arrays
0 1 2 4 7
11 12 15 20
22
26
Irregularly distributed arrays
CALL MPI_TYPE_CREATE_INDEXED_BLOCK (10, 1,
FILE_MAP, MPI_INT, FILETYPE, IERR) CALL
MPI_TYPE_COMMIT (FILETYPE, IERR) DISP 0 CALL
MPI_FILE_SET_VIEW (FH, DISP, MPI_INT, FILETYPE,
native, MPI_INFO_NULL, IERR)
27
DATA ACCESS
28
COLLECTIVE I/O
Memory layout on 4 processor
29
EXAMPLE 1

Bob Maier - ARC
Production level Fortran code
Challenge problem
Every X iterations, write a re-start file
At conclusion write output file
On SP w/512 Processors, 12 hrs computation, 12
hrs I/O.

30
EXAMPLE 1 (cont)

Conceptually, four 3-dim. Arrays
Implemented with a single 4-dim array
Improved cache-hit ratio
Uses ghost cells
Write out to 4 separate files
Block-Block data distribution
Mem access is completely non-contiguous

31
EXAMPLE 1 (cont)
32
EXAMPLE 1 - Solution
Set up array with size of file set up array
with subsize of file set up array with size of
local arrays set up array with subsize of
memory set up array with starting positions in
file set up array with starting positions in
memory disp 0 call mpi_type_create_subarray(3,
file_sizes, file_subsizes, file_starts,
MPI_ORDER_FORTRAN, MPI_REAL8, file_type,
ierr) call mpi_type_commit (file_type, ierr)
do vars1,4 mem_starts(1) vars-1 call
mpi_type_create_subarray(4, mem_sizes,
mem_subsizes, mem_starts, MPI_ORDER_FORTRAN,
MPI_REAL8, mem_type, ierr) call mpi_type_commit
(mem_type, ierr) call mpi_file_open () call
mpi_file_set_view (fh, disp, MPI_REAL8,
file_type, native, ) call mpi_file_write_all
(fh, Z, 1, mem_type, ) call mpi_file_close
(fh, ierr) enddo
33
LBMPI - PERFORMANCEOriginal 5204 Seconds
34
LBMPI - PERFORMANCE Original 5204 Seconds
35
EXAMPLE 2

Victor Parr - UTA
Production level Fortran code performing EPA
simulations (CE-QUAL-ICM Message Passing code )
Typical production run performs a 10 year
simulation dumping output for every simulation
month
Irregular grid and irregular data distribution
High ratio of ghost cells

36
EXAMPLE 2 (cont)
header
37
EXAMPLE 2 - CURRENT

Each processor writes all output (including ghost
cells) to a process specific file
Post processor reads in process specific files
Determines if value is from resident cell
places resident values in appropriate position in
a global output array
writes out global array to global output file

38
EXAMPLE 2 - SOLUTION
1 2 4 7 9 1011 14
20 24
32 63 7 21 44 2 77 31 55 19
39
EXAMPLE 2 - SOLUTION
DONE FOR EACH OUTPUT call mpi_file_set_view (fh,
disp, memtype, filetype, native, MPI_INFO_NULL,
ierr) call mpi_file_write_all (fh, buf, 1,
memtype, status, ierr) disp disp total
number of bytes written by all processes
DONE ONCE create mem_map create
file_map sort file_map permute mem_map to
match file_map call mpi_type_create_indexed_block
(num, 1, mem_map, MPI_DOUBLE_PRECISION, memtype,
ierr) call mpi_type_commit (memtype, ierr) call
mpi_type_create_indexed_block (num, 1, file_map,
MPI_DOUBLE_PRECISION, filetype, ierr) call
mpi_type_commit (filetype, ierr) disp size of
initial header in bytes
40
CONCLUSIONS