Title: Portable MPI and Related Parallel Development Tools
1Portable MPI and Related Parallel Development
Tools
- Rusty Lusk
- Mathematics and Computer Science Division
- Argonne National Laboratory
- (The rest of our group Bill Gropp, Rob Ross,
David Ashton, Brian Toonen, Anthony Chan)
2Outline
- MPI
- What is it?
- Where did it come from?
- One implementation
- Why has it succeeded?
- Case study an MPI application
- Portability
- Libraries
- Tools
- Future developments in parallel programming
- MPI development
- Languages
- Speculative approaches
3What is MPI?
- A message-passing library specification
- extended message-passing model
- not a language or compiler specification
- not a specific implementation or product
- For parallel computers, clusters, and
heterogeneous networks - Full-featured
- Designed to provide access to advanced parallel
hardware for - end users
- library writers
- tool developers
4Where Did MPI Come From?
- Early vendor systems (NX, EUI, CMMD) were not
portable. - Early portable systems (PVM, p4, TCGMSG,
Chameleon) were mainly research efforts. - Did not address the full spectrum of
message-passing issues - Lacked vendor support
- Were not implemented at the most efficient level
- The MPI Forum organized in 1992 with broad
participation by vendors, library writers, and
end users. - MPI Standard (1.0) released June, 1994 many
implementation efforts. - MPI-2 Standard (1.2 and 2.0) released July, 1997.
5Informal Status Assessment
- All MPP vendors now have MPI-1. (1.0, 1.1, or
1.2) - Public implementations (MPICH, LAM, CHIMP)
support heterogeneous workstation networks. - MPI-2 implementations are being undertaken now by
all vendors. - MPI-2 is harder to implement than MPI-1 was.
- MPI-2 implementations will appear piecemeal, with
I/O first.
6MPI Sources
- The Standard itself
- at http//www.mpi-forum.org
- All MPI official releases, in both postscript and
HTML - Books on MPI and MPI-2
- Using MPI Portable Parallel Programming with
the Message-Passing Interface (2nd edition), by
Gropp, Lusk, and Skjellum, MIT Press, 1999. - Using MPI-2 Extending the Message-Passing
Interface, by Gropp, Lusk, and Thakur, MIT Press,
1999 - MPI The Complete Reference, volumes 1 and 2,
MIT Press, 1999. - Other information on Web
- at http//www.mcs.anl.gov/mpi
- pointers to lots of stuff, including other talks
and tutorials, a FAQ, other MPI pages
7The MPI Standard Documentation
8Tutorial Material on MPI, MPI-2
9The MPICH Implementation of MPI
- As a research project exploring tradeoffs
between performance and portability conducting
research in implementation issues. - As a software project providing a free MPI
implementation on most machines enabling
vendors and others to build complete MPI
implementation on their own communication
services. - MPICH 1.2.2 just released, with complete MPI-1,
parts of MPI-2 (I/O and C), port to
Windows2000. - Available at http//www.mcs.anl.gov/mpi/mpich
10Lessons From MPIWhy Has It Succeeded?
- The MPI Process
- Portability
- Performance
- Simplicity
- Modularity
- Composability
- Completeness
11The MPI Process
- Started with open invitation to all those
interested in standardizing message-passing model - Participation from
- Parallel computing vendors
- Computer Scientists
- Application scientists
- Open process
- All invited, but hard work required
- All deliberations available at all times
- Reference implementation developed during design
process - Helped debug design
- Immediately available when design completed
12Portability
- Most important property of a programming model
for high-performance computing - Application lifetimes 5 to 20 years
- Hardware lifetimes much shorter
- (not to mention corporate lifetimes!)
- Need not lead to lowest common denominator
approach - Example MPI semantics allow direct copy of data
from user space send buffer to user space receive
buffer - Might be implemented by hardware data mover
- Might be implemented by netwrk hardware
- Might be implemented by socket
- The hard part portability with performance
13Performance
- MPI can help manage the crucial memory hierarchy
- Local vs. remote memory is explicit
- A received message is likely to be in cache
- MPI provides collective operations for both
communication and computation that hide
complexity or non-portability of scalable
algorithms from the programmer. - Can interoperate with optimising compilers
- Promotes use of high-performance libraries
- Doesnt provide performance portability
- This problem is still too hard, even for the best
compilers - E.g. BLAS
14Simplicity
- Simplicity is in the eye of the beholder
- MPI-1 has about 125 functions
- Too big!
- Too small!
- MPI-2 has about 150 more
- Even this is not very many by comparison
- Few applications use all of MPI
- But few MPI functions go unused
- One can write serious MPI programs with as few as
six functions - Other programs with a different six
- Economy of concepts
- Communicators encapsulate both process groups and
contexts - Datatypes both enable heterogeneous communication
and allow non-contiguous messages buffers - Symmetry helps make MPI easy to understand.
15Modularity
- Modern applications often combine multiple
parallel components. - MPI supports component-oriented software through
its use of communicators - Support of libraries means applications may
contain no MPI calls at all.
16Composability
- MPI works with other tools
- Compilers
- Since it is a library
- Debuggers
- Debugging interface used by MPICH, TotalView,
others - Profiling tools
- The MPI profiling interface is part of standard
- MPI-2 provides precise interaction with
multi-threaded programs - MPI_THREAD_SINGLE
- MPI_THREAD_FUNNELLED (OpenMP loops)
- MPI_THREAD_SERIAL (Open MP single)
- MPI_THREAD_MULTIPLE
- The interface provides for both portability and
performance
17Completeness
- MPI provides a complete programming model.
- Any parallel algorithm can be expressed.
- Collective operations operate on subsets of
processes. - Easy things are not always easy, but
- Hard things are possible.
18The Center for the Study of Astrophysical
Thermonuclear Flashes
- To simulate matter accumulation on the surface of
compact stars, nuclear ignition of the accreted
(and possibly underlying stellar) material, and
the subsequent evolution of the stars interior,
surface, and exterior - X-ray bursts (on neutron star surfaces)
- Novae (on white dwarf surfaces)
- Type Ia supernovae (in white dwarf interiors)
19FLASH Scientific Results
- Wide range of compressibility
- Wide range of length and time scales
- Many interacting physical processes
- Only indirect validation possible
- Rapidly evolving computing environment
- Many people in collaboration
Flame-vortex interactions
Compressible turbulence
Laser-driven shock instabilities
Nova outbursts on white dwarfs
Richtmyer-Meshkov instability
Cellular detonations
Helium burning on neutron stars
Rayleigh-Taylor instability
Gordon Bell prize at SC2000
20The FLASH Code MPI in Action
- Solves complex systems of equations for
hydrodynamics and nuclear burning - Written primarily in Fortran-90
- Uses Paramesh library for adaptive mesh
refinement Paramesh is implemented with MPI - I/O (for checkpointing, visualization, other
purposes) done with HDF-5 library, which is
implemented with MPI-IO - Debugged with TotalView, using standard debugger
interface - Tuned with Jumpshot and Vampir, using MPI
profiling interface - Gordon Bell prize winner in 2000
- Portable to all parallel computing environments
(since MPI)
21FLASH Scaling Runs
22X-Ray Burst on the Surface of a Neutron Star
23Showing the AMR Grid
24MPI Performance Visualization with Jumpshot
- For detailed analysis of parallel program
behavior, timestamped events are collected into a
log file during the run. - A separate display program (Jumpshot) aids the
user in conducting a post mortem analysis of
program behavior. - Log files can become large, making it impossible
to inspect the entire program at once. - The FLASH Project motivated an indexed file
format (SLOG) that uses a preview to select a
time of interest and quickly display an interval.
25Removing Barriers From Paramesh
26Using Jumpshot
- MPI functions and messages automatically logged
- User-defined states
- Nested states
- Zooming and scrolling
- Spotting opportunities for optimization
27Future Developments in Parallel Programming MPI
and Beyond
- MPI not perfect
- Any widely-used replacement will have to share
the properties that made MPI a success. - Some directions (in decreasing order of
speculativeness) - Improvements to MPI implementations
- Improvements to the MPI definition
- Continued evolution of libraries
- Research and development for parallel languages
- Further out radically different programming
models for radically different architectures.
28MPI Implementations
- Implementations beget implementation research
- Datatypes, I/O, memory motion elimination
- On most platforms, better collective
- Most MPI implementations build collective on
point-to-point, too high-level - Need stream-oriented methods that understand MPI
datatypes - Optimize for new hardware
- In progress for VIA, Infiniband
- Need more emphasis on collective operations
- Off-loading message processing onto NIC
- Scaling beyond 10,000 processes
- Parallel I/O
- Clusters
- Remote
- Fault-tolerance
- Intercommunicators provide an approach
- Working with multithreading approaches
29Improvements to MPI Itself
- Better Remote-memory-access interface
- Simpler for some simple operations
- Atomic fetch-and-increment
- Some minor fixup already in progress
- MPI 2.1
- Building on experience with MPI-2
- Interactions with compilers
30Libraries and Languages
- General Libraries
- Global Arrays
- PETSc
- ScaLAPACK
- Application-specific libraries
- Most built on MPI, at least for portable version.
31More Speculative Approaches
- HTMT for Petaflops
- Blue Gene
- PIMS
- MTA
- All will need a programming model that explicitly
manages a deep memory hierarchy. - Exotic small benefit dead
32Summary
- MPI is a successful example of a community
defining, implementing, and adopting a standard
programming methodology. - It happened because of the open MPI process, the
MPI design itself, and early implementation. - MPI research continues to refine implementations
on modern platforms, and this is the main road
ahead. - Tools that work with MPI programs are thus a good
investment. - MPI provides portability and performance for
complex applications on a variety of
architectures.