MPI Message Passing Interface - PowerPoint PPT Presentation

1 / 26

About This Presentation

Title:

MPI Message Passing Interface

Description:

Distributed memory (Paragon, IBM SPx, workstation network) ... pack/unpack. Architecture of MPI. ADI(Abstract Device Interface) Channel Interface ... – PowerPoint PPT presentation

Number of Views:37

Avg rating:3.0/5.0

Slides: 27

Provided by: muratb1

Category:

more less

Transcript and Presenter's Notes

Title: MPI Message Passing Interface

1
MPIMessage Passing Interface
Mehmet Balman Cmpe 587 Dec, 2001
2
Parallel Computing

Separate workers or processes
Interact by exchanging information

Types of parallel computing
SIMD (single instruction multiple data)
SPMD (single program multiple data)
MPMD (multiple program multiple data)

Hardware models
Distributed memory (Paragon, IBM SPx, workstation
network)
Shared memory (SGI Power Challenge, Cray T3D)

3
Communication with other processes

One sided
one worker performs transfer of data

Cooperative
all parties agree to transfer data

4
What is MPI?

A message-passing library specification
Multiple processors by message passing
Library of functions and macros that can be used
in C FORTRAN and C programs
For parallel computers, clusters, and
heterogeneous networks

Who designed MPI?
Vendors IBM, Intel, TMC, Meiko, Cray,
Convex, Ncube
Library writers PVM, p4, Zipcode,
TCGMSG,
Chameleon, Express, Linda
Broad Participation

Development history (1993-1994)
Began at Williamsburg Workshop in April, 1992
Organized at Supercomputing '92 (November)
Met every six weeks for two days
Pre-final draft distributed at Supercomputing '93
Final version of draft in May, 1994
Public and vendor implementations available

6
Features of MPI

Point-to-point communication
blocking, nonblocking
synchronous, asynchronous
ready,buffered
Collective routines
built-in, user defined
Large of data movement routines
Built-in support for grids and graphs
125 functions (MPI is large)
6 basic functions (MPI is small)
Communicators combine context and groups for
message security

7
example
include "mpi.h" include ltstdio.hgt int main(
int argc, char argv) int rank, size
MPI_Init( argc, argv ) MPI_Comm_rank(
MPI_COMM_WORLD, rank ) MPI_Comm_size(
MPI_COMM_WORLD, size ) printf( "Hello world!
I'm d of d\n", rank, size ) MPI_Finalize()
return 0
8

What happens when an MPI job is run
The user issues a directive to the operating
system which has the effect of placing a copy of
the executable program on each processor
Each processor begins execution of its copy of
the executable
Different processes can execute different
statements by branching within the program
Typically the branching will be based on process
ranks

Envelope of a message (control block)
the rank of the receiver
the rank of the sender
a tag
a communicator

9
Two mechanisms for partitioning message space
Tags(0-32767) Communicators(MPI_COMM_WORLD)
10
MPI_Init MPI_Finalize MPI_Comm_size
MPI_Comm_rank MPI_Send MPI_Recv
MPI_Send( start, count, datatype, dest, tag, comm
) MPI_Recv(start, count, datatype, source, tag,
comm, status) MPI_Bcast(start, count, datatype,
root, comm) MPI_Reduce(start, result, count,
datatype, operation, root, comm)
11
Collective patterns
12
Collective Computation Operations
Operation Name Meaning MPI MAX Maximum MPI MIN
Minimum MPI SUM Sum MPI PROD Product MPI LAND
Logical And MPI BAND Bitwise And MPI LOR
Logical Or MPI BOR Bitwise Or MPI LXOR
Logical Exclusive Or MPI BXOR Bitwise Exclusive
Or MPI MAXLOC Maximum and Location of
Maximum MPI MINLOC Minimum and Location of
Minimum
MPI_Op_create( user_function, commutetrue if
commutative, op) MPI_Op_free(op)
13
User defined communication groups
Communicator contains a context and a group.
Group just a set of processes.
MPI_Comm_create( oldcomm, group, newcomm
) MPI_Comm_group( oldcomm, group
) MPI_Group_free( group ) MPI_Group_incl
MPI_Group_excl MPI_Group_range_incl
MPI_Group_range_excl MPI_Group_union
MPI_Group_intersection
14
Non-blocking operations
Non-blocking operations return immediately.

MPI_Isend(start, count, datatype, dest, tag,
comm, request)
MPI_Irecv(start, count, datatype, dest, tag,
comm, request)
MPI_Wait(request, status)
MPI_Waitall
MPI_Waitany
MPI_Waitsome
MPI_Test( request, flag, status)

15
Communication Modes

Synchronous mode ( MPI_Ssend) the send does not
complete until a matching receive has begun.
Buffered mode ( MPI_Bsend) the user supplies the
buffer to system for its use.
Ready mode ( MPI_Rsend) user guarantees that
matching receive has been posted. Non-blocking
versions MPI_Issend
MPI_Irsend
MPI_Ibsend

int bufsize char buf malloc(bufsize)
MPI_Buffer_attach( buf, bufsize ) ...
MPI_Bsend( ... same as MPI_Send ... ) ...
MPI_Buffer_detach( buf, bufsize )
16
Datatypes

Two main purpose
Heterogenity --- parallel programs between
different processors
Noncontiguous data --- structures, vectors with
non-unit stride

MPI datatype C datatype MPI CHAR signed
char MPI SHORT signed short int MPI INT
signed int MPI LONG signed long int MPI
UNSIGNED CHAR unsigned char MPI UNSIGNED SHORT
unsigned short int MPI UNSIGNED unsigned
int MPI UNSIGNED LONG unsigned long int MPI
FLOAT foat MPI DOUBLE ouble MPI LONG DOUBLE
long double MPI BYTE MPI PACKED
17
Build derived type
void Build_derived_type(INDATA_TYPE
indata,MPI_Datatype message_type_ptr) int
block_lengths3 MPI_Aint displacements3
MPI_Aint addresses4 MPI_Datatype typelist3
typelist0 MPI_FLOAT typelist1
MPI_FLOAT typelist2 MPI_INT
block_lengths0 block_lengths1
block_lengths2 1 MPI_Address(indata,
addresses0) MPI_Address((indatagta),
addresses1) MPI_Address((indatagtb),
addresses2) MPI_Address((indatagtn),
addresses3) displacements0 addresses1
addresses0 displacements1 addresses2
addresses0 displacements2 addresses3
addresses0 MPI_Type_struct(3, block_lengths,
displacements, typelist, message_type_ptr)
MPI_Type_commit(message_type_ptr)
18
Other derived data types

int MPI_Type_contiguous(int count, MPI_Datatype
oldtype,MPI_Datatype newtype)
elements are contiguous entries in an array
int MPI_Type_vector(int count, int
block_length,int stride,MPI_Datatype
element_type,MPI_Datatype new_type)
elements are equally spaced entries of an array
int MPI_Type_indexed(int count,int
array_of_blocklengths, int array_of_displacements,
MPI_Datatype element_type,MPI_Datatype new_type)
elements are arbitrary entries of an array

19
Pack/unpack
void Get_data4(int my_rank, float a_ptr, float
b_ptr, int n_ptr) int root 0 char
buffer10int position if (my_rank
0) printf(''Enter a, b, and nn'') scanf(''f
f d'', a_ptr, b_ptr, n_ptr) position
0 MPI_Pack(a_ptr, 1, MPI_FLOAT, buffer, 100,
position, MPI_COMM_WORLD) MPI_Pack(b_ptr, 1,
MPI_FLOAT, buffer, 100, position,
MPI_COMM_WORLD) MPI_Pack(n_ptr, 1, MPI_INT,
buffer, 100, position, MPI_COMM_WORLD)
MPI_Bcast(buffer, 100, MPI_PACKED, root,
MPI_COMM_WORLD) else MPI_Bcast(buffer, 100,
MPI_PACKED, root, MPI_COMM_WORLD) position
0 MPI_Unpack(buffer, 100, position, a_ptr, 1,
MPI_FLOAT, MPI_COMM_WORLD) MPI_Unpack(buffer,
100, position, b_ptr, 1, MPI_FLOAT,
MPI_COMM_WORLD) MPI_Unpack(buffer, 100,
position, n_ptr, 1, MPI_INT, MPI_COMM_WORLD)

20
Profiling
static int nsend 0 int MPI_Send( start,
count, datatype, dest, tag, comm ) nsend
return PMPI_Send( start, count, datatype, dest,
tag, comm )
21
Architecture of MPI

Complex communication operations can be expressed
portably in terms of lower-level ones
All MPI functions are implemented in terms of the
macros and functions that make up the
ADI(Abstract Device Interface)

ADI
specifying a message to be sent or received
moving data between the API and the
message-passing hardware
managing lists of pending messages (both sent and
received),
providing basic information about the execution
environment (e.g., how many tasks are there

22
Upper layers of MPICH
23
Channel Interface

Routines for send and receive envelope(control)
information
MPID_SendControl(MPID_SendControlBlock )
MPID_RecvAnyControl
MPID_ControlMsgAvail
Send and receive data
MPID_SendChannel
MPID_RecvFromChannel

24
Channel InterfaceThree different data exchange
mechanisms
Eager (default) Data is sent to the destination
immediately. Buffered on receiver
site. Rendezvous (MPI_Bsend) Data is sent to the
destination only when requested. Get (shared
memory) Data is read directly by the receiver.
25
Lower layers of MPICH
26
Summary