Title: Basic MPI Programming Spring 2005
1Basic MPI ProgrammingSpring 2005
- By Yaohang Li, Ph.D.
- Department of Computer Science
- North Carolina AT State University
- yaohang_at_ncat.edu
2Review
- Last Class
- Definition of Parallel Computing System
- Hardware Characteristics
- Kinds of Processors
- Types of Memory Organization
- Flow of Control
- Interconnection Network
- Software Characteristics of Scientific Computing
Applications - Computation-bounded
- Communication-bounded
- Computation and Communication-bounded
- MPI
- Compiling and Running MPI programs
- MPI program Structure
- Primary elements
- Error handling
- Finding out the environment
- This Class
- MPI Programming
3Better Hello (C)
- include "mpi.h"
- include ltstdio.hgt
- int main( int argc, char argv )
-
- int rank, size
- MPI_Init( argc, argv )
- MPI_Comm_rank( MPI_COMM_WORLD, rank )
- MPI_Comm_size( MPI_COMM_WORLD, size )
- printf( "I am d of d\n", rank, size )
- MPI_Finalize()
- return 0
4Better Hello (Fortran)
- program main
- use MPI
- integer ierr, rank, size
- call MPI_INIT( ierr )
- call MPI_COMM_RANK( MPI_COMM_WORLD, rank, ierr )
- call MPI_COMM_SIZE( MPI_COMM_WORLD, size, ierr )
- print , 'I am ', rank, ' of ', size
- call MPI_FINALIZE( ierr )
- end
5Communicators
- Communicators
- A parameter for most MPI calls
- A collection of processors working for a parallel
job - MPI_COMM_WORLD is defined in the MPI include file
as all the processors in your program - Can create subsets of MPI_COMM_WORLD
- Processors within a communicator are assigned
numbers 0 to n-1
6MPI Datatype
- Data Types
- When sending a message, it is given a data type
- Predefined types correspond to normal types
- MPI_REAL, MPI_FLOAT (Fortran Real and C float)
- MPI_DOUBLE_PRECISION, MPI_DOUBLE (Fortran double
precision and C double) - MPI_INTEGER and MPI_INT (Fortran and C integer)
- Can create user-defined type
7Point-to-Point Communication
- Basic communication in message passing libraries
- Data values are transferred from one process to
another - One process sends the data
- Another receives the data
8MPI_Send and MPI_Recv
- int MPI_Send(void buf, int count, MPI_Datatype
datatype, int dest, int tag, MPI_Comm comm) - Input
- buf - initial address of send buffer (choice)
- count - number of elements in send buffer
(nonnegative integer) - datatype - datatype of each send buffer element
(handle) - dest - rank of destination (integer)
- tag - message tag (integer)
- comm - communicator (handle)
- int MPI_Recv(void buf, int count, MPI_Datatype
datatype, int source, int tag, MPI_Comm comm,
MPI_Status status) - Output
- buf - initial address of receive buffer
- status - status object, provides information
about message received status is a structure of
type MPI_Status, the element status.MPI_SOURCE is
the source of the message received, and the
element status.MPI_TAG is the tag value. - Input
- count - maximum number of elements in receive
buffer (integer) - datatype - datatype of each receive buffer
element (handle) - source - rank of source (integer)
- tag - message tag (integer)
- comm - communicator (handle)
9A Simple Send and Receive Program
tag1234 source0 destination1
count1 if(myid source)
buffer5678 MPI_Send(buffer,count,MPI_INT,desti
nation,tag,MPI_COMM_WORLD)
printf("processor d sent d\n",myid,buffer)
if(myid destination) MPI_Recv(buffer,c
ount,MPI_INT,source,tag,MPI_COMM_WORLD,status)
printf("processor d got
d\n",myid,buffer) MPI_Finalize()
- include ltstdio.hgt
- include "mpi.h"
- /
- This is a simple send/receive program in MPI
- /
- int main(argc,argv)
- int argc
- char argv
-
- int myid
- int tag,source,destination,count
- int buffer
- MPI_Status status
- MPI_Init(argc,argv)
- MPI_Comm_rank(MPI_COMM_WORLD,myid)
10MPI is Simple
- Many parallel programs can be written using just
these six functions, only two of which are
non-trivial - MPI_INIT
- MPI_FINALIZE
- MPI_COMM_SIZE
- MPI_COMM_RANK
- MPI_SEND
- MPI_RECV
- Point-to-point (send/recv) isnt the only way...
11Introduction to Collective Operations in MPI
- Collective operations are called by all processes
in a communicator. - MPI_BCAST distributes data from one process (the
root) to all others in a communicator. - MPI_REDUCE combines data from all processes in
communicator and returns it to one process. - In many numerical algorithms, SEND/RECEIVE can be
replaced by BCAST/REDUCE, improving both
simplicity and efficiency.
12MPI_BCAST
- MPI_BCAST( buffer, count, datatype, root, comm )
- INOUT buffer starting address of buffer
(choice) - IN count number of entries in buffer (integer)
- IN datatype data type of buffer (handle)
- IN root rank of broadcast root (integer)
- IN comm communicator (handle)
- C prototype
- int MPI_Bcast(void buffer, int count,
MPI_Datatype datatype, int root, MPI_Comm comm) - Fortran prototype
- MPI_BCAST(BUFFER, COUNT, DATATYPE, ROOT, COMM,
IERROR) - lttypegt BUFFER()
- INTEGER COUNT, DATATYPE, ROOT, COMM, IERROR
13MPI_REDUCE
- MPI_REDUCE( sendbuf, recvbuf, count, datatype,
op, root, comm) - IN sendbuf address of send buffer (choice)
- OUT recvbuf address of receive buffer (choice,
significant only at root) - IN count number of elements in send buffer
(integer) - IN datatype data type of elements of send
buffer (handle) - IN op reduce operation (handle)
- Predefined operations including MPI_MAX, MPI_MIN
and MPI_SUM - IN root rank of root process (integer)
- IN comm communicator (handle)
- C Prototype
- int MPI_Reduce(void sendbuf, void recvbuf, int
count, MPI_Datatype datatype, MPI_Op op, into
root, MPI_Comm comm) - Fortran Prototype
- MPI_REDUCE(SENDBUF, RECVBUF, COUNT, DATATYPE, OP,
ROOT, COMM, IERROR) - lttypegt SENDBUF(), RECVBUF()
- INTEGER COUNT, DATATYPE, OP, ROOT, COMM, IERROR
14Example PI in C -1
- / Example program to calculate the value of pi
by integrating f(x) 4 / (1 x2). / - include "mpi.h"
- include ltmath.hgt
- int main(int argc, char argv)
- int done 0, n, myid, numprocs, i, rcdouble
PI25DT 3.141592653589793238462643double mypi,
pi, h, sum, x, aMPI_Init(argc,argv)MPI_Comm_
size(MPI_COMM_WORLD,numprocs)MPI_Comm_rank(MPI_
COMM_WORLD,myid) if (myid 0)
printf("Enter the number of intervals ")
scanf("d",n) MPI_Bcast(n, 1, MPI_INT,
0, MPI_COMM_WORLD) if (n 0) break
15Example PI in C - 2
- h 1.0 / (double) n sum 0.0 for (i
myid 1 i lt n i numprocs) x h
((double)i - 0.5) sum 4.0 / (1.0 xx)
mypi h sum MPI_Reduce(mypi, pi, 1,
MPI_DOUBLE, MPI_SUM, 0,
MPI_COMM_WORLD) if (myid 0) printf("pi
is approximately .16f, Error is .16f\n",
pi, fabs(pi - PI25DT))MPI_Finalize() - return 0
-
16Timing
- MPI Timing
- Performance Evaluation
- Debugging
- MPI Prototype
- MPI_WTIME()
- returns a floating-point number of seconds
- wall-clock time
- C double MPI_Wtime(void)
- Fortran DOUBLE PRECISION MPI_WTIME(
17PI Program with Timing (I)
- include ltmpi.hgt
- include ltstdio.hgt
- include ltmath.hgt
- int main(int argc, char argv)
-
- int done 0, n, myid, numprocs, i, rc
- double PI25DT 3.141592653589793238462643
- double mypi, pi, h, sum, x, a
- double mytime
- char str100
- MPI_Init(argc,argv)
- MPI_Comm_size(MPI_COMM_WORLD,numprocs)
- MPI_Comm_rank(MPI_COMM_WORLD,myid)
- if (myid 0)
- printf("Enter the number of intervals ")
- scanf("d",n)
-
18PI Program with Timing (II)
- MPI_Bcast(n, 1, MPI_INT, 0, MPI_COMM_WORLD)
- mytimeMPI_Wtime()
- h 1.0 / (double) n
- sum 0.0
- for (i myid 1 i lt n i numprocs)
- x h ((double)i - 0.5)
- sum 4.0 / (1.0 xx)
-
- mypi h sum
- mytimeMPI_Wtime()-mytime
- fprintf(stderr,"Computation time at process d
f\n",myid,mytime) - MPI_Reduce(mypi, pi, 1, MPI_DOUBLE, MPI_SUM,
0, MPI_COMM_WORLD) - if (myid 0)
- printf("pi is approximately .16f, Error is
.16f\n", pi, fabs(pi - PI25DT)) - MPI_Finalize()
- return 0
19Other MPI Functions
- MPI_GATHER
- MPI_SCATTER
- MPI_ALLgather
- MPI_ALLreduce
- Unblock communications
- Group Communications
20Advantages of MPI Programming
- Universality
- MP model fits well on separate processors
connected by fast/slow network - Matches the hardware of most of todays parallel
supercomputers as well as network of workstations
(NOW) - Expressivity
- MP has been found to be a useful and complete
model in which to express parallel algorithms - Ease of Debugging
- Debugging of parallel programs remains a
challenging research area - Debugging is easier in MPI paradigm than shared
memory paradigm
21Summary
- MPI Programming
- MPI Communicator
- MPI Point-to-Point Communication
- MPI Collective Communication
- MPI Timing
- Advantage of MPI
22What I want you to do?
- Review Slides
- Read the UNIX handbook if you are not familiar
with UNIX - Read the introduction of MPI
- Work on your Assignment 1