Message Passing Interface

About This Presentation

Title:

Message Passing Interface

Description:

Basics of MPI implementation (blocking communication) Basic input and output data ... Basics of MPI. MPI header (library) file should be included in user's ... – PowerPoint PPT presentation

Number of Views:158

Avg rating:3.0/5.0

Slides: 109

Provided by: jun6

Category:

more less

Transcript and Presenter's Notes

Title: Message Passing Interface

1
Message Passing Interface

Outline
Introduction to Message passing library (MPI)
Basics of MPI implementation (blocking
communication)
Basic input and output data
Basic nonbloking communication

2
Introduction

Basic concept of message passing
Most commonly used method of programming in
distributed-memory MIMD systems
In message passing, the processes coordinate
their activities by explicitly sending and
receiving messages

3
Message Passing Interface
4
Introduction to MPI

Message Passing Interface (MPI)
Commonly used message passing library, which can
statically allocate processes (number of
processes is set at the beginning of the program
execution, and no additional processes are
created during execution).
Each processes is assigned a unique integer rank
in the rang 0, 1, p-1 (p is the total number of
processes defined)
Basically, one can write a single program and
execute on different processes (SPMD)

5
Introduction to MPI

Message Passing Interface (MPI)
The selective execution is based on the
conditional branch within the source code.
Buffering in communication
Blocking and non-blocking communication

6
Introduction to MPI

Parallel computing utility library of
subroutine/functions, not a independent language
MPI subroutines and functions can be called from
Fortran and C, respectively
Compiled with FORTRAN or C compilers
MPI-1 doesnt support F90, but MPI-2 does support
F90 and C

7
Introduction to MPI (cont.)

Why people use MPI?
speed up computation
big demand of CPU time and more memory
more portable and scalable rather than using
automatic "parallelizer" , which might not work
good for distributed memory computers, such as
distributed clusters, network based computers or
workstations

8
Introduction to MPI (cont.)

Why people are afraid of MPI?
more complicated than serial computing
more complicated to master the technique
synchronization lost
amount of time required to convert serial code to
parallelized code

9
Introduction to MPI (cont.)

Alternative ways?
data parallel model using high level language
such as HPF
advanced library (or interface), such as (The
Portable, Extensible Toolkit for Scientific
Computation (PETSC)
Java multithread computing on internet based
distributed computation

10
Basics of MPI

MPI header (library) file should be included in
users FORTRAN or C codes. The library files
contains definitions of constants, prototypes.

include "mpif.h" for FORTRAN code include
"mpi.h" for C code
11
Basics of MPI

MPI is initiated by calling MPI_Init() first
before invoking any other MPI subroutines or
functions.
MPI processing ends with a call MPI_Finalize().

12
Basics of MPI

Only difference between MPI subroutines (for
FORTRAN) and MPI functions (for C) is the error
reporting flag.
In FORTRAN, it is returned as the last member of
the subroutine's argument list. In C, the integer
error flag is returned through the function
return value.

13
Basics of MPI

Consequently, MPI FORTRAN subroutines always
contain one additional variable in the argument
list than the C counterpart.

14
Basics of MPI (cont.)

C's MPI function names start with MPI_ followed
by a character string with the leading character
in upper case letter while the rest in lower case
letters
FORTRAN subroutines bear the same names but are
case-insensitive.
On SGI's Origin20000 (NCSA), parallel I/O is
supported.

15
Compilation and Execution (f77)

To compile and execute a f77 (or f90) code
without MPI

f77 -o example example.f f90 o example
example.f /bin/time example Or time example
16

To compile and execute a f77 (or f90) code with
MPI

f77 -o example1_1 example1_1.f lmpi g77 -o
example1_1 example1_1.f -lmpi f90 -o example1_1
example1_1.f lmpi mpif77 -o example1_1
example1_1.f (our cluster) mpif90 -o example1_1
example1_1.f (our cluster) bin/time mpirun -np
4 example1_1 time mpirun -np 4 example1_1
17

To compile and execute a C code without MPI

gcc -o exampleC exampleC.c -lm Or cc -o
exampleC exampleC.c -lm exampleC
18

To compile and execute a C code with MPI

cc o exampleC1_1 exampleC1_1.c lm lmpi gcc o
exampleC1_1 exampleC1_1.c lm lmpi mpicc
exampleC1_1.c (our cluster) Execution bin/tim
e mpirun -np 10 exampleC1_1 time mpirun -np 10
exampleC1_1
19
Basic communication among processes

Example 0 basic communication between processes
p, multiple processes starting from 0 to p-1
process 0 receive message from other processes

message
process 1
process 0
process 2
process 3
20
Learning MPI by Examples

Example 0 mechanism
system copies the executable code to each
processes
each process begins execution of the copied
executable code, simultaneously
different processes can execute different
statements by branching within the program based
on their ranks (this form of MIMD programming is
called single-program multiple-data (SPMD)
programming)

21
/
greetings.c -- greetings program Send a
message from all processes with rank ! 0 to
process 0. Process 0 prints the messages
received. Input none. Output contents
of messages received by process 0.

/ include ltstdio.hgt include ltstring.hgt include
"mpi.h"
include MPI library
22
Passing command-line parameters to main function
main(int argc, char argv) int
my_rank / rank of process
/ int p /
number of processes / int source
/ rank of sender /
int dest / rank of
receiver / int tag 0
/ tag for messages /
char message100 / storage for message
/ MPI_Status status / return
status for receive / / Start up MPI /
MPI_Init(argc, argv)
23
Obtain the rank number
/ Find out process rank /
MPI_Comm_rank(MPI_COMM_WORLD, my_rank)
printf("my_rank is d\n",my_rank) / Find
out number of processes / MPI_Comm_size(MPI_C
OMM_WORLD, p) printf("p, the toal number
of processes d\n",p) if (my_rank ! 0)
/ other processes, but not process 0 /
/ Create message /
sprintf(message, "Greetings from process d!",
my_rank) dest 0 /
destination to where the message send
24
/ Use strlen1 so that '\0' gets transmitted
/ MPI_Send(message, strlen(message)1,
MPI_CHAR, dest, tag,
MPI_COMM_WORLD) else
/ my_rank 0 , process 0/
for (source 1 source lt p source)
MPI_Recv(message, 100, MPI_CHAR, source, tag,
MPI_COMM_WORLD, status)
printf("s\n", message)
25
Learning MPI by Examples
/ Shut down MPI / MPI_Finalize()
/ main /
Commands
mpicc greetings.c mpirun -np 8 a.out
26
Result mpicc greetings.c mpirun -np 8
a.out my_rank is 3 p, the toal number of
processes 8 my_rank is 4 p, the toal number of
processes 8 my_rank is 0 p, the toal number of
processes 8 my_rank is 1 p, the toal number of
processes 8 Greetings from process 1! my_rank is
2
27
p, the toal number of processes 8 my_rank is
7 p, the toal number of processes 8 Greetings
from process 2! Greetings from process 3! my_rank
is 5 p, the toal number of processes 8 Greetings
from process 4! Greetings from process 5! my_rank
is 6 p, the toal number of processes 8 Greetings
from process 6! Greetings from process 7!
28

Example 0 (in Fortran)

c greetings.f -- greetings program c c Send a
message from all processes with rank ! 0 to
process 0. c Process 0 prints the messages
received. c c Input none. c Output contents
of messages received by process 0. c c Note
Due to the differences in character data in
Fortran and char c in C, their may be
problems in MPI_Send/MPI_Recv c
29
program greetings c include 'mpif.h' c integer
my_rank integer p integer source integer
dest integer tag character100
message character10 digit_string integer
size integer status(MPI_STATUS_SIZE) integer
ierr c
30
c function integer string_len c call
MPI_Init(ierr) c call MPI_Comm_rank(MPI_COMM_
WORLD, my_rank, ierr) call
MPI_Comm_size(MPI_COMM_WORLD, p, ierr) c if
(my_rank.ne.0) then call
to_string(my_rank, digit_string, size)
message 'Greetings from process ! ' //
digit_string(1size) // dest 0
tag 0 call MPI_Send(message,
string_len(message), MPI_CHARACTER,
dest, tag, MPI_COMM_WORLD, ierr) else
31
do 200 source 1, p-1 tag 0 call
MPI_Recv(message, 100, MPI_CHARACTER, source,
tag, MPI_COMM_WORLD, status, ierr)
call MPI_Get_count(status, MPI_CHARACTER, size,
ierr) write(6,100) message(1size) 100
format(' ',a) 200 continue endif c
call MPI_Finalize(ierr) stop
end c c
32
cccccccccccccccccccccccccccccccccccccccccccccccccc
cccccccc c c Converts the integer stored in
number into an ascii c string. The string is
returned in string. The number of c digits is
returned in size. subroutine to_string(number,
string, size) integer number character ()
string integer size character100
temp integer local integer last_digit integer
i local number i 0
33
c strip digits off starting with least
significant c do-while loop 100 last_digit
mod(local,10) local local/10 i i
1 temp(ii) char(last_digit
ichar('0')) if (local.ne.0) go to 100 size
i c reverse digits do 200 i 1, size
string(size-i1size-i1) temp(ii)
200 continue c return end
34
c to_string c c cccccccccccccccccccccccccccccccc
ccccccccccccccccccccccccc c Finds the number of
characters stored in a string c integer function
string_len(string) character()
string c character1 space parameter (space '
') integer i c i len(string)
35
c while loop 100 if ((string(ii).eq.space).
and.(i.gt.1)) then i i - 1 go
to 100 endif c if ((i.eq.1).and.(string(ii).eq.
space)) then string_len 0 else
string_len i endif c return end c end of
string_len
36
mpif77 greetings.f mpirun np 8 a.out
37

Not necessary to call MPI_Init function at the
beginning of your code.
Not necessary to call MPI_finalize function athe
the end of your code.
MPI section should be inserted only into wherever
you need the code to be in parallel.

38
Numerical Integration

Example 1 numerical integration using mid-point
method
mathematical problem
numerical method
serial programming and parallel programming

Problem
Testing integration of cos(x) from 0 to p/2

40
(No Transcript)
41
Example of C serial program
/ serial.c -- serial version of trapezoidal
rule Calculate definite integral using
trapezoidal rule. The function f(x) is
hardwired. Input a, b, n. Output estimate
of integral from a to b of f(x) using n
trapezoids. See Chapter 4, pp. 53 ff. in
PPMPI. / include ltstdio.hgt
42
main() float integral / Store result
in integral / float a, b / Left
and right endpoints / int n
/ Number of trapezoids / float h
/ Trapezoid base width / float
x int i float f(float x) /
Function we're integrating / printf("Enter
a, b, and n\n") scanf("f f d", a, b,
n)
43
h (b-a)/n integral (f(a) f(b))/2.0
x a for (i 1 i lt n-1 i)
x x h integral integral
f(x) integral integralh
printf("With n d trapezoids, our estimate\n",
n) printf("of the integral from f to f
f\n", a, b, integral)
44
float f(float x) float return_val
/ Calculate f(x). Store calculation in
return_val. / return_val xx return
return_val
45
Example of serial code in Fortran
C serial.f -- calculate definite integral using
trapezoidal rule. C C The function f(x) is
hardwired. C Input a, b, n. C Output estimate
of integral from a to b of f(x) C using n
trapezoids. C C See Chapter 4, pp. 53 ff. in
PPMPI. C PROGRAM serial INCLUDE
'mpif.h' real integral real a
real b
46
integer n real h real
x integer i C real f C
print , 'Enter a, b, and n' read , a,
b, n C h (b-a)/n integral (f(a)
f(b))/2.0 x a do 100 i 1 , n-1
x x h integral integral
f(x) 100 continue
47
integral integralh C print ,'With n
', n,' trapezoids, our estimate' print
,'of the integral from ', a, ' to ',b, ' ' ,
integral end C C
real function
f(x) real x C Calculate f(x). Store
calculation in return_val. C f xx
return end
48

To compile and execute serial.f
Result

g77 -o serial serial.f example
The result 1.000000 real 0.021 user
0.002 sys 0.013
49

Parallel programming with MPI blocking
Send/Receive
implement-dependent because using assignment of
inputs
Using the following MPI functions
MPI_Init and MPI_Finalize
MPI_Comm_rank
MPI_Comm_size
MPI_Recv
MPI_Send

Parallel programming with MPI blocking
Send/Receive
master process receives each partial result,
based on subinterval integration from other
process
master sum all of the sub-result together
other processes are idle during master's
performance (due to blocking communication)

51
Example of parallel programming in C (trap.c)
/ trap.c -- Parallel Trapezoidal Rule, first
version Input None. Output Estimate
of the integral from a to b of f(x) using
the trapezoidal rule and n trapezoids.
Algorithm 1. Each process calculates
"its" interval of integration. 2.
Each process estimates the integral of f(x)
over its interval using the trapezoidal
rule. 3a. Each process ! 0 sends its
integral to process 0. 3b. Process 0 sums
the calculations received from the
individual processes and prints the result.
52
Notes 1. f(x), a, b, and n are all
hardwired. 2. The number of processes (p)
should evenly divide the number of
trapezoids (n 1024) See Chap. 4, pp. 56
ff. in PPMPI. / include ltstdio.hgt / We'll be
using MPI routines, definitions, etc. / include
"mpi.h"
53
main(int argc, char argv) int
my_rank / My process rank /
int p / The number of processes
/ float a 0.0 / Left endpoint
/ float b 1.0 / Right
endpoint / int n 1024
/ Number of trapezoids / float
h / Trapezoid base length / /
local_a and local_b are the bounds for each
integration performed in individual process /
float local_a / Left endpoint my
process / float local_b / Right
endpoint my process / int local_n
/ Number of trapezoids for /
/ my calculation /
float integral / Integral over my
interval /
54
float total / Total integral
/ int source / Process
sending integral / int dest 0
/ All messages go to 0 / int
tag 0 MPI_Status status / Trap
function prototype. Trap function is used to
calculate local integral / float
Trap(float local_a, float local_b, int local_n,
float h) / Let the system do what it needs
to start up MPI / MPI_Init(argc, argv)
/ Get my process rank /
MPI_Comm_rank(MPI_COMM_WORLD, my_rank)
55
/ Find out how many processes are being used
/ MPI_Comm_size(MPI_COMM_WORLD, p) h
(b-a)/n / h is the same for all processes
/ local_n n/p / So is the number of
trapezoids / / Length of each process'
interval of integration local_nh. So
my interval starts at / local_a a
my_ranklocal_nh local_b local_a
local_nh integral Trap(local_a, local_b,
local_n, h) if (my_rank 0) /
Add up the integrals calculated by each process
/ total integral / this is the
intergal calculated by process 0 /
56
for (source 1 source lt p source)
MPI_Recv(integral, 1, MPI_FLOAT,
source, tag, MPI_COMM_WORLD,
status) total total integral
else printf("The
intergal calculated from process d is f\n",
my_rank,integral )
MPI_Send(integral, 1, MPI_FLOAT, dest, tag,
MPI_COMM_WORLD)
57
/ Print the result / if (my_rank 0)
printf("With n d trapezoids, our
estimate\n", n) printf("of the integral
from f to f f\n",a,b,total) /
Shut down MPI / MPI_Finalize()
58
float Trap( float local_a / in
/, float local_b / in /,
int local_n / in /, float h
/ in /) float integral / Store
result in integral / float x int i
float f(float x) / function we're integrating
/ integral (f(local_a)
f(local_b))/2.0 x local_a
59
for (i 1 i lt local_n-1 i)
x x h integral integral
f(x) integral integralh return
integral / Trap / float f(float x)
float return_val / Calculate f(x). /
/ Store calculation in return_val. /
return_val xx return return_val / f /
60

To compile a C code with MPI library
In our cluster system

cc -o trap trap.c -lmpi -lm
mpicc trap.c mpirun -np 8 a.out
61

Result

With n 1024 trapezoids, our estimate of the
integral from 0.000000 to 1.000000 0.333333 The
intergal calculated from process 3 is
0.024089 The intergal calculated from process 4
is 0.039714 The intergal calculated from process
7 is 0.110026 The intergal calculated from
process 5 is 0.059245 The intergal calculated
from process 1 is 0.004557 The intergal
calculated from process 2 is 0.012370 The
intergal calculated from process 6 is 0.082682
62

Example of parallel programming in Fortran
(trap.f)

c trap.f -- Parallel Trapezoidal Rule, first
version c c Input None. c Output Estimate
of the integral from a to b of f(x) c using
the trapezoidal rule and n trapezoids. c c
Algorithm c 1. Each process calculates
"its" interval of c integration. c
2. Each process estimates the integral of f(x) c
over its interval using the trapezoidal
rule. c 3a. Each process ! 0 sends its
integral to 0. c 3b. Process 0 sums the
calculations received from
63
c the individual processes and prints the
result. c c Notes c 1. f(x), a, b, and
n are all hardwired. c 2. Assumes number of
processes (p) evenly divides c number of
trapezoids (n 1024) c c See Chap. 4, pp. 56
ff. in PPMPI. c program trapezoidal c
include 'mpif.h' c integer my_rank
integer p real a
64
real b integer n real
h real local_a real
local_b integer local_n real
integral real total
integer source integer dest
integer tag integer
status(MPI_STATUS_SIZE) integer ierr c
real Trap c
65
data a, b, n, dest, tag /0.0, 1.0, 1024, 0,
0/ call MPI_INIT(ierr) call
MPI_COMM_RANK(MPI_COMM_WORLD, my_rank, ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, p, ierr)
h (b-a)/n local_n n/p
local_a a my_ranklocal_nh local_b
local_a local_nh integral
Trap(local_a, local_b, local_n, h) if
(my_rank .EQ. 0) then total integral
66
do 100 source 1, p-1 call
MPI_RECV(integral, 1, MPI_REAL, source, tag,
MPI_COMM_WORLD, status, ierr)
total total integral 100
continue else call
MPI_SEND(integral, 1, MPI_REAL, dest,
tag, MPI_COMM_WORLD, ierr) endif
if (my_rank .EQ. 0) then
write(6,200) n 200 format(' ','With n
',I4,' trapezoids, our estimate')
write(6,300) a, b, total 300 format('
','of the integral from ',f6.2,' to ',f6.2,
' ',f11.5) endif
67
call MPI_FINALIZE(ierr) end
c c real function Trap(local_a, local_b,
local_n, h) real local_a real
local_b integer local_n real
h c real integral real
x real i c real f
68
integral (f(local_a) f(local_b))/2.0
x local_a do 100 i 1,
local_n-1 x x h
integral integral f(x) 100 continue
Trap integralh return
end c real function f(x) real x
real return_val return_val xx
f return_val return end
69

Example of parallel programming in Fortran
(trap.f)

With n 1024 trapezoids, our estimate of the
integral from 0.00 to 1.00 0.33333
To compile a f77 code with MPI library In our
cluster system
f77 -o trap trap.f -lmpi
mpif77 trap.f mpirun -np 8 trap
70

Basic mechanism of message passing through
buffering
Compose a message and put it in a buffer
Drop a message in a box, called by MPI_Send
Sending addresses should be addressed.
Envelopes should be created, which contains
destination of message, information size of
message, as well as add source process to the
envelope.
Tags or message types are the standard on message
passing
Tag is used to identify the process action on the
data

Message envelope contains at least the following
information
The rank of the receiver
The rank of the sender
A tag, like project identification
A communicator, collection of processes that can
send message to each other. The predefined
MPI_COMM_WORLD on all MPI system consists of all
the processes running when execution of the
program starts.
Message refers the actual data being transmitted
Status information on the data that was actually
received

MPI datatype
MPI_CHAR signed char
MPI_SHORT signed short int
MPI_INT signed int
MPI_LONG signed long int
MPI_UNSIGNED_CHAR unsigned char
MPI_UNSIGNED_SHORT unsigned short int
MPI_UNSIGNED unsigned int
MPI_UNSIGNED_LONG unsigned long int
MPI_FLOAT float
MPI_DOUBLE double
MPI_LONG_DOUBLE long double
MPI_BYTE
MPI_PACKED

73
Int MPI_Send ( void message / in
/, int count / in /, MPI_Datatype datatype
/ in /, int dest / in /, int tag / in
/, MPI_Comm comm / in /) Int MPI_Recv
( void message / out /, int count /
in /, MPI_Datatype datatype / in
/, int source / in /, int tag / in
/, MPI_Comm comm / in /, MPI_Status status
/ out /)
74

Parallel programming with MPI non blocking
Send/Receive
do not make processes idle
Using the following MPI functions
MPI_Init and MPI_Finalize
MPI_Comm_rank
MPI_Comm_size
MPI_Recv
MPI_ISend

Basic input and output in MPI
Global and local variables
Some variables are significant on all the
processes
Some variables are significant on individual
processes
I/O on parallel system
Many parallel system provide standards of I/O
(keyboard input and terminal output) on process 0
Some systems allow all the processes to read and
write
How do we deal with

If we want to input values such as a, b and n
from keyboard, should we add
Scanf(f f d, a, a, n) ?????????
Usually we assume process 0 can read and write
Modified parallel code

77
/ get_data.c -- Parallel Trapezoidal Rule, uses
basic Get_data function for input.
Input a, b limits of integration.
n number of trapezoids. Output Estimate of
the integral from a to b of f(x) using the
trapezoidal rule and n trapezoids. Notes
1. f(x) is hardwired. 2. Assumes
number of processes (p) evenly divides
number of trapezoids (n). See Chap. 4, pp.
60 ff in PPMPI. /
78
include ltstdio.hgt / We'll be using MPI
routines, definitions, etc. / include
"mpi.h" main(int argc, char argv) int
my_rank / My process rank /
int p / The number of
processes / float a / Left
endpoint / float b
/ Right endpoint / int
n / Number of trapezoids /
float h / Trapezoid base length
/ float local_a / Left endpoint
my process / float local_b /
Right endpoint my process / int
local_n / Number of trapezoids for /
/ my calculation
/
79
float integral / Integral over my
interval / float total / Total
integral / int source
/ Process sending integral / int
dest 0 / All messages go to 0 /
int tag 0 MPI_Status status /
function prototypes / void Get_data(float
a_ptr, float b_ptr, int
n_ptr, int my_rank, int p) float Trap(float
local_a, float local_b, int local_n,
float h) / Calculate local integral /
/ Let the system do what it needs to start up
MPI / MPI_Init(argc, argv)
80
/ Get my process rank /
MPI_Comm_rank(MPI_COMM_WORLD, my_rank) /
Find out how many processes are being used /
MPI_Comm_size(MPI_COMM_WORLD, p)
Get_data(a, b, n, my_rank, p) h
(b-a)/n / h is the same for all processes
/ local_n n/p / So is the number of
trapezoids / / Length of each process'
interval of integration local_nh. So
my interval starts at / local_a a
my_ranklocal_nh local_b local_a
local_nh integral Trap(local_a, local_b,
local_n, h)
81
/ Add up the integrals calculated by each
process / if (my_rank 0)
total integral for (source 1 source
lt p source)
MPI_Recv(integral, 1, MPI_FLOAT, source, tag,
MPI_COMM_WORLD, status)
total total integral
else MPI_Send(integral, 1,
MPI_FLOAT, dest, tag,
MPI_COMM_WORLD)
82
/ Print the result / if (my_rank 0)
printf("With n d trapezoids, our
estimate\n", n) printf("of the integral
from f to f f\n", a, b, total)
/ Shut down MPI /
MPI_Finalize() / main / /
/ /
Function Get_data Reads in the user input a,
b, and n. Input parameters
83
1. int my_rank rank of current
process. 2. int p number of processes.
Output parameters 1. float a_ptr
pointer to left endpoint a. 2. float
b_ptr pointer to right endpoint b. 3.
int n_ptr pointer to number of trapezoids.
Algorithm 1. Process 0 prompts user for
input and reads in the values.
2. Process 0 sends input values to other
processes. / void Get_data( float
a_ptr / out /, float b_ptr /
out /, int n_ptr / out /,
84
int my_rank / in /, int
p / in /) int source 0 /
All local variables used by / int dest
/ MPI_Send and MPI_Recv / int
tag MPI_Status status if (my_rank
0) printf("Enter a, b, and n\n")
scanf("f f d", a_ptr, b_ptr, n_ptr)
for (dest 1 dest lt p dest)
tag 0
85
MPI_Send(a_ptr, 1, MPI_FLOAT, dest,
tag, MPI_COMM_WORLD)
tag 1 MPI_Send(b_ptr, 1,
MPI_FLOAT, dest, tag,
MPI_COMM_WORLD) tag 2
MPI_Send(n_ptr, 1, MPI_INT, dest,
tag, MPI_COMM_WORLD)
else tag 0
MPI_Recv(a_ptr, 1, MPI_FLOAT, source,
tag, MPI_COMM_WORLD, status)
tag 1
86
MPI_Recv(b_ptr, 1, MPI_FLOAT, source,
tag, MPI_COMM_WORLD, status)
tag 2 MPI_Recv(n_ptr, 1, MPI_INT,
source, tag,
MPI_COMM_WORLD, status) / Get_data /
/
/ float Trap( float local_a / in /,
float local_b / in /,
int local_n / in /, float h
/ in /)
87
float integral / Store result in
integral / float x int i
float f(float x) / function we're integrating
/ integral (f(local_a) f(local_b))/2.0
x local_a for (i 1 i lt
local_n-1 i) x x h
integral integral f(x) integral
integralh return integral / Trap
/
88
/
/ float f(float x) float return_val
/ Calculate f(x). / / Store
calculation in return_val. / return_val
xx return return_val / f /
89
Enter a, b, and n 0 1.0 1024 With n 1024
trapezoids, our estimate of the integral from
0.000000 to 1.000000 0.333333
90

Non blocking Send/Receive

/ get_dataNonBlocking.c -- Parallel Trapezoidal
Rule, uses basic Get_data function for
input. It uses non blocking MPI functions
Input a, b limits of integration.
n number of trapezoids. Output Estimate of
the integral from a to b of f(x) using the
trapezoidal rule and n trapezoids. Notes
91
1. f(x) is hardwired. 2. Assumes
number of processes (p) evenly divides
number of trapezoids (n). See Chap. 4, pp.
60 ff in PPMPI. / include ltstdio.hgt /
We'll be using MPI routines, definitions, etc.
/ include "mpi.h" main(int argc, char argv)
int my_rank / My process rank
/ int p / The
number of processes / float a
/ Left endpoint /
92
float b / Right endpoint
/ int n / Number of
trapezoids / float h /
Trapezoid base length / float
local_a / Left endpoint my process /
float local_b / Right endpoint my
process / int local_n / Number
of trapezoids for /
/ my calculation / float
integral / Integral over my interval /
float total / Total integral
/ int source / Process
sending integral / int dest 0
/ All messages go to 0 / int
tag 0 MPI_Status status MPI_Request
req
93
/ function prototypes / void
Get_data(float a_ptr, float b_ptr,
int n_ptr, int my_rank, int p) float
Trap(float local_a, float local_b, int local_n,
float h) / Calculate local
integral / / Let the system do what it
needs to start up MPI / MPI_Init(argc,
argv) / Get my process rank /
MPI_Comm_rank(MPI_COMM_WORLD, my_rank) /
Find out how many processes are being used /
MPI_Comm_size(MPI_COMM_WORLD, p)
Get_data(a, b, n, my_rank, p)
94
h (b-a)/n / h is the same for all
processes / local_n n/p / So is the
number of trapezoids / / Length of each
process' interval of integration
local_nh. So my interval starts at /
local_a a my_ranklocal_nh local_b
local_a local_nh integral Trap(local_a,
local_b, local_n, h) / Add up the
integrals calculated by each process / if
(my_rank 0) total integral
95
for (source 1 source lt p source)
MPI_Recv(integral, 1, MPI_FLOAT,
source, tag, MPI_COMM_WORLD,
status) total total integral
else
MPI_Isend(integral,1,MPI_FLOAT,dest,
tag,MPI_COMM_WORLD,req)
MPI_Wait(req,status) / Print the
result / if (my_rank 0)
96
printf("With n d trapezoids, our
estimate\n", n) printf("of the integral
from f to f f\n", a, b, total)
/ Shut down MPI /
MPI_Finalize() / main / /
/ / Function
Get_data Reads in the user input a, b, and n.
Input parameters 1. int my_rank rank
of current process. 2. int p number of
processes.
97
Output parameters 1. float
a_ptr pointer to left endpoint a. 2.
float b_ptr pointer to right endpoint b.
3. int n_ptr pointer to number of
trapezoids. Algorithm 1. Process 0
prompts user for input and reads in
the values. 2. Process 0 sends input
values to other processes. / void
Get_data( float a_ptr / out /,
float b_ptr / out /, int
n_ptr / out /, int my_rank
/ in /, int p / in /)
98
int source 0 / All local
variables used by / int dest /
MPI_Send and MPI_Recv / int tag
MPI_Status status if (my_rank 0)
printf("Enter a, b, and n\n")
scanf("f f d", a_ptr, b_ptr, n_ptr)
for (dest 1 dest lt p dest)
tag 0 MPI_Send(a_ptr, 1,
MPI_FLOAT, dest, tag,
MPI_COMM_WORLD) tag 1
99
MPI_Send(b_ptr, 1, MPI_FLOAT, dest,
tag, MPI_COMM_WORLD)
tag 2 MPI_Send(n_ptr, 1, MPI_INT,
dest, tag, MPI_COMM_WORLD)
else tag 0
MPI_Recv(a_ptr, 1, MPI_FLOAT, source,
tag, MPI_COMM_WORLD, status)
tag 1 MPI_Recv(b_ptr, 1, MPI_FLOAT,
source, tag,
MPI_COMM_WORLD, status) tag 2
MPI_Recv(n_ptr, 1, MPI_INT, source,
tag, MPI_COMM_WORLD, status) /
Get_data /
100
/
/ float Trap( float local_a / in /,
float local_b / in /, int
local_n / in /, float h
/ in /) float integral / Store
result in integral / float x int i
float f(float x) / function we're
integrating / integral (f(local_a)
f(local_b))/2.0
101
x local_a for (i 1 i lt
local_n-1 i) x x h
integral integral f(x) integral
integralh return integral / Trap
/ //
float f(float x) float return_val
/ Calculate f(x). / / Store calculation in
return_val. / return_val xx return
return_val / f /
102

Non blocking Send/Receive (in Fortran)

Program Example1_2 c
c example1_1.f c
parallel programming in Fortran c to solve
numerical integration using mid-point method c
function selected is cos(x) c it demonstrate
non-block communication c c This is an MPI
example on parallel integration
103
c It demonstrates the use of c c MPI_Init c
MPI_Comm_rank c MPI_Comm_size c MPI_Recv c
MPI_Isend c MPI_Wait c MPI_Finalize c c

implicit none integer n, p, i, j, k, ierr,
master real h, a, b, integral, pi
integer req(1)
104
include "mpif.h" !! This brings in
pre-defined MPI constants, ... integer Iam,
source, dest, tag, status(MPI_STATUS_SIZE)
real my_result, Total_result, result data
master/0/ cStarts MPI processes ...
call MPI_Init(ierr)
!! starts MPI call MPI_Comm_rank(MPI_COMM_WO
RLD, Iam, ierr)
!! get current proc id
call MPI_Comm_size(MPI_COMM_WORLD, p, ierr)

!! get number of procs pi acos(-1.0)
!! 3.14159... a 0.0 !!
lower limit of integration b pi/2.
!! upper limit of integration
105
n 500 !! number of increment
within each process dest master !!
define the process that computes the final
result tag 123 !! set the tag to
identify this particular job h (b-a)/n/p
!! length of increment my_result
integral(a,Iam,h,n) write(,)'Iam',Iam,',
my_result',my_result if(Iam .eq. master)
then ! the following is serial
result my_result do k1,p-1 !! more
efficient, less prone to deadlock !! root
receives my_result from proc call
MPI_Recv(my_result, 1, MPI_REAL,
MPI_ANY_SOURCE, tag, MPI_COMM_WORLD,
status, ierr) result result
my_result enddo
106
else call MPI_Isend(my_result, 1,
MPI_REAL, dest, tag, MPI_COMM_WORLD,
req, ierr) !!
send my_result to intended dest. call
MPI_Wait(req, status, ierr) !! wait for nonblock
send ... endif cresults from all procs
have been collected and summed ... if(Iam
.eq. 0) then write(,)'Final Result
',result endif call
MPI_Finalize(ierr) !! let
MPI finish up ... stop end
107
real function integral(a,i,h,n) implicit
none integer n, i, j real h, h2, aij,
a real fct, x fct(x) cos(x)
!! kernel of the integral integral
0.0 !! initialize integral
h2 h/2. do j0,n-1 !!
sum over all "j" integrals aij a (in
j)h !! lower limit of "j" integral
integral integral fct(aijh2)h
enddo return end
108
Result Process 6 has the partial result of
0.056906 Process 1 has the partial result of
0.187593 Process 0 has the partial result of
0.195090 Process 2 has the partial result of
0.172887 Process 3 has the partial result of
0.151536 Process 4 has the partial result of
0.124363 Process 5 has the partial result of
0.092410 Process 7 has the partial result of
0.019215 The result 0.9999998

Write a Comment

User Comments (0)