Message Passing Programming - PowerPoint PPT Presentation

1 / 55
About This Presentation
Title:

Message Passing Programming

Description:

Copy Makefile and machines.sample in your working directory. Create .rhosts file in you home directory ... Combinations allocated in cyclic fashion to processes ... – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 56
Provided by: saikatmuk
Category:

less

Transcript and Presenter's Notes

Title: Message Passing Programming


1
Message Passing Programming
2
Learning Objectives
  • Understanding how MPI programs execute
  • Familiarity with fundamental MPI functions

3
Review of Flynns Taxonomy
  • SISD
  • SIMD
  • MISD
  • MIMD
  • SPMD (Single program different data)

MIMD can be converted to SPMD MPI is primarily
for MIMD/SPMD codes
4
Message-passing Model
5
SPMD Model
  • Single Program Multiple Data
  • Each processor has a copy of the same program
  • All run them at their own rate
  • May take different paths through the code
  • Process specific control through
  • My process number
  • Total number of processors
  • Explicit Communication and Synchornization

6
Task/Channel vs. Message-passing
  • Communication
  • Point to Point
  • Broadcast

7
Advantages of Message-passing Model
  • Portability to many architectures
  • Natural fit for multicomputers
  • Distinguishes between local memory (fast access)
    and remote memory(slow access)
  • Ability to manage memory hierarchy
  • Each process controls its own memory
  • No cache coherence problems
  • Easier to create a deterministic program
  • Simplifies debugging

8
The Message Passing Interface
  • 1980s vendors had unique libraries
  • 1989 Parallel Virtual Machine (PVM) developed at
    Oak Ridge National Lab
  • 1992 Work on MPI standard begun
  • 1994 Version 1.0 of MPI standard
  • 1997 Version 2.0 of MPI standard
  • Today MPI is dominant message passing library
    standard
  • Public Domain versions at http//www-unix.mcs.anl
    .gov/mpi/

9
MPI Features
  • A message passing library specification
  • Message passing model
  • Not a language or complier specification
  • For parallel computers, clusters and heterogenous
    networks
  • Designed for ease of parallel software
    development
  • Designed to provide access to advance hardware
  • Not designed for fault tolerance
  • No process management
  • No virtual memory management

10
Flexibility of MPI
  • Large (has 125 Functions)
  • But most programs can be written using just 6
    functions
  • Need not master all parts of MPI to use it

11
Getting Started
12
Getting Started
  • Setup path to MPICH
  • set MPI_ROOT /home/software/mpich
  • set path (path MPI_ROOT/bin)
  • Copy Makefile and machines.sample in your working
    directory
  • Create .rhosts file in you home directory
  • Add names of machines from machines.sample
    machine_name userid

13
Hello World Version 1(hello0.c)
include ltstdio.hgt include "mpi.h int main
(int argc, char argv) MPI_Init (argc,
argv) printf ("\n Hello World\n")
MPI_Finalize ()
Compile make hello0
Run mpirun -np x -machinefile machines.sample
hello0
14
Include Files
include ltmpi.hgt
  • MPI header file

include ltstdio.hgt
  • Standard I/O header file

15
Initialize MPI
MPI_Init (argc, argv)
  • First MPI function called by each process
  • Not necessarily first executable statement
  • Allows system to do any necessary setup

16
Shutting Down MPI
MPI_Finalize()
  • Call after all other MPI library calls
  • Allows system to free up MPI resources

17
Communicators
  • Communicator opaque object that provides
    message-passing environment for processes
  • MPI_COMM_WORLD
  • Default communicator
  • Includes all processes

18
Communicator
MPI_COMM_WORLD
0
5
2
1
4
3
19
Determine Number of Processes
MPI_Comm_size (MPI_COMM_WORLD, p)
  • First argument is communicator
  • Number of processes returned through second
    argument

20
Determine Process Rank
MPI_Comm_rank (MPI_COMM_WORLD, id)
  • First argument is communicator
  • Process rank (in range 0, 1, , p-1) returned
    through second argument

21
Replication of Automatic Variables
22
Hello World Version 2 (hello1.c)
include ltstdio.hgt include "mpi.h" int main (int
argc, char argv) int rank, n, i,
message char buff1000 MPI_Init (argc,
argv) MPI_Comm_size (MPI_COMM_WORLD, n)
MPI_Comm_rank (MPI_COMM_WORLD, rank)
printf ("\n Hello from process 3d \n", rank)
MPI_Finalize ()
23
Point-to-point Communication
  • Involves a pair of processes
  • One process sends a message
  • Other process receives the message

24
Send/Receive (Blocking)
25
Function MPI_Send
int MPI_Send ( void message,
int count, MPI_Datatype
datatype, int dest, int
tag, MPI_Comm comm )
26
Function MPI_Recv
int MPI_Recv ( void message,
int count, MPI_Datatype
datatype, int source, int
tag, MPI_Comm comm,
MPI_Status status ) MPI_Recv blocks until the
message has been received, or error occurs
27
Inside MPI_Send and MPI_Recv
Sending Process
Receiving Process
Program Memory
System Buffer
System Buffer
Program Memory
28
Return from MPI_Send
  • Function blocks until message buffer free
  • Message buffer is free when
  • Message copied to system buffer, or
  • Message transmitted
  • Typical scenario
  • Message copied to system buffer
  • Transmission overlaps computation

29
Return from MPI_Recv
  • Function blocks until message in buffer
  • If message never arrives, function never returns

30
Deadlock
  • Deadlock process waiting for a condition that
    will never become true
  • Easy to write send/receive code that deadlocks
  • Two processes both receive before send
  • Send tag doesnt match receive tag
  • Process sends message to wrong destination process

31
Hello World Version 3 (hello2.c)
include ltstdio.hgt include "mpi.h" int main (int
argc, char argv) int rank, n, i,
message char buff1000 MPI_Status
status MPI_Init (argc, argv)
MPI_Comm_size (MPI_COMM_WORLD, n)
MPI_Comm_rank (MPI_COMM_WORLD, rank) if
(rank0) / Process 0 will output data /
printf ("\n Hello from process 3d", rank)
for (i1iltni) MPI_Recv
(message, 1, MPI_INT, i, 111,
MPI_COMM_WORLD, status) printf ("\n
Hello Sent from process 3d\n", message)
else MPI_Send (rank, 1, MPI_INT, 0,
111, MPI_COMM_WORLD) MPI_Finalize ()
32
MPI Functions
  • MPI_Init
  • MPI_Comm_Size
  • MPI_Comm_Rank
  • MPI_Send
  • MPI_Recv
  • MPI_Finalize

33
Global Communications
  • MPI_Reduce
  • MPI_Bcast

34
Prototype of MPI_Reduce()
int MPI_Reduce ( void operand,
/ addr of 1st reduction element /
void result, / addr of
1st reduction result / int count,
/ reductions to perform /
MPI_Datatype type, / type of
elements / MPI_Op operator,
/ reduction operator / int
root, / process getting
result(s) / MPI_Comm comm
/ communicator / )
35
Addition (add1.c)
. MPI_Init (argc, argv) MPI_Comm_size
(MPI_COMM_WORLD, n) MPI_Comm_rank
(MPI_COMM_WORLD, rank) divisiontotal_numbers
/n startrankdivision end(rank1)division
sum0 total_sum0 for(istart
iltendi) sumsumi printf ("\n
Process 3d calculated from d to d \n", rank,
start,,end) MPI_Reduce(sum,total_sum,
1,MPI_INT,MPI_SUM,0,MPI_COMM_WORLD) if
(rank0) / Process 0 will output data /
printf ("\n Total sum is 3d\n", total_sum
) MPI_Finalize ()
36
Function MPI_Bcast
int MPI_Bcast ( void buffer, / Addr of 1st
element / int count, / elements to
broadcast / MPI_Datatype datatype, / Type of
elements / int root, / ID of root
process / MPI_Comm comm) / Communicator /
MPI_Bcast (k, 1, MPI_INT, 0, MPI_COMM_WORLD)
37
Addition(add2.c)
MPI_Init (argc, argv) MPI_Comm_size
(MPI_COMM_WORLD, n) MPI_Comm_rank
(MPI_COMM_WORLD, rank) if(rank0)
printf("How many numbers ? \n") fscanf(stdin,
"d", total_numbers) MPI_Bcast(total_number
s, 1, MPI_INT, 0, MPI_COMM_WORLD) printf ("\n
Process 3d knows total numbers is d \n", rank,
total_numbers) divisiontotal_numbers/n
startrankdivision end(rank1)division
sum0 total_sum0 for(istart iltendi)
sumsumi printf ("\n Process 3d
calculated from d to d \n", rank, start, end)
MPI_Reduce(sum,total_sum,1,MPI_INT,MPI_SUM,0,M
PI_COMM_WORLD) if (rank0) / Process 0
will output data / printf ("\n Total sum is
3d\n", total_sum ) MPI_Finalize ()
38
MPI_Datatype Options
  • MPI_CHAR
  • MPI_DOUBLE
  • MPI_FLOAT
  • MPI_INT
  • MPI_LONG
  • MPI_LONG_DOUBLE
  • MPI_SHORT
  • MPI_UNSIGNED_CHAR
  • MPI_UNSIGNED
  • MPI_UNSIGNED_LONG
  • MPI_UNSIGNED_SHORT

39
MPI_Op Options
  • MPI_BAND Bitwise AND
  • MPI_BOR Bitwise OR
  • MPI_BXOR Bitwise XOR
  • MPI_LAND Logical AND
  • MPI_LOR Logical OR
  • MPI_MAX Maximum
  • MPI_MAXLOC Maximum and Location
  • MPI_MIN Minimum
  • MPI_MINLOC Minimum and Location
  • MPI_PROD Product
  • MPI_SUM Sum

40
Benchmarking the Program
  • MPI_Barrier ? barrier synchronization
  • MPI_Wtick ? timer resolution
  • MPI_Wtime ? current time

41
Addition(add3.c)
long sum,total_sum double startwtime,
endwtime, exetime MPI_Init (argc, argv)
MPI_Barrier(MPI_COMM_WORLD) MPI_Comm_size
(MPI_COMM_WORLD, n) MPI_Comm_rank
(MPI_COMM_WORLD, rank) if(rank0)
printf("How many numbers ? \n") fscanf(stdin,
"d", total_numbers) startwtimeMPI_Wtime()
MPI_Bcast(total_numbers, 1, MPI_INT, 0,
MPI_COMM_WORLD) divisiontotal_numbers/n
startrankdivision end(rank1)division
sum0 total_sum0 for(istart iltendi)
sumsumi MPI_Reduce(sum,total_sum,
1,MPI_LONG,MPI_SUM,0,MPI_COMM_WORLD) if
(rank0) / Process 0 will output data
/ endwtimeMPI_Wtime() printf ("\n
Total sum is 3d\n", total_sum ) printf("Time
taken f\n", endwtime-startwtime)
MPI_Finalize ()
42
Benchmarking Results
43
Circuit Satisfiability
1
1
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
44
Solution Method
  • Circuit satisfiability is NP-complete
  • No known algorithms to solve in polynomial time
  • We seek all solutions
  • We find through exhaustive search
  • 16 inputs ? 65,536 combinations to test

45
Partitioning Functional Decomposition
  • Embarrassingly parallel No channels between
    tasks

46
Agglomeration and Mapping
  • Properties of parallel algorithm
  • Fixed number of tasks
  • No communications between tasks
  • Time needed per task is variable
  • Consult mapping strategy decision tree
  • Map tasks to processors in a cyclic fashion

47
Cyclic (interleaved) Allocation
  • Assume p processes
  • Each process gets every pth piece of work
  • Example 5 processes and 12 pieces of work
  • P0 0, 5, 10
  • P1 1, 6, 11
  • P2 2, 7
  • P3 3, 8
  • P4 4, 9

48
Cyclic Allocation
  • Assume n pieces of work, p processes, and cyclic
    allocation
  • What is the most pieces of work any process has?
  • What is the least pieces of work any process has?
  • How many processes have the most pieces of work?

49
Summary of Program Design
  • Program will consider all 65,536 combinations of
    16 boolean inputs
  • Combinations allocated in cyclic fashion to
    processes
  • Each process examines each of its combinations
  • If it finds a satisfiable combination, it will
    print it

50
include ltmpi.hgtinclude ltstdio.hgtint main
(int argc, char argv) int i int id
int p void check_circuit (int, int)
MPI_Init (argc, argv) MPI_Comm_rank
(MPI_COMM_WORLD, id) MPI_Comm_size
(MPI_COMM_WORLD, p) for (i id i lt 65536
i p) check_circuit (id, i) printf
("Process d is done\n", id) fflush
(stdout) MPI_Finalize() return 0
51
/ Return 1 if 'i'th bit of 'n' is 1 0 otherwise
/ define EXTRACT_BIT(n,i) ((n(1ltlti))?10) void
check_circuit (int id, int z) int v16
/ Each element is a bit of z / int i
for (i 0 i lt 16 i) vi
EXTRACT_BIT(z,i) if ((v0 v1)
(!v1 !v3) (v2 v3)
(!v3 !v4) (v4 !v5)
(v5 !v6) (v5 v6) (v6
!v15) (v7 !v8) (!v7
!v13) (v8 v9) (v8
!v9) (!v9 !v10) (v9
v11) (v10 v11) (v12
v13) (v13 !v14) (v14
v15)) printf ("d) dddddddddd
dddddd\n", id, v0,v1,v2,v3,v
4,v5,v6,v7,v8,v9,
v10,v11,v12,v13,v14,v15)
fflush (stdout)
52
Our Call to MPI_Reduce()
MPI_Reduce (count, global_count,
1, MPI_INT,
MPI_SUM, 0,
MPI_COMM_WORLD)
if (!id) printf ("There are d different
solutions\n", global_count)
53
Benchmarking Code
double elapsed_time MPI_Init (argc,
argv)MPI_Barrier (MPI_COMM_WORLD)elapsed_time
- MPI_Wtime() MPI_Reduce ()elapsed_time
MPI_Wtime()
54
Summary
  • Message-passing programming follows naturally
    from task/channel model
  • Portability of message-passing programs
  • MPI most widely adopted standard

55
Summary
  • MPI functions introduced
  • MPI_Init
  • MPI_Comm_rank
  • MPI_Comm_size
  • MPI_Reduce
  • MPI_Finalize
  • MPI_Barrier
  • MPI_Wtime
  • MPI_Wtick
Write a Comment
User Comments (0)
About PowerShow.com