Title: Account Setup and MPI Introduction
1Account Setup and MPI Introduction
- Parallel Computing
- Bioinformatics Lab
Sylvain Pitre (spitre_at_scs.carleton.ca) Web
http//cgmlab.carleton.ca
2Overview
- CGM Cluster specs
- Account Creation
- Logging in Remotely (Putty, X-Win32)
- Account Setup for MPI
- Checking Cluster Load
- Listing Your Jobs
- MPI Introduction and Basics
3CGM Lab Cluster
4CGM Lab Cluster (2)
- 8 dual-core workstations (total of 16 processors)
- Named cgm01, cgm02cgm08.
- Intel Core 2 Duo 1.6GHz, 4GB DDR2 RAM, 320GB
disks - Server (cgm01) has an extra terabyte (1TB) disk
space. - Connected through a dedicated gigabit switch.
- Running Fedora 8 (64-bit).
- OpenMPI (http//www.open-mpi.org/)
- cgmXX.carleton.ca (SSH, where XX01 to 08)
- Putty (terminal) http//www.putty.nl/download.htm
l - WinSCP (file transfer) http//winscp.net/eng/inde
x.php - XWin-32 (http//www.starnet.com/)
5CGM Lab Cluster (3)
- Accounts are handled by LDAP (Lightweight
Directory Access Protocol) on the server. - User files are stored on the server and accessed
by every workstation using NFS (Network File
System). - Same login and password will work on any
workstation.
6CGM Lab Cluster (4)
cgm01
cgm02
cgm03
cgm04
NFS Server LDAP Server
Carleton Network
cgm05
cgm06
cgm07
cgm08
7Account Creation
- To get an account send an email to Sylvain Pitre
(spitre_at_scs.carleton.ca) - Include in your email
- your full name
- your email address (if different from the one
used to send the email). - your supervisor name (or course professor).
- your preferred login name (8 characters max)
8Logging In Remotely
- You can login remotely to the cluster by SSH
(Secure Shell). - Users familiar to unix/linux should already know
how to do this. - Windows users can use Putty, a lightweight SSH
client (see link on slide 4) - Windows users can also log in by X-Win32
- DNS names cgmXX.carleton.ca (XX01 to 08)
- Log in any node except cgm01 (server)
9Logging in with Putty
- Under Host Name, enter the cgm machine you want
to log into (cgm03 in this case) then click Open. - A terminal will open and ask you for your
username then password. - Thats it! You are logged into one of the cgm
nodes.
10Login with X-Win32
- You can also log in to the nodes using X-Win32
- Open the X-Win32 Configuration program (X-Config)
- Under the Sessions Tab, click on Wizard.
- Enter a name for the session (ex cgm03) and
under Type click on ssh then click Next. - As host enter the name of the node you wish to
connect to (ex cgm03.carleton.ca) then click
Next. - Enter your login name and password and Click
Next. - For Command, click on Linux then click Finish.
- The new session is now added to your Sessions
Window.
11Login with X-Win32 (2)
- Click on the newly created session then click on
Launch. - After launching the session, you might get asked
to accept a key (click on yes). - You should now be at a terminal.
- You can work in this terminal if you wish (like
in Putty) but if you wish to have the visual
interface type - gnome-session
- After a few seconds the visual interface will
start up. - Now you have access to all the menus and windows
of the Fedora 8 interface (using Gnome).
12Login with X-Win32 (3)
13Account Setup
- First time login
- Once you have your account (send me an email to
get one) and login, change your password with the
passwd command. - If you are unfamiliar with unix/linux
- I strongly recommend reading some tutorials and
playing around with commands (but be careful!). - I assume you have some basic unix/linux knowledge
in the rest of the slides.
14Password-less SSH
- In order to run MPI on different nodes
transparently, we need to setup SSH to it doesnt
constantly ask us for a password. Type - ssh-keygen -t rsa   Â
- cd .ssh
- cp id_rsa.pub authorized_keys2
- chmod go-rwx authorized_keys2
- ssh-agent SHELL
- ssh-add
- cd ..
15Password-less SSH (2)
- Now after your initial login you should be able
to SSH into any other cgmXX machine without a
password. SSH to every workstation in order to
add that node to your known_hosts. Type - ssh cgm01 date (answer yes when asked)
- ssh cgm02 date
-
- ssh cgm08 date
-
16Ready for MPI!
- After completing the steps above your account is
now ready to run MPI jobs. - Running big jobs on multiple processors
- Since there is no job scheduler jobs are launched
manually so please be considerate. Use nodes that
are not in use or that have less load (Ill show
you how to check). - If you need all the nodes for a longer period of
time well try to reserve them for you.
17Network Vs. Local Files
- If you need to do a lot of disk I/O, it is
preferable to use the local disks /tmp
directory. - Since your account is mounted by NFS, all files
written to your home directory are sent to the
server (network bottleneck). - To reduce network transfers, place your large
input/output files in /tmp on your local node. - Make the filename unique.
18Checking Cluster Load
- To check the load on each workstation type the
command load
19Listing Your Jobs
- To check all of your jobs (processes) across the
cluster type listjobs
20MPI Introduction
- Message Passing Interface (MPI)
- Portable message-passing standard that
facilitates the development of parallel
applications and libraries. - For parallel computers, clusters
- Not a language in its own. It is used as a
package with another language, like C or Fortran. - Different implementations OpenMPI, LAM/MPI,
MPICH - Portable not limited to a specific architecture.
21MPI Basics
- Every node (process) executes the same code.
- Nodes can follow different paths (Master/slave
model) but dont abuse! - Communication is done by message passing.
- Every node has a unique rank (ID) from 0 to p.
- The total number of nodes is known to every node.
- Synchronous or asynchronous messages.
- Thread safe.
22Compiling/Running MPI Programs
- Compiler mpicc
- Command line
- mpirun n ltpgt --hostfile lthostfilegt ltproggt
ltparamsgt - Where ltpgt is the number of processes you want to
use. Can be greater than the number of processors
available (used for overloading or simulation).
23Hostfile
- For running a job on more than one node, a
hostfile must be used. - Whats in a hostfile
- Node name or IP.
- How many processors on each node (1 by default).
- Example
- cgm01 slots2
- cgm02 slots2
-
24MPI Startup/Finalize
- include "mpi.h"
- int main(int argc, char argv)
- int rank, wsize
- MPI_Init (argc, argv)
- MPI_Comm_rank(MPI_COMM_WORLD, rank)
- MPI_Comm_size(MPI_COMM_WORLD, wsize)
- / CODE /
- MPI_Finalize()
- return 0
25MPI Types
MPI C Type C Type
MPI_CHAR char
MPI_SHORT signed short int
MPI_INT signed int
MPI_LONG signed long int
MPI_UNSIGNED_CHAR unsigned char
MPI_UNSIGNED_SHORT unsigned short int
MPI_UNSIGNED unsigned int
MPI_UNSIGNED_LONG unsigned long int
MPI_FLOAT float
MPI_DOUBLE double
MPI_LONG_DOUBLE long double
MPI_BYTE -
MPI_PACKED -
26MPI Functions
- Send/receive
- Broadcast
- All to all
- Gather/Scatter
- Reduce
- Barrier
- Other
27MPI Send/Receive (synch)
28MPI Send/Receive (synch)
- Communication between nodes (processors).
- Blocking
- int MPI_Send(void buf, int count, MPI_Datatype
datatype, int dest, int tag, MPI_Comm comm) - int MPI_Recv(void buf, int count, MPI_Datatype
datatype, int source, int tag, MPI_Comm comm,
MPI_Status status) - buf send buffer address
- count number of entries in buffer
- datatype data type of entries
- dest destination process rank
- tag message tag
- comm communicator
- status status after operation (returned)
29MPI Send/Receive (asynch)
- A buffer can be used with asynchronous messages.
- Problems occur when the buffer becomes empty or
full.
30MPI Send/Receive (asynch)
- Non-blocking (not guaranteed to be received)
- int MPI_Isend(void buf, int count, MPI_Datatype
datatype, int dest, int tag, MPI_Comm comm) - int MPI_Irecv(void buf, int count, MPI_Datatype
datatype, int source, int tag, MPI_Comm comm) - Parameters are the same as MPI_Send() and
MPI_Recv()
31MPI Broadcast
- One to all (including itself).
32MPI Broadcast (syntax)
int MPI_Bcast(void buf, int count, MPI_Datatype
datatype, int root, MPI_Comm comm) buf send
buffer address count number of entries in
buffer datatype data type of entries root rank
of root
33MPI All to All
- Flood a message from every process to every
process. - MPI_AlltoAll(void sendbuf, int sendcount,
MPI_Datatype sendtype, void recvbuf, int
recvcount, MPI_Datatype datatype, MPI_Comm comm) - sendbuf send buffer address
- sendcount number of send buffer elements
- sendtype data type of send elements
- recvbuf receive buffer address (loaded)
- recvcount number of elements each receive
- recvtype data type of receiving process
- comm communicator
34MPI All to All
35MPI All to All (alternative)
- MPI_AlltoAllv()
- Sends data to all processes, with displacement.
- MPI_Alltoallv ( void sendbuf, int sendcounts,
int sdispls, MPI_Datatype sendtype, void
recvbuf, int recvcnts, int rdispls,
MPI_Datatype recvtype, MPI_Comm comm )
36MPI Gather (Description)
- MPI_Gather()
- Each process in comm sends the contents of send
buf to the process with rank root. The process
root concatenates the received data in process
rank order in recvbuf That is the data from
process is followed by the data from process
which is followed by the data from process, etc.
The recv arguments are signicant only on the
process with rank root. The argument recv count
indicates the number of items received from each
process not the total number received
37MPI Scatter (Description)
- MPI_Scatter()
- The process with rank root distributes the
contents of sendbuf among the processes. The
contents of sendbuf are split into p segments
each consisting of sendcount items The first
segment goes to process 0, the second to process
1, etc. The send arguments are significant only
on process root.
38MPI Gather/Scatter
Gather
Scatter
39MPI Gather/Scatter (syntax)
- int MPI_Gather(void sendbuf, int sendcount,
MPI_Datatype sendtype, void recvbuf, int
recvcount, MPI_Datatype recvtype, int root,
MPI_Comm comm) - int MPI_Scatter(void sendbuf, int sendcount,
MPI_Datatype sendtype, void recvbuf, int
recvcount, MPI_Datatype recvtype, int root,
MPI_Comm comm) - sendbuf send buffer address
- sendcount number of send buffer elements
- sendtype data type of send elements
- recvbuf receive buffer address (loaded)
- recvcount number of elements each receive
- recvtype data type of receiving process
- root rank of sending (scatter) or receiving
(gather) process - comm communicator
40MPI Gatherv/Scatterv
- Similar functions than gather/scatter, but allows
for varying amounts of data to be sent instead of
a fixed amount. - For example, varying parts of an array can be
scattered/gathered in one step. - See Parallel Image Processing example to see how
they can be used.
41MPI Gatherv/Scatterv (Syntax)
- int MPI_Scatterv(void sendbuf,int
sendcounts,int displs,MPI_Datatype
sendtype,void recvbuf,int recvcount,MPI_Datatype
recvtype,int root,MPI_Comm comm) - int MPI_Gatherv(void sendbuf,int
sendcount,MPI_Datatype sendtype,void recvbuf,int
recvcounts,int displs,MPI_Datatype recvtype,int
root,MPI_Comm comm) - sendcounts number of send buffer elements for
each processes - recvcounts number of elements each receive from
each processes - displs displacement for each processor
- Other parameters are the same as gather/scatter.
42MPI Reduce
- Gather results and reduce them to one value using
an operation (Max, Min, Sum, Product).
43MPI Reduce (syntax)
- int MPI_Reduce(void sendbuf, void recvbuf, int
count, MPI_Datatype datatype, MPI_Op op, int
root, MPI_Comm comm) - sendbuf send buffer address
- recvbuf receive buffer address
- count number of send buffer elements
- datatype data type of send elements
- op reduce operation
- - MPI_MAX Maximum
- - MPI_MIN Minimum
- - MPI_SUM Sum
- - MPI_PROD Product
- root root process rank for result
- comm communicator
44MPI Barrier
- Blocks until all processes have called it.
- int MPI_Barrier(MPI_Comm comm)
- comm communicator
45Other MPI Routines
- MPI_Allgather() Gather values and distribute to
all. - MPI_Allgatherv() Gather values into specified
locations and distribute to all. - MPI_Reduce_scatter() Combine values and scatter
results. - MPI_Wait() Waits for a MPI send/receive to
complete then returns.
46Parallel Problem Examples
- Embarrassingly Parallel
- Simple Image Processing (Brightness, Negative)
- Pipelined computations
- Sorting
- Synchronous computations
- Heat Distribution Problem
- Cellular Automata
- Divide and Conquer
- N-Body Problem
47MPI Hello World!
- include "mpi.h"
- int main(int argc, char argv)
- int rank, wsize
- MPI_Status status
- MPI_Init (argc, argv)
- MPI_Comm_rank(MPI_COMM_WORLD, rank)
- MPI_Comm_size(MPI_COMM_WORLD, wsize)
- printf("Hello World!, I am processor
d.\n",rank) -
- MPI_Finalize()
- return 0
48Parallel Image processing
- Input Image of size MxN.
- Output Negative of the image.
- Each processor should have an equal share of the
work, roughly (MxN)/P. - Master/slave model
- The master will read in the image and distribute
the pixels to the slave nodes. Once done the
slaves will return the results to the master who
will output the negative image.
49Parallel Image processing (2)
- Workload
- If we have 32 pixels to process, and 4 CPUs, each
CPU will process 8 pixels. - For P0, the work will start at pixel 0
(displacement) and process 8 pixels (count).
50Parallel Image processing (3)
- Find the displacement/count for each processor.
- Master processor scatters the image
- Execute the negative operation
- Gather the results on the master processor.
- Displacement (displs) tells you where to start,
count (counts) tells you how many to do. - MPI_Scatterv (image, counts, displs, MPI_CHAR,
image, counts myId, MPI_CHAR, 0,
MPI_COMM_WORLD) - MPI_Gatherv (image, counts myId, MPI_CHAR,
image, counts, displs, MPI_CHAR, 0,
MPI_COMM_WORLD)
51MPI Timing
- Calculate the wall clock time of some code. Can
be executed by master to find out total runtime. - double start, total
- start MPI_Wtime()
- //Do some work!
- total MPI_Wtime() - start
- printf( Total Runtime f \n", total)
52Compiling Running Your First MPI Program
- Download the MPI_hello.tar.gz example from the
cgmlab.carleton.ca website. In the terminal type - wget http//cgmlab.carleton.ca/files/MPI_hello.tar
.gz - Uncompress the files by typing
- tar zxvf MPI_hello.tar.gz
- Compile the program by typing
- make
- Run the program on all 16 cores by typing
- mpirun np 16 --hostfile hostfile ./hello
53What To Do Next?
- There is also a prefix sums example on the
cgmlab.carleton.ca website. - Try other examples you find on the web.
- Find MPI tutorials online or in books.
- Write your own MPI programs.
- Have fun )
54References
- Parallel Programing Techniques and Applications
Using Networked Workstations and Parallel
Computers, Barry Wilkinson and Michael Allen,
Prentice Hall, 1999. - MPI Information/Tutorials
- http//www-unix.mcs.anl.gov/mpi/learning.html
- A draft of a Tutorial/User's Guide for MPI by
Peter Pacheco. - ftp//math.usfca.edu/pub/MPI/mpi.guide.ps
- OpenMPI (http//www.open-mpi.org/)