Account Setup and MPI Introduction - PowerPoint PPT Presentation

1 / 54
About This Presentation
Title:

Account Setup and MPI Introduction

Description:

Windows users can use Putty, a lightweight SSH client (see link on 4) ... After launching the session, you might get asked to accept a key (click on 'yes' ... – PowerPoint PPT presentation

Number of Views:130
Avg rating:3.0/5.0
Slides: 55
Provided by: cgmlabC
Category:

less

Transcript and Presenter's Notes

Title: Account Setup and MPI Introduction


1
Account Setup and MPI Introduction
  • Parallel Computing
  • Bioinformatics Lab

Sylvain Pitre (spitre_at_scs.carleton.ca) Web
http//cgmlab.carleton.ca
2
Overview
  • CGM Cluster specs
  • Account Creation
  • Logging in Remotely (Putty, X-Win32)
  • Account Setup for MPI
  • Checking Cluster Load
  • Listing Your Jobs
  • MPI Introduction and Basics

3
CGM Lab Cluster
4
CGM Lab Cluster (2)
  • 8 dual-core workstations (total of 16 processors)
  • Named cgm01, cgm02cgm08.
  • Intel Core 2 Duo 1.6GHz, 4GB DDR2 RAM, 320GB
    disks
  • Server (cgm01) has an extra terabyte (1TB) disk
    space.
  • Connected through a dedicated gigabit switch.
  • Running Fedora 8 (64-bit).
  • OpenMPI (http//www.open-mpi.org/)
  • cgmXX.carleton.ca (SSH, where XX01 to 08)
  • Putty (terminal) http//www.putty.nl/download.htm
    l
  • WinSCP (file transfer) http//winscp.net/eng/inde
    x.php
  • XWin-32 (http//www.starnet.com/)

5
CGM Lab Cluster (3)
  • Accounts are handled by LDAP (Lightweight
    Directory Access Protocol) on the server.
  • User files are stored on the server and accessed
    by every workstation using NFS (Network File
    System).
  • Same login and password will work on any
    workstation.

6
CGM Lab Cluster (4)
cgm01
cgm02
cgm03
cgm04
NFS Server LDAP Server
Carleton Network
cgm05
cgm06
cgm07
cgm08
7
Account Creation
  • To get an account send an email to Sylvain Pitre
    (spitre_at_scs.carleton.ca)
  • Include in your email
  • your full name
  • your email address (if different from the one
    used to send the email).
  • your supervisor name (or course professor).
  • your preferred login name (8 characters max)

8
Logging In Remotely
  • You can login remotely to the cluster by SSH
    (Secure Shell).
  • Users familiar to unix/linux should already know
    how to do this.
  • Windows users can use Putty, a lightweight SSH
    client (see link on slide 4)
  • Windows users can also log in by X-Win32
  • DNS names cgmXX.carleton.ca (XX01 to 08)
  • Log in any node except cgm01 (server)

9
Logging in with Putty
  • Under Host Name, enter the cgm machine you want
    to log into (cgm03 in this case) then click Open.
  • A terminal will open and ask you for your
    username then password.
  • Thats it! You are logged into one of the cgm
    nodes.

10
Login with X-Win32
  • You can also log in to the nodes using X-Win32
  • Open the X-Win32 Configuration program (X-Config)
  • Under the Sessions Tab, click on Wizard.
  • Enter a name for the session (ex cgm03) and
    under Type click on ssh then click Next.
  • As host enter the name of the node you wish to
    connect to (ex cgm03.carleton.ca) then click
    Next.
  • Enter your login name and password and Click
    Next.
  • For Command, click on Linux then click Finish.
  • The new session is now added to your Sessions
    Window.

11
Login with X-Win32 (2)
  • Click on the newly created session then click on
    Launch.
  • After launching the session, you might get asked
    to accept a key (click on yes).
  • You should now be at a terminal.
  • You can work in this terminal if you wish (like
    in Putty) but if you wish to have the visual
    interface type
  • gnome-session
  • After a few seconds the visual interface will
    start up.
  • Now you have access to all the menus and windows
    of the Fedora 8 interface (using Gnome).

12
Login with X-Win32 (3)
  • Demonstration

13
Account Setup
  • First time login
  • Once you have your account (send me an email to
    get one) and login, change your password with the
    passwd command.
  • If you are unfamiliar with unix/linux
  • I strongly recommend reading some tutorials and
    playing around with commands (but be careful!).
  • I assume you have some basic unix/linux knowledge
    in the rest of the slides.

14
Password-less SSH
  • In order to run MPI on different nodes
    transparently, we need to setup SSH to it doesnt
    constantly ask us for a password. Type
  • ssh-keygen -t rsa    
  • cd .ssh
  • cp id_rsa.pub authorized_keys2
  • chmod go-rwx authorized_keys2
  • ssh-agent SHELL
  • ssh-add
  • cd ..

15
Password-less SSH (2)
  • Now after your initial login you should be able
    to SSH into any other cgmXX machine without a
    password. SSH to every workstation in order to
    add that node to your known_hosts. Type
  • ssh cgm01 date (answer yes when asked)
  • ssh cgm02 date
  • ssh cgm08 date

16
Ready for MPI!
  • After completing the steps above your account is
    now ready to run MPI jobs.
  • Running big jobs on multiple processors
  • Since there is no job scheduler jobs are launched
    manually so please be considerate. Use nodes that
    are not in use or that have less load (Ill show
    you how to check).
  • If you need all the nodes for a longer period of
    time well try to reserve them for you.

17
Network Vs. Local Files
  • If you need to do a lot of disk I/O, it is
    preferable to use the local disks /tmp
    directory.
  • Since your account is mounted by NFS, all files
    written to your home directory are sent to the
    server (network bottleneck).
  • To reduce network transfers, place your large
    input/output files in /tmp on your local node.
  • Make the filename unique.

18
Checking Cluster Load
  • To check the load on each workstation type the
    command load

19
Listing Your Jobs
  • To check all of your jobs (processes) across the
    cluster type listjobs

20
MPI Introduction
  • Message Passing Interface (MPI)
  • Portable message-passing standard that
    facilitates the development of parallel
    applications and libraries.
  • For parallel computers, clusters
  • Not a language in its own. It is used as a
    package with another language, like C or Fortran.
  • Different implementations OpenMPI, LAM/MPI,
    MPICH
  • Portable not limited to a specific architecture.

21
MPI Basics
  • Every node (process) executes the same code.
  • Nodes can follow different paths (Master/slave
    model) but dont abuse!
  • Communication is done by message passing.
  • Every node has a unique rank (ID) from 0 to p.
  • The total number of nodes is known to every node.
  • Synchronous or asynchronous messages.
  • Thread safe.

22
Compiling/Running MPI Programs
  • Compiler mpicc
  • Command line
  • mpirun n ltpgt --hostfile lthostfilegt ltproggt
    ltparamsgt
  • Where ltpgt is the number of processes you want to
    use. Can be greater than the number of processors
    available (used for overloading or simulation).

23
Hostfile
  • For running a job on more than one node, a
    hostfile must be used.
  • Whats in a hostfile
  • Node name or IP.
  • How many processors on each node (1 by default).
  • Example
  • cgm01 slots2
  • cgm02 slots2

24
MPI Startup/Finalize
  • include "mpi.h"
  • int main(int argc, char argv)
  • int rank, wsize
  • MPI_Init (argc, argv)
  • MPI_Comm_rank(MPI_COMM_WORLD, rank)
  • MPI_Comm_size(MPI_COMM_WORLD, wsize)
  • / CODE /
  • MPI_Finalize()
  • return 0

25
MPI Types
MPI C Type C Type
MPI_CHAR char
MPI_SHORT signed short int
MPI_INT signed int
MPI_LONG signed long int
MPI_UNSIGNED_CHAR unsigned char
MPI_UNSIGNED_SHORT unsigned short int
MPI_UNSIGNED unsigned int
MPI_UNSIGNED_LONG unsigned long int
MPI_FLOAT float
MPI_DOUBLE double
MPI_LONG_DOUBLE long double
MPI_BYTE -
MPI_PACKED -
26
MPI Functions
  • Send/receive
  • Broadcast
  • All to all
  • Gather/Scatter
  • Reduce
  • Barrier
  • Other

27
MPI Send/Receive (synch)
28
MPI Send/Receive (synch)
  • Communication between nodes (processors).
  • Blocking
  • int MPI_Send(void buf, int count, MPI_Datatype
    datatype, int dest, int tag, MPI_Comm comm)
  • int MPI_Recv(void buf, int count, MPI_Datatype
    datatype, int source, int tag, MPI_Comm comm,
    MPI_Status status)
  • buf send buffer address
  • count number of entries in buffer
  • datatype data type of entries
  • dest destination process rank
  • tag message tag
  • comm communicator
  • status status after operation (returned)

29
MPI Send/Receive (asynch)
  • A buffer can be used with asynchronous messages.
  • Problems occur when the buffer becomes empty or
    full.

30
MPI Send/Receive (asynch)
  • Non-blocking (not guaranteed to be received)
  • int MPI_Isend(void buf, int count, MPI_Datatype
    datatype, int dest, int tag, MPI_Comm comm)
  • int MPI_Irecv(void buf, int count, MPI_Datatype
    datatype, int source, int tag, MPI_Comm comm)
  • Parameters are the same as MPI_Send() and
    MPI_Recv()

31
MPI Broadcast
  • One to all (including itself).

32
MPI Broadcast (syntax)
int MPI_Bcast(void buf, int count, MPI_Datatype
datatype, int root, MPI_Comm comm) buf send
buffer address count number of entries in
buffer datatype data type of entries root rank
of root
33
MPI All to All
  • Flood a message from every process to every
    process.
  • MPI_AlltoAll(void sendbuf, int sendcount,
    MPI_Datatype sendtype, void recvbuf, int
    recvcount, MPI_Datatype datatype, MPI_Comm comm)
  • sendbuf send buffer address
  • sendcount number of send buffer elements
  • sendtype data type of send elements
  • recvbuf receive buffer address (loaded)
  • recvcount number of elements each receive
  • recvtype data type of receiving process
  • comm communicator

34
MPI All to All
35
MPI All to All (alternative)
  • MPI_AlltoAllv()
  • Sends data to all processes, with displacement.
  • MPI_Alltoallv ( void sendbuf, int sendcounts,
    int sdispls, MPI_Datatype sendtype, void
    recvbuf, int recvcnts, int rdispls,
    MPI_Datatype recvtype, MPI_Comm comm )

36
MPI Gather (Description)
  • MPI_Gather()
  • Each process in comm sends the contents of send
    buf to the process with rank root. The process
    root concatenates the received data in process
    rank order in recvbuf That is the data from
    process is followed by the data from process
    which is followed by the data from process, etc.
    The recv arguments are signicant only on the
    process with rank root. The argument recv count
    indicates the number of items received from each
    process not the total number received

37
MPI Scatter (Description)
  • MPI_Scatter()
  • The process with rank root distributes the
    contents of sendbuf among the processes. The
    contents of sendbuf are split into p segments
    each consisting of sendcount items The first
    segment goes to process 0, the second to process
    1, etc. The send arguments are significant only
    on process root.

38
MPI Gather/Scatter
Gather
Scatter
39
MPI Gather/Scatter (syntax)
  • int MPI_Gather(void sendbuf, int sendcount,
    MPI_Datatype sendtype, void recvbuf, int
    recvcount, MPI_Datatype recvtype, int root,
    MPI_Comm comm)
  • int MPI_Scatter(void sendbuf, int sendcount,
    MPI_Datatype sendtype, void recvbuf, int
    recvcount, MPI_Datatype recvtype, int root,
    MPI_Comm comm)
  • sendbuf send buffer address
  • sendcount number of send buffer elements
  • sendtype data type of send elements
  • recvbuf receive buffer address (loaded)
  • recvcount number of elements each receive
  • recvtype data type of receiving process
  • root rank of sending (scatter) or receiving
    (gather) process
  • comm communicator

40
MPI Gatherv/Scatterv
  • Similar functions than gather/scatter, but allows
    for varying amounts of data to be sent instead of
    a fixed amount.
  • For example, varying parts of an array can be
    scattered/gathered in one step.
  • See Parallel Image Processing example to see how
    they can be used.

41
MPI Gatherv/Scatterv (Syntax)
  • int MPI_Scatterv(void sendbuf,int
    sendcounts,int displs,MPI_Datatype
    sendtype,void recvbuf,int recvcount,MPI_Datatype
    recvtype,int root,MPI_Comm comm)
  • int MPI_Gatherv(void sendbuf,int
    sendcount,MPI_Datatype sendtype,void recvbuf,int
    recvcounts,int displs,MPI_Datatype recvtype,int
    root,MPI_Comm comm)
  • sendcounts number of send buffer elements for
    each processes
  • recvcounts number of elements each receive from
    each processes
  • displs displacement for each processor
  • Other parameters are the same as gather/scatter.

42
MPI Reduce
  • Gather results and reduce them to one value using
    an operation (Max, Min, Sum, Product).

43
MPI Reduce (syntax)
  • int MPI_Reduce(void sendbuf, void recvbuf, int
    count, MPI_Datatype datatype, MPI_Op op, int
    root, MPI_Comm comm)
  • sendbuf send buffer address
  • recvbuf receive buffer address
  • count number of send buffer elements
  • datatype data type of send elements
  • op reduce operation
  • - MPI_MAX Maximum
  • - MPI_MIN Minimum
  • - MPI_SUM Sum
  • - MPI_PROD Product
  • root root process rank for result
  • comm communicator

44
MPI Barrier
  • Blocks until all processes have called it.
  • int MPI_Barrier(MPI_Comm comm)
  • comm communicator

45
Other MPI Routines
  • MPI_Allgather() Gather values and distribute to
    all.
  • MPI_Allgatherv() Gather values into specified
    locations and distribute to all.
  • MPI_Reduce_scatter() Combine values and scatter
    results.
  • MPI_Wait() Waits for a MPI send/receive to
    complete then returns.

46
Parallel Problem Examples
  • Embarrassingly Parallel
  • Simple Image Processing (Brightness, Negative)
  • Pipelined computations
  • Sorting
  • Synchronous computations
  • Heat Distribution Problem
  • Cellular Automata
  • Divide and Conquer
  • N-Body Problem

47
MPI Hello World!
  • include "mpi.h"
  • int main(int argc, char argv)
  • int rank, wsize
  • MPI_Status status
  • MPI_Init (argc, argv)
  • MPI_Comm_rank(MPI_COMM_WORLD, rank)
  • MPI_Comm_size(MPI_COMM_WORLD, wsize)
  • printf("Hello World!, I am processor
    d.\n",rank)
  • MPI_Finalize()
  • return 0

48
Parallel Image processing
  • Input Image of size MxN.
  • Output Negative of the image.
  • Each processor should have an equal share of the
    work, roughly (MxN)/P.
  • Master/slave model
  • The master will read in the image and distribute
    the pixels to the slave nodes. Once done the
    slaves will return the results to the master who
    will output the negative image.

49
Parallel Image processing (2)
  • Workload
  • If we have 32 pixels to process, and 4 CPUs, each
    CPU will process 8 pixels.
  • For P0, the work will start at pixel 0
    (displacement) and process 8 pixels (count).

50
Parallel Image processing (3)
  • Find the displacement/count for each processor.
  • Master processor scatters the image
  • Execute the negative operation
  • Gather the results on the master processor.
  • Displacement (displs) tells you where to start,
    count (counts) tells you how many to do.
  • MPI_Scatterv (image, counts, displs, MPI_CHAR,
    image, counts myId, MPI_CHAR, 0,
    MPI_COMM_WORLD)
  • MPI_Gatherv (image, counts myId, MPI_CHAR,
    image, counts, displs, MPI_CHAR, 0,
    MPI_COMM_WORLD)

51
MPI Timing
  • Calculate the wall clock time of some code. Can
    be executed by master to find out total runtime.
  • double start, total
  • start MPI_Wtime()
  • //Do some work!
  • total MPI_Wtime() - start
  • printf( Total Runtime f \n", total)

52
Compiling Running Your First MPI Program
  • Download the MPI_hello.tar.gz example from the
    cgmlab.carleton.ca website. In the terminal type
  • wget http//cgmlab.carleton.ca/files/MPI_hello.tar
    .gz
  • Uncompress the files by typing
  • tar zxvf MPI_hello.tar.gz
  • Compile the program by typing
  • make
  • Run the program on all 16 cores by typing
  • mpirun np 16 --hostfile hostfile ./hello

53
What To Do Next?
  • There is also a prefix sums example on the
    cgmlab.carleton.ca website.
  • Try other examples you find on the web.
  • Find MPI tutorials online or in books.
  • Write your own MPI programs.
  • Have fun )

54
References
  • Parallel Programing Techniques and Applications
    Using Networked Workstations and Parallel
    Computers, Barry Wilkinson and Michael Allen,
    Prentice Hall, 1999.
  • MPI Information/Tutorials
  • http//www-unix.mcs.anl.gov/mpi/learning.html
  • A draft of a Tutorial/User's Guide for MPI by
    Peter Pacheco.
  • ftp//math.usfca.edu/pub/MPI/mpi.guide.ps
  • OpenMPI (http//www.open-mpi.org/)
Write a Comment
User Comments (0)
About PowerShow.com