Distributed Parallel Processing MPICHVMI - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Distributed Parallel Processing MPICHVMI

Description:

Hiding pt-to-pt ... of underlying network, effectively hiding latencies ... Hiding Comm. Latencies. MPICH-VMI 1. MPICH-G2. MPICH-VMI 2. MPICH-GM ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 17
Provided by: Avne5
Category:

less

Transcript and Presenter's Notes

Title: Distributed Parallel Processing MPICHVMI


1
Distributed Parallel Processing MPICH-VMI
  • Avneesh Pant

2
VMI
  • What is VMI?
  • Virtual Machine Interface
  • High performance communication middleware
  • Abstracts underlying communication network
  • What is MPICH-VMI
  • MPI library based on MPICH 1.2 from Argonne that
    uses VMI for underlying communication

3
Features
  • Communication over heterogeneous networks
  • Infiniband, Myrinet, TCP, Shmem supported
  • Underlying networks selected at runtime
  • Enables cross-site jobs over compute grids
  • Optimized point-to-point communication
  • Higher level MPICH protocols (eager and
    rendezvous) implemented over RDMA Put and Get
    primitives
  • RDMA emulated on networks without native RDMA
    support (TCP)
  • Extensive support for profiling
  • Profiling counters collect information about
    communication pattern of the application
  • Profiling information logged to a databases
    during MPI_Finalize
  • Profile Guided Optimization (PGO) framework uses
    profile databases to optimize subsequent
    executions of the application

4
Features
  • Hiding pt-to-pt communication latency
  • RMDA get protocol very useful in overlapping
    communication and computation
  • PGO infrastructure maps MPI processes to nodes to
    take advantage of heterogeneity of underlying
    network, effectively hiding latencies
  • Optimized Collectives
  • RDMA based collectives (e.g., MPI_Barrier)
  • Multicast based collectives (e,g., MPI_Bcast
    experimental implementation using multicast)
  • Topology aware collectives (currently MPI_Bcast,
    MPI_Reduce, MPI_Allreduce supported)

5
MPI on Teragrid
  • MPI flavors available on Teragrid
  • MPICH-GM
  • MPICH-G2
  • MPICH-VMI 1
  • Deprecated! Was part of CTSS v1
  • MPICH-VMI 2
  • Available as part of CTSS v2 and v3
  • All are part of CTSS
  • Which one to use?
  • We are biased!

6
MPI on Teragrid
  • MPI Designed for
  • MPICH-GM -gt Single site runs using myrinet
  • MPICH-G2 -gt Running across globus grids
  • MPICH-VMI2 -gt Scale out seamlessly from single
    site to across grid
  • Currently need to keep two separate executables
  • Single site using MPICH-GM and Grid job using
    MPICH-G2
  • MPICH-VMI2 allows you to use the same executable
    with comparable or better performance

7
MPI on Teragrid
8
Using MPICH-VMI
  • Two flavors of MPICH-VMI2 on Teragrid
  • GCC compiled library
  • Intel compiled library
  • Recommended not to mix them together
  • CTSS defines keys for each compiled library
  • GCC mpich-vmi-2.1.0-1-gcc-3-2
  • Intel mpich-vmi-2.1.0-1-intel-8.0

9
Setting the Environment
  • To use MPICH VMI 2.1
  • soft add mpich-vmi-2.1.0-1-gcc-3-2
    intel-8.0
  • To preserve VMI 2.1 environment across sessions,
    add
  • mpich-vmi-2.1.0-1-gcc-3-2 intel-8.0 to the
    .soft file in your home directory
  • Intel 8.1 is also available at NCSA. Other sites
    do not have Intel 8.1 completely installed yet.
  • Softenv brings in the compiler wrapper scripts
    into your environment
  • mpicc and mpiCC for C and C codes
  • mpif77 and mpif90 for F77 and F90 codes
  • Some underlying compilers such as GNU compiler
    suite do not support F90. Use mpif90 show to
    determine underlying compiler being used.

10
Compiling with MPICH-VMI
  • The compiler scripts are wrappers that include
    all MPICH-VMI specific libraries and paths
  • All underlying compiler switches are supported
    and passed to the compiler
  • eg. mpicc hello.c o hello
  • The MPICH-VMI library by default is compiled with
    debug symbols.

11
Running with MPICH-VMI
  • mpirun script is available for launching jobs
  • Supports all standard arguments in addition to
    MPICH-VMI specific arguments
  • mpirun uses ssh, rsh and MPD for launching jobs.
    Default is MPD.
  • Provides automatic selection/failover
  • If MPD ring not available, falls back to ssh/rsh
  • Supports standard way to run jobs
  • mpirun np lt of procsgt -machinefile ltnodelist
    filegt ltexecutablegt ltargumentsgt
  • -machinefile argument not needed when running
    within PBS or LSF environment
  • Can select network to use at runtime by
    specifying
  • -specfile ltnetworkgt
  • Supported networks are myrinet, tcp and
    xsite-myrinet-tcp
  • Default network on Teragrid is Myrinet
  • Recommend to always specify network explicitly
    using specfile switch

12
Running with MPICH-VMI
  • MPICH-VMI 2.1 specific arguments related to three
    broad categories
  • Parameters for runtime tuning
  • Parameters for launching GRID jobs
  • Parameters for controlling profiling of job
  • mpirun help option to list all tunable
    parameters
  • All MPICH-VMI 2.1 specific parameters are
    optional. GRID jobs require some parameters to be
    set.
  • To run a simple job within a Teragrid cluster
  • mpirun np 4 /path/to/hello
  • mpirun np 4 specfile myrinet /path/to/hello
  • Within PBS PBS_NODEFILE contains the path to the
    nodes allocated at runtime
  • mpirun np lt procsgt machinefile PBS_NODEFILE
    /path/to/hello
  • For cross-site jobs, additional arguments
    required (discussed later)

13
For Detecting/Reporting Errors
  • Verbosity switches
  • -v Verbose Level 1. Output VMI startup messages
    and make MPIRUN verbose.
  • -vv Verbose Level 2. Additionally output any
    warning messages.
  • -vvv Verbose Level 3. Additionally output any
    error messages.
  • -vvvv Verbose Level 10. Excess Debug. Useful only
    for developers of MPICH-VMI and submitting crash
    dumps.

14
Running Inter Site Jobs
  • A MPICH-VMI GRID job consists of one or more
    subjobs
  • A subjob is launched on each site using
    individual mpirun commands. The specfile selected
    should be one of the xsite network transports
    (xsite-mst-tcp or xsite-myrinet-tcp).
  • The higher performance SAN (Infiniband or Myinet)
    is used for intra site communication. Cross site
    communication uses TCP automatically
  • In Addition to Intra Site Parameters all Inter
    Site Runs Must Specify the same Grid Specific
    Parameters
  • A Grid CRM Must be Available on the Network to
    Synchronize Subjobs
  • Grid CRM on Teragrid is available at
    tg-master2.ncsa.uiuc.edu
  • No reason why any other site cant host their own
  • In fact, you can run one on your own desktop!
  • Grid Specific Parameters
  • -grid-procs Specifies the total number of
    processes in the job. np parameter to mpirun
    still specifies the number of processes in the
    subjob
  • -grid-crm Specifies the host running the grid CRM
    to be used for subjob synchronization.
  • -key Alphanumeric string that uniquely identifies
    the grid job. This should be the same for all
    subjobs!

15
Running Inter Site Jobs
  • Running xsite across SDSC (2 procs) and NCSA (6
    procs)
  • _at_SDSC mpirun -np 2 grid-procs 8 -key myxsitejob
    -specfile xsite-myrinet-tcp grid-crm
    tg-master2.ncsa.teragrid.org cpi
  • _at_NCSA mpirun -np 6 grid-procs 8 -key myxsitejob
    -specfile xsite-myrinet-tcp grid-crm
    tg-master2.ncsa.teragrid.org cpi

16
MPICH-VMI2 Support
  • Support
  • help_at_teragrid.org
  • Mailing lists http//vmi.ncsa.uiuc.edu/mailingLis
    ts.php
  • Announcements vmi-announce_at_yams.ncsa.uiuc.edu
  • Users vmi-user_at_yams.ncsa.uiuc.edu
  • Developers vmi-devel_at_yams.ncsa.uiuc.edu
Write a Comment
User Comments (0)
About PowerShow.com