Hybrid Programming with OpenMP and MPI - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Hybrid Programming with OpenMP and MPI

Description:

Each process spawns 4 threads, which carries out OpenMP iterations ... All threads idle except one while inter-node communication is taking place ... – PowerPoint PPT presentation

Number of Views:113
Avg rating:3.0/5.0
Slides: 19
Provided by: Kus1
Category:

less

Transcript and Presenter's Notes

Title: Hybrid Programming with OpenMP and MPI


1
Hybrid Programming with OpenMP and MPI
  • Kushal Kedia
  • Reacting Gas Dynamics Laboratory
  • 18.337J Final Project Presentation, May 13th
    2009
  • kushal_at_mit.edu

2
Motivation Flame dynamics
  • 8 processors on single shared memory, pure OpenMP
  • 2 cycles of 200 Hz flame oscillation (0.01
    seconds of real time) takes approximately 5 days!

3
MPI? OpenMP?
  • MPI Message Passing Interface
  • Cluster of Computers
  • OpenMP Open Multi Processing
  • Desktop

Symmetric Multiprocessing (SMP)
4
Modern Clusters (multiple SMPs)
5
MPI OpenMP
  • Pros
  • Portable to distributed and shared memory
    machines.
  • Scales beyond one node
  • No data placement problem
  • Cons
  • Difficult to develop and debug
  • High latency, low bandwidth
  • Explicit communication
  • Difficult load balancing
  • Pros
  • Easy to implement parallelism
  • Low latency, high bandwidth
  • Implicit Communication
  • Dynamic load balancing
  • Cons
  • Only on shared memory machine
  • Scale within one node
  • Possible data placement problem
  • No specific thread order

6
Why Hybridization? best from both the paradigms
  • introducing MPI into OpenMP applications can help
    scale across multiple SMP nodes
  • introducing OpenMP into MPI applications can help
    make more efficient use of the shared memory on
    SMP nodes, thus mitigating the need for explicit
    intra-node communication
  • introducing MPI and OpenMP during the
    design/coding of a new application can help
    maximize efficiency, performance, and scaling

7
Problem Statement
Steady-State Heat Transfer Like Problem
8
Solution
9
Grid and parallel Decomposition
10
MPI Scenario
  • Each chunk of the grid goes to a separate
    processor
  • Explicit communication calls are made after every
    iteration, irrespective of the processor being on
    the same SMP node or different

11
Hybrid Scenario
  • A single MPI process on each SMP node
  • Each process spawns 4 threads, which carries out
    OpenMP iterations
  • Master thread of each SMP node communicates after
    every iteration

12
Schematic Hybrid Scenario
13
Computational resources
  • Pharos cluster (a shared and distributed memory
    architecture), used by the Reacting Gas Dynamics
    Laboratory (Dept. of Mechanical Engineering, MIT)
    is used for the parallel simulations.
  • Pharos consists of about 60 Intel Xeon
    -Harpertown nodes, each consisting of dual
    quad-core CPUs (8 processors) of 2.66 GHz speed.

14
Fixed grid size of 2500 x 2500
15
Constant of processors 20
16
Hybrid Program slower than pure MPI!
Seen with many problems. Cases with Hybrid
programs faster than pure MPI are few, but work
very well if suitable
17
Why? Possible arguments
  • Less scalability of OpenMP due to implicit
    parallelism
  • All threads idle except one while inter-node
    communication is taking place
  • Cache Miss high due to larger dataset and data
    placement problem
  • Lack of optimized OpenMP libraries compared to
    MPI libraries
  • Communication/computation ratio is low

18
References
Write a Comment
User Comments (0)
About PowerShow.com