Domain Decomposed Parallel Heat Distribution Problem in Two Dimensions - PowerPoint PPT Presentation

About This Presentation
Title:

Domain Decomposed Parallel Heat Distribution Problem in Two Dimensions

Description:

Domain Decomposed Parallel Heat Distribution Problem in Two Dimensions Yana Kortsarts Jeff Rufinus Widener University Computer Science Department – PowerPoint PPT presentation

Number of Views:116
Avg rating:3.0/5.0
Slides: 23
Provided by: yana150
Learn more at: https://cs.widener.edu
Category:

less

Transcript and Presenter's Notes

Title: Domain Decomposed Parallel Heat Distribution Problem in Two Dimensions


1
Domain Decomposed Parallel Heat
DistributionProblem in Two Dimensions
  • Yana Kortsarts
  • Jeff Rufinus
  • Widener University
  • Computer Science Department

2
Introduction
  • 2004 Office of Science in the Department of
    Energy issued a twenty-year strategic plan with
    seven highest priorities ranging from fusion
    energy to genomics.
  • To achieve the necessary levels of algorithmic
    and computational capabilities it is essential to
    educate students in computation and computational
    techniques.
  • Parallel Computing topic is one of the attractive
    topics in computation science field

3
Introductory Parallel Computer Course
  • Computer Science Department, Widener University,
    CS and CIS majors
  • Series of two courses Introduction to Parallel
    Computing I and II
  • Resources computer cluster of six nodes, each
    node has two 2.4 GHz processors and 1 GB of
    memory, nodes are connected by Gigabit Ethernet
    switch.

4
Course Curriculum
  • Matrix Manipulation
  • Numerical Simulation Concepts
  • Direct applications in science and engineering
  • Introduction to MPI libraries and their
    applications
  • Concepts of parallelism
  • Finite difference method for the 2-D heat
    equation using parallel algorithm

5
2-D Heat Distribution Problem
  • The problem to determine the temperature
    u(x,y,t) in an isotropic two-dimensional
    rectangular plate
  • The model

6
Finite Difference Method
  • The finite difference method begins with the
    discretization of space and time such that there
    is an integer number of points in space and an
    integer number of times at which we calculate the
    temperature

(xi , yj)
tk
tk1
?t
?y
?x
7
We will use the following notation
We will use the finite difference approximations
for the derivatives
Expressing ui,j,k1 from this equation yields
8
Finite Difference Method Explicit Scheme
ui,j1,k
ui,j,k1
ui,j,k
ui-1,j,k
ui1,j,k
k ? k 1
ui,j-1,k
9
Single Processor Implementation
  • double u_oldn1n1, u_newn1n1
  • Initialize u_old with initial values and boundary
    conditions
  • while (still time points to compute)
  • for (i 1 i lt n i)
  • for (j 1 j lt n j)
  • compute u_newi, j using formula (1)
  • //end of for
  • // end of for
  • u_old ? u_new
  • // end of while

10
Parallel ImplementationDomain Decomposition
  • Dividing computation and data into pieces
  • Domain could be decomposed in three ways
  • Column-wise adjacent groups of columns (A)
  • Row-wise adjacent groups of rows (B)
  • Block-wise adjacent groups of two dimensional
    blocks (C)

11
Domain Decomposition and Partition
  • Example column-wise domain decomposition method,
    200 points
  • to be calculated simultaneously, 4 processors
  • MPI_Send and MPI_Recv

Processor 4 x149x199
Processor 1 x0x49
Processor 2 x50x99
Processor3 x100x149
12
Load Imbalance
  • When dividing the data into processes we have to
    pay attention to the number of loads being
    processed by each processor
  • Uneven load distribution may cause some processes
    to finish earlier than others
  • Load imbalance is one source of overhead
  • Good task mapping is needed
  • All tasks should be mapped onto processes as
    evenly as possible so that all tasks complete in
    the shortest amount of time and the idle time is
    minimized

13
Communication
  • Communication time depends on the latency and the
    speed of communication network these two
    factors are much slower than CPUs communication
    time
  • There is a catch of using too many communications

14
Running Time and Speedup
  • The running time of one time iteration of the
    sequential algorithm is ?(MN), where M and N are
    numbers of grid points in each direction
  • The running time of one time iteration of the
    parallel algorithm is computational time
    communication time
  • ??(MN/p) B
  • where p is the number of processors and B is
    the total send-receive communication time that is
    required for one time iteration
  • The speedup is always defined as

15
Results
The temperature distribution on the
two-dimensional plate at a much later time
16
Results
  • Two cases were considered
  • M x N 1000
  • M x N 500,000.
  • Next slide shows the speed-up versus the number
    of processors for two different inputs 500,000
    (the top chart) and 1000 (the bottom chart).
  • The dashed line indicates the speed-up equals to
    one, which is the sequential version of the
    algorithm.
  • The higher the speed-up (at a specific number of
    processors) means the better the performance of
    the parallel algorithm.
  • Most of the results come from the column-wise
    domain decomposition method.

17
Results
18
Results
  • For the case of input 1000, the sequential
    version (with p 1) is faster than
    the parallel version (p 2). The parallel
    version is slower because of the latency and
    speed of the communication network which does not
    exist in the sequential version.
  • The top chart shows the speedup versus the number
    of processors for total input 500,000. In this
    case, as we increase the number of processors,
    the speedup also increases, reaching the speedup
    of 4.13 at p 10.
  • For a large number of inputs the communication
    time begins to catch up with the CPUs
    computation time, resulting in a better
    performance of the parallel algorithm.

19
Speedup comparisons for column-wise and
block-wise decomposition methods for number of
processors equals to 4 and 9

Total number of inputs Speedup Column-wise Decomposition P 4 Speedup Column-wise Decomposition P 9 Speedup Block-wise Decomposition P 4 Speedup Block-wise Decomposition P 9
1,000 0.093 0.05 0.065 0.04
500,000 2.12 3.82 2.19 3.51
20
Results
  • Overall, the speedups between the two methods are
    not very different.
  • For number of inputs 1,000 the column-wise
    decomposition produces better speed-ups than the
    block-wise decomposition.
  • For number of inputs 500,000, we have a mixed
    result. The column-wise method performs better
    for 9 processors while the block-wise method
    performs (slightly) better for 4 processors.
  • The results given in the table do not give a
    conclusive idea of which decomposition method is
    better, unless the number of inputs and the
    number of processors could be extended beyond the
    ones used in here.

21
Summary
  • Numerical simulation of two-dimensional heat
    distribution has been used as an example that can
    be used to teach parallel computing concepts in
    an introductory course.
  • With this simple example we introduce the core
    concepts of parallelism
  • Domain decomposition and partitioning
  • Load balancing and mapping
  • Communication
  • Speedup
  • We show the benchmarking results of the parallel
    version of two-dimensional heat distribution
    problem with different number of processors.

22
References
  • 1. J. Dongarra, I. Foster, G. Fox, W. Gropp, K.
    Kennedy, L. Torczon, and A. White, (Editors),
    Sourcebook of Parallel Computing. Elsevier
    Science (2003).
  • 2. I. Foster, Designing and Building Parallel
    Programs. Addison Wesley (1994).
  • 3. G. E. Karniadakis and R. M. Kirby, Parallel
    Scientific Computing in C and MPI. Cambridge
    University Press (2003).
  • 4. M. J. Quinn, Parallel Programming in C with
    MPI and OpenMP. McGraw Hill Publishers (2005).
  • 5. B. Wilkinson and M. Allen, Parallel
    Programming. Second edition. Prentice-Hall
    (2005).
  • 6. M. Snir, S. Otto, S. Huss-Lederman, D. Walker
    and J. Dongarra, MPI The Complete Reference,
    Volume 1. Second edition. MIT Press (1998).
  • 7. W. F. Ames, Numerical Methods for Partial
    Differential Equations. Second edition. Academic
    Press, New York (1977).
  • 8. T. Myint-U and L. Debnath, Partial
    Differential Equations for Scientists and
    Engineers. Elsevier Science (1987).
  • 9. G. D. Smith, Numerical Solution of Partial
    Differential Equations Finite Difference
    Methods. Third edition. Oxford University Press
    (1985).
  • 10. S. S. Rao, Applied Numerical Methods for
    Engineers and Scientists. Prentice-Hall (2002).
Write a Comment
User Comments (0)
About PowerShow.com