Presentation for Introduction of FISC - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Presentation for Introduction of FISC

Description:

Dept. of Electrical and Computer Engineering. University of Illinois at Urbana-Champaign ... ScaleME is a parallel implementation of the Multi-level Fast ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 19
Provided by: Jimin3
Category:

less

Transcript and Presenter's Notes

Title: Presentation for Introduction of FISC


1
Description of ScaleME
W.C. Chew, L. Hastriter, and S.
Velamparambil Center for Computational
Electromagnetics Dept. of Electrical and Computer
Engineering University of Illinois at
Urbana-Champaign
2
ScaleME
  • ScaleME is a parallel implementation of the
    Multi-level Fast Multipole Algorithm (MLFMA).
  • It has been tested up to 10 million unknowns on
    the SGI Origin 2000 with 128 Processors at NCSA.
  • It has been demonstrated to have better parallel
    efficiency compared to FISC.
  • It is about 7 times faster than FISC for
    performing a matrix vector product.
  • It is suppose to be portable.

3
Essential Ideas
  • A simple way to parallelize MLFMA, which is a
    tree code, is to split the workload according to
    te workload at each node.
  • However, this gives rise to exorbitant
    communcation cost.
  • Hence, a two prong approach is usedthe bottom
    part of the tree is split according to workload
    at each node, but the top is split according to
    message length being passed from nodes to nodes.

4
Essential Ideas - Illustrated
  • We called the top levels of the tree shared
    levels.
  • At the shared levels, the same tree is replicated.
  • Each processor gets half the radiation/receiving
    patterns of the boxes numbered 1, 2 and 3.

5
Examples
  • We will show the scaling property of ScaleME for
    increasing number of processors.

6
Matrix-Vector Products Sphere 6?
  • Total number of levels 5.
  • There is an initial improvement in parallel
    efficiency when the code goes
  • For small problem size, the use of more shared
    levels makes the problem less efficient.

7
Matrix-Vector Products Sphere 12?
  • Total number of levels 6.
  • For a larger problem, the use of more shared
    levels enhances parallel efficiency, but
    eventually, parallel efficiency is lost with too
    many shared levels.

8
Matrix-Vector Products Pencil at 8GHz
Number of levels 9
1.2 million unknowns Length 3.17 meters Radius
0.1 meters f 8 GHz 5 GB of RAM 300
s/iteration 1 proc 10 s/iteration 32 proc
9
Matrix-Vector products Pencil at 4 GHz
Number of levels 8
  • N 291,774
  • lt 1.5 GB RAM
  • Carefully chosen shared level results in
    impressive scaling properties

10
Matrix-Vector Products VFY-218
  • Full scale model
  • Realistic target
  • Has many geometric features which do not allow
    easy load balancing
  • Tested for frequencies from 500 MHz to 8 GHz

11
Matrix-Vector Products VFY-218 at 500 MHz
  • Total number of levels 6
  • For such a complex structure, communication cost
    is high.
  • When no shared levels are used, parallel
    efficiency is poor.

12
Matrix-Vector products VFY-218 at 1 GHz
When the problem size gets larger, the use of
shared levels can greatly enhance the parallel
efficiency.
13
Matrix-Vector Products Scaling with Size
VFY-218
14
Scaling of RCS computations
  • As a result of the efficient parallel FMM, RCS
    evaluation also becomes scalable

15
Parallel RCS Computation Time
  • Actual time required for evaluating bistatic RCS
    for 1800 angles on the VFY-218

16
Very Large Scale Problem - Sphere
  • N 10,002,828 Number of Levels 9
  • Time for matrix-vector products 34 s on 126
    processors
  • Total solution time 2 hrs, 5 mins

17
Very Large Scale Problem VFY-218
  • Frequency 8 GHz N 10,186,446
  • Time for matrix-vector products 119 s on 126
    processors
  • Total solution time 7 hrs and 25 mins ( 2 rhs)

18
Conclusions
  • The objective of the paper was to summarize our
    efforts at developing a scalable MLFMA-based fast
    solver for electromagnetic scattering
    calculations
  • Presented the essential ideas in the
    parallelization of dynamic MLFMA, which has
    exorbitant communication cost at the coarse level
    for a naĂŻve parallelization
  • Demonstrated the performance of the method with
    several examples
  • Demonstrated the ability of the code to handle
    extremely large scale problems by solving
    problems involving more than 10 million unknowns
Write a Comment
User Comments (0)
About PowerShow.com