PaGrid: A Mesh Partitioner for Computational Grids - PowerPoint PPT Presentation

About This Presentation
Title:

PaGrid: A Mesh Partitioner for Computational Grids

Description:

... Network) is Atlantic Canada's entry into this national fabric of HPC facilities. ... A partnership of seven institutions, including UNB, MUN, MTA, ... – PowerPoint PPT presentation

Number of Views:117
Avg rating:3.0/5.0
Slides: 42
Provided by: Wesl152
Category:

less

Transcript and Presenter's Notes

Title: PaGrid: A Mesh Partitioner for Computational Grids


1
PaGrid A Mesh Partitioner for Computational
Grids
  • Virendra C. Bhavsar
  • Professor and Dean
  • Faculty of Computer Science
  • UNB, Fredericton
  • bhavsar_at_unb.ca
  • This work is done in collaboration with
    Sili Huang and Dr. Eric Aubanel.

2
Outline
  • Introduction
  • Background
  • PaGrid Mesh Partitioner
  • Experimental Results
  • Conclusion

3
Advanced Computational Research Laboratory
  • Virendra C. Bhavsar

4
ACRL Facilities
5
ACEnet Project
  • ACEnet (Atlantic Computational Excellence
    Network) is Atlantic Canada's entry into this
    national fabric of HPC facilities.
  • A partnership of seven institutions, including
    UNB, MUN, MTA, Dalhousie, StFX, SMU, and UPEI.
  • ACEnet was awarded 9.9M by the CFI in March
    2004. The project will be worth nearly 28M.

6
Mesh Partitioning Problem
(a) Heat distribution problem
(b) Corresponding application graph
7
Mesh Partitioning Problem
  • Mapping of the mesh onto the processors while
    minimizing the inter-processor communication cost
  • Balance the computational load among processors

8
Computational Grids
The slide is from the Centre for Unified
Computing, University of College Cork, Ireland
9
Computational Grid Applications
10
A Computational Grid Model
  • Computational Grids and their heterogeneity in
    both processors and networks

11
Mesh Partitioning Problem
Equation Total communication cost
12
Background
  • Generic Multilevel Partitioning Algorithm

The slide is from Centre from CEPBA-IBM Research
Institute, Spain.
13
Background
  • Coarsening phase
  • Matching and contraction.
  • Heavy Edge Matching Heuristic.

v2
u
v1
14
Background
  • Refinement (Uncoarsening Phase)
  • Kernighan-Lin/Fiduccia-Mattheyses (KL-FM)
    refinement
  • Refine partitions under load balance constraint.
  • Compute a gain for each candidate vertex.
  • Each step, move a single vertex to a different
    subdomain.
  • Vertices with negative gains are allowed for
    migration.
  • Greedy refinement
  • Similar to KL-FM refinement
  • Vertices with negative gains are not allowed to
    move

15
Background
  • (Computational) Load balancing
  • To balance the load among the processors
  • Small imbalance can lead to a better partition.
  • Diffusion-based Flow Solutions
  • Determine how much load to be transferred among
    processors

16
Mesh Partitioning Tools
  • Mesh Partitioning Tools
  • METIS (Karypis and Kumar, 1995)
  • JOSTLE (Walshaw, 1997)
  • CHACO (Hendrickson and Leland, 1994)
  • PART (Chen and Taylor, 1996)
  • SCOTCH (Pellegrini, 1994)
  • PARTY (Preis and Diekmann, 1996)
  • MiniMax (Kumar, Das, and Biswas , 2002)

17
METIS
  • A widely used partitioning tool.
  • Developed from 1995.
  • Uses Multilevel partitioning algorithm.
  • Heavy Edge Matching for Coarsening Phase
  • Greedy Refinement algorithm
  • Does not consider the network heterogeneity.

18
JOSTLE
  • Developed from 1997.
  • A heterogeneous partitioner
  • Uses multilevel partitioning algorithm
  • Heavy Edge Matching
  • KL-type refinement algorithm
  • Does not factor in the ratio of communication
    time and computation time.

19
PaGrid Mesh Partitioner
  • Grid System Modeling
  • Refinement Cost Function
  • KL-type Refinement
  • Estimated Execution Time Load Balancing

20
Grid System Modeling
  • Grid system that contains a set of processors (P)
    connected by a set of edges (C) gt weighted
    processor graph S.
  • Vertex weight relative computational power
  • if p0 is twice powerful than p1, and p10.5,
    then p01
  • Path length accumulative weights in the
    shortest path.
  • Weighted Matrix W of size P X P is
    constructed, where

21
Refinement Cost Function
  • Given a processor mapping cost matrix W, the
    total mapping cost for a partition is given by

22
Refinement Cost Function
23
Multilevel Partitioning Algorithm
  • Coarsening Phase.
  • Heavy Edge Matching
  • Iterate until the number of vertices in the
    coarsest graph is same as the given number of
    processors.
  • Initial Partitioning Phase.
  • Assign the each vertex to a processor, while
    minimizing the cost function.
  • Uncoarsening Phase.
  • Load balancing based on vertex weights
  • KL-type refinement algorithm.
  • Load balancing based on estimated execution time.

24
Estimated Execution time load balancing
  • Input is the final partition after refinement
    stage.
  • Tries to improve the quality of final partition
    in terms of estimated execution time.
  • Execution time for a processor is the sum of time
    required for computation and the time required
    for communication.
  • Execution time is a more accurate metric for the
    quality of a partition.
  • Uses KL-type algorithm

25
Estimated Execution time load balancing
  • For a processor p with one of its edges (p, q) in
    the processor graph, let
  • Estimated execution time for processor p is given
    as
  • Estimated execution time of the application is

26
Experimental Results
  • Test application graphs
  • Grid system graphs
  • Comparison with METIS and JOSTLE

27
Test Application Graphs
Graph V E E/V Description
598a 110971 741934 6.69 3D finite element mesh (Submarine I)
144 144649 1074393 7.43 3D finite element mesh (Parafoil)
m14b 214765 1679018 7.82 3D finite element mesh (Submarine II)
auto 448695 3314611 7.39 3D finite element mesh (GM Saturn)
Mrng2 1017253 2015714 1.98 (description not available)
V is the total number of vertices and E is
the total number of edges in the graph.
28
Grid Systems
29
Metrics
  • Total Communication Cost
  • Maximum Estimated Execution Time

30
Total Communication Cost
32-processor Grid System
31
Total Communication Cost
  • Average values of Total Communication Cost of
    PaGrid are similar to those of METIS.
  • Average values of Total Communication Cost of
    PaGrid are slightly worse than for Jostle.

32
Maximum Estimated Execution Time
32-processor Grid System
33
Maximum Estimated Execution Time
  • The minimum and average values of Execution Time
    for PaGrid are always lower than for Jostle and
    METIS, except for graph mrng2, where PaGrid is
    slightly worse than METIS.
  • Even though the results PaGrid are worse than
    Jostle in terms of average Total Communication
    Cost, PaGrids Estimated Execution Time Load
    Balancing generates lower average Execution Time
    than Jostle in all cases.

34
Total Communication Cost
64-processor Grid System
35
Total Communication Cost
  • Average values of Total Communication Cost of
    PaGrid are better than METIS in most cases,
    except for graph mrng2 (because of the low ratio
    of E/V).
  • Average values of Total Communication Cost of
    PaGrid are much worse than Jostle in three of
    five test application graphs.

36
Maximum Estimated Execution Time
64-processor Grid System
37
Maximum Estimated Execution Time
  • The difference between PaGrid and Jostle are
    amplified
  • even though the results PaGrid are much worse
    than Jostle in terms of average Total
    Communication Cost, the minimum and average
    values of Execution Time for PaGrid are much
    lower than for Jostle.
  • The minimum Estimated Execution Times for PaGrid
    are always much lower than for METIS, and the
    average Execution Times for PaGrid are almost
    always lower than those of METIS, except for
    application graph mrng2.

38
Conclusion
  • Intensive need for mesh partitioner that
    considers the heterogeneity of the processors and
    networks in a computational Grid environment.
  • Current partitioning tools provide only limited
    solution.
  • PaGrid a heterogeneous mesh partitioner
  • Consider both processor and network
    heterogeneity.
  • Use multilevel graph partitioning algorithm.
  • Incorporate load balancing that is based on
    estimated execution time.
  • Experimental results indicate that load balancing
    based on estimated execution time improves the
    quality of partitions.

39
Future Work
  • Cost function can be modified to be based on
    estimated execution time.
  • Algorithms can be developed addressing
    repartitioning problem.
  • Parallelization of PaGrid.

40
Publications
  • S. Huang, E. Aubanel, and V.C. Bhavsar, "PaGrid
    A Mesh Partitioner for Computational Grids",
    Journal of Grid Computing, 18 pages, in press,
    2006.
  • S. Huang, E. Aubanel and V. Bhavsar, Mesh
    Partitioners for Computational Grids a
    Comparison, in V. Kumar, M. Gavrilova, C. Tan,
    and P. L'Ecuyer (eds.), Computational Science and
    Its Applications, Vol. 2269 of Lecture Notes in
    Computer Science, Springer Inc., Berlin
    Heidelberg New York, pp. 6068, 2003.

41
Questions ?
Write a Comment
User Comments (0)
About PowerShow.com