Title: Training Manual 001419
12.4 Parallel Performance Enhancements
- In this section, we will discuss the following
topics - A. New add-on product Parallel Performance for
ANSYS - B. Distributed Domain Solver (DDS)
- C. Algebraic Multigrid Solver (AMG)
2Parallel Performance EnhancementsOverview
- Driven by user requirements of higher accuracy
and fidelity in solution - e.g. mesh refinement and adaptive meshing
- Desire to solve assemblies instead of individual
component analysis - e.g. assembly contact problems
3Parallel Performance EnhancementsA. Parallel
Performance for ANSYS
- A new, add-on product for shared memory and
distributed memory environments - Offers powerful new solvers enabling quick,
accurate solutions to large models using multiple
processors - Algebraic MultiGrid (AMG) solver
- Solves static/ transient nonlinear analyses using
multiple processors (up to 8) on a single system
(shared memory parallel) - Distributed Domain Solver (DDS)
- Solves large static / transient nonlinear
analyses over multiple systems (Distributed
memory parallel) as well as multiple processors
on a single machine (Shared memory parallel) or
any combination
4Parallel Performance Enhancements B. DDS
- What is DDS?
- Breaks large problems (up to 10 million DOFs)
into smaller domains (1000 to 10000 DOFs)
automatically - Compatibility among domains obtained by solving
for interface variables (Lagrange multipliers)
5Parallel Performance Enhancements DDS
- ...What is DDS?
- Transfers and factorizes the subdomains on slave
machines using direct solver - Master machine retrieves and assembles subdomain
solutions solves for interface variables using
an iterative solver and computes results for
entire model
6Parallel Performance Enhancements DDS
- Why DDS?
- Highly scalable
- More processors / less elapsed time
- Example below shows a 3.5 million-DOF SOLID92
model - 2020 subdomains on an SGI Origin 2000, 12GB
memory
Speed-up 21.0
7Parallel Performance Enhancements DDS
- Memory / Disk requirements
- 2 to 4 times more memory than PCG however, this
is not a problem for distributed memory
architecture. - Memory required is a sum of all master
individual slave machine memories - In general Master machine will need large memory
8Parallel Performance Enhancements DDS - Under
the Hood
- DDS has 2 components
- Domain decomposer
- Embedded in ANSYS
- Divides domain into n subdomains
- Creates scratch.dds, file.dds, and file.erot
- Issues mpirun command and launches appropriate
ansdds.e57 executable - ANSDDS.E57
- A stand-alone, MPI enabled executable
- Computes solution for subdomain on the slave
processor - Writes out a file called scratch.u, which is
later retrieved by the Master to calculate
element results
9Parallel Performance Enhancements DDS
- System requirements
- Network must be homogeneous (same operating
system) - Message Passing Interface (MPI) used to
communicate - Master (where the job is submitted)
- Performance Parallel for ANSYS add-on required
- ANSYS 5.7 must be installed (including
ansdds.e57) - Installation of MPI
- 256 MB ram / 10 GB disk required
- Slave
- Installation of MPI on all slave machines
- ansdds.e57 executable must be installed
10Parallel Performance Enhancements DDS
- How to use DDS
- Specify Parallel Peformance for ANSYS add-on
when starting ANSYS - ansys57 -pp
- Choose DDS Solver
- EQSLV,DOMAIN
- Specify information about slave processors
- DDSOPT command
- DDSOPT command covered in Systems Training
11Parallel Performance Enhancements DDS
- How to use DDS (cont'd)
- Solve
- Postprocessing
- You get a results file as usual
- /PNUM,DOMAIN,ON will display domains by colors /
numbers
12Parallel Performance Enhancements DDS Solver
- Are there any modeling restrictions for using
DDS? - Structural static/transient only (linear or
nonlinear) - Symmetric matrices
- h elements only
- No coupling / constraint equations
- No inertia relief
13Parallel Performance Enhancements DDS Solver
14Parallel Performance Enhancements C. AMG Solver
- What is AMG solver?
- A preconditioned conjugate gradient solver
similar to PCG solver - The preconditioner used in AMG solver is derived
using Algebraic MultiGrid technique - MultiGrid techniques derive a preconditioner that
is very close to K-1 by working on a coarser
mesh of the FE model supplied - Algebraic MultiGrid methods work on a coarsened
version of the full K matrix instead of the
mesh (that is mesh independent)
15Parallel Performance Enhancements AMG Solver
- Why do we need AMG solver?
- Sensitivity to ill-conditioning
- Much less sensitive to ill-conditioned problems
than PCG - Will get solutions in fewer iterations than PCG
for ill-conditioned problems - Expected to perform as well as PCG for well
conditioned problems - Scalability
- Up to 5 times for 8 processors
- Scales much better than PCG
- Used in shared memory parallel (single machine
with multiple processors) only
16Parallel Performance Enhancements AMG Solver
17Parallel Performance Enhancements AMG Solver
- How to use AMG solver
- Specify Parallel Peformance for ANSYS add-on
when starting ANSYS - ansys57 -pp
- Specify number of processors
- /CONFIG,NPROC,N
- or config57.ans
- or use the macro SETNPROC
- Choose AMG Solver
- EQSLV,AMG,Toler
- Tolerance defaults to 1e-8 similar to PCG
- Solve
18Parallel Performance Enhancements AMG Solver
- When to use AMG solver
- Structural Static Transient analyses
- Nonlinear analyses
- Large aspect ratio elements, reduced integration
elements - Models with combination of shells/ solids/ beams
- Shared memory parallel machines
- When not to use AMG solver
- Non-structural problems (it works but is less
efficient) - Models made of only shell63 elements do not seem
to be as cpu efficient as PCG
19Parallel Performance Enhancements AMG Solver
- Memory / Disk requirements
- 1.3 to 2 times more memory than PCG solver
- Rule of thumb is 130 MB per 100,000 dof for
solid92s - Memory required is also a function of number of
processors used (overhead) - Files created during AMG solution are very
similar to PCG and about the same size
20(No Transcript)