Sparse Matrix Methods - PowerPoint PPT Presentation

About This Presentation
Title:

Sparse Matrix Methods

Description:

such that ri = (Axi b) Ki (A, b) Notice ri Ki 1 (A, b), so ri rj for all j i ... find xi Ki (A, b) such that ri = (Axi b) Ki (A, b) ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 31
Provided by: JohnGi84
Category:
Tags: matrix | methods | ri | sparse | xi

less

Transcript and Presenter's Notes

Title: Sparse Matrix Methods


1
Sparse Matrix Methods
  • Day 1 Overview
  • Day 2 Direct methods
  • Day 3 Iterative methods
  • The conjugate gradient algorithm
  • Parallel conjugate gradient and graph
    partitioning
  • Preconditioning methods and graph coloring
  • Domain decomposition and multigrid
  • Krylov subspace methods for other problems
  • Complexity of iterative and direct methods

2
SuperLU-dist Iterative refinement to improve
solution
  • Iterate
  • r b Ax
  • backerr maxi ( ri / (Ax b)i )
  • if backerr lt e or backerr gt lasterr/2 then
    stop iterating
  • solve LUdx r
  • x x dx
  • lasterr backerr
  • repeat
  • Usually 0 3 steps are enough

3
(Matlab demo)
  • iterative refinement

4
Convergence analysis of iterative refinement
Let C I A(LU)-1 so A (I C)(LU)
x1 (LU)-1b r1 b Ax1 (I
A(LU)-1)b Cb dx1 (LU)-1 r1 (LU)-1Cb x2
x1dx1 (LU)-1(I C)b r2 b Ax2
(I (I C)(I C))b C2b . . . In general,
rk b Axk Ckb Thus rk ? 0 if
largest eigenvalue of C lt 1.
5
The Landscape of Sparse Axb Solvers
D
6
Conjugate gradient iteration
x0 0, r0 b, p0 r0 for k 1, 2,
3, . . . ak (rTk-1rk-1) / (pTk-1Apk-1)
step length xk xk-1 ak pk-1
approx solution rk rk-1 ak
Apk-1 residual ßk
(rTk rk) / (rTk-1rk-1) improvement pk
rk ßk pk-1
search direction
  • One matrix-vector multiplication per iteration
  • Two vector dot products per iteration
  • Four n-vectors of working storage

7
Conjugate gradient Krylov subspaces
  • Eigenvalues Au ?u ?1, ?2 ,
    . . ., ?n
  • Cayley-Hamilton theorem
  • (A ?1I)(A ?2I) (A ?nI) 0
  • Therefore S ciAi 0 for some ci
  • so A-1 S (ci/c0) Ai1
  • Krylov subspace
  • Therefore if Ax b, then x A-1 b and
  • x ? span (b, Ab, A2b, . . ., An-1b) Kn (A, b)

0 ? i ? n
1 ? i ? n
8
Conjugate gradient Orthogonal sequences
  • Krylov subspace Ki (A, b) span (b, Ab, A2b, .
    . ., Ai-1b)
  • Conjugate gradient algorithm for i 1, 2, 3,
    . . . find xi ? Ki (A, b) such that ri
    (Axi b) ? Ki (A, b)
  • Notice ri ? Ki1 (A, b), so ri ? rj for all
    j lt i
  • Similarly, the directions are
    A-orthogonal (xi xi-1 )TA (xj xj-1 ) 0
  • The magic Short recurrences. . . A is symmetric
    gt can get next residual and direction from
    the previous one, without saving them all.

9
Conjugate gradient Convergence
  • In exact arithmetic, CG converges in n steps
    (completely unrealistic!!)
  • Accuracy after k steps of CG is related to
  • consider polynomials of degree k that are equal
    to 1 at 0.
  • how small can such a polynomial be at all the
    eigenvalues of A?
  • Thus, eigenvalues close together are good.
  • Condition number ?(A) A2 A-12
    ?max(A) / ?min(A)
  • Residual is reduced by a constant factor by
    O(?1/2(A)) iterations of CG.

10
(Matlab demo)
  • CG on grid5(15) and bcsstk08
  • n steps of CG on bcsstk08

11
Conjugate gradient Parallel implementation
  • Lay out matrix and vectors by rows
  • Hard part is matrix-vector product
    y Ax
  • Algorithm
  • Each processor j
  • Broadcast x(j)
  • Compute y(j) A(j,)x
  • May send more of x than needed
  • Partition / reorder matrix to reduce communication

12
(Matlab demo)
  • 2-way partition of eppstein mesh
  • 8-way dice of eppstein mesh

13
Preconditioners
  • Suppose you had a matrix B such that
  • condition number ?(B-1A) is small
  • By z is easy to solve
  • Then you could solve (B-1A)x B-1b instead of Ax
    b
  • B A is great for (1), not for (2)
  • B I is great for (2), not for (1)
  • Domain-specific approximations sometimes work
  • B diagonal of A sometimes works
  • Or, bring back the direct methods technology. . .

14
(Matlab demo)
  • bcsstk08 with diagonal precond

15
Incomplete Cholesky factorization (IC, ILU)
  • Compute factors of A by Gaussian elimination,
    but ignore fill
  • Preconditioner B RTR ? A, not formed explicitly
  • Compute B-1z by triangular solves (in time
    nnz(A))
  • Total storage is O(nnz(A)), static data structure
  • Either symmetric (IC) or nonsymmetric (ILU)

16
(Matlab demo)
  • bcsstk08 with ic precond

17
Incomplete Cholesky and ILU Variants
  • Allow one or more levels of fill
  • unpredictable storage requirements
  • Allow fill whose magnitude exceeds a drop
    tolerance
  • may get better approximate factors than levels of
    fill
  • unpredictable storage requirements
  • choice of tolerance is ad hoc
  • Partial pivoting (for nonsymmetric A)
  • Modified ILU (MIC) Add dropped fill to
    diagonal of U or R
  • A and RTR have same row sums
  • good in some PDE contexts

18
Incomplete Cholesky and ILU Issues
  • Choice of parameters
  • good smooth transition from iterative to direct
    methods
  • bad very ad hoc, problem-dependent
  • tradeoff time per iteration (more fill gt more
    time) vs of iterations (more
    fill gt fewer iters)
  • Effectiveness
  • condition number usually improves (only) by
    constant factor (except MIC for some problems
    from PDEs)
  • still, often good when tuned for a particular
    class of problems
  • Parallelism
  • Triangular solves are not very parallel
  • Reordering for parallel triangular solve by graph
    coloring

19
(Matlab demo)
  • 2-coloring of grid5(15)

20
Sparse approximate inverses
  • Compute B-1 ? A explicitly
  • Minimize B-1A I F (in parallel, by
    columns)
  • Variants factored form of B-1, more fill, . .
  • Good very parallel
  • Bad effectiveness varies widely

21
Support graph preconditioners example
Vaidya
G(A)
G(B)
  • A is symmetric positive definite with negative
    off-diagonal nzs
  • B is a maximum-weight spanning tree for A (with
    diagonal modified to preserve row sums)
  • factor B in O(n) space and O(n) time
  • applying the preconditioner costs O(n) time per
    iteration

22
Support graph preconditioners example
G(A)
G(B)
  • support each edge of A by a path in B
  • dilation(A edge) length of supporting path in B
  • congestion(B edge) of supported A edges
  • p max congestion, q max dilation
  • condition number ?(B-1A) bounded by pq (at most
    O(n2))

23
Support graph preconditioners example
G(A)
G(B)
  • can improve congestion and dilation by adding a
    few strategically chosen edges to B
  • cost of factorsolve is O(n1.75), or O(n1.2) if A
    is planar
  • in recent experiments Chen Toledo, often
    better than drop-tolerance MIC for 2D problems,
    but not for 3D.

24
Domain decomposition (introduction)
B 0 E
0 C F
ET FT G
A
  • Partition the problem (e.g. the mesh) into
    subdomains
  • Use solvers for the subdomains B-1 and C-1 to
    precondition an iterative solver on the
    interface
  • Interface system is the Schur complement
    S G ET B-1E FT C-1F
  • Parallelizes naturally by subdomains

25
(Matlab demo)
  • grid and matrix structure for overlapping 2-way
    partition of eppstein

26
Multigrid (introduction)
  • For a PDE on a fine mesh, precondition using a
    solution on a coarser mesh
  • Use idea recursively on hierarchy of meshes
  • Solves the model problem (Poissons eqn) in
    linear time!
  • Often useful when hierarchy of meshes can be
    built
  • Hard to parallelize coarse meshes well
  • This is just the intuition lots of theory and
    technology

27
Other Krylov subspace methods
  • Nonsymmetric linear systems
  • GMRES for i 1, 2, 3, . . . find xi ? Ki
    (A, b) such that ri (Axi b) ? Ki (A,
    b)But, no short recurrence gt save old vectors
    gt lots more space
  • BiCGStab, QMR, etc.Two spaces Ki (A, b) and Ki
    (AT, b) w/ mutually orthogonal basesShort
    recurrences gt O(n) space, but less robust
  • Convergence and preconditioning more delicate
    than CG
  • Active area of current research
  • Eigenvalues Lanczos (symmetric), Arnoldi
    (nonsymmetric)

28
The Landscape of Sparse Axb Solvers
29
Complexity of direct methods
Time and space to solve any problem on any
well-shaped finite element mesh
2D 3D
Space (fill) O(n log n) O(n 4/3 )
Time (flops) O(n 3/2 ) O(n 2 )
30
Complexity of linear solvers
Time to solve model problem (Poissons equation)
on regular mesh
2D 3D
Sparse Cholesky O(n1.5 ) O(n2 )
CG, exact arithmetic O(n2 ) O(n2 )
CG, no precond O(n1.5 ) O(n1.33 )
CG, modified IC O(n1.25 ) O(n1.17 )
CG, support trees O(n1.20 ) O(n1.75 )
Multigrid O(n) O(n)
Write a Comment
User Comments (0)
About PowerShow.com