Sparse Matrix Methods

About This Presentation

Title:

Sparse Matrix Methods

Description:

such that ri = (Axi b) Ki (A, b) Notice ri Ki 1 (A, b), so ri rj for all j i ... find xi Ki (A, b) such that ri = (Axi b) Ki (A, b) ... – PowerPoint PPT presentation

Number of Views:30

Avg rating:3.0/5.0

Slides: 31

Provided by: JohnGi84

Learn more at: https://sites.cs.ucsb.edu

Category:

more less

Transcript and Presenter's Notes

Title: Sparse Matrix Methods

1
Sparse Matrix Methods

Day 1 Overview
Day 2 Direct methods
Day 3 Iterative methods
The conjugate gradient algorithm
Parallel conjugate gradient and graph
partitioning
Preconditioning methods and graph coloring
Domain decomposition and multigrid
Krylov subspace methods for other problems
Complexity of iterative and direct methods

2
SuperLU-dist Iterative refinement to improve
solution

Iterate
r b Ax
backerr maxi ( ri / (Ax b)i )
if backerr lt e or backerr gt lasterr/2 then
stop iterating
solve LUdx r
x x dx
lasterr backerr
repeat

Usually 0 3 steps are enough

3
(Matlab demo)

iterative refinement

4
Convergence analysis of iterative refinement
Let C I A(LU)-1 so A (I C)(LU)
x1 (LU)-1b r1 b Ax1 (I
A(LU)-1)b Cb dx1 (LU)-1 r1 (LU)-1Cb x2
x1dx1 (LU)-1(I C)b r2 b Ax2
(I (I C)(I C))b C2b . . . In general,
rk b Axk Ckb Thus rk ? 0 if
largest eigenvalue of C lt 1.
5
The Landscape of Sparse Axb Solvers
D
6
Conjugate gradient iteration
x0 0, r0 b, p0 r0 for k 1, 2,
3, . . . ak (rTk-1rk-1) / (pTk-1Apk-1)
step length xk xk-1 ak pk-1
approx solution rk rk-1 ak
Apk-1 residual ßk
(rTk rk) / (rTk-1rk-1) improvement pk
rk ßk pk-1
search direction

One matrix-vector multiplication per iteration
Two vector dot products per iteration
Four n-vectors of working storage

7
Conjugate gradient Krylov subspaces

Eigenvalues Au ?u ?1, ?2 ,
. . ., ?n
Cayley-Hamilton theorem
(A ?1I)(A ?2I) (A ?nI) 0
Therefore S ciAi 0 for some ci
so A-1 S (ci/c0) Ai1
Krylov subspace
Therefore if Ax b, then x A-1 b and
x ? span (b, Ab, A2b, . . ., An-1b) Kn (A, b)

0 ? i ? n
1 ? i ? n
8
Conjugate gradient Orthogonal sequences

Krylov subspace Ki (A, b) span (b, Ab, A2b, .
. ., Ai-1b)
Conjugate gradient algorithm for i 1, 2, 3,
. . . find xi ? Ki (A, b) such that ri
(Axi b) ? Ki (A, b)
Notice ri ? Ki1 (A, b), so ri ? rj for all
j lt i
Similarly, the directions are
A-orthogonal (xi xi-1 )TA (xj xj-1 ) 0
The magic Short recurrences. . . A is symmetric
gt can get next residual and direction from
the previous one, without saving them all.

9
Conjugate gradient Convergence

In exact arithmetic, CG converges in n steps
(completely unrealistic!!)
Accuracy after k steps of CG is related to
consider polynomials of degree k that are equal
to 1 at 0.
how small can such a polynomial be at all the
eigenvalues of A?
Thus, eigenvalues close together are good.
Condition number ?(A) A2 A-12
?max(A) / ?min(A)
Residual is reduced by a constant factor by
O(?1/2(A)) iterations of CG.

10
(Matlab demo)

CG on grid5(15) and bcsstk08
n steps of CG on bcsstk08

11
Conjugate gradient Parallel implementation

Lay out matrix and vectors by rows
Hard part is matrix-vector product
y Ax
Algorithm
Each processor j
Broadcast x(j)
Compute y(j) A(j,)x
May send more of x than needed
Partition / reorder matrix to reduce communication

12
(Matlab demo)

2-way partition of eppstein mesh
8-way dice of eppstein mesh

13
Preconditioners

Suppose you had a matrix B such that
condition number ?(B-1A) is small
By z is easy to solve
Then you could solve (B-1A)x B-1b instead of Ax
b
B A is great for (1), not for (2)
B I is great for (2), not for (1)
Domain-specific approximations sometimes work
B diagonal of A sometimes works
Or, bring back the direct methods technology. . .

14
(Matlab demo)

bcsstk08 with diagonal precond

15
Incomplete Cholesky factorization (IC, ILU)

Compute factors of A by Gaussian elimination,
but ignore fill
Preconditioner B RTR ? A, not formed explicitly
Compute B-1z by triangular solves (in time
nnz(A))
Total storage is O(nnz(A)), static data structure
Either symmetric (IC) or nonsymmetric (ILU)

16
(Matlab demo)

bcsstk08 with ic precond

17
Incomplete Cholesky and ILU Variants

Allow one or more levels of fill
unpredictable storage requirements
Allow fill whose magnitude exceeds a drop
tolerance
may get better approximate factors than levels of
fill
unpredictable storage requirements
choice of tolerance is ad hoc
Partial pivoting (for nonsymmetric A)
Modified ILU (MIC) Add dropped fill to
diagonal of U or R
A and RTR have same row sums
good in some PDE contexts

18
Incomplete Cholesky and ILU Issues

Choice of parameters
good smooth transition from iterative to direct
methods
bad very ad hoc, problem-dependent
tradeoff time per iteration (more fill gt more
time) vs of iterations (more
fill gt fewer iters)
Effectiveness
condition number usually improves (only) by
constant factor (except MIC for some problems
from PDEs)
still, often good when tuned for a particular
class of problems
Parallelism
Triangular solves are not very parallel
Reordering for parallel triangular solve by graph
coloring

19
(Matlab demo)

2-coloring of grid5(15)

20
Sparse approximate inverses

Compute B-1 ? A explicitly
Minimize B-1A I F (in parallel, by
columns)
Variants factored form of B-1, more fill, . .
Good very parallel
Bad effectiveness varies widely

21
Support graph preconditioners example
Vaidya
G(A)
G(B)

A is symmetric positive definite with negative
off-diagonal nzs
B is a maximum-weight spanning tree for A (with
diagonal modified to preserve row sums)
factor B in O(n) space and O(n) time
applying the preconditioner costs O(n) time per
iteration

22
Support graph preconditioners example
G(A)
G(B)

support each edge of A by a path in B
dilation(A edge) length of supporting path in B
congestion(B edge) of supported A edges
p max congestion, q max dilation
condition number ?(B-1A) bounded by pq (at most
O(n2))

23
Support graph preconditioners example
G(A)
G(B)

can improve congestion and dilation by adding a
few strategically chosen edges to B
cost of factorsolve is O(n1.75), or O(n1.2) if A
is planar
in recent experiments Chen Toledo, often
better than drop-tolerance MIC for 2D problems,
but not for 3D.

24
Domain decomposition (introduction)
B 0 E
0 C F
ET FT G
A

Partition the problem (e.g. the mesh) into
subdomains
Use solvers for the subdomains B-1 and C-1 to
precondition an iterative solver on the
interface
Interface system is the Schur complement
S G ET B-1E FT C-1F
Parallelizes naturally by subdomains

25
(Matlab demo)

grid and matrix structure for overlapping 2-way
partition of eppstein

26
Multigrid (introduction)

For a PDE on a fine mesh, precondition using a
solution on a coarser mesh
Use idea recursively on hierarchy of meshes
Solves the model problem (Poissons eqn) in
linear time!
Often useful when hierarchy of meshes can be
built
Hard to parallelize coarse meshes well
This is just the intuition lots of theory and
technology

27
Other Krylov subspace methods

Nonsymmetric linear systems
GMRES for i 1, 2, 3, . . . find xi ? Ki
(A, b) such that ri (Axi b) ? Ki (A,
b)But, no short recurrence gt save old vectors
gt lots more space
BiCGStab, QMR, etc.Two spaces Ki (A, b) and Ki
(AT, b) w/ mutually orthogonal basesShort
recurrences gt O(n) space, but less robust
Convergence and preconditioning more delicate
than CG
Active area of current research
Eigenvalues Lanczos (symmetric), Arnoldi
(nonsymmetric)

28
The Landscape of Sparse Axb Solvers
29
Complexity of direct methods
Time and space to solve any problem on any
well-shaped finite element mesh
2D 3D
Space (fill) O(n log n) O(n 4/3 )
Time (flops) O(n 3/2 ) O(n 2 )
30
Complexity of linear solvers
Time to solve model problem (Poissons equation)
on regular mesh
2D 3D
Sparse Cholesky O(n1.5 ) O(n2 )
CG, exact arithmetic O(n2 ) O(n2 )
CG, no precond O(n1.5 ) O(n1.33 )
CG, modified IC O(n1.25 ) O(n1.17 )
CG, support trees O(n1.20 ) O(n1.75 )
Multigrid O(n) O(n)

Write a Comment

User Comments (0)

About PowerShow.com

Sparse Matrix Methods - PowerPoint PPT Presentation

Sparse Matrix Methods

such that ri = (Axi b) Ki (A, b) Notice ri Ki 1 (A, b), so ri rj for all j i ... find xi Ki (A, b) such that ri = (Axi b) Ki (A, b) ... – PowerPoint PPT presentation