Title: CSE 245: Computer Aided Circuit Simulation and Verification
1CSE 245 Computer Aided Circuit Simulation and
Verification
Lecture 5 Matrix Solver II -Iterative
Method
2Outline
- Iterative Method
- Stationary Iterative Method (SOR, GS,Jacob)
- Krylov Method (CG, GMRES)
- Multigrid Method
3Iterative Methods
- Stationary
- x(k1) Gx(k)c
- where G and c do not depend on iteration count
(k) - Non Stationary
- x(k1) x(k)akp(k)
- where computation involves information that
change at each iteration
4Stationary Jacobi Method
- In the i-th equation solve for the value of xi
while assuming the other entries of x remain
fixed - In matrix terms the method becomes
- where D, -L and -U represent the diagonal, the
strictly lower-trg and strictly upper-trg parts
of M - MD-L-U
5Stationary-Gause-Seidel
- Like Jacobi, but now assume that previously
computed results are used as soon as they are
available - In matrix terms the method becomes
- where D, -L and -U represent the diagonal, the
strictly lower-trg and strictly upper-trg parts
of M - MD-L-U
6Stationary Successive Overrelaxation (SOR)
- Devised by extrapolation applied to Gauss-Seidel
in the form of weighted average - In matrix terms the method becomes
- where D, -L and -U represent the diagonal, the
strictly lower-trg and strictly upper-trg parts
of M - MD-L-U
7SOR
- Choose w to accelerate the convergence
- W 1 Jacobi / Gauss-Seidel
- 2gtWgt1 Over-Relaxation
- W lt 1 Under-Relaxation
8Convergence of Stationary Method
- Linear Equation MXb
- A sufficient condition for convergence of the
solution(GS,Jacob) is that the matrix M is
diagonally dominant. - If M is symmetric positive definite, SOR
converges for any w (0ltwlt2) - A necessary and sufficient condition for the
convergence is the magnitude of the largest
eigenvalue of the matrix G is smaller than 1 - Jacobi
- Gauss-Seidel
- SOR
9Outline
- Iterative Method
- Stationary Iterative Method (SOR, GS,Jacob)
- Krylov Method (CG, GMRES)
- Steepest Descent
- Conjugate Gradient
- Preconditioning
- Multigrid Method
10Linear Equation an optimization problem
- Quadratic function of vector x
- Matrix A is positive-definite, if
for any nonzero vector x - If A is symmetric, positive-definite, f(x) is
minimized by the solution
11Linear Equation an optimization problem
- Quadratic function
- Derivative
- If A is symmetric
- If A is positive-definite
- is minimized by setting to 0
12For symmetric positive definite matrix A
13Gradient of quadratic form
The points in the direction of steepest increase
of f(x)
14Symmetric Positive-Definite Matrix A
- If A is symmetric positive definite
- P is the arbitrary point
- X is the solution point
since
We have,
If p ! x
15If A is not positive definite
- Positive definite matrix b) negative-definite
matrix - c) Singular matrix d) positive
indefinite matrix
16Non-stationary Iterative Method
- State from initial guess x0, adjust it until
close enough to the exact solution - How to choose direction and step size?
i0,1,2,3,
Adjustment Direction
Step Size
17Steepest Descent Method (1)
- Choose the direction in which f decrease most
quickly the direction opposite of - Which is also the direction of residue
18Steepest Descent Method (2)
- How to choose step size ?
- Line Search
- should minimize f, along the direction of
, which means
Orthogonal
19Steepest Descent Algorithm
Given x0, iterate until residue is smaller than
error tolerance
20Steepest Descent Method example
- Starting at (-2,-2) take the
- direction of steepest descent of f
- b) Find the point on the intersec-
- tion of these two surfaces that
- minimize f
- c) Intersection of surfaces.
- d) The gradient at the bottommost
- point is orthogonal to the gradient
- of the previous step
21Iterations of Steepest Descent Method
22Convergence of Steepest Descent-1
let
Eigenvector
EigenValue
j1,2,,n
Energy norm
23Convergence of Steepest Descent-2
24Convergence Study (n2)
assume
let
Spectral condition number
let
25Plot of w
26Case Study
27Bound of Convergence
It can be proved that it is also valid for ngt2,
where
28Conjugate Gradient Method
- Steepest Descent
- Repeat search direction
- Why take exact one step for each direction?
Search direction of Steepest descent method
29Orthogonal Direction
Pick orthogonal search direction
30Orthogonal ? A-orthogonal
- Instead of orthogonal search direction, we make
search direction A orthogonal (conjugate)
31Search Step Size
32Iteration finish in n steps
Initial error
A-orthogonal
The error component at direction dj is eliminated
at step j. After n steps, all errors are
eliminated.
33Conjugate Search Direction
- How to construct A-orthogonal search directions,
given a set of n linear independent vectors. - Since the residue vector in
steepest descent method is orthogonal, a good
candidate to start with
34Construct Search Direction -1
- In Steepest Descent Method
- New residue is just a linear combination of
previous residue and - Let
We have
Krylov SubSpace repeatedly applying a matrix to
a vector
35Construct Search Direction -2
let
For i gt 0
36Construct Search Direction -3
- can get next direction from the previous one,
without saving them all.
let
then
37Conjugate Gradient Algorithm
Given x0, iterate until residue is smaller than
error tolerance
38Conjugate gradient Convergence
- In exact arithmetic, CG converges in n steps
(completely unrealistic!!) - Accuracy after k steps of CG is related to
- consider polynomials of degree k that is equal to
1 at step 0. - how small can such a polynomial be at all the
eigenvalues of A? - Eigenvalues close together are good.
- Condition number ?(A) A2 A-12
?max(A) / ?min(A) - Residual is reduced by a constant factor by
O(?1/2(A)) iterations of CG.
39Other Krylov subspace methods
- Nonsymmetric linear systems
- GMRES for i 1, 2, 3, . . . find xi ? Ki
(A, b) such that ri (Axi b) ? Ki (A,
b)But, no short recurrence gt save old vectors
gt lots more space (Usually restarted every k
iterations to use less space.) - BiCGStab, QMR, etc.Two spaces Ki (A, b) and Ki
(AT, b) w/ mutually orthogonal bases Short
recurrences gt O(n) space, but less robust - Convergence and preconditioning more delicate
than CG - Active area of current research
- Eigenvalues Lanczos (symmetric), Arnoldi
(nonsymmetric)
40Preconditioners
- Suppose you had a matrix B such that
- condition number ?(B-1A) is small
- By z is easy to solve
- Then you could solve (B-1A)x B-1b instead of Ax
b - B A is great for (1), not for (2)
- B I is great for (2), not for (1)
- Domain-specific approximations sometimes work
- B diagonal of A sometimes works
- Better blend in some direct-methods ideas. . .
41Preconditioned conjugate gradient iteration
x0 0, r0 b, d0 B-1 r0, y0
B-1 r0 for k 1, 2, 3, . . . ak
(yTk-1rk-1) / (dTk-1Adk-1) step length xk
xk-1 ak dk-1 approx
solution rk rk-1 ak Adk-1
residual yk B-1 rk
preconditioning
solve ßk (yTk rk) / (yTk-1rk-1)
improvement dk yk ßk dk-1
search direction
- One matrix-vector multiplication per iteration
- One solve with preconditioner per iteration
42Outline
- Iterative Method
- Stationary Iterative Method (SOR, GS,Jacob)
- Krylov Method (CG, GMRES)
- Multigrid Method
43What is the multigrid
- A multilevel iterative method to solve
- Axb
- Originated in PDEs on geometric grids
- Expend the multigrid idea to unstructured problem
Algebraic MG - Geometric multigrid for presenting the basic
ideas of the multigrid method.
44The model problem
Ax b
45Simple iterative method
- x(0) -gt x(1) -gt -gt x(k)
- Jacobi iteration
- Matrix form x(k) Rjx(k-1) Cj
- General form x(k) Rx(k-1) C (1)
- Stationary x Rx C (2)
46Error and Convergence
- Definition error e x - x (3)
- residual r b Ax (4)
- e, r relation Ae r (5) ((3)(4))
- e(1) x-x(1) Rx C Rx(0) C Re(0)
- Error equation e(k) Rke(0) (6) ((1)(2)(3))
- Convergence
47Error of diffenent frequency
- Wavenumber k and frequency ?
- k?/n
- High frequency error is more oscillatory between
points
48Iteration reduce low frequency error efficiently
- Smoothing iteration reduce high frequency error
efficiently, but not low frequency error
Error
k 1
k 2
k 4
Iterations
49Multigrid a first glance
- Two levels coarse and fine grid
?2h
A2hx2hb2h
Ahxhbh
?h
Axb
50Idea 1 the V-cycle iteration
- Also called the nested iteration
Start with
?2h
A2hx2h b2h
A2hx2h b2h
Iterate gt
Prolongation ?
Restriction ?
?h
Ahxh bh
Iterate to get
Question 1 Why we need the coarse grid ?
51Prolongation
- Prolongation (interpolation) operator
- xh x2h
52Restriction
- Restriction operator
- xh x2h
53Smoothing
- The basic iterations in each level
- In ?ph xphold ? xphnew
- Iteration reduces the error, makes the error
smooth geometrically. - So the iteration is called smoothing.
54Why multilevel ?
- Coarse lever iteration is cheap.
- More than this
- Coarse level smoothing reduces the error more
efficiently than fine level in some way . - Why ? ( Question 2 )
55Error restriction
- Map error to coarse grid will make the error more
oscillatory
K 4, ? ?
K 4, ? ?/2
56Idea 2 Residual correction
- Known current solution x
- Solve Axb eq. to
- MG do NOT map x directly between levels
- Map residual equation to coarse level
- Calculate rh
- b2h Ih2h rh ( Restriction )
- eh Ih2h x2h ( Prolongation )
- xh xh eh
57Why residual correction ?
- Error is smooth at fine level, but the actual
solution may not be. - Prolongation results in a smooth error in fine
level, which is suppose to be a good evaluation
of the fine level error. - If the solution is not smooth in fine level,
prolongation will introduce more high frequency
error.
58Revised V-cycle with idea 2
?2h ?h
- Smoothing on xh
- Calculate rh
- b2h Ih2h rh
- Smoothing on x2h
- eh Ih2h x2h
- Correct xh xh eh
Restriction
Prolongation
59What is A2h
60Going to multilevels
- V-cycle and W-cycle
- Full Multigrid V-cycle
h 2h 4h
h 2h 4h 8h
61Performance of Multigrid
Gaussian elimination O(N2)
Jacobi iteration O(N2log?)
Gauss-Seidel O(N2log?)
SOR O(N3/2log?)
Conjugate gradient O(N3/2log?)
Multigrid ( iterative ) O(Nlog?)
Multigrid ( FMG ) O(N)
62Summary of MG ideas
- Three important ideas of MG
- Nested iteration
- Residual correction
- Elimination of error
- high frequency fine grid
- low frequency coarse grid
63AMG for unstructured grids
- Axb, no regular grid structure
- Fine grid defined from A
64Three questions for AMG
- How to choose coarse grid
- How to define the smoothness of errors
- How are interpolation and prolongation done
65How to choose coarse grid
- Idea
- C/F splitting
- As few coarse grid point as possible
- For each F-node, at least one of its neighbor is
a C-node - Choose node with strong coupling to other nodes
as C-node
1
2
4
3
5
6
66How to define the smoothness of error
- AMG fundamental concept
- Smooth error small residuals
- r ltlt e
67How are Prolongation and Restriction done
- Prolongation is based on smooth error and strong
connections - Common practice I
68AMG Prolongation (2)
69AMG Prolongation (3)
70Summary
- Multigrid is a multilevel iterative method.
- Advantage scalable
- If no geometrical grid is available, try
Algebraic multigrid method
71The landscape of Solvers
More Robust
Less Storage (if sparse)