Eigenvalue%20Problems%20in%20Nanoscale%20Materials%20Modeling - PowerPoint PPT Presentation

About This Presentation

Title:

Eigenvalue%20Problems%20in%20Nanoscale%20Materials%20Modeling

Description:

Eigenvalue Problems in Nanoscale Materials Modeling Hong Zhang Computer Science, Illinois Institute of Technology Mathematics and Computer Science, Argonne National ... – PowerPoint PPT presentation

Number of Views:156

Avg rating:3.0/5.0

Slides: 35

Provided by: hzh87

Learn more at: https://www.cse.psu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Eigenvalue%20Problems%20in%20Nanoscale%20Materials%20Modeling

1
Eigenvalue Problems in Nanoscale Materials
Modeling

Hong Zhang
Computer Science, Illinois Institute of
Technology
Mathematics and Computer Science, Argonne
National Laboratory

2
Collaborators

Barry Smith
Mathematics and Computer Science, Argonne
National Laboratory
Michael Sternberg, Peter Zapol
Materials Science, Argonne National Laboratory

3
Modeling of Nanostructured Materials
System size
Accuracy

4
Density-Functional based Tight-Binding (DFTB)
5
Matrices are

large ultimate goal
50,000 atoms with electronic structure
N200,000
sparse
non-zero density -gt 0 as N increases
dense solutions are requested
60 eigenvalues and eigenvectors
Dense solutions of large sparse problems!

6
DFTB implementation (2002)
7
Two classes of methods

Direct methods (dense matrix storage)
- compute all or almost all eigensolutions out
of dense matrices of small to medium size
- Tridiagonal reduction QR or Bisection
- Time O(N3), Memory O(N2)
- LAPACK, ScaLAPACK
Iterative methods (sparse matrix storage)
- compute a selected small set of eigensolutions
out of sparse matrices of large size
- Lanczos
- Time O(nnzN) lt O(N3), Memory O(nnz) lt
O(N2)
- ARPACK, BLZPACK,

8
DFTB-eigenvalue problem is distinguished by

(A, B) is large and sparse
Iterative method
A large number of eigensolutions (60) are
requested
Iterative method multiple shift-and-invert
The spectrum has
- poor average eigenvalue separation O(1/N),
- cluster with hundreds of tightly packed
eigenvalues
- gap gtgt O(1/N)
Iterative method multiple shift-and-invert
robusness
The matrix factorization of (A-?B)LDLT
not-very-sparse(7) lt nonzero density lt
dense(50)
Iterative method multiple shift-and-invert
robusness efficiency
Ax?Bx is solved many times (possibly 1000s)
Iterative method multiple shift-and-invert
robusness efficiency
initial
approximation of eigensolutions

9
Lanczos shift-and-invert method for Ax ?Bx

Cost
- one matrix factorization
- many triangular matrix solves
Gain
- fast convergence
- clustering eigenvalues are transformed to
well-separated eigenvalues
- preferred in most practical cases

10
Multiple Shift-and-Invert Parallel Eigenvalue
Algorithm
11
Multiple Shift-and-Invert Parallel Eigenvalue
Algorithm
12
Idea distributed spectral slicing
compute eigensolutions in distributed
subintervalsExample Proc1 Assigned
Spectrum (?0, ?2) shrink Computed
Spectrum ?1 expand
Proc2
Proc1
Proc0
?min imin
?max imax
?0
?1
?2
13
Software Structure

Shift-and-Invert Parallel Spectral Transforms
(SIPs)
Select shifts
Bookkeep and validate eigensolutions
Balance parallel jobs
Ensure global orthogonality of eigenvectors
Subgroup of communicators

ARPACK
SLEPc
PETSc
MUMPS
MPI
14
Software Structure

ARPACK
www.caam.rice.edu/software/ARPACK/
SLEPc
Scalable Library for Eigenvalue Problem
Computations
www.grycap.upv.es/slepc/
MUMPS
MUltifrontal Massively Parallel sparse direct
Solver
www.enseeiht.fr/lima/apo/MUMPS/
PETSc
Portable, Extensible Toolkit for Scientific
Computation
www.mcs.anl.gov/petsc/
MPI
Message Passing Interface
www.mcs.anl.gov/mpi/

15
Select shifts

- robustness
be able to compute all the desired eigenpairs
under extreme pathological conditions
- efficiency
reduce the total computation cost
(matrix factorization and Lanczos runs)

16
Select shifts
??
?mid
?k
?1
?i1
?i
?max

e.g., extension to the right side of ?i
?? ?k 0.45( ?k ?1 )
?mid ( ?i ?max )/2
?i1 min( ??, ?mid )

17
Eigenvalue clusters and gaps

Gap detection
Move shift
outside of a gap

18
Bookkeep eigensolutions
DONE
COMPUT COMPUT
COMPUT COMPUT
UNCOMPUT
UNCOMPUT
?0
?1
Overlap Match

Multiple eigenvalues aross processors

proc0
proc1
19
Bookkeep eigensolutions
20
Balance parallel jobs
21
SIPs
Proc2
Proc1
Proc0
?min imin
?max imax
?0
?1
?2
22
d) pick next shift ? update computed
spectrum ?min, ?max and send to
neighboring processese) receive messages from
neighbors update its assigned spectrum
(?min, ?max )
Proc2
Proc1
Proc0
?min
?max
?0
?01
?1
?2
?11
23
Accuracy of the Eigensolutions

Residual norm of all computed eigenvalues
is inherited from ARPACK
Orthogonality of the eigenvectors computed from
the same shift is inherited from ARPACK
Orthogonality between the eigenvectors computed
from different shifts?
Each eigenvalue singleton is computed through a
single shift
Eigenvalue separation between two singletons
? satisfying eigenvector orthogonality

24
Subgroups of communicators

when a single process cannot store matrix factor
or distributed eigenvectors

0 id 3 idEps 1 idMat 0 6 9
1 4 idEps 1 idMat 1 7 10
2 5 idEps 1 idMat 2 8 11
commEps
commMat
?max
?min
25
Numerical Experiments on Jazz

Jazz, Argonne National Laboratory
Compute
350 nodes, each with a 2.4 GHz Pentium Xeon
Memory
175 nodes with 2 GB of RAM,
175 nodes with 1 GB of RAM
Storage
20 TB of clusterwide disk
10 TB GFS and 10 TB PVFS
Network
Myrinet 2000, Ethernet

26
Tests

Diamond (a diamond crystal)
Grainboundary-s13, Grainboundary-s29,
Graphene,
MixedSi, MixedSiO2,
Nanotube2 (a single-wall carbon nanotube)
Nanowire9
Nanowire25 (a diamond nanowire)

27
Numerical results Nanotube2 (a single-wall
carbon nanotube)
Non-zero density of matrix factor 7.6, N16k
28
Numerical results Nanotube2 (a single-wall
carbon nanotube)
Myrinet
Ethernet
29
Numerical results Nanowire25 (a diamond
nanowire)
Non-zero density of matrix factor 15, N16k
30
Numerical results Nanowire25 (a diamond
nanowire)
Myrinet
Ethernet
31
Numerical results Diamond (a diamond crystal)
Non-zero density of matrix factor 51, N16k
32
Numerical results Diamond (a diamond crystal)
Myrinet
Ethernet

npMat4
33
Summary