Eigenvalue%20Problems%20in%20Nanoscale%20Materials%20Modeling - PowerPoint PPT Presentation

About This Presentation
Title:

Eigenvalue%20Problems%20in%20Nanoscale%20Materials%20Modeling

Description:

Eigenvalue Problems in Nanoscale Materials Modeling Hong Zhang Computer Science, Illinois Institute of Technology Mathematics and Computer Science, Argonne National ... – PowerPoint PPT presentation

Number of Views:152
Avg rating:3.0/5.0
Slides: 35
Provided by: hzh87
Learn more at: https://www.cse.psu.edu
Category:

less

Transcript and Presenter's Notes

Title: Eigenvalue%20Problems%20in%20Nanoscale%20Materials%20Modeling


1
Eigenvalue Problems in Nanoscale Materials
Modeling
  • Hong Zhang
  • Computer Science, Illinois Institute of
    Technology
  • Mathematics and Computer Science, Argonne
    National Laboratory

2
Collaborators
  • Barry Smith
  • Mathematics and Computer Science, Argonne
    National Laboratory
  • Michael Sternberg, Peter Zapol
  • Materials Science, Argonne National Laboratory

3
Modeling of Nanostructured Materials
System size
Accuracy

4
Density-Functional based Tight-Binding (DFTB)
5
Matrices are
  • large ultimate goal
  • 50,000 atoms with electronic structure
  • N200,000
  • sparse
  • non-zero density -gt 0 as N increases
  • dense solutions are requested
  • 60 eigenvalues and eigenvectors
  • Dense solutions of large sparse problems!

6
DFTB implementation (2002)
7
Two classes of methods
  • Direct methods (dense matrix storage)
  • - compute all or almost all eigensolutions out
    of dense matrices of small to medium size
  • - Tridiagonal reduction QR or Bisection
  • - Time O(N3), Memory O(N2)
  • - LAPACK, ScaLAPACK
  • Iterative methods (sparse matrix storage)
  • - compute a selected small set of eigensolutions
    out of sparse matrices of large size
  • - Lanczos
  • - Time O(nnzN) lt O(N3), Memory O(nnz) lt
    O(N2)
  • - ARPACK, BLZPACK,

8
DFTB-eigenvalue problem is distinguished by
  • (A, B) is large and sparse
  • Iterative method
  • A large number of eigensolutions (60) are
    requested
  • Iterative method multiple shift-and-invert
  • The spectrum has
  • - poor average eigenvalue separation O(1/N),
  • - cluster with hundreds of tightly packed
    eigenvalues
  • - gap gtgt O(1/N)
  • Iterative method multiple shift-and-invert
    robusness
  • The matrix factorization of (A-?B)LDLT
  • not-very-sparse(7) lt nonzero density lt
    dense(50)
  • Iterative method multiple shift-and-invert
    robusness efficiency
  • Ax?Bx is solved many times (possibly 1000s)
  • Iterative method multiple shift-and-invert
    robusness efficiency
  • initial
    approximation of eigensolutions

9
Lanczos shift-and-invert method for Ax ?Bx
  • Cost
  • - one matrix factorization
  • - many triangular matrix solves
  • Gain
  • - fast convergence
  • - clustering eigenvalues are transformed to
  • well-separated eigenvalues
  • - preferred in most practical cases

10
Multiple Shift-and-Invert Parallel Eigenvalue
Algorithm
11
Multiple Shift-and-Invert Parallel Eigenvalue
Algorithm
12
Idea distributed spectral slicing
compute eigensolutions in distributed
subintervalsExample Proc1 Assigned
Spectrum (?0, ?2) shrink Computed
Spectrum ?1 expand
Proc2
Proc1
Proc0
?min imin
?max imax
?0
?1
?2
13
Software Structure
  • Shift-and-Invert Parallel Spectral Transforms
    (SIPs)
  • Select shifts
  • Bookkeep and validate eigensolutions
  • Balance parallel jobs
  • Ensure global orthogonality of eigenvectors
  • Subgroup of communicators

ARPACK
SLEPc
PETSc
MUMPS
MPI
14
Software Structure
  • ARPACK
  • www.caam.rice.edu/software/ARPACK/
  • SLEPc
  • Scalable Library for Eigenvalue Problem
    Computations
  • www.grycap.upv.es/slepc/
  • MUMPS
  • MUltifrontal Massively Parallel sparse direct
    Solver
  • www.enseeiht.fr/lima/apo/MUMPS/
  • PETSc
  • Portable, Extensible Toolkit for Scientific
    Computation
  • www.mcs.anl.gov/petsc/
  • MPI
  • Message Passing Interface
  • www.mcs.anl.gov/mpi/

15
Select shifts
  • - robustness
  • be able to compute all the desired eigenpairs
    under extreme pathological conditions
  • - efficiency
  • reduce the total computation cost
  • (matrix factorization and Lanczos runs)

16
Select shifts
??
?mid
?k
?1
?i1
?i
?max
  • e.g., extension to the right side of ?i
  • ?? ?k 0.45( ?k ?1 )
  • ?mid ( ?i ?max )/2
  • ?i1 min( ??, ?mid )

17
Eigenvalue clusters and gaps
  • Gap detection
  • Move shift
  • outside of a gap

18
Bookkeep eigensolutions
DONE
COMPUT COMPUT
COMPUT COMPUT
UNCOMPUT
UNCOMPUT
?0
?1
Overlap Match
  • Multiple eigenvalues aross processors

proc0
proc1
19
Bookkeep eigensolutions
20
Balance parallel jobs
21
SIPs
Proc2
Proc1
Proc0
?min imin
?max imax
?0
?1
?2
22
d) pick next shift ? update computed
spectrum ?min, ?max and send to
neighboring processese) receive messages from
neighbors update its assigned spectrum
(?min, ?max )
Proc2
Proc1
Proc0
?min
?max
?0
?01
?1
?2
?11
23
Accuracy of the Eigensolutions
  • Residual norm of all computed eigenvalues
  • is inherited from ARPACK
  • Orthogonality of the eigenvectors computed from
    the same shift is inherited from ARPACK
  • Orthogonality between the eigenvectors computed
    from different shifts?
  • Each eigenvalue singleton is computed through a
    single shift
  • Eigenvalue separation between two singletons
  • ? satisfying eigenvector orthogonality

24
Subgroups of communicators
  • when a single process cannot store matrix factor
    or distributed eigenvectors

0 id 3 idEps 1 idMat 0 6 9
1 4 idEps 1 idMat 1 7 10
2 5 idEps 1 idMat 2 8 11
commEps
commMat
?max
?min
25
Numerical Experiments on Jazz
  • Jazz, Argonne National Laboratory
  • Compute
  • 350 nodes, each with a 2.4 GHz Pentium Xeon
  • Memory
  • 175 nodes with 2 GB of RAM,
  • 175 nodes with 1 GB of RAM
  • Storage
  • 20 TB of clusterwide disk
  • 10 TB GFS and 10 TB PVFS
  • Network
  • Myrinet 2000, Ethernet

26
Tests
  • Diamond (a diamond crystal)
  • Grainboundary-s13, Grainboundary-s29,
  • Graphene,
  • MixedSi, MixedSiO2,
  • Nanotube2 (a single-wall carbon nanotube)
  • Nanowire9
  • Nanowire25 (a diamond nanowire)

27
Numerical results Nanotube2 (a single-wall
carbon nanotube)
Non-zero density of matrix factor 7.6, N16k
28
Numerical results Nanotube2 (a single-wall
carbon nanotube)
Myrinet
Ethernet
29
Numerical results Nanowire25 (a diamond
nanowire)
Non-zero density of matrix factor 15, N16k
30
Numerical results Nanowire25 (a diamond
nanowire)
Myrinet
Ethernet
31
Numerical results Diamond (a diamond crystal)
Non-zero density of matrix factor 51, N16k
32
Numerical results Diamond (a diamond crystal)
Myrinet
Ethernet


npMat4
33
Summary
  • SIPs a new multiple Shift-and-Invert Parallel
    eigensolver.
  • Competitive computational speed
  • - matrices with sparse factorization
  • SIPs (O(N2)) ScaLAPACK (O(N3))
  • - matrices with dense factorization
  • SIPs outperforms ScaLAPCK on slower network
    (fast Ethernet) as the
  • number of processors increases
  • Efficient memory usage
  • SIPs solves much larger eigenvalue problems than
    ScaLAPACK,
  • e.g., nproc64, SIPs Ngt64k ScaLAPACK N19k
  • Object-oriented design
  • - developed on top of PETSc and SLEPc.
  • PETSc provides sequential and parallel data
    structure
  • SLEPc offers built-in support for eigensolver
    and spectral transformation.
  • - through the interfaces of PETSc and SLEPc,
    SIPs easily uses external

34
Challenges ahead
Matrix Size
6k
32k 64k lt- We are here
200k
  • Memory
  • Execution time
  • Numerical difficulties!!!
  • eigenvalue spectrum (-1.5, 0.5)O(1)
  • -gt huge eigenvalue clusters
  • -gt large eigenspace with extremely
    sensitive vectors
  • Increase or mix arithmetic precision?
  • Eigenspace replaces individual eigenvectors?
  • Use previously computed eigenvectors as initial
    guess?
  • Adaptive residual tol?
  • New model?
Write a Comment
User Comments (0)
About PowerShow.com