Title: DOE HENP SciDAC Project:
1Electromagnetic Systems Simulation-1 (ESS)
DOE HENP SciDAC Project Advanced Computing
for 21st Century Accelerator Science and
Technology
Cho Ng
Advanced Computations Department Stanford Linear
Accelerator Center
SciDAC Meeting 8/11 8/12, 2004 at Fermilab
Work supported by U.S. DOE ASCR HENP
Divisions under contract DE-AC0376SF00515
2Outline of Presentation
- Overview
- Parallel Code Development
- Accelerator Modeling
- ISICs and SAPP Collaborations
- (Rich Lees presentation)
3SciDAC ESS Team
Advanced Computations Department
Accelerator Modeling V. Ivanov, A.
Kabel, K. Ko, Z. Li, C. Ng
Computing Technologies N. Folwell, A.
Guetz, G. Schussman, R. Uplenchwar
Computational Mathematics R. Lee, L. Ge,
L. Xiao M. Kowalski, S. Chen (Stanford)
ISICs Collaborators SAPP Partners
LBNL - E. Ng, W. Guo, P. Husbands, X. Li, A.
Pinar, C. Yang, Z. Bai LLNL - L. Diachin, K.
Chand, B. Henshaw, D. White SNL - P. Knupp, T.
Tautges, K. Devine, E. Boman Stanford G.
Golub UCD K. Ma, H. Yu, E. Lum RPI M.
Shephard, E. Seol, A. Bauer Columbia D. Keyes
Carnegie Mellon U O. Ghattas, V. Akcelik U. of
Wisconsin H. Kim
4Parallel Unstructured EM Codes
5Accelerator Modeling
The SciDAC tools are being used to improve
existing machines and to optimize future
facilities across the Office of Science
PEP-II IR (HEP)
NLC Structures (HEP)
RIA RFQ (NP)
PSI Ring Cyclotron
LCLS RF Gun (BES)
6Outline of Presentation
- Overview
- Parallel Code Development
- Accelerator Modeling
- ISICs and SAPP Collaborations
- (Rich Lees presentation)
7Omega3P Progress Plans
Progress include
- Complex solver to model lossy cavities
- Linear solver framework to facilitate direct
methods - Hierarchical preconditioner (up to 6th order
bases) - New eigensolvers (SAPP/TOPS)
- AMR to accelerate convergence (TSTT)
- Shape optimization (TOPS/TSTT)
Next steps
- Waveguide boundary conditions leading to
quadratic - (nonlinear) eigenvalue problem (SAPP/TOPS)
- Adaptive p refinement and combine with AMR
8Omega3P Solving Large Problems
SLAC, Stanford and LBL are developing new
algorithms to improve the accuracy, convergence
and scalability of Omega3P/S3P to solve
increasingly large problem size.
Advances
Largest problem solved is 93 million DOFs
- Use of direct solvers provides 50-100x faster
solution time, - Higher order hierarchical bases (up to p6) and
preconditioners increase accuracy and
convergence, - AV FEM formulation accelerates CG convergence by
5x to 10x.
Degrees of Freedom
9T3P Progress Plans
Progress include
- Unconditionally stable time-stepping scheme
- Mesh-independent particle trajectories
- High-order spatial discretizations on
tetrahedral elements
Next steps
- Waveguide boundary conditions
- Model entire PEP-II IR beam line complex
- Runtime parallel performance tuning
- Particle-in-cell capability
10T3P Null Space Suppression
- Standard formulations excite modes in the null
space of the curl-curl operator. - T3P, in contrast, correctly models the null
space of the curl-curl operator, removing the
need for a Poisson-type correction to the static
electric field.
Standard Formulation
T3P
11Tau3P Progress Plans
Progress include
- Curved beam paths for application to PEP-II IR
- Dielectric and lossy materials
- Checkpoint and restart capabilities
- Parallel performance improvement through
partitioning - schemes in Zoltan (SAPP)
Next steps
- Further advances in parallel performance
through Zoltan - Apply curved beam paths to model entire IR
complex
12Tau3P Curved Beam Paths
Snapshots of beam transits
Curved section of beam line
- Curved beam paths are needed to accurately
model beam transits in the PEP-II IR which
consists of two beam lines with a finite crossing
angle at the Interaction Point.
13Outline of Presentation
- Overview
- Parallel Code Development
- Accelerator Modeling
- ISICs and SAPP Collaborations
- (Rich Lees presentation)
14NLC DDS Wakefields from Tau3P
H60VG3 55-cell DDS w/ power and HOM couplers
Tau3P simulation with beam
- 1st wakefield analysis of an
- actual DDS prototype,
-
- 1st direct verification of DDS
- wakefield suppression by
- end-to-end simulation.
15NLC DDS Wakefields from Omega3P
16NLC DDS Wakefields Comparison
Mode Spectrum
Wakefields behind leading bunch
Omega3P
Tau3P
17NLC - Dark Current from Track3P
Track3P is used to model the dark current in the
X-Band 30-cell constant impedance structure for
comparison with experiment
Dark current _at_ 3 pulse risetimes
-- 10 nsec -- 15 nsec -- 20 nsec
Track3P
Data
18PEP- II IR Heating with Tau3P/Omega3P
Tau3P/Omega3P are being used to study beam
heating in the Interaction Region And
absorber design for damping trapped modes
Wall Loss Q
Absorber
Damped Q
19LCLS Coupler Design with S3P
Dual-feed racetrack Coupler Design Modeled with
S3P
Iris
Racetrack Shape
- Dual-feed to remove dipole fields
- Racetrack cavity shape to minimize quadrupole
fields - Rounded coupling iris to reduce RF heating
Dual-feed
S3P Model of S-band Structure
20LCLS RF Gun Design with Omega3P/S3P
Rounded Iris to lower pulse heating
Racetrack dual-feed coupler design to minimize
dipole and quadrupole fields
1.6-cell Cubit mesh model
21RIA RFQ with Omega3P
- RIA will consists of many RFQs in its low
frequency linacs - Tuners are needed to cover cavity frequency
deviations - of 1 for lack of better predictions
- Omega3P improves accuracy by an order of
magnitude - This can significantly reduce the number of
tuners and - their tuning range, helping to lower the
linac cost Â
Omega3P
Solid Model
Mesh Wall Loss
df/f MWS 1.5, Omega3P -0.2
22RIA Hybrid RFQ with Omega3P
Hybrid RFQ has disparate spatial scales which are
difficult to model, requiring high mesh
adaptivity and higher-order elements to obtain
fast convergence ? h-p refinement
Freq vs 4th power of grid size
23PSI Ring Cyclotron with Omega3P
(Lukas Stingelin PhD work PSI)
First ever mode analysis of entire ring cyclotron
only possible with parallel computing and
unstructured grids
Omega3P model 1.2 M elements, 6.9 M DOFs
Layout Top view
24PSI Ring Cyclotron with Omega3P
(Lukas Stingelin PhD work PSI)
- Omega3P finds the tightly clustered modes in 45
min for 20 modes on 32 CPUs (IBM-SP4) using about
120GB - 280 eigenmodes are calculated with
eigenfrequencies close to beam-harmonics
CAVITY VACUUM CHAMBER
MIXED MODES_____
25Outline of Presentation
- Overview
- Parallel Code Development
- Accelerator Modeling
- ISICs and SAPP Collaborations
- (Rich Lees presentation)