Title: Department of Energy
1Department of Energy Office of High Energy and
Nuclear Physics
Computational Science present and projected
potentials
- Outline
- Very general overview of HENP
- Some project overviews
- Lattice QCD
- PDSF
- Nuclear Structure
- Accelerator Design
- Astrophysics
David J. Dean ORNL
NERSC-NUG/NUGeX meeting, 22-23 February 2001
2DOE has led the Nation in Developing Major
Accelerator Facilities
APS
IPNS
NSLS
SSRL
From Rob Ryne
3SNS neutrons and molecules
Some of the Science HENP
RIA
FNAL, SLAC CEBAF
heavy nuclei
few nucleons
RHIC
quarks gluons
vacuum
Weak decays mesons nucleons QCD Standard Model
Few-body systems free NN force
Many-body systems effective NN force
quark-gluon plasma QCD
4Lattice Quantum ChromoDynamics (LQCD)
- Comprehensive method to extract, with controlled
systematic errors, first-principles - predictions from QCD for a wide range of
important particle phenomena. - Scientific Motivations
- 1) Tests of the Standard Model
- Quark mixing matrix elements Vtd , Vts
- CP violating K-meson decays.
- 2) Quark and gluon distributions in hadrons
- 3) Phase transitions of QCD (in search of the
quark-gluon plasma).
Conern I Lattice Spacing (x,y,z,t)
Concern II Quenched approximation
52 (of many) LQCD Examples and people
II Lepton decay constants of B-mesons
I QGP formation
1999
2008
NERSC involvement (PIs) 370,000
(Toussaint) 210,000 (Gupta) 190,000
(Soni) 187,500 (Sinclair) 150,000
(Negele) 100,000 (Liu) 40,000
(Lee) ----------- 1.2 million mpp hours
Unitarity triangle better LQCD calculations
constrain physical parameters tremendously.
6LQCD Computational Needs (from Doug Toussaint)
- The lattice is in 4 dimensions (3-space, 1-time)
- lattice spacing 1/sqrt(2)
current calculations. - Implies 8X computer power.
- Would cut systematic errors in half.
- Scientific gain push to smaller quark masses
and study - more complicated phenomena like flavor singlet
meson masses.
- What is important to this community?
- Sustained memory bandwidth and cache performance
- (present performance on SP at SDSC 170
Mflops/processor - on the big problem 70 Mflops/processor due
to less cache hits. - Node interconnect bandwidth and latency
- very important. Frequent global reductions
(gsum). - Tremendous potential here, may not be a NERSC
issue.
Given the straw machine (60 Tflops) Equation of
state for high temperature QCD using 3 dynamical
flavors and a lattice spacing of 0.13 fm would
be practical.
Main Computational Challenge Inversion of the
fermi-matrix (sparse matrix solution).
7Parallel Distributed Systems Facility
Evolving to ALICE at LHC ICE CUBE in
the Antarctic
Today BaBar(SlAC B-Factory) CP violation E871
(AGS) CP violation in hyperon decays CDF
(Fermilab) proton-antiproton collider D0
(Fermilab) E895 (AGS) RHI E896 (AGS)
RHI NA49 (CERN) RHI Phenix RHIC at
Brookhaven GC5 Data mining for the Quark Gluon
Plasma STAR RHIC at Brookhaven(85) AMANDA
Antarctic Muon and Neutrino Detector Array SNO
(Sudbury) solar neutrinos.
A theoretical point of view
Leads to the experimental Example One STAR
event at RHIC
- Computational Characteristics
- processing independent event data
- is naturally parallel
- Large data sets
- Distributed or global nature of complete
- computing picture.
8Evolution of PDSF (from Doug Olson)
Planning Assumptions
- Current STAR activity continues (very certain)
- Upgrade to STAR that increases data rate by 3x
around 2004 - Another large expt (e.g., ALICE or ICE CUBE)
chooses PDSF as a major center with usage - comparable to STAR in 20005
PDSF HOURS needed (1 PDSF hour 1 T3E hour)
FY01 1.2 M FY02 1.7 M FY03
2.3 M FY04 7.0 M FY05 20 M
FY06 28 M
Disk Storage Capacity (terabytes) FY01 16
FY02 32 FY03 45 FY04 134 FY05
375 FY06 527
Mass Storage TeraBytes Millions
Files FY01 16 1 FY02
32 2 FY03 45
3 FY04 134 9 FY05
376 20 FY06 527
30
Throughput to NERSC FY01 5 MB/s FY02
10 FY03 15 FY04 45 FY05 120 FY06
165
Other important factor HENP experiments are
moving towards data grid services NERSC
should plan to be a full function site on the
grid.
9Computational Nuclear Structure
126
Limits of nuclear existence
82
r-process
50
protons
rp-process
82
28
Density Functional Theory self-consistent Mean
Field
20
50
8
28
neutrons
2
20
8
2
A60
A10
A12
Towards a unified description of the nucleus
Ab initio few-body calculations
No-Core Shell Model G-matrix
10Nuclear Structure Examples Quantum Monte Carlo
Physics of medium mass nuclei Nuclear shell
model with effective NN interactions
application to SN-IA nucleosynthesis
Start with realistic NN potential fit to low
energy NN scattering data, and 3-body potential
calculations performed for nuclear structure
using GFMC techniques.
ANL/LANL/UIUC NN 3N interactions
For A10, each state takes 1.5 Tflop-hours
11Projected needs for nuclear structure
Physics to be addressed using AFMC/NSM Nuclear
structure of A60-100 nuclei studies of weak
interactions, thermal properties, and r-process
nucleosynthesis
Physics to be addressed using GFMC 12C
structure and 3-alpha burning nuclear
matter at finite temperature asymmetric
nuclear matter
FY K-MPP hours (NERSC only) 02 200 03
300 04 450 05 600 06 800
FY K-MPP hours (total) 02 400 03
700 04 1000 05 1700 06 3000
Memory needs are very important 1 Gbyte
memory/CPU by 2004.
Memory needs 0.25 Gbyte memory/CPU by 2004.
NERSC involvement (PIs) 125,000 (Pieper)
70,000 (Dean) 60,000 (Carlson) 60,000
(Alhassid) ----------- 0.32 million mpp hours
Sustained memory bandwidth and/or cache
performance is also very important. Pieper is
seeing a drop in performance when more CPUs are
clustered on a node.
Cache performance important (many matrix matrix
multiplies)
12Next-generation machines will require extreme
precision control push frontiers of beam
energy, beam intensity, system complexity(supplie
d by Rob Ryne)
- Physics issues
- highly three-dimensional
- nonlinear
- multi-scale
- many-body
- multi-physics
- Terascale simulation codes are being developed to
meet the challenges
Omega3P
- Simulation requirements/issues
- require high resolution
- are enormous in size
- CPU-intensive
- highly complex
IMPACT
Tau3P
13 Challenges in Electromagnetic Systems
Simulation Example NLC Accelerator Structure
(RDDS) Design
- Start w/ cylindrical cell geometry
- adjust geometry for maximum efficiency
- add micron-scale variations from cell-to-cell to
reduce wakefields - stack into multi-cell structure
- Add damping manifold to suppress long-range
wakefields, improve vaccum conductance, but
preserve RDS performance. Highly 3D structure.
Require 0.01 accuracy in accelerating frequency
to maintain structure efficiency (High resolution
modeling)
Verify wake suppression in entire 206-cell
section (System scale simulation)
Parallel solvers needed to model large, complex
3D electromagnetic structures to high accuracy
14Computer Science Issues
PEP-II cavity model w/ mesh refinement - accurate
wall loss calculation needed to guide
cooling channel design
- Meshing.
- Mesh generations, refinements, quality.
- Complex 3-D geometries structured and
unstructured meshes, and eventually oversetting
meshes. - Partitioning.
- Domain decomposition.
- Load balancing.
- Impact of memory hierarchy on efficiency.
- Cache, locally-shared memory, remote memory.
- Visualization of large data sets.
- Performance, scalability, and tuning on terascale
platforms
15Challenges in Beam Systems Simulation
- Simulation size for 3D modeling of rf linacs
- (1283-5123 grid points) x (20 particles/point)
40M-2B particles - 2D linac simulations w/ 1M particles require 1
weekend on PC - 100Mp PC simulation, if it could be performed,
would take 7 months - New 3D codes already enable 100Mp runs in 10
hours using 256 procs - Intense beams in rings (PSR, AGS, SNS ring)
- 100 to 1000 times more challenging than linac
simulations
NERSC involvement 0.600 million MPP hours
(mainly Rob Ryne)
16Supernova simulations
Type 1A supernova (from Nugent)
Core Collapse Supernova
Spherically symmetric simulations of the core
collapse including Boltzman neutrino transport
fails to explode. Indicates need to a) improve
nuclear physics inputs b) move to 2,3
dimensional simulations Calculations done on PVP
platforms, moving to MPP presently.
From Mezzacappa
17Supernova simulations computational needs
Core Collapse Supernova project (hydroBoltzman
neutrino transport)
Supernova Cosmology project
- Important computational issues
- Optimized FP performance
- memory performance is very important.
- Infrequent 50 Mbyte messages
- (send/receive).
- Communications with global file systems
- Crucial need for global storage
- (GPFS) with many I/O nodes
- HPSS is very important for data storage
NERSC-4 platform Year 3-D MGFLD models 2-D MGBT
models node hrs
Memory node hrs Memory 1
520,000 62G
-------------- 2 260,000
62G 260,000 25G 3
---------------- 750,000 25G 4
---------------- 1,000,000
100G 5 ? -------- (3-D MGBG?)
2,000,000 256G
Assumptions Yr. 1 3-D Newtonian MGFLD to
understand convection when
compared to 2-D Yr. 2 general relativistic
3-D MGFLD to compare to Newtonian
models Yr. 3 2-D MGBT at moderate
resolution Yr. 4 2-D MGBT at high resolution
with AMR technology Yr. 5 may
expand to 3-D MGBT.. But will require
growth of NERSC (NERSC-5 phase in?)
- With the straw-system
- chaotic velocity fields, 2D maybe
- 3D calculations with good input physics.
- SMP somewhat useless for this
- application (cpus on one node run
- independently using MPI).
From Doug Swesty
Current NERSC involvement Nugent 125,000
MPP Mezzacappa 43,500 MPP Total
0.15 MPP
From Peter Nugent
18People I left out
- Haiyan Gao (MIT)
- 3-body problem relativistic effects in
e,e scattering. 24,000 MPP/year - G. Malli / Walter Loveland (Simon Fraser)
- Coupled Cluster methods for the chemical
structure of super heavy elements. - ( 15,000 PVP).
- Big user who did not respond
- Chan Joshi -- 287,500 MPP hours
- (Plasma driven accelerators).
- General Conclusion
- More CPU is good for most people.
- Bytes/Flop ratio of 0.5 is okay for most people.
- Concern with memory access on single node.
- Concern with access to large disk space and HPSS
- (important to several groups)
- Exciting physics portfolio matched with DOE
facilities and - requiring computation.