Title: MFE%20Simulation%20Data%20Management
1MFE Simulation Data Management
- SLAC DMW 2004
- March 16, 2004
- W. W. Lee and S. Klasky
- Princeton Plasma Physics Laboratory
- Princeton, NJ
2Spatial Temporal Scales Present Major Challenge
to Theory Simulations
- Huge range of spatial and temporal scales.
- Overlap in scales often means strong
(simplified) ordering not possible - Different codes/theory for different scales.
- 5years Integration of physics into Fusion
Simulation Project
3Major Fusion Codes
4Data Rates of Major Fusion Codes
Code (GB) now / 5yr Runtime now/5yr (hr) Processors Now/5yr Mbs Now/5yr
GTC 4,000 / 100,000 300/150 2048 80/ 1600
Gyro 10 / 100 30/30 512/2048 .8/ 8
GS2 10 / 100 30/30 512/2048 .8 / 8
Degas2 .1 1 10 .2
Transp .05 3 1 .04
Nimrod 5/ 50 20/20 128 .6/ 6
M3D 10 / 100 20/20 128 1.1/ 11
NSTX .25/shot 1/ 4 0.25 40 9, 36
Total (TB) 4.3 / 101
5Plasma Turbulence Simulation
- Gyrokinetic Particle-In-Cell Simulation
- -- Reduced Vlasov-Maxwell Equations
- Simulations on MPP Platforms
- -- Cray T3E IBM SP (NERSC), Cray-X1
(ORNL), - SX6 (Earth Simulator, Japan)
- Simulation of Burning Plasmas
- -- International Tokamak Experimental Reactor
(ITER) - Integrated Fusion Simulation Project (MFE)
- Visualization -- turbulence evolution
particle orbits
6Gyrokinetic Approximation
- Gyromotion
- Polarization provides quasineutrality
W. W. Lee, PF 83 JCP 87
7(No Transcript)
818
10
(Ethier)
9Ion Temperature Gradient Driven Turbulence
Particle Trajectories
Electrostatic Potential
10Data Management challenges
- GTC is producing TBs of data
- Data rates 80Mbs now, 1.6Gbs 5 years.
- Need QOS to stream data.
- This data needs to be post-processed
- Essential to parallelize the post-processing
routines to handle our larger datasets. - We need a cluster to post process this data.
- M (supercomputer processors) x N (cluster
processors) problem. - QOS becomes more important to sustain this
post-processing. - The post-processed data needs to be shared among
collaborators - Different sections of the post-processed data may
go to different users . - Post-processed data, along with other metadata
should be archived into a relational database.
11Post processing of GTC Data.
- Particle Data
- No compression possible.
- Sent to 1 cluster for visualization/analysis.
- Work being done with K. Ma, U.C. Davis Visualize
a million particles. - Gain new insights into the theory.
- Field Data
- Geometric/Temporal compression of the data is
possible. - Data needs to be streamed to a local cluster at
PPPL. - Reduced subset needs to be sent to PPPL
collaborators. - Use Logistic Network. Beck, UT-K
- Data transfer needs to be automatic, and
integrated into a dataflow/webflow for use with
parallel analysis routines. - We desire to see post-processed data during the
simulation.
12After the analysis
- Post-processed data needs to be saved into a
relational database - How do we query this abstract data to compare it
with experiments? - 3D correlation functions
- Processing of TBs of data/run now, 100s of TBs
of data/run in 5 years. - Data mining techniques will be necessary to
understand this data.