Title: Fast Adaptive Storage and Retrieval
1Fast Adaptive Storage and Retrieval
- Scott B. Baden
- Department of Computer Science and Engineering
- University of California, San Diego
2Motivation
Some applications are able to distinguish
interesting features from background data using
on-line analysis
3Features
4Animation
5Fast Adaptive Storage and Retrieval
- If the volume fraction of interesting data is
small, then we can reduce storage, memory, and
network bandwidth requirements significantly by
storing only what is needed - We call a scheme that realizes this
capabilityAdaptive Storage and Retrieval (FASTR) - This is a new paradigm for scientific users,
since they are reluctant to part with their data - We use resources only to the extent that we
require them remote knowledge discovery and data
browsing
6The KeLP Project
- C run time libraries for parallel application
library development - Hide low level details without sacrificing
performance - Irregular block structured data
- Express communication at a high level using
intuitive geometric set operations - Also applies to data intensive applications
- KeLP I/O out of core (Bradley Broom, Rice)
7Data intensive application of KeLP
- KDistuf
- Turbulent flow with Direct Numerical Simulation
- Collaboration involving K. Nomura (UCSD MAE), W.
Kerney and D. Shalit (UCSD CSE),G. Balls
(UCSDSC), P. Diamessis (USC) - Content-based data compression
- Borrow structured adaptive mesh refinement grid
techniques to - Capture features at full resolution
- Discard remaining background data
8More about the application
- Turbulent mixing in stably stratified flow under
the influence of background shear - Solve the incompressible Navier Stokes equations
- Follow the time evolution of regions of
overturned dense fluid, which are the main agents
of stirring and mixing
The efficiency of mixing in turbulent patches
inferences from direct simulations and
microstructure observations, in press, J. Phys.
Ocean. Smyth, Moum, and Caldwell, 2001.
9Information discovery
- Oceanographic observations are incomplete
restricted to 1 dimensional observations - Discovery time evolution, energy dissipation and
lifetime of overturn regions, which have
irregular shapes
Bill Smyth, Dept. Oceanic Atmospheric
Sciences,Oregon State University
10Fast Adaptive Storage and Retrieval
- Compression depends on the data, currently on
1283 - Best case 201 compression (10 GB ? 500 MB),
worst case 2.81 - Lempel-Ziv (gzip) give us only 10
11Further savings another application
- Use volume tracking Silver, Rutgers to follow
individual features - FASTR permits us to extract only the data we need
out of the many features present - Computational volume 2M pts
- Average feature size 1K points
- Maximum feature size 20K pts
- Saves additional two orders of magnitude in
communication bandwidth requirements - Perform local analysis on a workstation
12Future plans
- Develop remote analysis capability
- Integrate with DTF data handling infrastructure
- Larger scale simulations on Blue Horizon and on
clusters 2563 - Study vortex pairs in a stratified turbulent
environment - Improved understanding of aircraft wake vortices
- Practical importance for air traffic control
13Remote analysis capability
- Perform analysis on data sets stored remotely,
e.g. Data Cutter - We can perform some data analysis on a local
workstation - For highly intensive data analysis, we can use
higher end resources, but again we access only
the data we need
14Publications and people
- FASTR is based on a research prototype called
MOLD, which is the M.S. thesis research of UCSD
CSE student William Kerney - MOLD A System for Breaking Down Large
Visualization and Post-Processing Problems.
Expected March 2002. - Peter Diamessis, then a PhD student with Keiko
Nomura (UCSD MAE Dept), used MOLD to carry out an
exploration of overturns - An Investigation of Vortical Structures and
Density Overturns in Stably Stratified
Homogeneous Turbulence by Means of Direct
Numerical Simulation, P. Diamessis, PhD thesis,
2001 - Automated Tracking of Turbulent Structures in
Direct Numerical Simulation, P. Diamessis et al,
PARA 2002, Helsinki, Finland. To appear.
15Software availability
- FASTR- contact us
- KeLP
- Hardened version of KeLP, AKA KeLP1.4
- http//www.cse.ucsd.edu/groups/hpcl/scg/kelp
- NPACI Blue Horizon, Sun HPC, Cray T3E, Linux
clusters - Workstations Solaris, Linux, etc.
- Dual tier variant, KeLP2.1 hierarchical KeLP for
SMP clusters and SMP based machines (e.g. BH)