Title: JASPER A' VRUGT
1Parameter Exploration Using Self-adaptive
Sampling and Optimization to Solve Environmental
Models
JASPER A. VRUGT Computational Earth Science
(EES-16) Applied Mathematics and Plasma Physics
(T-5) Center for NonLinear Studies (CNLS) LOS
ALAMOS NATIONAL LABORATORY
2OUTLINE OF PRESENTATION
- Short Introduction to Environmental Modeling
- Reconciling Environmental Models with Data
- Parameter Estimation
- Treatment of Parameter Uncertainty
- Markov Chain Monte (MCMC) Simulation
- Novel Theory (DREAM) Improves Sampling Efficiency
- Applications of DREAM in Environmental Modeling
- Surface hydrology Treatment of rainfall error
- Ecohydrology Multi-layer soil and canopy model
- Soil physics Free-form soil hydraulic functions
- Avian biology Assessing optimal bird migration
routes - Groundwater hydrology Contaminant transport
- Summary and Conclusions
3ENVIRONMENTAL MODELING
parameters
RECONCILING ENVIRONMENTAL MODELS WITH DATA USING
PARAMETER ESTIMATION
4LETS TAKE A HYDROLOGIC SYSTEM AS EXAMPLE
NO MATTER HOW SPATIALLY EXPLICIT AND DETAILED,
ALL MODELS CONTAIN PARAMETERS WHOSE VALUES CAN
ONLY BE DETERMINED THROUGH CALIBRATION
5ITERATIVE METHODS FOR PARAMETER ESTIMATION
MANUAL PARAMETER ESTIMATION (HAND
CALIBRATION) Advantage Simple to
implement Disadvantage subjective,
time-consuming and requires considerable
experience COMPUTERIZED ALGORITHMS (AUTOMATIC
CALIBRATION) Advantage Objective and more
efficient Disadvantage Complicated and
typically requires programming experience
and familiarity with technical jargon and
cluster computers
6OPTIMIZATION A SINGE SOLUTION
Model-data
x2
x1
7NO UNCERTAINTY THIS CANNOT BE JUSTIFIED
Maximum likelihood estimate
8NO UNCERTAINTY THIS CANNOT BE JUSTIFIED
streamflow m3/s
9MODEL EVALUATION WITH UNCERTAINTY
State (Prognostic Variables)
DIFFICULT TO SEPARATE ERROR SOURCES FROM TOTAL
MISFIT
STRUCTURAL MODEL ERROR IS KEY TO IMPROVING THEORY
10PROPER TREATMENT OF UNCERTAINTY IS DIFFICULT
Update
JOINT PROBABILITY DISTRIBUTIONS
System invariants
TYPICALLY HIGH-DIMENSIONAL SIGNIFICANT
MULTI-MODALITY POOR PARAMETER SENSITIVITY
UNKNOWN ORIENTATION AND SCALE
U(t)
HYMOD
Output
Forcing
X(t)
State
JOINT PROBABILITY DISTRIBUTIONS CAN GENERALLY NOT
BE DERIVED BY ANALYTICAL MEANS NOR BY ANALYTICAL
APPROXIMATION
NEED FOR METHODS THAT CAN EFFICIENTLY RETRIEVE
THE POSTERIOR PROBABILITY DENSITY FUNCTION!!
11SOLUTION MARKOV CHAIN MONTE CARLO SAMPLING
SIMPLE 1-PARAMETER EXAMPLE, BUT GENERALIZES TO
ANY DIMENSION
12MCMC SAMPLING (CONTINUED)
13MCMC SAMPLING (CONTINUED)
14PERFORMANCE MCMC DEPENDS ON PROPOSAL DISTRIBUTION
?
15DREAM DIFFERENTIAL EVOLUTION ADAPTIVE METROPOLIS
DREAM Continuously Updates the Scale and
Orientation of the Proposal Distribution Maintain
s Detailed Balance and is Ergodic Handles
Multimodality Efficiently High-dimensionality
ESPECIALLY DESIGNED FOR PARALLEL COMPUTING
Vrugt et al., IJNSNS, 10(3), 2009
16APPLICATIONS OF DREAM
17SURFACE HYDROLOGY
DREAM CAN SOLVE HIGH-DIMENSIONAL INVERSE PROBLEMS
? EXPLICIT TREATMENT OF RAINFALL AND MODEL ERROR
IN STREAMFLOW FORECASTING
Vrugt et al., WRR, (2008) SERRA, (2009)
18HYMOD CONCEPTUAL WATERSHED MODEL
Alpha
Cmax
bexp
Rs
Rs
Rs
0
(1-Alpha)
Rq
Cmax Maximum storage watershed bexp Spatial
variability soil moisture capacity Alpha
Distribution factor two reservoirs Rs
Residence time slow flow reservoir Rq
Residence time quick flow reservoir
Vrugt et al., WRR, (2008) SERRA, (2009)
19STREAMFLOW PREDICTION UNCERTAINTY BOUNDS
RESULTS WHEN INFERRING HYMOD PARAMETERS ONLY (d
5)
RESULTS WHEN INFERRING PARAMETER, MODEL AND
FORCING ERROR (d 63)
Vrugt et al., WRR, (2008) SERRA, (2009)
20TREATMENT OF RAINFALL ERROR
Rainfall mm
Time days
Alpha
Cmax
bexp
Rs
Rs
Rs
0
(1-Alpha)
Rq
Kavetski et al., 2002, 2006a,b Vrugt et al.,
WRR, 2008, SERRA, 2009
21STREAMFLOW PREDICTION UNCERTAINTY BOUNDS
RESULTS WHEN INFERRING HYMOD PARAMETERS ONLY (d
5)
RESULTS WHEN INFERRING PARAMETER, MODEL AND
FORCING ERROR (d 63)
Vrugt et al., WRR, (2008) SERRA, (2009)
22MARGINAL POSTERIOR PARAMETER DISTRIBUTIONS
HISTOGRAMS WHEN ONLY ESTIMATING HYMOD PARAMETERS
(d 5)
COMBINED HYMOD PARAMETER, MODEL AND RAINFALL
ERROR (d 63)
PARAMETER DISTRIBUTIONS ARE DIFFERENT!!!!
23MARGINAL POSTERIOR PDF OF RAINFALL MULTIPLIERS
SOME STORMS ARE WELL DEFINED OTHERS SHOW
CONSIDERABLE UNCERTAINTY
24ECOHYDROLOGY
DREAM CAN ESTIMATE PARAMETER UNCERTAINTY
? DISTRIBUTION OF CANOPY CHARACTERISTICS AND
DYNAMICS FROM OPTIMALITY OF NET CARBON PROFIT
Dekker et al., PCE, (2009)
25MULTI-LAYER SOIL AND CANOPY MODEL
Dekker et al., PCE, (2009)
26MULTI-LAYER SOIL AND CANOPY MODEL (2)
FOLLOWING SCHYMANSKI ET AL. (2007 2008) WE ADOPT
THE PRINCIPLE OF VEGETATION OPTIMALITY. IT IS
ASSUMED THAT THE VEGETATION ORGANIZES ITSELF IN
SUCH A WAY THAT NET CARBON PROFIT (NCPtot) IS
MAXIMIZED. PEVIOUS CONTRIBUTIONS HAVE USED A
SINGLE BIG LEAF MODEL (m 1), WITH INFERENCE OF
BEST PARAMETERS ONLY. HERE, WE USE OUR RECENTLY
DEVELOPED MULTI-LAYER SOIL AND CANOPY MODEL, AND
USE DREAM TO PROVIDE EXPLICIT RECOGNITION OF
PARAMETER UNCERTAINTY.
- WE SELECTED THE FOLLOWING PARAMETERS
- MA Fraction of watershed covered with
vegetation (a) - Yr Maximum rooting depth (b)
- m Number of canopy layers (c)
- ce Parameters defining relationship between
optimum water use and soil water (d) - me See (4). (e)
- Jmax25,0 Photosynthetic electron transport
capacity at top of canopy (f)
Dekker et al., PCE, (2009)
27POSTERIOR DISTRIBUTION PARAMETERS and NCPTOT
SIGNIFICANT VARIATION IN VEGETATION
CHARACTERISTICS AND DYNAMICS FROM ECOHYDROLOGIC
OPTIMALITY OF NET CARBON PROFIT
Dekker et al., PCE, (2009)
28COMPARISON AGAINST MEASURED CARBON AND WATER
FLUXES
Dekker et al., PCE, (2009)
29SOIL PHYSICS
DREAM CAN SOLVE HIGH-DIMENSIONAL
PROBLEMS ? ESTIMATE PORE SIZE DISTRIBUTION OF
SOIL SAMPLE FROM SIMPLE MEASUREMENTS OF AIR
PERMEABILITY
Dane et al., EJSC (2009)
30FREE-FORM ESTIMATION OF HYDRAULIC PROPERTIES
RESULTS
RESULTS FOR DEPTH INTERVAL 300 400 mm (
observation - prediction)
SIMILAR RESULTS WERE FOUND AT OTHER DEPTHS
MEASURED AIR PERMEABILITY DATA CONTAIN SUFFICIENT
INFORMATION TO ESTIMATE THE HYDRAULIC FUNCTIONS
Dane et al., EJSC (2009)
31LONG DISTANCE BIRD MIGRATION
MULTI-CRITERIA OPTIMIZATION ? PROVIDES INSIGHTS
INTO THE TRADE-OFFS OF BIRD BEHAVIOR BETWEEN
MINIMIZING FLIGHT TIME AND/OR ENERGY-USE
Vrugt et al., J. Avian Biol. (2006)
32MULTI-OBJECTIVE OPTIMIZATION
Birds are trying to optimize multiple objectives
simultaneously?
Need an optimization method that can identify
ensemble of solutions that span the Pareto surface
Vrugt et al., J. Avian Biol. (2006)
33INSPIRED DEVELOPMENT OF NEW GLOBAL OPTIMIZER
AMALGAM Ensemble Optimizer in which multiple
different search strategies are run
concurrently, and learn from each other through
information exchange using a common population of
points.
Vrugt et al., PNAS, (2007) IEEE-TEVC, (2009)
34BIRD MIGRATION MODEL PREDICTED FLIGHT ROUTES
Vrugt et al., J. Avian Biol. (2006)
35GROUNDWATER HYDROLOGY
DREAM AND AMALGAM METHODS HAVE ESPECIALLY BEEN
DESIGNED FOR PARALLEL COMPUTING ? SOLUTION OF
COMPUTATIONALLY DEMANDING PARTICLE TRACKING
MODELS FOR FLOW AND TRANSPORT THROUGH
HETEROGENEOUS MEDIA
Vrugt et al., PNAS (2009)
36MAGNETIC RESONANCE IMAGING DATASET
Many thanks to Hongkyu Yoon, Changyong Zhang and
Charles Werth for data collection
Concentration of MnCl2 versus time at 53,248
voxels and 112 time snaphots. This results in
5,963,776 data points!!
Vrugt et al., PNAS (2009)
37MRI THE NUMERICAL FLOW AND TRANSPORT MODEL
- 1. Steady State Flow Simulation
- Regular numerical grid at the scale of the
concentration data - FEHM finite element simulation of steady state
flow field through the heterogeneous system
- 2. Particle Tracking Transport Simulation
- FEHM random walk particle tracking algorithm used
to minimize numerical dispersion - 250,000 particles (1 cpu-hr on 3.4GHz processor)
- Particle tracking results converted to normalized
concentration using a numerical convolution method
Many thanks to Bruce Robinson for setting up the
forward model
Vrugt et al., PNAS (2009)
38MAGNETIC RESONANCE IMAGING PARAMETERS TO BE
OPTIMIZED
- Permeabilities of individual 5 zones
- Ranges assigned so that rank order of the
permeabilities of the five sands is honored - Molecular diffusion
- Longitudinal and transverse dispersivity
- Parameter ranges assigned based on literature
estimates and scientific judgment
Vrugt et al., PNAS (2009)
39SOLUTION USING HYBRID PARALLELIZATION SCHEME
(DREAM) algorithm
Each chain evolves on a different node
I. Use N 10 different chains
x2
x1
Each chain uses 10 other nodes for particle
tracking
(FEHM) Flow and transport code
COMPUTATIONAL TIME REDUCED WITH A FACTOR OF 100
40OPTIMIZATION RESULTS WITH DREAM
25,000 model FEHM runs using hybrid
parallelization with 100 processors
Breakthrough curves in high-flow zones are well
matched, but dispersion, diffusion, or advection
into lower permeability zones under-represented
Vrugt et al., PNAS (2009)
41INITIAL RESULTS MULTIOBJECTIVE OPTIMIZATION
f1 Permeability zones 1 3 f2 Permeability
zones 4 5
RED Original formulation with 5
parameters BLACK DREAM solution BLUE Modified
formulation with 15 parameters (each zone has its
own dispersivity values)
AMALGAM HELPS TO UNDERSTAND TRADE-OFFS in MODEL
Vrugt et al., PNAS (2009)
42SUMMARY AND CONCLUSIONS
- General Efficient and robust optimization and
sampling methods are a prerequisite to
appropriately treat uncertainty, and reconcile
environmental models with data. - Bayesian Statistics The Differential Evolution
Adaptive Metropolis (DREAM) algorithm
automatically updates the scale and orientation
of the proposal distribution during sampling, and
maintains detailed balance and ergodicity. This
scheme is significantly more efficient than
existing MCMC schemes. - Nonlinear Optimization Self-adaptive
multi-method optimization as implemented in the
AMALGAM algorithm significantly increases the
efficiency of evolutionary search. - Surface Hydrology Explicit treatment of rainfall
error in streamflow forecasting increases
coverage of the uncertainty bounds and alters
optimized values of model parameters. - Surface Hydrology Rainfall multipliers are
approximately normal, but with widely varying
uncertainty bounds. Some rainfall events are very
well defined, and others show considerable
uncertainty.
43SUMMARY AND CONCLUSIONS (2)
- Ecohydrology DREAM derived results demonstrate a
significant variation in vegetation
characteristics and dynamics exists from
ecohydrologic optimality of Net Carbon Profit
(NCP). These findings question the usefulness of
NCP as single optimality criteria, and advocate
the simultaneous use of multiple
non-commensurable criteria for ecohydrological
parameter estimation and model evaluation. - Soil Physics Air permeability data contain
sufficient information to retrieve soil hydraulic
functions. Our approach does not impose any
functional form of the retention and hydraulic
conductivity function and is especially designed
to work well for a range of soils. - Avian Biology Differential flight patterns of
migratory birds can be explained by optimality of
flight time and energy-use. - Groundwater Hydrology Breakthrough curves in
high-flow zones are well matched, but dispersion,
diffusion, or advection into lower permeability
zones under-represented. - DREAM and AMALGAM are especially designed to take
full advantage of parallel computing resources.