Ratto Lecture 1 - PowerPoint PPT Presentation

1 / 47
About This Presentation
Title:

Ratto Lecture 1

Description:

Let us consider a given deterministic model that has to be estimated ... quantitative methods, without restrictions about monotony or additivity of the model. ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 48
Provided by: niclagi
Category:

less

Transcript and Presenter's Notes

Title: Ratto Lecture 1


1
Bayesian uncertainty estimationand global
sensitivity analysis M. Ratto, IPSC-JRC,
European Commission
2
Bayesian estimation
Let us consider a given deterministic model that
has to be estimated according to some set of
observations. The analyst has a set of prior
assumptions on the model factors, expressed as
prior distributions . The posterior
distribution of the model factors given the data
set D is, applying the Bayes chain rule,
3
Bayesian estimation
The likelihood function tells us how much the
experimental data support the various possible
statistical hypotheses on model parameters. It
is a scalar function of model parameters. e.g.
in a standard linear regression model where the
variance of the error terms is known, the
log-likelihood is proportional to the sum of
squared errors. The likelihood is where the
model structure and the data enter the problem,
possibly in terms of sums of squared errors as in
the case of a simple linear regression model.
4
Bayesian estimation
If Y is the model output of interest, its
posterior mean and variance are Bayesian
parameter estimates can be given in terms of
posterior mean and covariance as
5
Bayesian estimation
Integrated likelihood is a measure of fit for
the model.
6
Bayesian hypothesis testing
When a set of models Mh, h1, H is
available comparisons are made considering
Bayes factors obtained by integrating (not
maximising) the likelihood over the parameter
space.
7
Bayesian Hypothesis Testing
The Bayes factor is a "summary of the evidence
provided by the data in favour of one scientific
theory, represented by a statistical model, as
opposed to another" (Kass and Raftery, 1995).
8
Bayesian estimation
Bayesian model averaging (BMA) is an approach to
modelling in which all possible sources of
uncertainty are taken into account (model
structures, model parameters and data
uncertainty) based on Bayesian theory.
9
Bayesian methods
  • Bayesian methods are an intuitively attractive
    solution to the problem of accounting for the
    different sources of uncertainty, but presents
    several difficulties.
  • The integrals implicit can in general be hard to
    compute
  • Monte Carlo methods (acceptance sampling,
    importance sampling) or better
  • Markov chain Monte Carlo methods (Gibbs sampler,
    Metropolis-Hastings).

10
Monte Carlo methods
The most trivial way is to sample from the prior,
so
11
Monte Carlo methods
Trivial route has problems since, usually, the
posterior is concentrated w.r.t. the prior. the
rate of success of this MC strategy is very
small, i.e. most of the sampled x(i)s have very
small likelihood values, implying that all
inference is dominated by a few points with large
likelihood values and that the algorithm
converges very slowly.
12
Monte Carlo methods
Other methods have to be applied, Gibbs and
Metropolis (MCMC) being the most advisable. With
samples from the posterior And the
integrated likelihood
13
Generalised Likelihood Uncertainty Estimation
(GLUE)
GLUE can be seen an extension of RSA or as a
simplification of rigorous Bayesian approaches
Beven K.J., Binley A., The Future of Distributed
Models Model Calibration and Uncertainty
Prediction, Hydrological Processes, 6, 279-298,
1992. Romanowicz R., Beven K., Tawn J.,
Evaluation of predictive uncertainty in nonlinear
hydrological models using a Bayesian approach. In
Statistics for the Environment 2, Water Related
Issues. Barnett V., Turkman F. (eds) pp.
297-315, Wiley New York, 1994.
14
What is GLUE
data
MC model runs
Y(t)
weighting function
Uncertainty estimation
15
What is GLUE
Most typical weighting functions where
is
the mean square error
for the i-th run
16
GLUE vs. RSA
- works with multiple sets of factors (via MC
sampling)
- model runs weighted and ranked on a likelihood
scale via conditioning on observations and the
weights are used to formulate a cumulative
distribution of predictions.
Comparing with RSA, no filtering is made
(on-off), BUT a smooth function is used to
evaluate behaviour of simulations.
17
GLUE vs. Bayesian methods
  • Two main problematic aspects with GLUE
  • the definition of the weighting function is a
    fundamental aspect and the uncertainty prediction
    can strongly depend on that definition
  • in a Bayesian framework, this is connected to
    how errors in the observations and in the model
    structure are represented by a statistical model
    in GLUE the 'qualitative' definition of the
    weights, based essentially on an inverse
    relationship to the mean square error, makes this
    procedure easier and more flexible, but a bit
    ambiguous

18
GLUE vs. Bayesian methods
  • the sampling strategy of GLUE has very poor
    efficiency properties, which can make the
    statistical properties of the GLUE inference
    poorly significant
  • the use of importance sampling could be a first
    step in the direction of improving efficiency,
    without introducing too much complication in the
    methodology (e.g. Young and Romanowicz, 2003).

19
A role for SA?
The aim of applying sensitivity analysis in the
Bayesian framework is to address the problem of
describing the acceptable parameter structure,
i.e. characterise the parameter
calibration/estimation
As in RSA, complex interaction structures are
typical complex structured factor subsets
having similar likelihood of being simulators of
reality are usually identified.
20
A role for SA?
There are at least two possibilities of
performing a sensitivity analysis on the model
output itself or on the likelihood. The
calibration issue is addressed by performing a
sensitivity analysis of the likelihood (or of the
weighting function for GLUE). Ratto et al. (2001)
21
A role for SA?
  • When trying to map back the output uncertainty to
    the input factors, as for RSA
  • conventional multivariate statistics (PCA ) is
    not very revealing (Spear et al., 1994)
  • complex parametric interactions do not become
    evident from looking at univariate marginal
    probability densities (Spear, 1997)

22
GSA in Bayesian analysis
Likelihood methodological issues for SA
  • non monotonic input-output mapping
  • high level of interaction between input factors.

? variance-based Global Sensitivity Analysis
(GSA) quantitative methods, without restrictions
about monotony or additivity of the model.
23
Why global SA?
  • global SA of the likelihood, means decomposing
    the variance of the likelihood, i.e. we are
    asking ourselves
  • which parameters drive the most of the variation
    of the likelihood?.
  • Does the analysis of the variance give
    information on which model parameters mainly
    drive the goodness of the model?

24
Why global SA?
  • Sensitivity index of the likelihood
  • measures the variation of the univariate function
  • Around the unconditional mean , i.e.
    the integrated likelihood

25
Why global SA?
  • Note
  • Decomposing the log-likelihood, would allow to
    factorise the likelihood!

26
Why global SA?
  • We can call
  • The conditional integrated likelihood.
  • If the main effect of Xi there will be values of
    Xi for which is significantly smaller than and
    other values for which is significantly larger
    than .
  • If we were allowed to tune (or fix) the values of
    Xi, we would be able to let the integrated
    likelihood increase (i.e. increase the goodness
    of the model).
  • The same holds if groups of factors, allowing to
    identify interaction structures.

27
Why GSA?
  • The determination of both main effect (as other
    methods) and total effect allows a classification
    in terms of necessary and sufficient conditions
  • factors with high main effect
  • affect model output singularly, independently of
    interaction
  • factors with small main effect but high total
    effect
  • influence on the model output mainly through
    interaction
  • factors with small main and total effect
  • have a negligible effect on the model output and
    can be fixed or neglected (? more powerful than
    RSA!).

28
Why GSA?
  • Group of factors
  • groups of factors having high group effect
  • affect model output singularly, independently of
    interaction with parameters outside the group

29
How do we apply GSA in Bayesian analysis?
We should do in a way that, with the same set of
model runs,
- predictive uncertainty can be estimated
- sensitivity indices computed and
- analysis of the interaction structure
performed.
30
How do we apply SA in Bayesian analysis?
If we sample from the prior (as in GLUE) this is
very simple. Factors have usually independent
prior distributions, so we can design Sobol,
FAST, sampling strategies. In a general context,
when a sample from the posterior is available,
methods pattern recognition smoothing can be
applied. Moreover, the decomposition of the
integrated likelihood has to deal with an
harmonic mean estimator, implying specific
methodological issues still under analysis.
31
A chemical system
We consider an isothermal first order
irreversible reaction in a batch system A?B
32
A chemical system
Model solution
Model parameters
33
A chemical system
Input factors Outputs 1) yB(t) (time
dependent output) 2) a weighting function, as
defined in GLUE
34
A chemical system
Weighting function
where
is the mean square error for the i-th run
35
A chemical system
Time dependent output Experimental
time series and 5 and 95 confidence bounds for
the output yB.
36
A chemical system
Sensitivity of yB(t)
? main effects sum up to 1 (small interaction) ?
kinetics mainly dominates the process
37
A chemical system
Sensitivity of weights 1/s2
?no predominance in main effect ?kinetic
parameters have big interaction
38
A chemical system
Sensitivity of weights (likelihood)
  • There is a relationship between
  • the difference between total and first order
    sensitivity indices,
  • the indeterminacy of the optimisation
    (estimation) problem, and
  • the interaction structure of the input factors in
    the posterior distribution after conditioning to
    the observations

39
A chemical system
Scatter plots ? Main effect sensitivity indices
? RSA
40
A chemical system
  • initial condition is unimportant (smallest ST)
  • chemical rate factors mainly drive the model fit
    to the experimental data (highest Si and SiT)
  • BUT, chemical rate factors cannot be precisely
    estimated, because their effect is dominated by
    interaction terms
  • high correlation between estimates of k? and E
    is usually detected

41
A chemical system
Correlation analysis the coefficient
between kinetic parameters is positive the
couple of factors acts in the model as a
quotient/difference.
42
A chemical system
Let us now consider the same chemical system, but
assume that 9 sets of observations are available
at 9 different temperatures.
43
A chemical system
Sensitivity indices Correlation analysis
44
A chemical system
  • Comparing the same model with a different
    experimental evidence
  • interaction between the kinetic factors is
    strongly reduced
  • the correlation structure is now very weak
  • a model can be over-parameterised or not,
    according to the evidence with which it is
    compared, implying that the calibration and
    sensitivity analysis exercise is useful also when
    the same mathematical model is used to describe
    different realities

45
Caveats
The performance of a global SA to the
weighting/likelihood functions gives a full
interaction structure, which obviously is not the
same as a functional representation. The latter
is something additional with respect to the
performance of a global SA and can be in some
cases a formidable task. This task usually
requires the use of computationally intensive
methods and/or the formulation of hypotheses
about the interaction structure and the
introduction of a certain degree of arbitrariness
for such representation.
46
Caveats
  • On the other hand, the application of a global SA
    provides a quantitative evaluation about
    fundamental aspects such as
  • which factors are important for calibration, i.e.
    are somehow conditioned by observations
  • the degree of complexity of the interaction
    structure
  • which factors are involved in the interaction
    structure.
  • Such information has a general validity, since it
    is obtained without assumptions about the model
    structure and/or the error structure.

47
Interaction
  • Possible approaches
  • Correlation analysis of the posterior pdf (B
    subset)
  • Variance based GSA
  • Spears Tree-Structured Densitiy Estimation
    technique
  • HDMR, cut-HDMR
  • Morris
Write a Comment
User Comments (0)
About PowerShow.com