Making rating curves the Bayesian approach - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Making rating curves the Bayesian approach

Description:

Making rating curves the old fashioned way ... The old approach handling c=-h0 ... Problems with the old approach ... – PowerPoint PPT presentation

Number of Views:71
Avg rating:3.0/5.0
Slides: 26
Provided by: are85
Category:

less

Transcript and Presenter's Notes

Title: Making rating curves the Bayesian approach


1
Making rating curves - the Bayesian approach
2
Rating curves what is wanted?
  • A best estimate of the relationship between stage
    and discharge at a given place in a river.
  • The relationship should be on the form
    QC(h-h0)b or a segmented version of that.
    Qdischarge, hstage.
  • It should be possible to deal with the
    uncertainty in such estimates.
  • There should also be other statistical measures
    of the quality of such a curve.
  • These measures should be easy to interpret by
    non-statisticians.

3
Making rating curves the old fashioned way
  • For a known zero-stage, the rating curve can be
    written as qabx, where qlog(Q), xlog(h-h0)
    and alog(C).
  • For a set of measurements, one can then do linear
    regression with q as response, x as covariate and
    a and b as unknown linear parameters. Minimize SS
    analytically (standard linear regression).

4
The old approach handling c-h0
  • The problem is that the effective bottom level,
    h0-c, is not known.
  • Solution Minimize SS by stepping through all
    possible values of c.
  • The advantage This is the same as maximizing the
    likelihood for the regression problem qiab
    log(hic)?i or QiC (hi-h0)b Ei where ?i
    N(0,?2) is iid noise and Ei e?i .
  • This model makes hydraulic and statistical sense!

5
Problems with the old approach
  • We have prior information about curves that we
    would like to use in the estimation.
  • Inference and statistical quality measures are
    difficult to interpret.
  • Difficult to get a grip on the discharge estimate
    uncertainty.
  • There is a chance that one gets infinite
    parameter estimates using this method!

6
Bayesian statistics
  • Frequentistic treats the parameters as fixed and
    finds estimators that will catch their values
    approximately.
  • Bayesian treats the parameters as having a
    stochastic distribution which is derived from the
    observations and to prior knowledge.
  • Bayes theorem f( ? D) f( D ?)f(?)/f(D)
    where f stands for a distribution, D is the data
    set and ? is the parameter set.

7
Prior knowledge
  • Prior info about a and b can be obtained from
    already generated rating curves (using the
    frequentistic approach) or by hydraulic
    principles.
  • Prior info about the noise can be obtained from
    knowledge about the measurements.
  • Problem Difficult to set the prior for the
    location parameter h0-c, but we know it will not
    be far below the stage measurements.

8
Prior knowledge of a and b from the database
Histogram of generated as from the database.
Normal approximation seems ok.
Histogram of generated bs from the database.
Normal approximation seems less fine, but is used
for practical reasons.
9
Bayesian regression
  • Data given parameters is the same here
    qiab log(hic)?i . Dhi, qii1n
  • Problem even though we have prior info, this
    does not give us the form of the prior f(?),
    ?(a,b,c,?2).
  • If the priors are on a certain form, one can do
    Bayesian linear regression analytically qiab
    xi?i for xilog(hic) for a given c.
  • Same thought as for the frequentistic approach,
    handle a,b and ?2 using a linear model, and
    handle c using discretization.

10
Problems with Bayesian regression
  • While this gives us the form of f(a,b,?2), it
    does not give us the form of f(c).
  • We know that the stage levels are not too far
    above the zero-level. Wed like to code this
    prior info but we dont want to use the stage
    measurement (using them both in the prior and the
    likelihood).
  • Jeffreys priors containing the covariates is a
    general problem with the Bayesian regression
    approach! Ok, if you really are in a regression
    setting, but this is not the case here.

11
Problems with the first Bayesian approach
  • The form that makes the linear regression
    analytical is rather strange.
  • It requires the form of the prior for ?2 which
    influences the priors for (a,b). However, prior
    info about these two would be better kept
    separate.
  • Difficult to set the prior info for users.
  • Expected discharge is infinite in this approach!
    (Median will be finite.)

12
A new Bayesian regression approach
  • Using a semi conjugate prior, (a,b)N2,
    independent of ?2IG, we separate prior
    knowledge about a,b and ?2.
  • We can no longer handle (a,b,?2) analytically for
    known c.
  • However, (a,b,c,?2) can be sampled using MCMC
    methods.
  • The sampling method must be effective, since
    users do not want to wait to long for the results.

13
A graphical overview of the new model
?a Va ? ?a Vb ?
? ?
Hyper-parameters
a b
?2
Parameters
qi
hi
Measurements
For i in 1,,number of measurements
14
Sampling methods and efficiency
  • Naïve MCMC The Metropolis algorithm. Problem
    (a,b,c) are extremely mutually dependent.
  • Metropolis or independence-sampler for c, Gibbs
    sampling for (a,b, ?2). Dependency of (a,b,c)
    makes trouble here, too.
  • Solution Sample (a,b,c,?2) together and then do
    a Metropolis-Hastings accepting. Sample c using
    first adaptive Metropolis, then indep. sampler.
    Sample (a,b,?2 ) given c and previous ?2 using
    Gibbs-like sampling. Then accept/reject all four.

?i-12
?i2
ai,bi
ci
Iteration i-1 i
15
Estimation based on simulations
  • We can estimate parameters using the sampled
    parameters by either taking the mean or the
    median.
  • We can estimate the discharge for a given stage
    value, either by mean or median discharge from
    the sampled parameters or by discharge from the
    mean or median parameters.
  • Simulations show that median is better than mean.

16
Inference based on simulations
  • Uncertainty in the parameters can be established
    by looking at the variance of sampled parameters.
  • Credibility intervals can be arrived at from the
    quantiles of the parameters.
  • Discharge uncertainty and credibility intervals
    can be obtained by a similar approach to the
    discharge for the drawn parameters.

17
Example rating curve with uncertainty
18
Example prior to posterior
Prior of b.
Posterior of b.
19
Example - diagnostic plots
Scatter plot of simultaneous samples from a and
b. Note the extreme correlation between the
parameters.
Residuals. Note the trumpet form. There is
heteroscedasticy here, which the model does not
catch.
20
What has been achieved
  • Discharge estimates with lower RMSE than
    frequentistic estimates.
  • Measures of estimation uncertainty that are easy
    to interpret.
  • Hopefully, quality measures should be less
    difficult to understand.
  • The distribution of parameters can be used for
    decision problems. (Should we do more
    measurements?)

21
What remains
  • Multiple segmentation.
  • Need to find good quality measures in addition to
    estimation uncertainty. Possibility Calculate
    the posterior probability of more advanced
    models.
  • Learning about the priors A hierarchical
    approach.
  • There is still some prior knowledge that has not
    found its way into the model namely distance
    between zero-stage and stage measurements.
  • Heteroscedasticy ought to be removed.
  • Should have a prior on b that closer reflects
    both prior knowledge (positive b) and the
    database collection of estimates. For example
    blogN. But this introduces problems with
    efficiency.

22
A graphical view of the model and a tool for a
hierarchical approach
distribution with or without hyper-parameters
?a Va ? ?b Vb
? ?
parameters
hyper-
aj bj
parameters
For j in 1,,number of stations
?j2
hj,i qj,i
For i in 1,,number of measurements for station
j
measurements
23
Solution to the prior for hc
  • Possible to go from a regression situation to a
    model that has both stochastic discharge and
    stage values.
  • Possibility A structural model where real
    discharge, , has a distribution. The real
    stage, , is a deterministic function of the
    curve parameters, (a, b, c). Observations, D(qi,
    hi), are the real values plus noise.
  • The model gives a more realistic description of
    what happens in the real world. It also codes the
    prior knowledge about the difference between
    stage measurements and zero-stage, through the
    distribution of q and the distribution of (a, b).

24
Structural model a graphical view
distribution with or without hyper-parameters
?? ??
?? ??
?q ?q
?0 ?0
parameters
?b Va ? ?b Vb
hyper-
?q ?q2
parameters
a b c
??2
?h2
latent variables
measurements
qi
hi
25
Advantage and problems of a structural model
  • Advantage
  • More realistic modelling of the measurements and
    the underlying structure.
  • Codes prior knowledge about the relationship
    between stage measurements and the zero-stage.
  • Can solve heteroscedasticy.
  • Gives a more detailed picture of how measurement
    errors occur.
  • Since b can not be sampled using Gibbs, we might
    as well use a form that insures positive exponent.
  • Problem
  • Difficult to make an efficient algorithm.
  • More complex. Thus even if it codes more prior
    knowledge, the estimates might be more uncertain.
    This has not been tested.
Write a Comment
User Comments (0)
About PowerShow.com