M - PowerPoint PPT Presentation

1 / 54
About This Presentation
Title:

M

Description:

Mthodes dEnsemble pour lAssimilation et la Prvision Olivier Talagrand Laboratoire de Mtorologie Dyna – PowerPoint PPT presentation

Number of Views:75
Avg rating:3.0/5.0
Slides: 55
Provided by: tal49
Category:
Tags: aie

less

Transcript and Presenter's Notes

Title: M


1
Méthodes dEnsemble pour lAssimilation et la
PrévisionOlivier TalagrandLaboratoire de
Météorologie Dynamique, École Normale
SupérieureParis, FranceRemerciements à F.
Atger, G. Candille, L. Descamps et beaucoup
dautresRéunion du Groupe Statistiques pour
lAnalyse, la Modélisation et lAssimilation
Institut Pierre-Simon Laplace pour les Sciences
de l'Environnement GlobalParis, 14 Février 2008
2
(No Transcript)
3
ECMWF, Technical Report 499, 2006
4
  • Autant que nous puissions dire, les prévisions
    météorologiques seront toujours affectées
    dincertitude, non négligeable pour
    lutilisateur.
  • Il importe de quantifier a priori cette
    incertitude (exemple entrepreneur qui doit
    décider douvrir ou non un chantier de
    construction alors quil existe un risque de
    gel).
  • Incertitude varie dune situation à lautre.
  • Conditions initiales de la prévision, issues de
    lassimilation, sont aussi affectées
    dincertitude, ne serait-ce que parce quelles
    sont définies avec une résolution spatiale finie.
  • Il importe aussi de quantifier lincertitude sur
    les conditions initiales, ne serait-ce que pour
    pouvoir en déduire lincertitude qui en résultera
    sur la prévision (effet papillon).

5
  • Cela a conduit à la mise en œuvre de Méthodes
    dEnsemble, dans lesquelles lincertitude sur
    létat de lécoulement est représentée, non par
    des barres derreur, mais par ensemble détats
    dont la dispersion est censée échantillonner
    cette incertitude.
  • Prévision dEnsemble opérationnelle au CEPMMT
    et au NCEP (Etats-Unis) depuis 1992. Dautres
    services météorologiques ont suivi (Service
    météorologique Canadien, Meteorological Office
    britannique, Météo-France).
  • Assimilation dEnsemble Filtre de Kalman
    densemble (EnKF, utilisé opérationnellement au
    SMC, et comme outil de recherche en maints
    endroits), filtres particulaires.
  • Dimension des ensembles N O(10-100)

6
  • Talagrand et al., ECMWF, 1999, T850, 6-day range

7
  • Assimilation dEnsemble
  • Prévision dEnsemble
  • Validation objective de Méthodes dEstimation
    Ensembliste

8
(No Transcript)
9
(No Transcript)
10
(No Transcript)
11
  • Modèle numérique, construit sur les lois
    physiques régissant lécoulement
  • Conservation de la masse
  • D?/Dt ? divU 0
  • Conservation de lénergie
  • De/Dt - (p/?2) D?/Dt Q
  • Conservation de la quantité de mouvement
  • DU/Dt (1/?) gradp - g 2 ???U F
  • Equation détat
  • f(p, ?,?e) 0 (p/? rT, e CvT)
  • Conservation de la masse de composants
    secondaires (eau pour latmosphère, sel pour
    locéan, )
  • Dq/Dt q divU S

12
  • Purpose of assimilation reconstruct as
    accurately as possible the state of the
    atmosphere (the ocean, or whatever the system of
    interest is), using all available appropriate
    information. The latter essentially consists of
  • The observations.
  • The physical laws governing the system, available
    in practice in the form of a discretized, and
    necessarily approximate, numerical model.
  • Asymptotic properties of the flow, such as, e.
    g., geostrophic balance of middle latitudes.
    Although they basically are necessary
    consequences of the physical laws which govern
    the flow, these properties can usefully be
    explicitly introduced in the assimilation
    process.

13
  • Both observations and model are affected with
    some uncertainty ? uncertainty on the estimate.
  • For some reason, uncertainty is conveniently
    described by probability distributions (dont
    know too well why, but it works).
  • Assimilation is a problem in bayesian
    estimation.
  • Determine the conditional probability
    distribution for the state of the system, knowing
    everything we know (unambiguously defined if a
    prior probability distribution is defined see
    Tarantola, 2005).

14
  • Sequential Assimilation
  • Assimilating model is integrated over period of
    time over which observations are available.
    Whenever model time reaches an instant at which
    observations are available, state predicted by
    the model is updated with new observations.
  • Variational Assimilation
  • Assimilating model is globally adjusted to
    observations distributed over observation period.
    Achieved by minimization of an appropriate scalar
    objective function measuring misfit between data
    and sequence of model states to be estimated.

15
  • Sequential assimilation. At time k
  • - Background (coming from the past)
  • xbk xk ?k
  • - Vector of observations at time k
  • yk Hk(xk) ?k
  • where Hk is known observation operator
  • ? Analyzed state
  • xak xbk K yk - Hk(xbk)
  • dk ? yk - Hk(xbk) is innovation vector

16
After A. Lorenc
17
  • More precisely
  • xb x ?b E(?b) 0 E(?b?bT) ? Pb
  • y Hx ???? E(?) 0 E(??T) ? R (H
    linear )
  • then least-variance estimate for x is
  • xa xb Pb HT HPbHT R-1 (y - Hxb)
  • If errors ?b and ??are gaussian, ?b? N ?,
    Pb, ?? N ?, R achieves bayesian estimation,
    in the sense that the conditional probability
    distribution for x, knowing ?b and ?, is
  • P(x??b, ? ) N xa, Pa
  • with
  • Pa Pb - Pb HT HPbHT R-1 HPb

18
  • Temporal dimension
  • Evolution equation
  • xk1 Mkxk ?k
  • E(?k) 0 E(?k?jT) ? Qk ?kj
  • E(?k?jT) 0
  • where Mk is known linear model, and ?k is model
    error
  • Then estimation error covariance matrix evolves
    in time according to
  • Pbk1 Mk Pak MkT Qk

19
  • Sequential assimilation assumes the form of
    Kalman filter
  • Background xbk and associated error covariance
    matrix Pbk known
  • Analysis step
  • xak xbk Pbk HkT HkPbkHkT Rk-1 (yk -
    Hkxbk)
  • Pak Pbk - Pbk HkT HkPbkHkT Rk-1 HkPbk
  • Forecast step
  • xbk1 Mk xak
  • Pbk1 Mk Pak MkT Qk
  • Optimal if errors are uncorrelated in time.
    Achieves bayesian estimation if errors are
    gaussian.

20
  • Ensemble filters (Evensen, Anderson, )
  • Uncertainty is represented, not by a covariance
    matrix, but by an ensemble of point estimates in
    state space which are meant to sample the
    conditional probability distribution for the
    state of the system (dimension N O(10-100)).
  • Ensemble is evolved in time through the full
    model, which eliminates any need for linear
    hypothesis as to the temporal evolution.

21
  • How to update predicted ensemble with new
    observations ?
  • Predicted ensemble at time t xbn, n 1, , N
  • Observation vector at same time y Hx ?
  • Gaussian approach
  • Produce sample of probability distribution for
    real observed quantity Hx
  • yn y - ?n
  • where ?n is distributed according to probability
    distribution for observation error ?.
  • Then use Kalman formula to produce sample of
    analysed states
  • xan xbn Pb HT HPbHT R-1 (yn - Hxbn) , n
    1, , N (2)
  • where Pb is covariance matrix of predicted
    ensemble xbn.
  • In the linear case, and if errors are gaussian,
    (2) achieves Bayesian estimation, in the sense
    that xan is a sample of conditional probability
    distribution for x, given all data up to time t.

22
  • I. Hoteit, Doctoral Dissertation, Université
    Joseph Fourier, Grenoble, 2001

23
  • Problem Ensemble collapse and spurious
    long-distance correlations
  • For relatively small ensemble dimensions (Nlt100),
    ensembles tend to collapse, leading to divergence
    of filter (too much weight is given to
    background, and too little to observations, with
    the consequence that the ensemble progressively
    drifts from the the latter).
  • In addition, spurious correlations occur at long
    distance.
  • Ad hoc a posteriori empirical remedies
  • - Covariance inflation
  • - Localization. Termwise multiplication (Schur,
    aka Hadamart, product) of ensemble covariance
    matrix by covariance matrix with bounded range.

24
  • Ensemble Kalman Filter exists in many variants
    (Hamill, Bishop, Toth, ).
  • Can be extended to estimation of whole history of
    system up to time t (Ensemble
  • Kalman Smoother, Evensen and van Leeuwen).
    Memory requirements.
  • In any case, optimality always requires errors to
    be independent in time.

25
  • Ensemble Transform Kalman Filter (ETKF, Wang and
    Bishop, 2003)
  • Uses a predefined analysis (and is therefore not
    an assimilation method in itself). Deviations of
    the background ensemble from the analysis
    background are transformed through a
    transformation matrix T so as to produce
    deviations from the analysis with approximate
    covariance matrix
  • Pa Pb - Pb HT HPbHT R-1 HPb
  • where Pb is, as in EnKF, the background ensemble
    covariance matrix.

26
  • Exact bayesian estimation
  • Particle filters
  • Predicted ensemble at time t xbn, n 1, , N
    , each element with its own weight
  • (probability) P(xbn)
  • Observation vector at same time y Hx ?
  • Bayes formula
  • P(xbn ? y) ? P(y ? xbn) P(xbn)
  • Defines updating of weights
  • Remarks
  • Many variants exist, including possible
    regeneration of ensemble elements
  • If errors are correlated in time, explicit
    computation of P(y ? xbn) will require using past
    data that are correlated with y (same remark for
    evolution of ensemble between two observation
    times)

27
  • van Leeuwen, 2003, Mon. Wea. Rev., 131, 2071-2084

28
  • Exact bayesian estimation
  • Acceptation-rejection
  • Bayes formula
  • f(x) ? P(x ? y) P(y ? x) P(x) / P(y)
  • defines probability density function for x.
  • Construct sample of that pdf as follows.
  • Draw randomly couple (?, ?) ? S x 0,supf.
  • Keep ? if ? lt f(?). ? is then distributed
    according to f(x).

29
(No Transcript)
30
Miller, Carter and Blue, 1999, Tellus, 51A,
167-194
31
  • Acceptation-rejection
  • Seems costly.
  • Requires capability of permanently interpolating
    probability distribution defined by
  • finite sample to whole state space.

32
  • Q. Is it possible to develop fully bayesian
    algorithms for systems with dimensions
    encountered in meteorology and oceanography ?
    Would that require totally new algorithmic
    developments ?
  • Q. Is it possible to have at the same time the
    advantages of both ensemble estimation and
    variational assimilation (propagation of
    information both forward and backward in time,
    and, more importantly, possibility to take
    temporal dependence into account) ?

33
  • Ensemble Prediction
  • Forecasts of a same ensemble differ through
    initial conditions, but also through specific
    features in the forecast model (stochastic
    physics at ECMWF), and more and more through the
    model itself (multi-model ensembles).
  • Initial conditions are defined through various
    procedures singular modes (ECMWF), bred modes
    (NCEP), Ensemble Kalman Filter (MSC), Ensemble
    Transform Kalman Filter (Met Office, UK).

34
Reliability diagramme, NCEP, event T850 gt Tc -
4C, 2-day range, Northern Atlantic Ocean,
December 1998 - February 1999
35
  • Statistical consistency between prediction and
    observation
  • Rain must occur with frequency 40 in the
    circumstances when it has been predicted to occur
    with probability 40.
  • Observed frequency of occurrence p(p) of event,
    given that it has been predicted to occur with
    probability p, must be equal to p.
  • For any p, p(p) p
  • Reliability
  • More generally, frequency distribution of
    observation F(F), given that probability
    distribution F has been predicted, must be equal
    to F.
  • For any F, F(F) F

36
_at_
Buizza et al., MWR, 2005, 1076-1097
37
  • More generally, for a given scalar variable,
    Reduced Centred Random Variable
  • (RCRV, Candille et al., 2006)
  • where ??is verifying observation, and ??and ??are
    respectively the expectation and
  • the standard deviation of the predicted
    probability distribution.
  • Over a large number of realizations of a reliable
    probabilistic prediction system
  • E(s) 0 , E(s2) 1

38
  • Rank Histograms
  • For some scalar variable x, N ensemble values,
    assumed to be N independent realizations of the
    same probability distribution, ranked in
    increasing order
  • x1 lt x2 lt lt xN
  • Define N1 intervals.
  • If verifying observation ??is an N1st
    independent realization of the same probability
    distribution, it must be statistically
    undistinguishable from the xis. In particular,
    must be uniformly distributed among the N1
    intervals defined by the xis.

39
Rank histograms, T850, Northern Atlantic, winter
1998-99 Top panels ECMWF, bottom panels NMC
(from Candille, Doctoral Dissertation, 2003)
40
ECMWF, Europe, 6-day range, Technical Report 504,
2006
41
ECMWF, Europe, 4-day range, Technical Report 504,
2006
42
  • Two properties make the value of an ensemble
    estimation system (either for assimilation or for
    prediction)
  • Reliability is statistical consistency between
    estimated probability distributions and verifying
    observations. Is objectively and quantitatively
    measured by a number of standard diagnostics
    (among which Reduced Centred Random Variable and
    Rank Histograms, reliability component of Brier
    and Brier-like scores).
  • Resolution (semantic disagreement) is the
    property that reliably predicted probability
    distributions are useful (essentially have small
    spread). Also measured by a number of standard
    diagnostics (resolution component of Brier and
    Brier-like scores).
  • .
  • To-days message. Evaluate assimilation
    ensembles in terms of reliability and resolution.

43
  • Questions
  • - What is the appropriate size of prediction
    ensembles ? Given the choice, is it better to
    improve the quality of the forecast model, or to
    increase the size of the ensembles ?
  • - What are the limitations (if any) imposed on
    the performance of EPSs by the various sources of
    'noise' (such as, e.g., the finite size of
    predicted ensembles, the errors in the verifying
    observations, or the finite size of the verifying
    sample) ?
  • - What is the effect of model errors on the
    performance of the current EPSs ? What are the
    potential approaches to take into account the
    effects of those errors?
  • - Can ensemble prediction help in the prediction
    of rare and/or extreme events ?

44
Theoretical estimate (raw Brier score)
Impact of ensemble size on Brier Skill
Score ECMWF, event T850 gt Tc Northern Hemisphere
(Talagrand et al., ECMWF, 1999)
45
  • All objective scores of performance of EPSs
    saturate for ensemble size
  • N 30 - 50
  • Values as large as N 500 have been suggested.
    Who will ever care
  • whether the probability for rain for to-morrow is
    123/500 rather 124/500
  • (or even 12/50 rather than 13/50) ?
  • Can large size ensembles be really useful ?

46
  • Predicted probability p1 1/N for event E
  • How long must we wait before we can tell whether
    that prediction is reliable ?
  • Probability p1 must have been predicted at least
    ?N times, with ??of the order of (at the very
    least) a few units, and one must have verified
    that event E has actually occurred about ??times
    over those predictions.
  • Mean time between two occurrences of E T
  • Assume system produces one 10-day forecast every
    day, so that 10 forecasts are available every
    day. For every occurrence of E, one can expect
    the particular probability p1 to be predicted
    with probability 10/N
  • ? waiting time necessary to assess whether
    prediction is reliable is at least
  • ?TN/10

47
  • Waiting time
  • ?TN/10
  • Take ?? 4 (not very demanding). If event occurs
    4 times a year, you must wait 10 years for N
    100, and 50 years for N 500.
  • If event occurs once every two years, waiting
    times are 80 and 400 years respectively.
  • Reanalyses and/or reforecasts can be used for
    validation, but that will do at most for a few
    tens of years.
  • Conclusion. Reliable large-N probabilistic
    prediction of even moderately rare events is
    simply impossible.

48
  • Is it worth using ensembles with size larger than
    N 50 ? Well,
  • On the other hand, large ensembles are required
    for predicting quantities such as, e. g.,
    variances (with N 50, there is a
    20-probability of being off on the variance by
    at least 25). If the spread of the ensemble is
    prime concern then we have a strong argument for
    using ensemble sizes well in excess of 50 (C.
    Bishop, N. Bowler).
  • But then, how does one validate prediction of
    variances ?
  • It may also be beneficial to produce large
    ensembles (if they are affordable), but to
    predict probabilities with a much lower accuracy
    (e. g., to produce probabilities with accuracy
    1/20 from ensembles with size N 100).

49
  • Why do scores saturate for N 30-50 ?
  • Explanations that have been suggested
  • (i) Scores have been implemented so far on
    probabilisic predictions of events or
    one-dimensional variables (e. g., temperature at
    a given point). Situation might be different for
    multivariate probability distributions (but then,
    problem with size of verification sample).
  • (ii) Probability distributions (in the case of
    one-dimensional variables) are most often
    unimodal. Situation might be different for
    multimodal probability distributions (as produced
    for instance by multi-model ensembles).
  • (iii) Saturation is due to the characters of
    synoptic-scale meteorology. Situation might be
    different with mesoscale ensemble prediction.

50
  • Definition of initial ensembles
  • Three basic approaches
  • Singular modes (ECMWF)
  • Singular modes are perturbations that amplify
    most rapidly in the tangent linear approximation
    over a given period of time. ECMWF uses a
    combination of evolved singular vectors defined
    over the last 48 hours period before forecast,
    and of future singular vectors determined over
    the first 48 hours of the forecast period.
    Mixture of past and future.
  • Bred modes (NCEP)
  • Bred modes are modes that result from
    integrations performed in parallel with the
    assimilation process. Come entirely from the
    past.
  • Perturbed observation method (MSC)
  • A form of ensemble assimilation. Comes entirely
    from the past.

51
  • L. Descamps (LMD)
  • Systematic comparison of different approaches,
    on simulated data, in as clean conditions as
    possible.

52
Descamps and Talagrand, Mon. Wea. Rev., 2007
53
  • Conclusion. If ensemble predictions are assessed
    by the accuracy with which they sample the future
    uncertainty on the state of the atmosphere, then
    the best initial conditions are those that best
    sample the initial uncertainty. Any anticipation
    on the future evolution of the flow is useless
    for the definition of the initial conditions.

54
  • Méthodes densemble sont la voie de lavenir,
    aussi bien en ce qui concerne lassimilation que
    la prévision.
  • De nombreux problèmes subsistent, en particulier
    qaunt aux limites de ce que peuvent apporter les
    méthodes densemble.
  • Question. Devons-nous tendre vers une situation
    où le produit final de la prévision ou de
    lassimilation sera une distribution de
    probabilité ?
Write a Comment
User Comments (0)
About PowerShow.com