Title: Lysbilde 1
1Probabilistic forecasts of rare visibility events
using neural nets
John Bjørnar Bremnes Norwegian Meteorological
Institute P.O. Box 43 Blindern, NO-0313 Oslo,
Norway j.b.bremnes_at_met.no
Silas Chr. Michaelides Meteorological
Service P.O. Box 43059, Larnaka Airport, CY-6650
Cyprus silas_at_ucy.ac.cy
Introduction
Visibility Forecasting at Larnaca Airport 3 hours
ahead
- Data
- Predictand
- Runway Visual Range (five categories)
- Predictors
- Hourly change in Runway Visual Range (not
categorised) - Height of cloud base
- Dew point temperature
- The predictors were selected by a stepwise
forward search among 16 subjectively chosen
variables and were kept fixed during the
experiments.
- Motivation
- Rare visibility events are of considerable
importance - Should one proceed as normal?
- Can simple statistical approaches improve
forecasting?
- Aim
- Compare various statistical approaches for
probabilistic forecasting of rare visibility
events
Experiments
- Neural Networks
- Neural networks are very flexible functions that
in our context map predictors into probabilities.
The details are - One hidden layer
- Logistic input and output functions
- Cross-entropy loss function with quadratic weight
decay - Number of hidden units and weight decay are
selected on basis of cross-validated predictions
on the training set - 5 neural nets (with random initial values) are
fitted to deal with local minima. The average of
the estimated probabilities are applied.
Statistical Approaches
- Modelling
- Standard
- Neural net for probabilities of all visibility
classes - Conditional
- Neural net for probabilities of good and reduced
visibility classes (2 classes) ? P(G) and
P(R) - Neural net for probabilities of the reduced
visibility classes given that visibility will be
in one of these.
? P(RkR), k1, ,
K. - Use laws of probability to obtain the remaining
probabilities ? P(Rk) P(Rk
and R) P(RkR) P(R), k1, , K.
The experiments were repeated 15 times and only
the average scores are shown.
- Sampling and Use of Weights
- Sampling based on predictor (x)
- Randomly reduce unimportant cases by predictor
value - Use of weights based on predictor (x)
- Increase influence of important cases
- Use of weights based on predictand (y)
- Increase influence of cases with observed reduced
visibility - Note will result in biased probabilities
- Sampling and use of weights based on predictand
(y) - Randomly reduce cases with observed good
visibility - Introduce weights to account for the sampling
- Sampling
- X sampling based on predictor (RVR good)
- Y sampling based on predictand (observed RVR
good) - 110 ratio of number of cases with reduced and
good visibility is 110 - 11 ratio of number of cases with reduced and
good visibility is 11
- Use of Weights
- X weights based on predictor (RVR). The weight
for each case is one minus its prior class
probability. - Y weights based on predictand (observed RVR).
- If no sampling, the weight for each case
is one minus its prior class probability. - If
sampling, the weights are equal to the ratio
between prior class probability and new class
probability after sampling.
- CV score
- WRPS weighted RPS where the weight for each
case is one minus its prior class probability.
- Summary and Recommendations
- Sampling and use of weights based on predictand
give best results, but mostly due to better
predictions for the good visibility cases - Naïve approach second best
- Standard and conditional modelling give similar
results, but the latter is computationally faster
(not shown) - Introducing weights in the model selection score
(RPS) do not improve results and is not
recommended - Either sampling or use of weights based on
predictand only is not recommended implies
biased probabilities and poor scores - The best proposed approaches are clearly better
than persistence
- Model Selection Score
- Ranked Probability Score (RPS)
- Weighted Ranked Probability Score (WRPS)
- Similar to RPS, but cases are given weights based
on the observed visibility class.
Note that the proposed statistical approaches can
be combined and applied to other statistical
methods than neural nets.