Trend Analysis - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Trend Analysis

Description:

trend tests with and without exogeneous variables; dealing with seasonality; ... Exogenous variable - variable other than time trend that may have influence on Y. ... – PowerPoint PPT presentation

Number of Views:613
Avg rating:3.0/5.0
Slides: 32
Provided by: Owne655
Category:
Tags: analysis | trend

less

Transcript and Presenter's Notes

Title: Trend Analysis


1
Trend Analysis
  • Step vs. monotonic trends
  • approaches to trend testing
  • trend tests with and without exogeneous
    variables
  • dealing with seasonality
  • Introduction to time series analysis
  • Step trends

2
Testing for Trends
  • Purpose
  • To determine if a series of observations of a
    random variable is generally increasing or
    decreasing with time
  • Or, has probability distribution changed with
    time?
  • Also, we may want to describe the amount or rate
    of change, in terms of some central value of the
    distribution such as the mean of median.

3
Monotonic Trend vs. Step Trend-Some Rules
  • Situation Monotonic Step
  • Long record with a known event that naturally
    X
  • divides the period of record into a pre and
  • post period.
  • Record broken into two segments with a long
    X
  • gap between them.
  • Unbroken or nearly unbroken long record X
  • Multiple records with a variety of lengths and
    X
  • timing of data gaps.
  • Unbroken record that shows a sudden jump in X
  • magnitude of r.v. for no known season.

4
Approaches to Monotonic Trend Testing
  • Where Y r.v. of interest in the trend test
    (e.g. conc., biomass, etc.)
  • X an exogenous variable expected to affect
    Y, (e.g. flow rate, etc.)
  • R residuals from a regression or LOWESS
    of Y vs. X
  • T time (often expressed in years)

5
Trend tests with No Exogenous Variable
  • Nonparametric Mann-Kendall test
  • same test as Kendalls ? (discussed in the next
    few slides)
  • test is invariant to power transformation.
  • Kendalls S statistic is computed from the Y, T
    data pairs. H0 of no change is rejected when S
    (and therefore Kendalls ? of Y vs T) is
    significantly different from zero.
  • If H0 rejected, we conclude that there is a
    monotonic trend in Y over time T.

6
Kendalls Tau (t)
  • Tau (t) measures the strength of the monotonic
    relationship between X and Y. Tau is a rank-based
    procedure and is therefore resistant to the
    effect of a small number of unusual values.
  • Because t depends only on the ranks of the data
    and not the values themselves, it can be used
    even in cases where some of the data are
    censored.
  • In general, for linear associations, t lt r.
    Strong linear correlations of r gt 0.9 corresponds
    to t gt 0.7.
  • Tau - easy to compute by hand, resistant to
    outliers, measures all monotonic correlations,
    and invariant to power transformations of X or Y
    or both.

7
Computation of Tau (t)
  • First order all data pairs by increasing x. If a
    positive correlation exists, the ys will
    increase more often than decreases as x
    increases.
  • For a negative correlation, the ys will decrease
    more than increase.
  • If no correlation exists, the ys will increase
    and decrease about the same number of times.
  • A 2-sided test for correlation will evaluate
  • Ho no correlation exists between x and y (t 0)
  • Ha x and y are correlated (t ? 0)

8
  • The test statistic S measures the monotonic
    dependence of y on x
  • S P - M
  • where P of (), the of times the ys
    increase as the xs increase, or the of yi lt yj
    for all i lt j.
  • M of (-), the of times the ys decrease as
    the xs increase, or the number of yi gt yj for
    all i lt j.
  • i 1, 2, (n-1) and j (i1), , n.
  • There are n(n-1)/2 possible comparisons to be
    made among the n data pairs. If all y values
    increased along the x values, S n(n-1)/2. In
    this situation, t 1, and vice versa.
    Therefore dividing S by n(n-1)/2 will give a -1
    lt t lt 1.

9
  • Hence the definition of t is
  • To test for the significance of t, S is compared
    to what would be expected when the null
    hypothesis is true. If it is further from 0 than
    expected, Ho is rejected.
  • For n lt 10, an exact test should be computed.
    The table of exact critical values is given in
    Table 1. For n gt 10, we can use a large sample
    approximation for the test statistic.

10
(No Transcript)
11
Large sample approximation - t
  • The large sample approximation Zs is given by
  • And, Zs 0, if S 0, and where
  • The null hypothesis is rejected at significance
    level a if Zs gt Zcrit where Zcrit is the critical
    value of the standard normal distribution with
    probability of exceedence of a/2.

12
Example 10 pairs of x and y are given below,
ordered by increasing xy 1.22 2.20 4.80
1.28 1.97 1.46 2.34 2.64 4.84 2.96
x 2 24 99 197 377 544
3452 632 6587 53170
Outlier
x
y
13
  • To compute S, first compare y1 1.22 with all
    subsequent ys.
  • 2.20 gt 1.22, hence
  • 4.40 gt 1.22 hence , etc.
  • Move on to i2, and compare y2 2.20 to all
    subsequent ys.
  • 4.80 gt 2.20, hence
  • 1.28 lt 2.20 hence -, etc.
  • For i2, there are 5 s and 3 -s. It is
    convenient to write all and - below their
    respective yi, as shown on the next slide.
  • In total there are 33 s (P33) and 12 -s
    (M12). Therefore
  • S33-12 21, and there are 10(9)/245 possible
    comparisons, so t 21/45 0.47. From Table
    1, for n 10 and S21, the exact p-value is
    2(0.036) 0.072.

14
Table of and - signs
  • yi 1.22 2.20 4.80 1.28 1.97 1.46
    2.64 2.34 4.84 2.96
  • - -
    - -
  • - -
  • - -
  • - -
  • -
  • -
  • 33 () and 12 (-), S 33-12 21

15
Large sample approximation
  • The large sample approximation is
  • From the Table of normal distribution, the
    1-sided quantile for 1.79 0.963, so that
    p2(1-0.963) 0.074
  • The large sample approximate is quite good even
    for a small sample of size 10.

16
Kendall-Theil Robust Line (Non-parametric)
  • The K-T Robust line is related to Kendalls
    correlation coefficient tau ( ) and is
    applicable when Y is linearly related to X.
  • This line is not
  • dependant on the normality of residuals for the
    validity of significant tests,
  • strongly affected by outliers.
  • The Kendall-Theil line is of the form

17
  • This line is closely related to Kendalls t, in
    that the significance to the test for H0 slope
    is identical to the test for H0
    .
  • The slope estimate is computed by comparing
    each data pair to all others in a pairwise
    fashion.
  • The median of all pairwise slopes is taken to be
    the non-parametric estimate of slope .
  • The intercept is defined as follows

for all i lt j
18
  • Where Ymed and Xmed are the medians of X and Y.
    The formula assures that the fitted line goes
    through the point (Ymed, Xmed). This is
    analogous to OLS, where the fitted line always
    goes through the means of X and Y.

Example 1 Given the following 7 data pairs
There are n(n-1)/2 pairs
19
Test of Significance
  • The test is identical to Kendalls t. That is,
    first compute S, then check Table 1 if n lt 10, or
    use large sample approximation for n gt 10.
  • For the example, S20-119, and there are 21
    pairwise slopes. t19/210.90. From Table 1,
    with n7 and S19, the exact 2-sided p-value is
    2(0.0014)0.003
  • Note If the Y value was 60 instead of 16, a
    clear outlier, the estimate of the slope would
    not change. This shows that the Kendall-Theil
    line is resistant to outliers.

20
Parametric Regression of Y on T
  • Simple regression of Y on T is a test for trend.
  • H0 is that the slope coefficient ?1 0.
  • All assumptions of regression must be met -
    normally of residuals, constant variance,
    linearity of relationship, and independence.
    Need to transform Y if assumptions not met.
  • If H0 is rejected, we conclude that there is a
    linear trend in Y over time T.

21
Comparison of Simple Tests for Trends
  • If regression assumptions are OK, then
    regression is best. Also good if there are more
    that one exogenous variable.
  • If assumptions of regression not met (outliers,
    censored, non-normal, etc.) Mann-Kendall will be
    OK or better.
  • Transformation of Y will affect regression, but
    not Mann-Kendall.
  • Best to try both methods.

22
(No Transcript)
23
(No Transcript)
24
Accounting for Exogenous Variables
  • Exogenous variable - variable other than time
    trend that may have influence on Y. These
    variables are usually natural, random phenomena
    such as rainfall, temperature or streamflow.
  • Removing variation in Y caused by these
    variables, the background variability or noise
    is reduced so that any trend signal present is
    not masked. The ability of a trend test to
    discern changes in Y with T is then increased.

25
  • Removal process involves modelling, and thus
    explaining the effect of exogenous variables with
    regression or LOWESS.
  • When removing the effect of one or more
    exogenous variables X, the probability
    distribution of the Xs is assumed to be
    unchanged over the period of record.
  • If the probability distribution of X has
    changed, a trend in the residuals may not
    necessarily be due to a trend in Y. Need to be
    careful of what is chosen as exogenous variable.

26
Nonparametric approach - LOWESS
  • LOWESS - describes the relationship between Y
    and X without assuming linearity or normality of
    residuals.
  • LOWESS pattern should be smooth enough that it
    doesnt have several local minima and maxima, but
    not so smooth as to eliminate the true change in
    slope.
  • LOWESS residuals
  • Then, Kendall S statistic is computed from R and
    T pairs to test for trend.

27
Mixed Approach
  • First do regression of Y on X (can have more
    than one X).
  • Check all regression assumption normality,
    linearity, constant variance, significant ?1,
    etc.
  • Then residuals (from
    regression)
  • Then Kendall S is computed from R, T pairs to
    test for trend.

28
Parametric approach
  • Uses regression of Y on T and X in one go.
  • This test for trend and simultaneously
    compensates for the effects of exogenous
    variables.
  • Must check for assumptions of regression. If ?1
    is significantly different from zero, then there
    is trend. ?2 should be significant as well.
    Otherwise no point including X.

29
(No Transcript)
30
(No Transcript)
31
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com