Break-points detection with atheoretical regression trees - PowerPoint PPT Presentation

About This Presentation
Title:

Break-points detection with atheoretical regression trees

Description:

Break-points detection with atheoretical regression trees Marco Reale University of Canterbury Universidade Federal do Parana, 27th November 2006 – PowerPoint PPT presentation

Number of Views:88
Avg rating:3.0/5.0
Slides: 40
Provided by: MarcoR150
Category:

less

Transcript and Presenter's Notes

Title: Break-points detection with atheoretical regression trees


1
Break-points detection with atheoretical
regression trees
  • Marco Reale
  • University of Canterbury
  • Universidade Federal do Parana, 27th November 2006

2
Acknowledgements
  • The results presented are the outcome of joint
    work with
  • Carmela Cappelli
  • and
  • William Rea

3
Structural Breaks
  • A structural break is a statement about
    parameters in the context of a specific model.
  • A structural break has occurred if at least one
    of the model parameters has changed value at some
    point (break-point).
  • We consider time series data.

4
Relevance
  • Their detection is important for
  • forecasting (latest update of the DGP)
  • Analysis.
  • With regard to this point a recent debated issue
    is fractional integration vs structural breaks.

5
Milestones Chow 1960
  • Test for an a priori candidate break-point.
  • Splits the sample period in two subperiods and
    test the equality of the parameter sets with an F
    statistic.
  • It cannot be used for unknown dates
    misinformation or bias.

6
Milestones Quandt 1960
  • We can compute Chow statistics for all possible
    break-points.
  • If the candidate breakpoint is known a priori,
    then a Chi-square statistics can be used.

7
Milestones CUSUM 1974
  • Proposed by Brown, Durbin and Evans.
  • It checks the cumulative sum of the residuals.
  • It tests the null of no breakpoints against one
    or more breakpoints.

8
Milestones Andrews 1993
  • It exploits the Quandt statistics for a priori
    unknown break-points.

9
Bai and Perron 1998, 2003
  • It finds multiple breaks at unknown times.
  • Application of Fisher algorithm (1958) to find
    optimal exhaustive partitions.
  • It requires prior indication of number of breaks.
  • Applied recursively after positive indication
    provided by CUSUM.
  • Use of AIC to decide the number of breaks.

10
Fishers algorithm
11
Examples with G2,3 and m1
12
Example with G3 and m2
13
Bai, Perron and Fisher
  • Eventually Fisher selects the partition with the
    minimum deviance.
  • It is a global optimizer, but was computationally
    feasible only for very small n and G (even with
    today's computers).
  • Using later results in dynamic programming Bai
    and Perron can use the Fisher algorithm
    reasonably fast for n1000 and any G and m.
  • Fishers algorithm is related to regression trees.

14
Trees (1)
  • Trees are particular kinds of directed acyclic
    graphs.
  • In particular we consider binary trees.
  • Splits to reduce heterogeneity.

15
Trees (2)
Node 1 is called root. Node 5 is called leaf. The
other nodes are called branches.
16
Regression Trees (1)
  • Regression trees are sequences of hierarchical
    dichotomous partitions with maximum homogeneity
    of y projected by partitions of explanatory
    variables.
  • y is a control or response variable.

17
Regression trees (2)
18
Regression trees optimality
  • Regression trees don't provide necessarily
    optimal partitions

19
Atheoretical Regression Trees
  • Any artificial strictly ascending or descending
    sequence as a covariate, e.g. 1,2,3,4... would
    do all the optimal dichotomous partitions.
  • It also works as a counter.
  • It is not a theory based covariate so the name,
    Atheoretical regression trees ....yes it's ART.
  • ART is not a global optimizer.

20
Pruning the tree
  • Trees tend to oversplit so the overgrown tree
    needs a pruning procedure
  • Cross Validation, is the usual procedure in
    regression tree, not ideal in general for time
    series
  • AIC (Akaike, 1973) tends to oversplit
  • BIC (Schwarz, 1978) very good
  • All the information criteria robust for non
    normality, especially BIC.

21
Single break simulations
22
Noisy square simulations
23
CUSUM on noisy square
24
ART on noisy square
25
Some comments
  • The simulations show an excellent performance.
  • However ART performs better in long regimes.
  • With short regimes it tends to find spurious
    breaks but the performance can be sensibly
    improved with an enhanced pruning technique
    (ETP).

26
Bai and Perron on noisy square
27
Some comments
  • BP tends to find breaks any time the CUSUM
    rejects the null.
  • It unlikely finds spurious breaks.
  • but
  • It tends to underestimate the number of breaks.

28
Application to Michigan-Huron
  • The Michigan-Huron lakes play a very important
    role in the U.S. economy and hence they are
    regularly monitored.
  • In particular we consider the mean water level
    (over one year) time series from 1860 to 2000.

29
Michigan-Huron (2)
30
Michigan-Huron (3)
31
Michigan-Huron (4)
32
Campito Mountain
  • We applied ART to the Campito Mountain
    Bristlecone Pine data which is an unbroken set of
    tree ring widths covering the period 3435BC
    to1969AD. A series of this length can be analyzed
    by ART in a few seconds. BPP was applied to the
    series and took more than 200 hours of CPU time
    to complete.Tree ring data are used as proxies
    for past climatic conditions.

33
Campito Mountain (2)
34
Campito Mountain (3)
35
The four most recent periods
  • are
  • 1863-1969 Industrialization and global warming.
  • 1333-1862 The Little Ice Age.
  • 1018-1332 The Medieval Climate Optimum.
  • 862-1017 Extreme drought in the Sierra Nevadas.

36
(No Transcript)
37
Niceties of ART
  • Speed Art has O(n(t)) while BP O(nng).
  • Simplicity it can be easily implemented or run
    with packages implementing regression trees.
  • Feasibility it can be used without almost any
    limitation on either the number of observations
    or the number of segments.
  • Visualization it results in a hierarchical tree
    diagram that allows for inputation of a priori
    knowledge.

38
and
  • ... and of course you can say you're doing ART

39
Dedicated to Paulo
Write a Comment
User Comments (0)
About PowerShow.com