Time series data a nuisance - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Time series data a nuisance

Description:

'The problem with the nice theoretical presentations is that they usually ignore ... Unsanitized examples DIAS Biometry Research Unit Internal Report 2001-06 ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 40
Provided by: nicolasf9
Category:

less

Transcript and Presenter's Notes

Title: Time series data a nuisance


1
Time series data - a nuisance
or the whole story?
  • personal experience
  • Nic Friggens
  • Danish Institute of Agricultural Sciences, Foulum

2
The problem with the nice theoretical
presentations is that they usually ignore the
kind of problems that the PhD student will have
to confront
Jørgensen 2002 (pers. com.)
  • Dont use too much time on the statistical
    issues, it is the biological focus and your
    practical experience that is most important

3
The nuisance
  • When time just happens to be the medium in which
    you accumulate replicates
  • Repeated measures
  • Seriously affect test statistics
  • Much less important for estimation of effects

4
First example (the good old days)
  • Effect of feed type on milk production
  • 2 feeds high and low
  • 2 periods each of 13 weeks
  • Dry matter intake (DMI) measured weekly
  • Cross-over design
  • i.e. There may be carry-over effects

(Friggens et al. 1998 J.Dairy Sci. 81 2228)
5
SED 0.5
carry-over effect?
6
(No Transcript)
7
Interpreting the interaction
  • Use week data and not the period averages.
  • Make use of a repeated measures structure
  • Include week as a class variable
  • Which covariance structure?
  • Within feed x period?
  • Not easy (statisticians dont come cheap)
  • May well be necessary but not always.
  • Unsanitized examples DIAS Biometry Research Unit
    Internal Report 2001-06

8
Ændringer i mælk ved mastitis
Second example
  • K.H.M.N. Sloth, N.C. Friggens, I.R. Korsgaard, J.
    Jensen, P. Løvendahl, P.H. Andersen, K.L.
    Ingvartsen
  • Malkekoens Energioptagelse, Mobilisering og
    Sundhed (MEMO), Danmarks JordbrugsForskning
    Intern Rapport 148 87-112. (2001)

9
Experimental setup
  • 8 milk parameters
  • Measured at each milking throughout lactation
  • Effect of stage of lactation (also breed, parity,
    feed, etc) on milk composition

Cannot ignore time
BUT what is the purpose of the analysis?
10
Purpose To assess the potential of a
multivariate analysis for improving detection and
description of udder health status
Time (stage of lactation) is NOT in itself of
interest, it is just a factor to be accounted for.
11
Simple stepwise approach to the analysis
  • Treat stage of lactation as a class variable
  • Use 5 windows equally spaced through lactation
  • Adjust values for fixed effects including stage
    of lactation (mixed REML model)
  • Generate components accounting for the variation
    in the adjusted values (PCA)
  • Compare grouping of combined milk measures
    component with independently assessed udder
    health status (Cluster analysis)

12
Variance reduction in step 1 Adjustment for
fixed effects
13
Step 2 Principal Component Analysis on adjusted
values
14
Step 3 Incidence of clinical mastitis in the
different clusters
15
Forekomst af subklinisk mastitis
16
Statistics are used by many just as a drunk uses
a lamp post - more for support than illumination
  • Think carefully about the biological
    interpretation when choosing your statistical
    approach.

17
Time series - the Whole Story
18
Time series - The Whole Story
  • Explicitly consider time effects
  • Include time as a continuous variable
  • In a biologically sensible way
  • But be prepared to fall back on more pragmatic
    solutions

19
3rd Example Changes in milk composition
immediately after calving
  • To achieve a better understanding of the
    physiological processes underlying the transition
    from colostrum to milk production
  • To identify the timepoint at which colostrum
    production ceases
  • Birgitte Madsen, Morten Dam Rasmussen (speciale)

20
Milk protein
Milkings from calving
21
Options for describing time
  • Linear models
  • Simple regression
  • Can use Random Regression osv
  • Biologically limited
  • Non-linear models
  • Difficult statistical behaviour
  • May be biologically more relevant

22
Random Regression
  • Random regression
  • Ignores time as a repeated measure
  • Yi (a Ai) (b Bi)t ..
  • Some very nice statistical properties
  • BUT, only with linear models
  • And..

23
Beware Regressions
  • End effects
  • Adding correlated terms
  • eg. curve minimum decreased from 16 to 12 weeks
    on increasing from 4th to 6 order (Legendre)
    polynomial
  • Always check against raw data, if possible

24
  • Y a b.t d.t2 linear but meaningless
  • Y a.exp(-b.t) linear (in log form) but
    tends to 0

25
Milk fat
Milkings from calving
26
Last example Predicting the effect of parity on
lactation curves
  • To develop a means to predict the potential
    lactation curve of heifers and 2nd parity cows
    from information on mature cows.
  • Note the use of the term potential
  • 40 cows each with 3 lactations

(Friggens et al., 1999. Livest. Prod. Sci. 62
1-13)
27
(No Transcript)
28
Woods Function
dY/dt a.(tb).exp(-c.t)
  • Linear in log form..random regression
  • Robust and generally gives a good fit.
  • Does not relate well to the underlying biological
    processes.

29
dY/dt a.tb.exp(-ct)
u
b
t
3.00
2.50
2.00
1.50
1.00
-ct
e
0.50
0.00
0
20
40
60
80
100
120
140
160
180
200
220
240
Days post calving
30
Lactation Curves
Woods function dY/dt a.tb.exp(-ct)
Alternative function (Emmans and Fisher,
1986) dY/dt a.u.exp(-ct) where u
exp(-exp(G0 - Bt))
31
Advantages of alternative function
  • u tends to 1 as t increases. Realistic values
    for G0 and B mean that u is very close to 1 after
    peak yield. The rate of decline post-peak is
    described by only one parameter, c.
  • Can have a positive yield at t 0
  • BUT, nonlinear - 2 step analysis
  • parameter estimation for each individual curve
  • ANOVA on the curve coefficients

32
Alternative functions to describe the lactation
curve.
33
(No Transcript)
34
The effect of Parity on curve coefficients
35
(No Transcript)
36
Choosing a (more) biologically meaningful
function resulted in
  • A clearer understanding of which phases of the
    lactation curve are affected by parity
  • A means to simplify the description of parity
    effects on potential milk production
  • The cost of using a more complex model was
    justified for this purpose.

37
Use your head
38
Summary
  • Careful consideration of the biological purpose
    is important in deciding how to deal with time in
    data analysis
  • Careful consideration of the biological
    interpretation of time series functions is also
    important.

39
Having the right tools is not enough!
Write a Comment
User Comments (0)
About PowerShow.com