Title: Time series data a nuisance
1Time series data - a nuisance
or the whole story?
- personal experience
- Nic Friggens
- Danish Institute of Agricultural Sciences, Foulum
2The problem with the nice theoretical
presentations is that they usually ignore the
kind of problems that the PhD student will have
to confront
Jørgensen 2002 (pers. com.)
- Dont use too much time on the statistical
issues, it is the biological focus and your
practical experience that is most important
3The nuisance
- When time just happens to be the medium in which
you accumulate replicates - Repeated measures
- Seriously affect test statistics
- Much less important for estimation of effects
4First example (the good old days)
- Effect of feed type on milk production
- 2 feeds high and low
- 2 periods each of 13 weeks
- Dry matter intake (DMI) measured weekly
- Cross-over design
- i.e. There may be carry-over effects
(Friggens et al. 1998 J.Dairy Sci. 81 2228)
5SED 0.5
carry-over effect?
6(No Transcript)
7Interpreting the interaction
- Use week data and not the period averages.
- Make use of a repeated measures structure
- Include week as a class variable
- Which covariance structure?
- Within feed x period?
- Not easy (statisticians dont come cheap)
- May well be necessary but not always.
- Unsanitized examples DIAS Biometry Research Unit
Internal Report 2001-06
8Ændringer i mælk ved mastitis
Second example
- K.H.M.N. Sloth, N.C. Friggens, I.R. Korsgaard, J.
Jensen, P. Løvendahl, P.H. Andersen, K.L.
Ingvartsen - Malkekoens Energioptagelse, Mobilisering og
Sundhed (MEMO), Danmarks JordbrugsForskning
Intern Rapport 148 87-112. (2001)
9Experimental setup
- 8 milk parameters
- Measured at each milking throughout lactation
- Effect of stage of lactation (also breed, parity,
feed, etc) on milk composition
Cannot ignore time
BUT what is the purpose of the analysis?
10Purpose To assess the potential of a
multivariate analysis for improving detection and
description of udder health status
Time (stage of lactation) is NOT in itself of
interest, it is just a factor to be accounted for.
11Simple stepwise approach to the analysis
- Treat stage of lactation as a class variable
- Use 5 windows equally spaced through lactation
- Adjust values for fixed effects including stage
of lactation (mixed REML model) - Generate components accounting for the variation
in the adjusted values (PCA) - Compare grouping of combined milk measures
component with independently assessed udder
health status (Cluster analysis)
12Variance reduction in step 1 Adjustment for
fixed effects
13Step 2 Principal Component Analysis on adjusted
values
14Step 3 Incidence of clinical mastitis in the
different clusters
15Forekomst af subklinisk mastitis
16Statistics are used by many just as a drunk uses
a lamp post - more for support than illumination
- Think carefully about the biological
interpretation when choosing your statistical
approach.
17Time series - the Whole Story
18Time series - The Whole Story
- Explicitly consider time effects
- Include time as a continuous variable
- In a biologically sensible way
- But be prepared to fall back on more pragmatic
solutions
193rd Example Changes in milk composition
immediately after calving
- To achieve a better understanding of the
physiological processes underlying the transition
from colostrum to milk production - To identify the timepoint at which colostrum
production ceases - Birgitte Madsen, Morten Dam Rasmussen (speciale)
20Milk protein
Milkings from calving
21Options for describing time
- Linear models
- Simple regression
- Can use Random Regression osv
- Biologically limited
- Non-linear models
- Difficult statistical behaviour
- May be biologically more relevant
22Random Regression
- Random regression
- Ignores time as a repeated measure
- Yi (a Ai) (b Bi)t ..
- Some very nice statistical properties
- BUT, only with linear models
- And..
23Beware Regressions
- End effects
- Adding correlated terms
- eg. curve minimum decreased from 16 to 12 weeks
on increasing from 4th to 6 order (Legendre)
polynomial - Always check against raw data, if possible
24- Y a b.t d.t2 linear but meaningless
- Y a.exp(-b.t) linear (in log form) but
tends to 0
25Milk fat
Milkings from calving
26Last example Predicting the effect of parity on
lactation curves
- To develop a means to predict the potential
lactation curve of heifers and 2nd parity cows
from information on mature cows. - Note the use of the term potential
- 40 cows each with 3 lactations
(Friggens et al., 1999. Livest. Prod. Sci. 62
1-13)
27(No Transcript)
28Woods Function
dY/dt a.(tb).exp(-c.t)
- Linear in log form..random regression
- Robust and generally gives a good fit.
- Does not relate well to the underlying biological
processes.
29dY/dt a.tb.exp(-ct)
u
b
t
3.00
2.50
2.00
1.50
1.00
-ct
e
0.50
0.00
0
20
40
60
80
100
120
140
160
180
200
220
240
Days post calving
30Lactation Curves
Woods function dY/dt a.tb.exp(-ct)
Alternative function (Emmans and Fisher,
1986) dY/dt a.u.exp(-ct) where u
exp(-exp(G0 - Bt))
31Advantages of alternative function
- u tends to 1 as t increases. Realistic values
for G0 and B mean that u is very close to 1 after
peak yield. The rate of decline post-peak is
described by only one parameter, c.
- Can have a positive yield at t 0
- BUT, nonlinear - 2 step analysis
- parameter estimation for each individual curve
- ANOVA on the curve coefficients
32Alternative functions to describe the lactation
curve.
33(No Transcript)
34The effect of Parity on curve coefficients
35(No Transcript)
36Choosing a (more) biologically meaningful
function resulted in
- A clearer understanding of which phases of the
lactation curve are affected by parity - A means to simplify the description of parity
effects on potential milk production - The cost of using a more complex model was
justified for this purpose.
37Use your head
38Summary
- Careful consideration of the biological purpose
is important in deciding how to deal with time in
data analysis - Careful consideration of the biological
interpretation of time series functions is also
important.
39Having the right tools is not enough!