Title: Analysis of multivariate transformations
1Analysis of multivariate transformations
2Transformation of the response in regression
- The normalized power transformation is
3Score test for transformation
The score test Tsc(? ?0) is the t-statistic on
the constructed variable w(?0)
4Multivariate transformations
- In this case yi is a v ? 1 vector of responses at
observation i with yij the observation on
response j. The normalized transformation of yij
is given by
is the geometric mean of the jth response
5Multivariate transformations
- We assume a multivariate linear regression model
of the form
6Mult. transformations to normality
- If the transformed obs. are normally distributed
with mean µi and cov. matrix S the max.
loglikelihood is given by
7Mult. transformations to normality
- If the explanatory variables are the same
- The max. lik. estimator of S is given by
ei(?) is a v ? 1 vector of residuals for
observation i for some value of ?
8The profile loglikelihood (i.e. maximized over µ
and S) is
9Multivariate likelihood ratio test
- The multivariate generalization of TSC is given
by
This statistic must be compared with a ?2 distr.
with v df.
10Swiss heads monitoring lik. ratio test for
transf. H0?1
11Boxplot of 6 var. with univariate outliers
labelled
12Swiss heads
- The marginal distribution of y4 had the two
outliers (units 104 and 111). - We want to test whether all the evidence for a
transformation is due to y4. - We recalculate the likelihood ratio but now
testing whether ?4 is equal to 1.
13Forward plot of the lik. ratio test H0 ?41
14Mussels data
- 82 observations on Horse mussels (cozze) from New
Zealand. Five variables
Purpose to see whether multivariate normality
can be obtained by joint transformation of all 5
variables
15Mussels data spm
16Forward lik. ratio for H0?1
17Finding a multivariate transformation with the
forward search
- With just one variable for transformation it is
extremely easy to use the fan plot from the
forward search to find satisfactory
transformations and observations which are
influential - With v variables there are 5v combinations of the
5 values of ?(-1,-0.5,0,0.5,1)
18Suggested procedure for finding multivariate
transformations
- Run the FS through untransformed data, ordering
the observations at each m by MD calculated from
untransformed observations. - Estimate ? at each step.
- Select a preliminary set of transformation
parameters
19Monitoring of MLE of ? H0 ?1
H0 ?(0.5, 0, 0.5, 0, 0)
20Monitoring of MLE of ? H0 ?(0.5, 0, 0.5, 0, 0)
21Forward lik. ratio for H0?(0.5,0,0.5,0,0)
22Validation of the transformation
- In univariate analysis the likelihood ratio test
is
- Asymptotically the null distribution of TLR is
chi-squared on one degree of freedom.
23Signed square root of TLR
- This test asymptotically has N(0,1)
- Including the sign of the difference between the
two ? gives an indication of the direction of any
departure from the hypothesised value
24Multivariate version of the signed sqrt lik. ratio
- We test just one component of ? when all others
are kept at some specified value - We calculate a set of tests by varying each
component of ? about ?0
25Example mussels data validation of
?0(0.5,0,0.5,0,0)
- Purpose to validate in a multivariate way ?10.5
for the first variable - To form the likelihood ratio test we need an
estimator ? (?1, , ?v) found by maximization
only over ?1. - The other parameters keep their values in ?0. (In
this example 0,0.5,0,0) - ?1 takes the 5 standard values of ?
- (-1,-0.5,0,0.5,1)
26Example validation of ?1
- We perform 5 independent FS with
- ?0(-1, 0,0.5,0,0)
- ?0(-0.5, 0,0.5,0,0)
- ?0(0, 0,0.5,0,0)
- ?0(0.5, 0,0.5,0,0)
- ?0(-1, 0,0.5,0,0)
- We monitor for each search the signed square root
likelihood ratio test
27Version for multivariate data of the signed sqrt
LR test
- ?j is the parameter under test
- ?Sj is one of the 5 standard values of ?
- ?0j is the vector of parameter values in which ?j
takes one of the 5 standard values ?S while the
other parameters keep their value in ?0 - One plot for each ?j j 1, , v
28Mussels data validation of ?0(0.5,0,0.5,0,0)
29Forward lik. ratio for H0?(1/3,1/3,1/3,0,0)
30Mussels data spm (transf. obs.)
31Monitoring MD before transforming
32Monitoring MD after transforming
33Minimum MD before and after transforming
The transformation has separated the outliers
from the bulk of the data.
34Gap before and after transforming
35Conclusions
- This was an example of our approach to finding a
mult. transformation in the presence of potential
influential obs. and outliers. - Procedure start the search with untransformed
data to suggest a transformation and repeat the
analysis until you find an acceptable
transformation. - In this example only 3 searches were necessary to
find a transformation which is stable for all the
search, any changes being at the end.
36Exercises
37Exercise 1
- The next slide gives two sets of bivariate data.
Which of the two has to be transformed to achieve
bivariate normality? - Consider a forward search in which you monitor
the likelihood ratio test for the hypothesis of
no transformation. Describe the plot you would
expect to get for each of the two sets of data.
38Two sets of simulated bivariate data