Title: Inference about Several Mean Vectors
1IV. Inferences about Several Mean Vectors
A. Paired Comparisons 1. Matched Samples
Random samples, taken from g samples, where every
sample element in group l has one unique element
in each other group that is similar on all
relevant characteristics except for group
membership. Often used in pre-post situations (if
used on two groups, group membership among the
two elements is usually assigned randomly).
2. Independent Samples Unrelated random samples
are taken from each of the g populations.
2 To make a univariate inference on two matched
samples - Let X1jdenote the jth observation
from group 1 and X2j denote the jth observation
from group 2 - Create a new random variable Dj
- Given that Dj N(d, sd), we have that
where
3 This result leads directly to normal theory
confidence intervals and hypothesis tests for the
mean difference d. Let D1,,Dn be a random
sample from a N(d, sd) distribution. The
hypothesis H0 d d0 ( 0 usually) is
rejected in favor of H1 d ? d0 at a level of
significance a, if
Note this can be adapted to a one-tailed test.
4 We can also build a confidence interval for the
mean difference d. Let D1,,Dn be a random
sample from a N(d, sd) distribution. Then the
100(1 a) confidence interval for d is given by
Now we extend this procedure to p-dimensions!
5 We will let Xlji represent the value of the ith
variable taken from the jth observation of the
lth group. For g 2 groups, we can then create p
new variables Dji
Let
and assume that
6 If the D1,,Dn Np(d, Sd) and are independent
random vectors, then
where
and we know that
7 This result leads directly to normal theory
confidence intervals and hypothesis tests for the
mean difference vector d. Let d1,,dn be
observed difference vectors from a Np(d, Sd)
distribution. The hypothesis H0 d 0 is
rejected in favor of H1 d ? 0 at a level of
significance a, if
8 A 100(1 a) confidence region for the mean
difference vector d consists of all d such that
100(1 a) simultaneous confidence intervals
for the individual mean differences di are given
by
Bonferroni 100(1 a) confidence intervals for
the individual mean differences di are given by
9 Example suppose we had the following fifteen
sample observations from two groups on some
random variables X1 and X2
At a significance level of a 0.05, do these
data support the assertion that they were drawn
from a population with similar means? In other
words, test the null hypothesis
10 First we need to calculate the difference
vectors d1j and d2j
These are the two variables of interest!
11 The scatter plot of pairs (d1j, d2j), sample
centroid (d1, d2), and hypothesized centroid (d1,
d2) is provided below
sample centroid (d1, d2)
_ _
hypothesized centroid (d1, d2)0
Do these data appear to support our null
hypothesis?
12 Lets go through the five steps of hypothesis
testing to assess the potential validity of our
assertion. - State the Null and Alternative
Hypotheses - Select the Appropriate
Test Statistic n p 15 2 13 is not very
large, but the data appear relatively bivariate
normal, so use
13 - State the Desired Level of Significance a,
Find the Critical Value(s) and State the Decision
Rule a 0.05 and n1 p 2, n2 n p 13
degrees of freedom, we have f2,13 (0.05) 3.81
0.5-a0.95
a0.05
95 Do Not Reject Region
Reject Region
But we dont yet have a decision rule since
14Thus our critical value is
So we have Decision rule do not reject H0 if
T2 ? 8.20 otherwise reject H0
15 - Calculate the Test Statistic We have
so
16 - Use the Decision Rule to Evaluate the Test
Statistic and Decide Whether to Reject or Not
Reject the Null Hypothesis T2 0.0971 ?
8.20 so do not reject H0. The sample evidence
does not refute the claim that the mean
difference vector differs from
Note that we could certainly use the likelihood
ratio test (Wilks lambda) to evaluate the
plausibility of this hypothesis.
17 Lets go through the five steps of hypothesis
testing to use Wilks Lambda to assess the
potential validity of our assertion. -
State the Null and Alternative Hypotheses
- Select the Appropriate Test Statistic n p
15 2 13 is not very large, but the data
appear relatively bivariate normal, so use
18 - State the Desired Level of Significance a,
Find the Critical Value(s) and State the Decision
Rule a 0.05 and n1 p 2, n2 n p 13
degrees of freedom, we have f2,13 (0.05) 3.81
0.5-a0.95
a0.05
95 Do Not Reject Region
Reject Region
But we dont yet have a decision rule since
19Thus our critical T2 value is
This leads to a critical likelihood ratio value of
So we have Decision rule do not reject H0 if
? 0.9931 otherwise reject H0
20 - Calculate the Test Statistic From our
earlier results we have T2 0.0971, so the
calculated value of the likelihood ratio test
statistic for this sample is
21 - Use the Decision Rule to Evaluate the Test
Statistic and Decide Whether to Reject or Not
Reject the Null Hypothesis 0.9931 ?
0.6307 so do not reject H0. The sample evidence
does not refute the claim that the mean
difference vector differs from
22SAS code for a Hypothesis Test of a Mean
Difference Vector
OPTIONS LINESIZE 72 NODATE PAGENO 1 DATA
stuff INPUT x11 x12 x21 x22 d1x11-x21
d2x12-x22 d100 d200 d1difd1-d10
d2difd2-d20 LABEL x11'Group 1 Observed Values
of X1' x12'Group 1 Observed Values of X2'
x21'Group 2 Observed Values of X1'
x22'Group 2 Observed Values of X2'
d1'Group 1 - Group 2 Difference for X1'
d2'Group 2 - Group 2 Difference for X1'
d1o'Hypothesized Value of D1'
d2o'Hypothesized Value of D2'
d1dif'Difference Between Observed and
Hypothesized Values of X1'
d2dif'Difference Between Observed and
Hypothesized Values of X2' CARDS 1.43 -0.689
2.05 -1.279 . . . . . .
. . . 9.42 -7.644 9.54 -7.324
23SAS code for a Hypothesis Test of a Mean Vector
(continued)
PROC MEANS DATAstuff N MEAN STD T PRT VAR d1 d2
d1dif d2dif TITLE4 'Using PROC MEANS to generate
univariate summary statistics' RUN PROC CORR
DATAstuff COV VAR d1 d2 TITLE4 'Using PROC
CORR to generate the sample covariance
matrix' RUN PROC GLM DATAstuff MODEL d1dif
d2dif /nouni MANOVA HINTERCEPT TITLE4 'Using
PROC GLM to test a Hypothesis of a Mean
Difference Vector' RUN
24 SAS output for Univariate Hypothesis Test of
Mean Differences
The MEANS
Procedure Variable Label
N
d1 Group 1 - Group 2 Difference for
X1 15 d2 Group 2 -
Group 2 Difference for X1
15 d1dif Difference Between Observed and
Hypothesized Values of X1 15 d2dif
Difference Between Observed and Hypothesized
Values of X2 15
Variable Mean Std Dev t Value
Pr gt t
d1
-0.0313333 0.4176955 -0.29
0.7757 d2 0.0044000
0.4229990 0.04 0.9684 d1dif
-0.0313333 0.4176955 -0.29
0.7757 d2dif 0.0044000
0.4229990 0.04 0.9684
25 SAS output Sample Covariance Correlation
Matrices
The CORR Procedure
2 Variables x1 x2
Covariance Matrix, DF 14
d1
d2 d1 Group 1 - Group 2 Difference
for X1 0.1744695238 0.0067262857 d2 Group
2 - Group 2 Difference for X1 0.0067262857
0.1789281143 Simple
Statistics Variable N Mean
Std Dev Sum d1
15 -0.03133 0.41770 -0.47000
d2 15 0.00440 0.42300
0.06600 Simple
Statistics Variable Minimum Maximum
Label d1 -0.71000 0.83000 Group 1
- Group 2 Difference for X1 d2 -0.82300
0.59000 Group 2 - Group 2 Difference for
X1 Pearson Correlation
Coefficients, N 15 Prob gt
r under H0 Rho0
d1 d2 d1
1.00000
0.03807 Group 1 - Group 2 Difference for X1
0.8929 d2
0.03807 1.00000
Group 2 - Group 2 Difference for X1 0.8929
26 SAS output for a Hypothesis Test of a Mean
Difference Vector
The GLM Procedure Multivariate Analysis of
Variance Characteristic Roots and Vectors
of E Inverse H, where H
Type III SSCP Matrix for Intercept
E Error SSCP Matrix
Characteristic Characteristic
Vector V'EV1 Root Percent
d1dif d2dif
0.00621775 100.00 0.63431394
-0.11011820 0.00000000 0.00
0.08743178 0.62262027 MANOVA
Test Criteria and Exact F Statistics for
the Hypothesis of No Overall Intercept
Effect H Type III SSCP Matrix
for Intercept E Error
SSCP Matrix S1 M0
N5.5 Statistic Value
F Value Num DF Den DF Pr gt F Wilks' Lambda
0.99382067 0.04 2 13
0.9605 Pillai's Trace 0.00617933
0.04 2 13 0.9605 Hotelling-Lawley
Trace 0.00621775 0.04 2 13
0.9605 Roy's Greatest Root 0.00621775
0.04 2 13 0.9605
27 Lets also find the 95 confidence region for
the mean difference vector d it consists of all
d such that
We can even plot the 95 confidence region by
finding and using the (eigenavalue, eigenvector)
pairs for the sample covariance matrix Sd.
28 The directions and relative lengths of axes of
this confidence interval ellipsoid are determined
by going
units along the corresponding eigenvectors
ei. Beginning at the centroid d, the lengths of
the axes of the confidence interval ellipsoid are
_
29 For our example, the eigenvalue-eigenvector
pairs li, ei for the sample covariance matrix Sd
are
so the half-lengths of the major and minor axes
are given by
30 The axes lie along the corresponding
eigenvectors ei
when these vectors are plotted with the sample
centroid d as the origin
_
31sample centroid (d1, d2)
_ _
32sample centroid (d1, d2)
_ _
Now we move ? 0.466 units along the vector e1
and ? 0.447 units along the vector e2.
33 The 100(1 a) simultaneous confidence
intervals for the individual mean differences di
are given by
and
34shadow of the 95 confidence region on the X1 axis
sample centroid (d1, d2)
_ _
shadow of the 90 confidence region on the X2 axis
35 Univariate Repeated Measures Design
observations are made for a single response on
the same set of elements, usually at uniform time
intervals. Here the jth observation is
q not p!
Here we have measurements on a single response
on each element at q points in time.
36 We usually are concerned with contrasts of
(i.e., group or time period means) or sometimes
(i.e., respondent means)
37 A common contrast is
or
either of which allows us to test m1 ? mq!
38 Once again, this leads directly to normal theory
confidence intervals and hypothesis tests for
quality of all means across respondents.
Consider an Nq(m, S) population, and let C be a
contrast matrix. The hypothesis H0 Cm 0 is
rejected in favor of H1 Cm ? 0 at a level of
significance a, if
39 Under these circumstances, a 100(1 a)
confidence region for contrasts Cm is given by
Similarly, simultaneous 100(1 a) confidence
intervals for single contrasts cm are given by
40 Example suppose we asked fifteen consumers to
rate the overall palatability for different
yogurt blends on a 100 point scale. We have
systematically varied the sugar content and
acidity of the blends each can be either low
or high, so we have the following experimental
design
1
2
Low
Well call these the treatment labels
Sugar Content
3
4
High
Low
High
Acidity
There are three common contrasts of interest in
this design two main effects (sugar content and
acidity) and one interaction.
41 For our design
1
2
Low
Sugar Content
3
4
High
Low
High
Acidity
The three particular contrasts of interest
are (m1 m2) - (m3 m4) sugar content main
effect (m1 m3) - (m2 m4) acidity main
effect (m1 m4) - (m2 m3) sugar content x
acidity interaction We can now construct a
contrast matrix C.
42 With mean vector
our contrast matrix could be
with each vector representing an individual
contrast of interest.
43 For the sample data provided below we have the
following summary data
and we wish to test the null hypothesis H0 Cm
0
44 A scatterplot and overlaid interaction plot
could look like this
The Green Points are means at low sugar contents
Yellow Points are means at high sugar contents.
This plot suggests main effects and an
interaction.
45 At a level of significance a 0.01, the null
hypothesis H0 Cm 0 is rejected in favor
of H1 Cm ? 0 if
where
46 Thus our decision rule is do not reject H0
if T2 ? 20.834 otherwise reject H0 And
the calculated value of the test statistic is
47 So by our decision rule T2 175.702 ?
20.834 we reject H0. The sample evidence
refutes the claim that the three contrasts of the
mean vector (m1 m2) - (m3 m4) sugar
content main effect (m1 m3) - (m2 m4)
acidity main effect (m1 m4) - (m2 m3)
sugar content x acidity interaction are jointly
equal to zero.
48 Note that we could convert the T2 to an F
and find the corresponding pvalue
(0.000000459). - if either of our factors
(sugar content or acidity) had more than two
levels, this could easily be accommodated by
expanding the number of columns of the contrast
matrix C (but would result in many more possible
contrasts) - this analysis could also be
accomplished using a two factor ANOVA with
interaction - this test could be conducted
using a likelihood ratio test (such as Wilks
lambda)
49SAS code for a Univariate Repeated Measures
Analysis
OPTIONS LINESIZE 72 NODATE PAGENO 1 DATA
stuff INPUT x11 x12 x21 x22 LABEL x11'Low
Sugar/Low Acid Palatability Rating'
x12'Low Sugar/High Acid Palatability Rating'
x21'High Sugar/Low Acid Palatability Rating'
x22'High Sugar/High Acid Palatability
Rating' CARDS 65 63 72 60 . . . . .
. . . . 73 59 84 64 PROC MEANS DATAstuff
N MEAN STD T PRT VAR x11 x12 x21 x22 RUN PROC
CORR DATAstuff COV VAR x11 x12 x21
x22 RUN PROC GLM DATAstuff MODEL x11 x12 x21
x22 /nouni MANOVA HINTERCEPT M(1 1 -1 -1,1
-1 1 -1, 1 -1 -1 1) RUN
This is the transposed contrast matrix C (SAS
uses BM for contrasts)
Do not use M if you want a standard repeated
measures hypothesis test
50 SAS output for Univariate Repeated Measures
Analysis
The MEANS Procedure
Variable Label
N Mean
x11
Low Sugar/Low Acid Palatability Rating 15
73.0000000 x12 Low Sugar/High Acid
Palatability Rating 15 66.4666667 x21
High Sugar/Low Acid Palatability Rating 15
80.2000000 x22 High Sugar/High Acid
Palatability Rating 15 60.0666667
Variable Label
Std Dev t
Value
x11 Low
Sugar/Low Acid Palatability Rating
7.6811457 36.81 x12 Low Sugar/High Acid
Palatability Rating 6.5341593 39.40 x21
High Sugar/Low Acid Palatability Rating
7.0730878 43.91 x22 High Sugar/High Acid
Palatability Rating 4.1656189
55.85
Variable
Label Pr gt
t
x11 Low Sugar/Low
Acid Palatability Rating lt.0001 x12
Low Sugar/High Acid Palatability Rating
lt.0001 x21 High Sugar/Low Acid
Palatability Rating lt.0001 x22
High Sugar/High Acid Palatability Rating
lt.0001
51 SAS output for a Univariate Repeated Measures
Analysis
Covariance Matrix, DF 14
x11 x11 Low Sugar/Low Acid Palatability
Rating 59.00000000 x12 Low
Sugar/High Acid Palatability Rating
19.28571429 x21 High Sugar/Low Acid
Palatability Rating 42.35714286 x22
High Sugar/High Acid Palatability Rating
15.78571429 Covariance
Matrix, DF 14
x12 x11
Low Sugar/Low Acid Palatability Rating
19.28571429 x12 Low Sugar/High Acid
Palatability Rating 42.69523810 x21
High Sugar/Low Acid Palatability Rating
19.40000000 x22 High Sugar/High Acid
Palatability Rating 15.46666667
Covariance Matrix, DF 14
x21 x11 Low Sugar/Low Acid
Palatability Rating 42.35714286 x12
Low Sugar/High Acid Palatability Rating
19.40000000 x21 High Sugar/Low Acid
Palatability Rating 50.02857143 x22
High Sugar/High Acid Palatability Rating
14.91428571 Covariance
Matrix, DF 14
x22 x11
Low Sugar/Low Acid Palatability Rating
15.78571429 x12 Low Sugar/High Acid
Palatability Rating 15.46666667 x21
High Sugar/Low Acid Palatability Rating
14.91428571 x22 High Sugar/High Acid
Palatability Rating 17.35238095
52 SAS output for a Univariate Repeated Measures
Analysis
The CORR Procedure
Simple Statistics Variable
Minimum Maximum x11
59.00000 91.00000 Low Sugar/Low Acid
Palatability Rating x12 55.00000
80.00000 Low Sugar/High Acid Palatability Rating
x21 62.00000 91.00000 High
Sugar/Low Acid Palatability Rating x22
50.00000 65.00000 High Sugar/High Acid
Palatability Rating Pearson
Correlation Coefficients, N 15
Prob gt r under H0 Rho0
x11 x12
x21 x22 x11
1.00000 0.38426 0.77964 0.49335 Low
Sugar/Low Acid Palatability Rating
0.1573 0.0006 0.0616 x12
0.38426 1.00000 0.41976
0.56823 Low Sugar/High Acid Palatability Rating
0.1573 0.1193 0.0271 x21
0.77964 0.41976
1.00000 0.50619 High Sugar/Low Acid
Palatability Rating 0.0006 0.1193
0.0542 x22
0.49335 0.56823 0.50619 1.00000 High
Sugar/High Acid Palatability Rating 0.0616
0.0271 0.0542
53 SAS output for a Univariate Repeated Measures
Analysis
The GLM Procedure Multivariate Analysis of
Variance M Matrix Describing
Transformed Variables x11
x12 x21 x22
MVAR1 1 1
-1 -1 MVAR2 1
-1 1 -1 MVAR3
1 -1 -1
1 Variables have been
transformed by the M Matrix Characteristic
Characteristic Vector V'EV1
Root Percent MVAR1 MVAR2
MVAR3 12.2469729 100.00
-0.00043226 0.01939071 -0.02839361
0.0000000 0.00 0.03726347 0.00063998
-0.00093711 0.0000000 0.00
0.00453341 0.01264141 0.02452041
MANOVA Test Criteria and Exact F Statistics
for the Hypothesis of No Overall
Intercept Effect on the Variables Defined
by the M Matrix Transformation H
Type III SSCP Matrix for Intercept
E Error SSCP Matrix
S1 M0.5 N5 Statistic
Value F Value Num DF Den DF Pr gt
F Wilks' Lambda 0.07548894 48.99
3 12 lt.0001 Pillai's Trace
0.92451106 48.99 3 12 lt.0001
Hotelling-Lawley Trace 12.24697289 48.99
3 12 lt.0001 Roy's Greatest Root
12.24697289 48.99 3 12 lt.0001
54B. Comparing Mean Vectors m From Two
Populations If independent samples are drawn
from two populations, they are compared on a
summary level (we have inference on the
difference between two mean vectors instead of on
the mean difference vector) Consider random
samples of size n1 from population 1 and n2 from
population 2. If we are comparing a single
variable, we have Sample Summary Statistics
55 We may wish to make inferences about the
difference between the group means (usually m1 -
m2 0). To do so we will need to make the
following assumptions - -
- This structure is sufficient for making
inferences about the scalar m1 - m2 when the
sample sizes n1 and n2 are large the result is
that
56 where
note no adjustment for covariance!
which can be approximated by
when the sample sizes n1 and n2 are large.
This result leads directly to large sample
confidence intervals and hypothesis tests for the
difference between means m1 and m2.
57 Let Xi1,,Xin1 be independent random samples
from populations with mean mi and variance sii,
i 1, 2. The hypothesis H0 m1 m2 d0 ( 0
usually) is rejected in favor of H1 m1 m2 ?
d0 at a level of significance a, if
Note this can be adapted to a one-tailed test.
58 We can also build a confidence interval for the
difference between means m1 and m2. Let
Xi1,,Xini be independent random samples from
populations with mean mi and variance sii, i 1,
2. Then the 100(1 a) confidence interval for
m1 - m2 is given by
when the sample sizes n1 and n2 are large. As
usual, things are a little different when either
of the sample sizes n1 or n2 is small the
Central Limit Theorem does not apply!
59 When either of the sample size n1 or n2 is
small, the conditions under which we can make
valid inferences about the difference between the
group means (usually m1 - m2 0) are more
restrictive. To do so we will need to make the
following assumptions - -
- This additional structure is sufficient for
making inferences about the scalar m1 - m2 when
either sample size n1 or n2 is small the result
is that
60 where
note no adjustment for covariance!
which can be approximated by
when the population variances are equal (s11
s22).
61 This result leads directly to normal theory
confidence intervals and hypothesis tests for the
difference between means m1 and m2. Let
Xi1,,Xini be independent random samples from
normal populations with mean mi and common
variance sii, i 1, 2. The hypothesis H0 m1
m2 d0 ( 0 usually) is rejected in favor
of H1 m1 m2 ? d0 at a level of significance
a, if
62 We can also build a confidence interval for the
difference between means m1 and m2. Let
Xi1,,Xini be independent random samples from
normal populations with mean mi and common
variance sii, i 1, 2. Then the 100(1 a)
confidence interval for m1 - m2 is given by
when either sample size n1 or n2 is small.
63 Note that this test is very sensitive to the
assumption of common variance we should test
this assumption before using this approach to
making inferences about the difference between
means m1 and m2. If a random sample of size n
is selected from a normal population then
so if our two samples are independent and s1
s2, then
64 This again leads to a natural hypothesis test of
s1 s2. For two random samples of size n1 and
n2 selected from normal populations, the
hypothesis H0 s11 s22 is rejected in favor
of H1 s11 ? s22 at a level of significance a,
if
Note that many other tests for homogeneity of
variances (Bartlett, Levine, OBrien, etc.) exist.
65 Example consider our previous situation (we
asked fifteen consumers to rate the overall
palatability for different yogurt blends on a 100
point scale). Recall we have systematically
varied the sugar content and acidity of the
blends each can be wither low or high, so
we have the following experimental design
1
2
Low
Well call these the treatment labels
Sugar Content
3
4
High
Low
High
Acidity
What if we had taken one sample of sixty
responses and randomly assigned each to one of
our four groups?
66 For our design
1
2
Low
Sugar Content
3
4
High
Low
High
Acidity
The (previously identified) contrasts of
interest are (m1 m2) - (m3 m4) sugar
content main effect (m1 m3) - (m2 m4)
acidity main effect (m1 m4) - (m2 m3)
sugar content x acidity interaction We can now
test the corresponding univariate hypotheses.
67 We have the following raw data and summary
statistics
some of which will be useful in our univariate
hypothesis tests.
68 Recall our scatterplot and overlaid interaction
plot looked like this
The Green Points are means at low sugar contents
Yellow Points are means at high sugar contents.
and suggested both main effects and an
interaction.
69 Lets go through the five steps of hypothesis
testing to assess the potential validity of our
assertion at a 0.01. - State the Null and
Alternative Hypotheses - Select the
Appropriate Test Statistic nLow Sugar nHigh
Sugar 30 (borderline large samples) and the
data appear relatively normal but are the
variances equal?
70 Lets test the hypothesis at a level of
significance a 0.01 Well need the following
summary statistics
71 - State the Desired Level of Significance a,
Find the Critical Value(s) and State the Decision
Rule a 0.01 and nLow Sugar - 1 29, nHigh
Sugar - 1 29 degrees of freedom, we have f29,29
(0.005) 2.674
0.5-a0.995
a0.005
99.5 Do Not Reject Region
Reject Region
So we have Decision rule do not reject H0 if F
? 2.674 otherwise reject H0
72 - Calculate the Test Statistic We have
- Use the Decision Rule to Evaluate the Test
Statistic and Decide Whether to Reject or Not
Reject the Null Hypothesis F 2.284 ?
2.674 so do not reject H0. The sample evidence
does not refute the claim that the population
variances differ. Use the pooled variances
t-test to test
73 - Select the Appropriate Test Statistic
nLow Sugar nHigh Sugar 30 (borderline large
samples), but the data appear relatively normal
and the population variances appear to be equal,
so use
where
74 - State the Desired Level of Significance a,
Find the Critical Value(s) and State the Decision
Rule a0.01, nLow Sugar nHigh Sugar 2 58
and we have a two-tailed test, so ta/2 ?2.918.
Reject Region
Reject Region
Decision rule do not reject H0 if 2.918 ? t ?
2.918 otherwise reject H0
75 - Calculate the Test Statistic We have
and
76 - Use the Decision Rule to Evaluate the Test
Statistic and Decide Whether to Reject or Not
Reject the Null Hypothesis -2.918 ? -0.156 ?
2.918 so do not reject H0. The sample evidence
does not refute the claim that the mean
palatability ratings for the low sugar and high
sugar content yogurts are equal.
77 We could test the other two contrasts in a
similar manner if we leave a 0.01, we will
only have minimal calculations - State the
Null and Alternative Hypotheses - Select
the Appropriate Test Statistic nLow Acid
nHigh Acid 30 (borderline large samples) and
the data appear relatively normal but are the
variances equal?
78 Lets test the hypotheses at a level of
significance a 0.01 Well need the following
summary statistics
79 - Calculate the Test Statistic We have
- Use the Decision Rule to Evaluate the Test
Statistic and Decide Whether to Reject or Not
Reject the Null Hypothesis F 1.668 ?
2.674 F 1.168 ? 2.674 so do not reject
either H0. The sample evidence does not refute
either claim that the population variances
differ. Use the pooled variances t-test to test
80 Now we can calculate the test statistics first
we tackle the Acid Main effect
so
81 We now use the decision rule to evaluate the
test statistic and decide whether to reject or
not reject the null hypothesis -2.918 ? 7.106 ?
2.918 so reject H0. The sample evidence refutes
the claim that the difference between mean
palatability ratings for the low acid and high
acid content yogurts are equal.
82 For the interaction the calculated value of the
test statistic is
so
83 We now use the decision rule to evaluate the
test statistic and decide whether to reject or
not reject the null hypothesis -2.918 ? -2.826 ?
2.918 so do not reject H0. The sample evidence
does not refute the claim that no interaction
effect exists.
84SAS code for a Small Independent Samples Test of
the Difference Between Two Means
OPTIONS LINESIZE 72 NODATE PAGENO 1 DATA
stuff INPUT y1 y2 factor1 factor2 LABEL
y1'Palatability Rating' y2'Purchase
Intent' factor1'Low Sugar(1) vs. High
Sugar(2)' factor2'Low Acid(1) vs. High
Acid(2)' CARDS 65 67 1 1 63 71 1 2 72 77 2 1
60 58 2 2 72 70 1 1 . . . . . . . .
. 84 92 2 1 64 68 2 2 PROC TTEST DATAstuff
ALPHA0.01 CLASS factor1 VAR y1 y2 TITLE4
'Using PROC TTEST to test hypotheses of equal
means' RUN
85 SAS output for a Small Independent Samples
Test of the Difference Between Two Means
The TTEST Procedure
Statistics
Lower CL Upper CL Lower CL Variable
factor1 N Mean Mean Mean
Std Dev y1 1 30
65.831 69.733 73.636 5.7724 y1
2 30 64.235 70.133 76.031
8.7243 y1 Diff (1-2) -7.233
-0.4 6.4334 8.0006 y2
1 30 68.447 73.2 77.953 7.0304
y2 2 30 68.424 74.433
80.443 8.8891 y2 Diff (1-2)
-8.636 -1.233 6.1698 8.6676
Statistics
Upper CL Variable factor1
Std Dev Std Dev Std Err Minimum Maximum
y1 1 7.7546 11.528
1.4158 55 91 y1 2
11.72 17.424 2.1398 50 91
y1 Diff (1-2) 9.9372 12.977
2.5658 y2 1 9.4446
14.041 1.7243 50 90 y2
2 11.941 17.753 2.1802 55
97 y2 Diff (1-2) 10.766 14.059
2.7797
86 SAS output for a Small Independent Samples
Test of the Difference Between Two Means
The TTEST Procedure
T-Tests Variable Method
Variances DF t Value Pr gt t y1
Pooled Equal 58 -0.16
0.8767 y1 Satterthwaite Unequal
50.3 -0.16 0.8767 y2
Pooled Equal 58 -0.44
0.6589 y2 Satterthwaite Unequal
55.1 -0.44 0.6590
Equality of Variances Variable
Method Num DF Den DF F Value Pr gt
F y1 Folded F 29 29
2.28 0.0296 y2 Folded F
29 29 1.60 0.2125
87 If independent samples on p-variables are drawn
from two populations, they are compared on a
summary level (we have inference on the
difference between two mean vectors instead of on
the mean difference vector) we can extend the
univariate tests to p dimensions. Consider
random samples of size n1 from population 1 and
n2 from population 2. Then we have Sample Summ
ary Statistics
88 Now we may wish to make inferences about the
difference between the group means (usually m1
m2). To do so we will need to make the
following assumptions - -
- This structure is sufficient for making
inferences about the p x 1 vector m1 - m2 when
the sample sizes n1 and n2 are large relative to
p.
89 When the sample sizes n1 and n2 are small
relative to p, we must make additional
assumptions to support inferences about the
difference between the mean vectors m1
m2. Because the central limit theorem does not
apply in these circumstances, To do so we will
need to make the following assumptions -
- - This structure is sufficient for
making inferences about the p x 1 vector m1 - m2
when the sample sizes n1 and n2 are small
relative to p.
90 We know that if S1 S2 S, then
is an estimate of (n1 - 1)S and
is an estimate of (n1 - 1)S. Consequently, we
can combine (or pool) the information in both
samples
91 Now to test the hypothesis H0 m1 - m2
d0 we consider the squared distance from the
sample estimate x1 x2 from the hypothesized
difference d0. Since
_ _
independence of the samples implies
92 and since Spooled estimates S, we have that
is an estimator of
as a result
93 Once again, this leads directly to normal theory
confidence intervals and hypothesis tests for
equality of all means across respondents. For
independent samples
and
the likelihood ratio test of hypothesis H0 m1
- m2 d0 is rejected in favor of H1 m1 - m2
? d0 at a level of significance a, if
94 Note the reappearance of the pattern we observed
when using the T2 statistic to test a single mean
vector
multivariate normal Np(0,S) random vector
multivariate normal Np(0,S) random vector
(Wishart Wn1n2-2(S) random matrix/df)-1
95 Under these circumstances, a 100(1 a)
confidence region for the difference between
means d is given by
Similarly, simultaneous 100(1 a) confidence
intervals for single contrasts a(m1 - m2) are
given by
which will cover any a(m1 - m2) with
probability 1 a.
96 Example consider our previous situation (we
took a sample of sixty respondents and randomly
assigned each to one of our four groups based on
the sugar content and acidity of the yogurt
blends, then asked respondents to rate the
overall palatability for their assigned yogurt
blends on a 100 point scale), resulting in the
following experimental design
1
2
Low
These are the treatment labels
Sugar Content
3
4
High
Low
High
Acidity
What if we also collected purchase intent
ratings on a one hundred point scale from the
respondents?
97 For our design
1
2
Low
Sugar Content
3
4
High
Low
High
Acidity
The (previously identified) contrasts of
interest are (m1 m2) - (m3 m4) sugar
content main effect (m1 m3) - (m2 m4)
acidity main effect (m1 m4) - (m2 m3)
sugar content x acidity interaction We can now
test the corresponding multivariate hypotheses.
98 We have the following raw data on our two
variables
Lets look at the sugar content main effect.
99 Well go through the five steps of hypothesis
testing to assess the potential validity of our
assertion at a 0.01. - State the Null and
Alternative Hypotheses - Select the
Appropriate Test Statistic nLow Sugar p
nHigh Sugar p 28 (borderline large samples)
and the data appear relatively normal but are
the variances equal?
100 Lets look at the sample covariance matrices
SLow Sugar and SHigh Sugar
Johnson Wichern suggest that assuming equality
of covariance matrices is problematic if a
component of one sample covariance matrix is at
least four times as large as the corresponding
component from the other sample covariance
matrix. Our ratios are in the range of about 2.0
2.5, so we meet the (rather liberal) Johnson
Wichern standard, so we will proceed directly
with our hypothesis test.
101 - Select the Appropriate Test Statistic nLow
Sugar nHigh Sugar 30 (borderline large
samples), but the data appear relatively normal
and the population covariance matrices appear to
be (relatively equal, so use
where
102 - State the Desired Level of Significance a,
Find the Critical Value(s) and State the Decision
Rule We have a 0.01 and p 2, nLow Sugar
nHigh Sugar p 1 57 degrees of freedom, so
f2,57 (0.01) 4.998
0.5-a0.99
a0.01
99 Do Not Reject Region
Reject Region
But we dont yet have a decision rule since
103Thus our critical value is
So we have Decision rule do not reject H0 if
T2 ? 10.17 otherwise reject H0
104 - Calculate the Test Statistic We have
so
105 - Use the Decision Rule to Evaluate the Test
Statistic and Decide Whether to Reject or Not
Reject the Null Hypothesis 0.425 ? 10.17 so do
not reject H0. The sample evidence does not the
refute the claim that no difference exists
between mean palatability and purchase intent
ratings for the low sugar and high sugar content
yogurts.
106SAS code for a One-Factor MANOVA
OPTIONS LINESIZE 72 NODATE PAGENO 1 DATA
stuff INPUT y1 y2 factor1 factor2 LABEL
y1'Palatability Rating' y2'Purchase
Intent' factor1'Low Sugar(1) vs. High
Sugar(2)' factor2'Low Acid(1) vs. High
Acid(2)' CARDS 65 67 1 1 63 71 1 2 72 77 2 1
60 58 2 2 72 70 1 1 . . . . . . . .
. 84 92 2 1 64 68 2 2 PROC GLM
DATAstuff CLASS factor1 MODEL y1 y2
factor1/nouni MANOVA Hfactor1 TITLE4 'Using
PROC GLM to test a Hypothesis of Equal Mean
Vectors' RUN
107SAS code for a One-Factor MANOVA
The GLM Procedure
Multivariate Analysis of
Variance Class Level
Information Class
Levels Values factor1
2 1 2 Number of
observations 60 Characteristic Roots
and Vectors of E Inverse H, where
H Type III SSCP Matrix for factor1
E Error SSCP Matrix
Characteristic Characteristic
Vector V'EV1 Root Percent
y1 y2
0.00733375 100.00 -0.01995203
0.02439915 0.00000000 0.00
0.01851985 -0.00600644 MANOVA
Test Criteria and Exact F Statistics for
the Hypothesis of No Overall factor1 Effect
H Type III SSCP Matrix for
factor1 E Error SSCP
Matrix S1 M0
N27.5 Statistic Value F
Value Num DF Den DF Pr gt F Wilks' Lambda
0.99271964 0.21 2 57
0.8120 Pillai's Trace 0.00728036
0.21 2 57 0.8120 Hotelling-Lawley
Trace 0.00733375 0.21 2 57
0.8120 Roy's Greatest Root 0.00733375
0.21 2 57 0.8120
108 Lets consider acidity - Well go through the
five steps of hypothesis testing to assess the
potential validity of our assertion at a 0.01.
- State the Null and Alternative
Hypotheses - Select the Appropriate Test
Statistic nLow Acidity p nHigh Acidity p
28 (borderline large samples) and the data
appear relatively normal but are the variances
equal?
109 Lets look at the sample covariance matrices
SLow Acidity and SHigh Acidity
Again, Johnson Wichern suggest that assuming
equality of covariance matrices is problematic
if a component of one sample covariance matrix is
at least four times as large as the corresponding
component from the other sample covariance
matrix. Our ratios are in the range of about 1.1
2.0, so we meet the (rather liberal) Johnson
Wichern standard, so we will proceed directly
with our hypothesis test.
110 - Select the Appropriate Test Statistic nLow
Acidity nHigh Acidity 30 (borderline large
samples), but the data appear relatively normal
and the population covariance matrices appear to
be (relatively equal, so use
where
111 - State the Desired Level of Significance a,
Find the Critical Value(s) and State the Decision
Rule We have a 0.01 and p 2, nLow Acidity
nHigh Acidity p 1 57 degrees of freedom, so
f2,57 (0.01) 4.998
0.5-a0.99
a0.01
99 Do Not Reject Region
Reject Region
But we dont yet have a decision rule since
112Thus our critical value is
So we have Decision rule do not reject H0 if
T2 ? 10.17 otherwise reject H0
113 - Calculate the Test Statistic We have
so
114 - Use the Decision Rule to Evaluate the Test
Statistic and Decide Whether to Reject or Not
Reject the Null Hypothesis 57.354 ? 10.17 so
reject H0. The sample evidence refutes the claim
that the no difference exists between mean
palatability and purchase intent ratings for the
low acidity and high acidity content yogurts.
115SAS code for a One-Factor MANOVA
OPTIONS LINESIZE 72 NODATE PAGENO 1 DATA
stuff INPUT y1 y2 factor1 factor2 LABEL
y1'Palatability Rating' y2'Purchase
Intent' factor1'Low Sugar(1) vs. High
Sugar(2)' factor2'Low Acid(1) vs. High
Acid(2)' CARDS 65 67 1 1 63 71 1 2 72 77 2 1
60 58 2 2 72 70 1 1 . . . . . . . .
. 84 92 2 1 64 68 2 2 PROC GLM
DATAstuff CLASS factor2 MODEL y1 y2
factor2/nouni MANOVA Hfactor2 TITLE4 'Using
PROC GLM to test a Hypothesis of Equal Mean
Vectors' RUN
116SAS code for a One-Factor MANOVA
The GLM Procedure
Multivariate Analysis of
Variance Class Level
Information Class
Levels Values factor2
2 1 2 Number of
observations 60 Characteristic Roots
and Vectors of E Inverse H, where
H Type III SSCP Matrix for factor2
E Error SSCP Matrix
Characteristic Characteristic
Vector V'EV1 Root Percent
y1 y2
0.90937019 100.00 0.02299899
-0.00522470 0.00000000 0.00
-0.02147417 0.02475409 MANOVA
Test Criteria and Exact F Statistics for
the Hypothesis of No Overall factor2 Effect
H Type III SSCP Matrix for
factor2 E Error SSCP
Matrix S1 M0
N27.5 Statistic Value F
Value Num DF Den DF Pr gt F Wilks' Lambda
0.52373291 25.92 2 57
lt.0001 Pillai's Trace 0.47626709
25.92 2 57 lt.0001 Hotelling-Lawley
Trace 0.90937019 25.92 2 57
lt.0001 Roy's Greatest Root 0.90937019
25.92 2 57 lt.0001
117 When the covariance structures are not equal
(i.e., S1 ? S2), any measure of distance (such as
T2) will depend on the unknowns S1 and S2 when at
least one of the sample sizes n1 and n2 is small
relative to p. However, if both sample sizes n1
and n2 are large relative to p, we can avoid the
complexities due to unequal covariance matrices
when making inferences about the difference
between the mean vectors m1 - m2. Under such
conditions we have that
118 Once again, this leads directly to large sample
theory confidence intervals and hypothesis tests
for equality of all means across
respondents. The likelihood ratio test of
hypothesis H0 m1 - m2 d0 is rejected in
favor of H1 m1 - m2 ? d0 at a level of
significance a, if
119 Under these circumstances, a 100(1 a)
confidence region for the difference between
means d is given by
Similarly, simultaneous 100(1 a) confidence
intervals for single contrasts a(m1 - m2) are
given by
which will cover any a(m1 - m2) with
probability 1 a.
120C. Comparing Mean Vectors m From More Than Two
Populations If independent samples are drawn
from three (or more) populations, they are
compared in a manner similar to Analysis of
Variance (note we wish to test the equality of
the mean vectors). Consider random samples of
size nl from population l, l 1,,g. Our data
will look like this
We use the multivariate analog of ANOVA (MANOVA
or Multivariate Analysis of Variance) to assess
the assertion that g groups have equal mean
vectors.
121common variance
Recall that in ANOVA we assumed -
- These random samples are taken independently
from the g populations Thus we can decompose
any observation Xlt in the data set as
and its estimate as
This, of course, leads to the reparameterization
122 So our null hypothesis of equal means across the
g groups becomes H0 t1 t2 ? tg Note
also that, if the assumptions are met, then e
N(0, s2) and that we must impose a restriction
such as
or perhaps t1 0 to achieve unique
least-squares estimates.
123 The decomposition of the jth observation of the
lth group
suggests a similar sample decomposition
estimate of m
estimate of tl
estimate of el
124_
If we subtract the overall sample mean x from
both sides and square the results
then sum both sides over the subscript j, we get
125 If we then sum both sides over the subscript l,
the result is the classic partitioning of the
sums of squares
SSres (residual or error or within samples SS)
SStr (treatment or model or between samples SS)
SStot or SScor (total corrected SS)
which can be restated as
SSres (residual or error or within samples SS)
SStr (treatment or model or between samples SS)
SSobs (total observation SS)
SSmean (mean SS)
126 Note that the MStr and the MSres are independent
estimates of the pooled population variance s2.
Thus if the treatment means are equal,
and
so we have
127 and our test of the null hypothesis of equal
means across the g groups H0 t1 t2 ?
tg is rejected at level of significance a is
128 The results of an Analysis of Variance are
usually summarized and displayed in an ANOVA
Table
Treatments
Error
Total
129 Example consider our previous situation (we
took a sample of sixty respondents and randomly
assigned each to one of our four groups based on
the sugar content and acidity of the yogurt
blends, then asked respondents to rate the
overall palatability and purchase intent for
their assigned yogurt blends on a 100 point
scale). What if we also collected data at a third
level (moderate) of sweetness texture from thirty
additional consumers)?
This is the resulting experimental design.
130 For our design
The (previously identified) contrasts of
interest are still sugar content main
effect acidity main effect sugar content x
acidity interaction for either variable
(palatability or purchase intent), but are now
much more complex.
131 We have the following raw data on our two
variables
Lets look at the sugar content main effect.
132 Well go through the five steps of hypothesis
testing to assess the potential validity of our
assertion at a 0.01. - State the Null and
Alternative Hypotheses - Select the
Appropriate Test Statistic we would like to do
an ANOVA, but are the variances equal? We have
These dont look too dissimilar.
133 - Since the data appear relatively normal and
the population variance appear to be (relatively)
equal, well use the F-test
where
134 - State the Desired Level of Significance a,
Find the Critical Value(s) and State the Decision
Rule We have a 0.01 and g 1 2, nLow Sugar
nModerate Sugar nHigh Sugar g 87 degrees
of freedom, so f2,87 (0.01) 4.858
0.5-a0.99
a0.01
99 Do Not Reject Region
Reject Region
So we have Decision rule do not reject H0 if F
? 4.858 otherwise reject H0
135 - Calculate the Test Statistic First we
calculate the sample treatment means and overall
mean
136 We can use these to calculate the sample
treatment sum of squares
And we can use these to calculate the sample
error sum of squares
137 Now we can calculate the mean squares and the
F-ratio and complete the Analysis of Variance
table
Treatments
3 1 2
13.156
Error
90 3 87
7684.133
Total
90 1 89
7697.289
138 - Use the Decision Rule to Evaluate the Test
Statistic and Decide Whether to Reject or Not
Reject the Null Hypothesis 0.0745 ? 4.858 so
do not reject H0. The sample evidence does not
the refute the claim that no difference exists
between mean palatability ratings for the low,
moderate, and high sugar content yogurts.
139 SAS CODE FOR PROC GLM OUTPUT FOR ONE-FACTOR ANOVA
OPTIONS LINESIZE72 NODATE PAGENO1 DATA
stuff INPUT y1 y2 factor1 factor2 LABEL
y1'Palatability Rating' y2'Purchase
Intent' factor1'Low Sugar(1) vs. Moderate
Sugar(2) vs. High Sugar(3)' factor2'Low
Acid(1) vs. High Acid(2)' CARDS 65 67 1 1 72
70 1 1 . . . . . . 64 68 3 2 PROC GLM
DATAstuff CLASS factor1 MODEL y1
factor1 TITLE4 'Using PROC GLM to test a
Hypothesis of Equal Means' RUN
140PROC GLM OUTPUT FOR ONE-FACTOR ANOVA
The GLM Procedure Dependent Variable y1
Palatability Rating
Sum of Source DF
Squares Mean Square F Value Pr gt F
Model 2 13.155556
6.577778 0.07 0.9283 Error
87 7684.133333 88.323372 Corrected
Total 89 7697.288889
R-Square Coeff Var Root MSE y1
Mean 0.001709 13.48572
9.398051 69.68889 Source
DF Type I SS Mean Square F Value Pr
gt F factor1 2 13.15555556
6.57777778 0.07 0.9283 Source
DF Type III SS Mean Square F
Value Pr gt F factor1 2
13.15555556 6.57777778 0.07 0.9283
141 If the independent samples drawn from three (or
more) populations are multivariate, they are
compared in a manner similar to Analysis of
Variance (note we wish to test the equality of
the mean vectors). Consider random samples of
size nl from population l, l 1,,g. Our data
will look like this
We use the multivariate analog of ANOVA (MANOVA
or Multivariate Analysis of Variance) to assess
the assertion that g groups have equal mean
vectors.
142common covariance
In MANOVA we must assume - - These
random samples are taken independently from the g
populations Thus we can decompose any
observation Xlj in the data set as
and its estimate as
This, of course, leads to the reparameterization
143 So our null hypothesis of equal means across the
g groups becomes H0 t1 t2 ? tg Note
also that, if the assumptions are met, then el
Np(0, S) and that we must impose a restriction
such as
to achieve unique least-squares estimates (this
is the most common of such restrictions).
144 The decomposition of the jth observation of the
lth group
suggests a similar sample decomposition
estimate of m
estimate of tl
estimate of el
145_
If we subtract the overall sample mean x from
both sides and square the results
then sum both sides over the subscripts j and l,
the result is the multivariate generalization of
the classic partitioning of the sums of squares
SSres (residual or error or within samples SS)
SStr (treatment or model or between samples SS)
SStot or SScor (total corrected SS)
146 The resulting MANOVA Table looks very similar to
an ANOVA Table
Treatments
Error
Total
Sometimes called sums of squares and
crossproducts (SSCPs or SSPs)
Note the lack of Mean Squares - their
distributions are much more complex in
p-dimensions.
147 One approach (suggested by Wilks 1932) to
testing the null hypothesis of equal mean vectors
across the g groups H0 t1 t2 ? tg is
the ratio of generalized variances
This is related to the likelihood ratio
criterion.
148 Under certain conditions, the exact distribution
of this Wilks lambda (L) is known
?2
1
2
?2
?1
2
?1
3
Bartlett 1938 has developed an approximate
distribution for other cases.
149 Bartlett 1938 showed that
so we can reject the null hypothesis of equal
mean vectors across the g groups H0 t1 t2 ?
tg at an a level of significance if
150 Example consider our previous situation (we
took a sample of sixty respondents and randomly
assigned each to one of our six groups based on
the sugar content and acidity of the yogurt
blends, then asked respondents to rate the
overall palatability and purchase intent for
their assigned yogurt blends on a 100 point
scale).
This is the resulting experimental design.
151 For our design
The (previously identified) contrasts of
interest are still sugar content main
effect acidity main effect sugar content x
acidity interaction for either variable
(palatability or purchase intent), but are now
much more complex.
152 We have the following raw data on our two
variables
Lets look at the sugar content main effect on
both variables (Palatability and Purchase Intent)
jointly.
153 Well go through the five steps of hypothesis
testing to assess the potential validity of our
assertion at a 0.01. - State the Null and
Alternative Hypotheses - Select the
Appropriate Test Statistic we would like to do
a MANOVA, but are the covariance matrices equal?
154 Lets look at the sample covariance matrices
SLow Acidity, SModerate Acidity, and SHigh
Acidity
These do look reasonably similar (for
statistical estimates), and they do pass Johnson
Wichern suggested criteria for assuming
equality of covariance matrices (our ratios are
in the range of about 1.5 1.8), so we will
proceed directly with our hypothesis test.
155 - Since the data appear relatively normal, the
population covariance matrices appear to be
(relatively) equal, and we have p 2 variables
and g 2 groups, well use an exact L test
where
156 - State the Desired Level of Significance a,
Find the Critical Value(s) and State the Decision
Rule We have a 0.01 and p 3, nLow Sugar
nModerate Sugar nHigh Sugar p - 1 87
degrees of freedom, so f3,87 (0.01) 4.015
0.5-a0.99
a0.01
99 Do Not Reject Region
Reject Region
So we have Decision rule do not reject H0 if F
? 4.015 otherwise rejec