Title: QNT 531 Advanced Problems in Statistics and Research Methods
1QNT 531Advanced Problems in Statistics and
Research Methods
- WORKSHOP 2
- By
- Dr. Serhat Eren
- University OF PHOENIX
2SECTION 2
- ANALYSIS OF VARIANCE AND EXPERIMENTAL DESIGN
3SECTION 2SECTION OBJECTIVES
- An Introduction to Analysis of Variance
- Analysis of Variance Testing for the Equality of
k population means - Multiple comparison procedures
- An introduction to Experimental Design
- Completely Randomized Designs
- Randomized Block Design
- Factorial Experiment
4SECTION 2 ANALYSIS OF DATA FROM ONE-WAY DESIGNS
- One-Way Designs The Basics
- A factor is a variable that can be used to
differentiate one group or population from
another. It is a variable that may be related to
he variable of interest. - A level is one of several possible values or
settings that the factor can assume. - The response variable is a quantitative variable
that you are measuring or observing.
5SECTION 2 ANALYSIS OF DATA FROM ONE-WAY DESIGNS
- These are all examples of one-way or completely
randomized designs. - An experiment has a one-way or completely
randomized design if there are several different
levels of one factor being studied and the
objects or people being observed/ measured are
randomly assigned to one of the levels of the
factor.
6SECTION 2 ANALYSIS OF DATA FROM ONE-WAY DESIGNS
- The term one-way refers to the fact that the
groups differ with regard to the one factor being
studied. - The term completely randomized refers to the fact
that individual observations are assigned to the
groups in a random manner.
7SECTION 2 ANALYSIS OF DATA FROM ONE-WAY DESIGNS
- Understanding the Total Variation
- Analysis of variance (ANOYA) is the technique
used to analyze the variation in the data to
determine if more than two population means are
equal. - A treatment is a particular setting or
combination of settings of the factor(s)
8SECTION 2 ANALYSIS OF DATA FROM ONE-WAY DESIGNS
- The grand mean or the overall mean is the sample
average of all the observations in the
experiment. It is labeled (x-bar-bar). - Now we can rewrite the variance calculations as
follows
9SECTION 2 ANALYSIS OF DATA FROM ONE-WAY DESIGNS
- The total variation or sum of squares total (SST)
is a measure of the variability in the entire
data set considered as a whole. - SST is calculated as follows
10SECTION 2 ANALYSIS OF DATA FROM ONE-WAY DESIGNS
- Components of Total Variation
- The between groups variation is also called the
Sum or Squares between or the Sum of Squares
Among and it measures how much of the total
variation comes from actual differences in the
treatments. - The dot-plot shown in Figure 14.3 displays the
sample average for each of the four time
treatments. These are called treatment means.
11(No Transcript)
12SECTION 2 ANALYSIS OF DATA FROM ONE-WAY DESIGNS
- A treatment mean is the average of the response
variable for a particular treatment. - Between Groups Variation measures how different
the individual treatment means are from the
overall grand mean. It is often called the sum of
squares between or the sum of squares among (SSA).
13SECTION 2 ANALYSIS OF DATA FROM ONE-WAY DESIGNS
- The formula for sum of squares among (SSA) is
- Within groups variation measures the variability
in the measurements within the groups. It is
often called sum of squares within or sum of
squares error (SSE).
14(No Transcript)
15(No Transcript)
16(No Transcript)
17(No Transcript)
18(No Transcript)
19(No Transcript)
20SECTION 2 ANALYSIS OF DATA FROM ONE-WAY DESIGNS
- The Mean Square Terms in the ANOVA Table
- The mean square among is labeled MSA The mean
square error is labeled MSE and the mean square
total is labeled MST. - The formulas for the mean squares are
21SECTION 2 ANALYSIS OF DATA FROM ONE-WAY DESIGNS
- Testing the Hypothesis of Equal Means
- In general, the null and alternative hypotheses
for a one-way designed experiment are shown
below -
- HA At least one of the population means is
different from the others.
22SECTION 2 ANALYSIS OF DATA FROM ONE-WAY DESIGNS
- The formula for the F test statistic is
calculated by taking the ratio of the two sample
variances - In ANOVA, MSA and MSE are our two sample
variances. So the F statistic is calculated as
23SECTION 2 ASSUMPTIONS OF ANOVA
- The three major assumptions of ANOVA are as
follows - The errors are random and independent of each
other. - Each population has a normal distribution.
- All of the populations have the same variance.
24SECTION 2ANALYSIS OF DATA FROM BLOCKED DESIGNS
- A block is a group or objects or people that have
been matched. Are object or person can be matched
with itself, meaning that repeated observations
are taken on that object or person and these
observations form a block? - If the realities of data collection lead you to
use blocks, then you must take this into account
in your analysis. Your experimental design is
called a randomized block design. Instead of
using a one-way ANOVA you must use a block ANOVA.
25SECTION 2ANALYSIS OF DATA FROM BLOCKED DESIGNS
- An experiment has a randomized block design if
several different levels of one factor are being
studied and the objects or people being observed/
measured have been matched. - Each object or person is randomly assigned to one
of the c levels of the factor.
26SECTION 2ANALYSIS OF DATA FROM BLOCKED DESIGNS
- Partitioning the Total Variation
- Like the approach we took with data from a
one-way design, the idea is to take the total
variability as measured by SST and break it down
into its components. - With a block design there is one additional
component the variability between the blocks. It
is called the sum of squares blocks and is
labeled SSBL.
27SECTION 2ANALYSIS OF DATA FROM BLOCKED DESIGNS
- The sum of squares blocks measures the
variability between the blocks. It is labeled
SSBL. - For a block design, the variation we see in the
data is due to one of three things the level of
the factor, the block, or the error.
28SECTION 2ANALYSIS OF DATA FROM BLOCKED DESIGNS
- Thus, the total variation is divided into three
components - SST SSA SSBL SSE
29SECTION 2ANALYSIS OF DATA FROM BLOCKED DESIGNS
- Using the ANOVA Table in a Block Design
- The ANOVA table for such a block design looks
just like the ANOVA table for a one-way design
with an additional row.
30(No Transcript)
31(No Transcript)
32(No Transcript)
33(No Transcript)
34(No Transcript)
35SECTION 2ANALYSIS OF DATA FROM TWO-WAY DESIGNS
- Motivation for a Factorial Design Model
- An experimental design is called a factorial
design with two factors if there are several
different levels of two factors being studied. - The first factor is called factor A and there are
r levels of factor A. The second factor is called
factor B and there are c levels of factor B.
36SECTION 2ANALYSIS OF DATA FROM TWO-WAY DESIGNS
- The design is said to have equal replication if
the same number of objects or people being
observed/measured are randomly selected from each
population. - The population is described by a specific level
for each of the two factors. Each observation is
called a replicate. - There are n' observations or replicates observed
from each population. There are n n'rc
observations in total.
37SECTION 2ANALYSIS OF DATA FROM TWO-WAY DESIGNS
- Partitioning the Variation
- The sum of squares due to factor A is labeled
SSA. It measures the squared differences between
the mean of each level of factor A and the grand
mean. - The sum of squares due to factor B is labeled
SSB. It measures the squared differences between
the mean of each level of factor B and the grand
mean.
38SECTION 2ANALYSIS OF DATA FROM TWO-WAY DESIGNS
- The sum of squares due to the interacting effect
of A and B is labeled SSAB. It measures the
effect of combining factor A and factor B. - The sum of squares error is labeled SSE. It
measures the variability in the measurements
within the groups. - Thus, the total variation is divided into four
components - SST SSA SSB SSAB SSE
39SECTION 2ANALYSIS OF DATA FROM TWO-WAY DESIGNS
- Using the ANOVA Table in a Two-Way Design
- The ANOVA table for such a design looks just like
the ANOVA table for a one-way design with two
additional rows.
40(No Transcript)
41SECTION 2ANALYSIS OF DATA FROM TWO-WAY DESIGNS
- Using the ANOVA Table in a Two-Way Design
- In a two-way ANOVA, three hypothesis tests should
be done. - To test the hypothesis of no difference due to
factor A we would have the following null and
alternative hypotheses - Ho There is no difference in the population
means due to factor A. - HA There is a difference in the population
means due to factor A.
42SECTION 2ANALYSIS OF DATA FROM TWO-WAY DESIGNS
- To test the hypothesis of no difference due to
factor B we would have the following null and
alternative hypotheses - Ho There is no difference in the population
means due to factor B. - HA There is a difference in the population
means due to factor B.
43SECTION 2ANALYSIS OF DATA FROM TWO-WAY DESIGNS
- To test the hypothesis of no difference due to
the interaction of factors A and B we would have
the following null and alternative hypotheses -
- Ho There is no difference in the population
means due to the interaction of factors A and B. - HA There is a difference in the population
means due to the interaction of factors A arid B.
44SECTION 2ANALYSIS OF DATA FROM TWO-WAY DESIGNS
- Understanding the interaction Effect
- The easiest way to understand this effect is to
look at a graph of the sample averages for each
of the possible combinations of the two factors. - The line graph shown in Figure 14.7 displays the
20 sample means for airspace.
45(No Transcript)
46SECTION 2ANALYSIS OF DATA FROM TWO-WAY DESIGNS
- From this graph you can see that the mean
airspace decreases the longer the box sits on the
shelf, regardless of from what position in the
hardroll the box was made. - The airspace behavior is affected by the
interaction of the time on the shelf and the
position in the hardroll from which it was made.
47SECTION 2ANALYSIS OF DATA FROM TWO-WAY DESIGNS
- If there were no interaction effect, the lines
connecting the sample means would be parallel as
in Figure 14.8.
48(No Transcript)
49SECTION 2 MULTIPLE COMPARISON PROCEDURE
- When we use analysis of variance to test whether
the means of k populations are equal, rejection
of the null hypothesis allows us to conclude only
that the population means are not all equal. - In some cases we will want to go a step further
and determine where the differences among means
occur.
50SECTION 2 MULTIPLE COMPARISON PROCEDURE
- The purpose of this section is to introduce two
multiple comparison procedures that can be used
to conduct statistical comparisons between pairs
of population means.
51SECTION 2 MULTIPLE COMPARISON PROCEDURE
- 2.3.1 FISHERS LSD
- Suppose that analysis of variance has provided
statistical evidence to reject the null
hypothesis of equal population means. - In this case, Fishers least significant
difference (LSD) procedure can be used to
determine where the differences occur.
52(No Transcript)
53(No Transcript)
54SECTION 2 MULTIPLE COMPARISON PROCEDURE
- Confidence Interval Estimate of the Difference
Between Two Population Means Using Fishers LSD
Procedure
55SECTION 2 MULTIPLE COMPARISON PROCEDURE
- TYPE I ERROR RATES
- We showed how Fishers LSD procedure can be used
in such cases to determine where the differences
occur. - Technically, it is referred to as a protected or
restricted LSD test because it is employed only
if we first find a significant F value by using
analysis of variance.
56SECTION 2 MULTIPLE COMPARISON PROCEDURE
- To see why this distinction is important in
multiple comparison tests, we need to explain the
difference between a comparisonwise Type I error
rate and an experimentwise Type I error rate.
57SECTION 2 MULTIPLE COMPARISON PROCEDURE
- For example, in the NCP example Fishers LSD
procedure was used to make three pairwise
comparisons.
58SECTION 2 MULTIPLE COMPARISON PROCEDURE
- In each case, we used a level of significance of
? 0.05. - Therefore, for each test, if the null hypothesis
is true, the probability that we will make a Type
I error is ? 0.05 hence, the probability that
we will not make a Type I error on each test is
1- 0.05 0.95.
59SECTION 2 MULTIPLE COMPARISON PROCEDURE
- In discussing multiple comparison procedures we
refer to this probability of a Type I error
(? 0.05) as the comparisonwise Type I error
rate comparisonwise Type I error rates indicate
the level of significance associated with a
single pairwise comparison. - Let us now consider a slightly different
question. What is the probability that in making
three pairwise comparisons, we will commit a Type
I error on at least one of the three tests?
60SECTION 2 MULTIPLE COMPARISON PROCEDURE
- To answer this question, note that the
probability that we will not make a Type I error
on any of the three tests is - (.95)(.95)(.95)0 .8574.
- Therefore, the probability of making at least one
Type I error is -
- 1-0.8574 0.1426
61SECTION 2 MULTIPLE COMPARISON PROCEDURE
- Thus, when we use Fishers LSD procedure to make
all three pairwise comparisons, the Type I error
rate associated with this approach is not .05,
but actually 0.1426 we refer to this error rate
as the overall or experimentwise Type I error
rate.