Experimental Design - PowerPoint PPT Presentation

1 / 73
About This Presentation
Title:

Experimental Design

Description:

More economical use of subjects. Better control over individual differences (increased power) ... to have an immediate impact that declines linearly over time. ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 74
Provided by: michael1175
Category:

less

Transcript and Presenter's Notes

Title: Experimental Design


1
Experimental Design
2
  • The Big Questions . . .
  • What are the two basic ways that repeated
    measures are incorporated into experimental
    designs?
  • What are the advantages?
  • What problems await the unwary?

3
Within-Subject Designs
  • In a within-subject design, participants complete
    more than one measure. Two basic forms are used
  • Repeated measures designs
  • Repeated treatment designs

4
Repeated Measures Design
In a repeated measures design, the manipulation
is a between-subjects factor.
5
Repeated Treatment Design
In a repeated treatment design, the treatment is
manipulated within-subjects.
6
Complex designs can have repeated measures (Time)
and repeated treatments (A) as well as treatments
manipulated between-subjects (B).
7
  • There are several distinct advantages to
    within-subject designs
  • More economical use of subjects
  • Better control over individual differences
    (increased power)
  • Perhaps the only way to test the hypothesis

8
The savings in sample size is one of the
attractive features of these designs
9
The original measures (e.g., Time 1, Time 2,
etc.) in a within-subjects design are transformed
as part of the statistical analyses. The greater
power in within-subjects designs stems from the
particular linear combinations of the multiple
measures that are created.
10
The weights (l) in the linear combination are
chosen to create two basic parts of the
statistical designthe between-subjects part and
the within-subjects part. The within-subjects
part of the design involves difference scores. It
is here where the greater power is gained. When a
change score is formed, many individual
difference variables--ordinarily a source of
error-- are held constant.
11
In this basic 2 x 2 design, there will be a main
effect for Treatment vs. Control, a main effect
for Time, and an interaction. Those effects can
be thought of as reflecting the analysis of two
new dependent variablesthe sum of Time 1 and
Time 2 and the difference between Time 1 and
Time 2
12
The sum becomes the between-subjects part of the
design, with a t-test of the difference between
MTSum and MCSum indicating if the two groups are
different in their overall levels of performance.
In this case, the weights (l) for combining the
two scores that each person has are 1 and 1.
13
The difference score becomes the within-subjects
part of the design, with a t-test of the
difference between MTDiff and MCDiff indicating
if the two groups are different in their changes
in performance. If so, then this indicates an
interaction between Treatment vs. Control and
Time. In this case, the weights for combining the
two original scores are 1 and -1.
14
The grand mean for the difference variable is
also meaningful. It indicates if there was any
overall change from Time 1 to Time 2. It can be
tested against the null hypothesis value of 0 and
is the main effect for Time in the
within-subjects part of the design.
15
As more time periods are added, the possible ways
to construct the linear combinations increase and
their match to the underlying hypotheses needs to
be considered carefully. The weights reflect how
the scores are combined to create new variables
that are analyzed.
16
What are these linear combinations testing?
17
What about these?
18
The formula for the variance of a linear
combination reveals why a within-subjects design
is usually more powerful than a between subjects
design.
In the simplest within-subjects design, the
weights will be either 1 and 1, or, 1 and -1, for
the sum and difference respectively.
19
Assuming equal variances for the two measures and
assuming standardized scores for convenience, the
variance for a sum score will be 2(1 r12). The
variance for the difference score will be
2(1-r12). As long as the correlation between the
two measures is greater than zero, the variance
of the difference score will be smaller than the
variance for the sum score. That will make
statistical tests of the difference score more
powerful.
20
Repeated measures designs have some important
statistical problems that require attention.
In designs such as this, two assumptions must be
met equivalence of the covariance matrices
across groups and sphericity.
21
Each group in the design has a variance-covariance
matrix for the multiple measures. These matrices
are assumed to be homogeneous across the groups.
22
The sphericity assumption is met when the pooled
variance-covariance matrix exhibits compound
symmetry, but that is unlikely to happen in a
repeated measures design.
23
Instead, measures that are spaced farther apart
are correlated less strongly than measures that
are collected more closely in time.
If Time 2-Time 1 Time 3-Time 2 (s12 s23) gt s13
24
More generally, sphericity is met when there is
homogeneity of the variances for all possible
difference scores
25
  • Violations of sphericity are especially
    problematic. They can inflate the Type I error
    rate. Remedies include
  • Single df contrasts
  • Adjusted df tests (e.g., Geisser-Greenhouse)
  • Multivariate tests (no sphericity assumption)
  • Resampling procedures (no assumptions)

26
GLM trial1 trial2 trial3 trial4 /WSFACTOR
time 4 Polynomial /METHOD SSTYPE(3) /PRINT
DESCRIPTIVE ETASQ OPOWER PARAMETER RSSCP
/PLOT RESIDUALS /CRITERIA ALPHA(.05)
/WSDESIGN time .
27
(No Transcript)
28
(No Transcript)
29
(No Transcript)
30
(No Transcript)
31
(No Transcript)
32
Repeated measures designs have one other hidden
statistical problem. When the measures are not
equally spaced, the unequal spacing needs to be
taken into account. Otherwise, attempts to
separate the linear and nonlinear effects may be
flawed.
33
Example A study examines the long-term
effectiveness of a treatment. Three measures are
collected post-test immediately after treatment,
1 month after treatment and 6 months after
treatment. The treatment is expected to have an
immediate impact that declines linearly over time.
34
If the hypothesis is correct, then an analysis of
the temporal trends should show a linear effect
but not a nonlinear effect. This might seem to
indicate that the following contrasts (linear
combinations) should be tested
35
If this is the actual pattern of means relative
to time of data collection, the nonlinearity will
not be appropriately analyzed by a contrasts that
assume equal spacing.
Time 1 Time 2
Time 3
36
Time 1 Time 2 Time 3
37
(No Transcript)
38
(No Transcript)
39
GLM time1 time2 time3 /WSFACTOR time 3
Polynomial /METHOD SSTYPE(3) /EMMEANS
TABLES(OVERALL) /EMMEANS TABLES(time)
/PRINT DESCRIPTIVE ETASQ OPOWER PARAMETER
RSSCP TEST(MMATRIX) /CRITERIA ALPHA(.05)
/WSDESIGN time .
40
(No Transcript)
41
(No Transcript)
42
GLM time1 time2 time3 /WSFACTOR time 3
Polynomial(1,2,7) /METHOD SSTYPE(3)
/EMMEANS TABLES(OVERALL) /EMMEANS
TABLES(time) /PLOT PROFILE( time ) /PRINT
DESCRIPTIVE ETASQ OPOWER PARAMETER RSSCP
TEST(MMATRIX) /CRITERIA ALPHA(.05)
/WSDESIGN time .
43
Why is this unchanged from the previous analysis?
44
(No Transcript)
45
(No Transcript)
46
(No Transcript)
47
(No Transcript)
48
GLM time1 time2 time3 BY group /WSFACTOR
time 3 Polynomial /METHOD SSTYPE(3)
/EMMEANS TABLES(OVERALL) /EMMEANS
TABLES(group) /EMMEANS TABLES(time)
/EMMEANS TABLES(grouptime) /PRINT
DESCRIPTIVE ETASQ OPOWER PARAMETER RSSCP
TEST(MMATRIX) /CRITERIA ALPHA(.05)
/WSDESIGN time /DESIGN group .
49
(No Transcript)
50
(No Transcript)
51
GLM time1 time2 time3 BY group /WSFACTOR
time 3 Polynomial(1,2,7) /METHOD SSTYPE(3)
/EMMEANS TABLES(OVERALL) /EMMEANS
TABLES(group) /EMMEANS TABLES(time)
/EMMEANS TABLES(grouptime) /PRINT
DESCRIPTIVE ETASQ OPOWER PARAMETER RSSCP
TEST(MMATRIX) /CRITERIA ALPHA(.05)
/WSDESIGN time /DESIGN group .
52
(No Transcript)
53
(No Transcript)
54
Repeated treatment designs present their own
problems. Exposing subjects to more than one
treatment necessarily means the treatments have
to be given in some order. That order becomes a
potential confounding variable. Any treatment
differences might reflect the order in which the
treatments were given.
55
The remedy is to counterbalance the order of
treatments
Counterbalancing presents no problem hereall
possible orders are present and so all order
effects can be explored.
56
More complex designs are not so easily handled
If there are k treatments to which all subjects
will be exposed, how many possible orders are
there?
57
As the number of treatments increases, it becomes
unwieldy to use all k! possible orders. To do so
would defeat one goal of using these
designssample size reduction. Each new order
that is tested requires a separate group of
subjects. Example A researcher interested in
physical attractiveness asks subjects to rate the
facial attractiveness of 10 people (varying in
their facial characteristics) depicted in
photographs. Realizing that ratings for any one
photograph might be contaminated by having viewed
other photographs before it, a complete
counterbalancing is chosen. How many different
orders need to be presented, each to a different
group of subjects?
58
For large numbers of treatments, two other
procedures are used to keep the benefits of small
sample sizes while also addressing the order
effects problem
  • A random sample (lt k!) from the set of k! orders
  • 2. A Latin Square design

59
The main problem with the random subset approach
is that it would be possible to get orders at
random that might still be problematic. Example
In a photograph rating study with 5 photos (A, B,
C, D, and E) to be rated, it might happen that 5
orders drawn at random would be Order 1 A, C,
B, E, D Order 2 A, C, E, B, D Order 3 A, B, E,
C, D Order 4 A, E, B, C, D Order 5 A, B, C, E, D
60
  • The Latin Square design is intended to avoid this
    kind of problem by imposing some constraints on
    the orders that are used while also keeping the
    number of orders to a reasonable size. A Latin
    Square design has the following characteristics
  • There are k orders
  • Each element to be ordered appears once in every
    ordinal position
  • Orders are chosen randomly within these
    constraints

61
  • The Latin Square design is created using the
    following steps
  • Create the standard square
  • Randomly order the columns of the standard square
  • Randomly order the rows of the square created in
    Step 2
  • Repeat Steps 2 and 3 on the square created in
    Step 3

62
Example 5 Photographs (A, B, etc.) to be
judged. 1. Create the standard square
63
Example 5 Photographs to be judged. 2. Randomly
order the columns of the standard square
64
Example 5 Photographs to be judged. 3. Randomly
order the rows of the square from Step 2
65
Example 5 Photographs to be judged. 4. Randomly
order the columns of the square from Step 3
66
Example 5 Photographs to be judged. 5. Randomly
order the rows of the square from the previous
step.
67
Example 5 Photographs to be judged. After
numerous rounds of this randomization, the pairs
of photographs are less likely to be coupled
68
  • Three advantages to Latin Square
  • Economical use of subjects within the goal of
    counterbalancing
  • A well-developed set of statistics for these
    designs
  • The first element can be analyzed as an
    uncontaminated between-subjects design

69
  • Two more important points . . .
  • Counterbalancing allows separation of treatment
    and order effects but it does not eliminate
    interpretational problems entirely. Once exposed
    to the first treatment--no matter what treatment
    it isparticipants may respond differently to
    subsequent treatments in ways that would not be
    true for subjects who have not been sensitized.
  • The mere fact that counterbalancing has been used
    does not relieve the investigator of the
    responsibility of investigating its impact. Order
    effects need to be modeled statistically,
    including their interaction with treatment.

70
  • Within-subjects designs are often the appropriate
    approach in research because they mimic the
    natural occurrence of the phenomenon being
    studied.
  • Examples
  • Evaluation of job candidates, products, etc.
  • Judgments of relative attractiveness (beauty
    contests)
  • Impact of multiple therapies

71
  • Other research questions probably are not best
    addressed in these designs.
  • Examples
  • Trial characteristics that affect jury decisions
  • Factors that affect impression processes
  • Factors that affect causal attributions

72
The key is whether the phenomenon under study is
inherently comparative or relative, where the
characteristics of one stimulus are expected to
have an impact on the judgment of other stimuli.
Such phenomena are appropriately examined within
the context of a within-subjects design.
Phenomena that are naturally isolated or
absolute are best studied in between-subjects
designs.
73
  • The Big Questions . . .
  • What are the two basic ways that repeated
    measures are incorporated into experimental
    designs?
  • What are the advantages?
  • What problems await the unwary?
Write a Comment
User Comments (0)
About PowerShow.com