QUANTITATIVE AND QUALITATIVE METHODS - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

QUANTITATIVE AND QUALITATIVE METHODS

Description:

A PRIORI vs POST HOC (A POSTERIORI) COMPARISONS ... But a posteriori tests are necessary when the results of a study are clearly ... A POSTERIORI TESTS - Cont. ... – PowerPoint PPT presentation

Number of Views:77
Avg rating:3.0/5.0
Slides: 45
Provided by: alanga5
Category:

less

Transcript and Presenter's Notes

Title: QUANTITATIVE AND QUALITATIVE METHODS


1
QUANTITATIVE AND QUALITATIVE METHODS
  • ANOVA and Subsidiary Analyses

2
SCIENTIFIC METHOD
  • Scientific Theory -gt Predictions -gt Experiment
  • Reminder There is a crucial distinction between
    SCIENTIFIC hypothesis (e.g. about mental
    processes) with the extremely restricted null
    and alternative hypotheses of the Neyman-Pearson
    approach to statistical inference

3
SCIENTIFIC METHOD
  • Although you should design experiments to answer
    the scientific questions you want to answer, it
    is useful to bear in mind the analytic tools
    (statistical tests) that will be available to
    process your results
  • Why statistical tests? - because of noise in most
    psychological data

4
MEASUREMENT (Recap)
  • In any experiment, the experimenter will be
    "measuring" something or other - usually several
    things.
  • Need to consider (for each thing "measured") the
    type of scale of measurement
  • nominal
  • ordinal
  • interval
  • ratio

5
EXPERIMENTAL DESIGN
  • In Experimental Psychology (though not
    necessarily other branches of psychology) the
    most common design in one that can be analysed
    with ANOVA
  • Independent
  • (experimenter controlled/selected) vs
  • Dependent variable(s)
  • Vary as a result of the experimenters
    manipulations
  • Traditionally more than one dependent variable -gt
    more than one ANOVA
  • (though possibility of MANOVA - see later in
    course)

6
EXPERIMENTAL DESIGN
  • independent variables
  • treatment
  • classification
  • usually considered as nominal, though with some
    independent variables (e.g. age, dose) it is
    possible to do post hoc trend tests that treat
    levels of independent variables as
    ordinal/interval

7
FACTORS AND COUNTERBALANCING
  • independent variables regarded as factors with
    levels
  • what counts as a factor - to some extent depends
    on choice of experimenter
  • Experimental factor vs counterbalancing

8
OTHER ISSUES
  • fixed vs random (independent) variables
  • crossed and nested designs
  • factorial designs (all fixed factors crossed)
  • systematic subset designs
  • only Latin square common, where only some orders,
    say, are chosen
  • Latin squares are often used for counterbalancing

9
WITHIN- and BETWEEN FACTORS
  • Terminology
  • between within
  • unrelated related
  • matched samples
  • correlated
  • independent groups repeated measures
  • matched samples (e.g. when the experimental group
    is hard to construct, and differs from general
    population on possibly relevant measures e.g.
    age, IQ)

10
ADVANTAGES AND DISADVANTAGES OF BETWEEN-SUBJECT
DESIGNS
  • necessary if subjects can't do one condition
    after other (e.g. unexpected memory test)
  • necessary if one S can't be in both conditions
    (e.g. male/female good reader/bad reader)
  • more subjects needed (to get same number of
    observations in all conditions, and because
    between s variance can't be partitioned out)
  • must allocate Ss randomly to conditions (problem
    of Ss who are rejected on basis of experimental
    performance - may not be randomly distributed
    between conditions)

11
ADVANTAGES AND DISADVANTAGES OF WITHIN-SUBJECT
DESIGNS
  • "each subject acts as own control"
  • computationally equivalent to taking out
    differences between subject means (cf. related
    groups t-test - is the difference different from
    0)
  • in some types of experiment a between-subjects
    design can make test conditions too homogeneous
    (can get around with filler items)
  • additional assumptions in within-subjects ANOVA
    (homongeneous covariance for each subject)
  • order (sequence), practice, fatigue, boredom
    effects
  • carry over effects
  • need to make a decision about blocked vs mixed
    designs

12
ANALYSIS OF VARIANCE BEFORE THE ANOVA
  • Before carrying out statistical tests
  • Think about results in simple numerical terms
  • Produce simple plots
  • (histograms, line graphs, whatever is
    appropriate)
  • with some indication of range/variability (e.g.
    standard error)
  • THEN do the stats

13
ANOVA
  • USE Regimented Experimental Designs
  • What it tells you
  • are there differences between MEAN values for
    conditions in your experiment.
  • So WHY is it called ANOVA?
  • BECAUSE it partitions (analyses) overall variance
    into various components, which can then be used
    to construct tests (F-tests) of the (statistical)
    (null) hypotheses that some set of means are
    really all the same.
  • Details of how later.
  • First, a more intuitive approach.

14
INDEPENDENT TESTS
  • Given a completely factorial design (all fixed
    factors crossed) it is possible to construct
    INDEPENDENT tests for all main effects and
    interactions
  • Statisticians can provide a mathematical proof
    that the tests are independent
  • However, some grasp of why they are independent
    can be gain by simpler means (tabular,
    graphical).

15
WHY THE TESTS ARE INDEPENDENT
  • THINKING ABOUT NUMBERS IN TABLES
  • Experiment
  • Good readers and poor readers answering questions
    about a text
  • Two types of questions - those about material
    explicit in the text and those based on inferences

16
MAIN EFFECT - GOOD vs POOR READERS
  • good poor
  • literal 60 30 45
  • inference 60 30 45
  • 60 30

17
MAIN EFFECT - LITERAL vs INFERENCE QUESTIONS
  • good poor
  • literal 60 60 60
  • inference 30 30 30
  • 45 45

18
INTERACTION BUT NO MAIN EFFECTS
  • good poor
  • literal 30 60 45
  • inference 60 30 45
  • 45 45

19
TWO MAIN EFFECTS BUT NO INTERACTION
  • good poor
  • literal 60 30 45
  • inference 50 20 35
  • 55 25

20
WHY THE TESTS ARE INDEPENDENT
  • GRAPHICAL INTERPRETATION (FOR A SIMPLE TWO x TWO
    INTERACTION)
  • one main effect is (average) slope of lines
  • other main effect is separation of lines
  • interaction is angle between lines
  • which you can see, in principle, vary
    independently of one another.

21
GRAPH TWO MAIN EFFECTS AND AN INTERACTION
22
HOW ANOVA WORKS
  • Consider one of the main effects or interactions
    in your design, say the main effect of good vs
    poor readers.
  • Some of the variability (i.e. numerical
    differences) in the data you collect will arise
    because you are collecting data at different
    levels at of this variable (good or poor).
  • In general, differences between data collected at
    different levels of such a factor arise from two
    sources
  • random noise in the data
  • any effect of the variable itself

23
HOW ANOVA WORKS
  • With a within-subjects factor (which literal vs
    inference MIGHT be in the above design), there is
    also the possibility that individual subjects are
    affected differently by the manipulation being
    made (e.g. some subjects may show a large effect
    and others a small effect).
  • To put this another way there may be an
    interaction between subjects and conditions.
  • In a between subjects design this issue does not
    arise.
  • Subjects are assumed to be assigned at random to
    conditions, and any difference in performance
    between conditions can only be attributed to a
    condition effect.

24
HOW ANOVA WORKS
  • So, to test whether the manipulation in question
    has any effect, we need to calculate, from the
    data, an estimate of the variability in the data
    from level to level of a factor and compare that
    with an estimate of the general noisiness of the
    data. Then we can write
  • F estimate which includes noise effect of
    factor
  • ------------------------------------------------
    ------------
  • estimate which includes noise alone
  • This ratio will be bigger than one, only if the
    effect of the factor itself is significant.

25
HOW ANOVA WORKS
  • For a completely between-subjects design the
    noise is estimated from by pooling estimates from
    within each cell, and every main effect or
    interaction is tested again that so-called error
    term.
  • For a within-subjects factor we have the
    complication that any estimate from the data of
    the variability from level to level of a factor
    includes a component attributable to the
    different performance of the subjects in the
    different conditions. We, therefore, need
  • F estimate which includes noise effect of
    factor
  • interaction of factor with subjects
  • ------------------------------------------------
    ------------
  • estimate which includes noise alone
  • interaction of factor with subjects

26
HOW ANOVA WORKS
  • Fortunately we can get the bottom estimate from
    the factor X subject cells, but we will have
    different "error" terms for different main
    effects and interactions.
  • In a completely within-subjects design there is a
    different error term for each effect and
    interaction tested.

27
ASSUMPTIONS OF ANOVA
  • Population of scores in any group is normally
    distributed.
  • Variance is the same for each group
    (homoscedasticity).
  • Each observation sampled randomly and
    independently from a normal distribution.
  • (For within-subjects designs) - some assumption
    about the covariance between measures at
    different levels of the same factor.

28
ASSUMPTIONS OF ANOVA
  • "Compound symmetry" (i.e. all covariances equal)
    is a sufficient, but not necessary, assumption.
    Necessary and sufficient condition is equality of
    variances of differences between all treatment
    pairs (Huynh and Feldt, 1970). For between
    designs these covariances are assumed to be zero
    (follows from homegeneity of variance
    assumption), but they cannot be for within
  • e.g. a fast subject in an RT experiment is
    likely to be fast in all conditions.

29
ASSUMPTIONS OF ANOVA
  • If these assumptions are met, then the ratios
    described above will have the distribution known
    as F, and we can calculate absolute probabilities
    (or use tables).
  • ANOVA is "robust" with respect to first two
    assumptions (but not third independence).
  • E.g. ratios of 4 or 5 between variances are not
    usually a problem.
  • skewed distributions are usually ok, particularly
    if all groups show similar skew (e.g. RT data).

30
ASSUMPTIONS OF ANOVA
  • As a rule of thumb, the simpler the design the
    more robust it is with respect to violations of
    the assumptions
  • (e.g. robustness with respect to 2 is weaker for
    unequal N designs).
  • Ways of testing robustness
  • analytic (not usually straightforward)
  • randomisation/Monte Carlo

31
ANOVA - SUBSIDIARY ANALYSES
  • Statistical significance vs importance (extra
    statistical) of differences.
  • Which of the differences between means are
    significant?
  • In addition you need to think not only about
    errors on a single test.
  • You may be making several comparisons
  • Reminder
  • Type I - reject a true null hypothesis
  • Type II - accept a false null hypothesis (fail to
    reject)

32
ANOVA - SUBSIDIARY ANALYSES
  • In an ANOVA with lots of different Fs you might
    need to think about
  • probability of error on each test
  • number of errors per experiment
  • e.g. 1 Type I error per 20 comparisons
  • probability of at least one error.

33
A PRIORI vs POST HOC (A POSTERIORI) COMPARISONS
  • Not really before or after data are collected,
    but depends on whether the differences can be
    predicted from the (scientific) hypothesis under
    investigation.
  • A priori comparisons are more powerful and are
    preferred for this reason (and others).
  • But a posteriori tests are necessary when the
    results of a study are clearly interesting, but
    different from those predicted by any theory that
    has been considered.
  • An a priori test can be significant even if the
    overall F of which it is part is not.

34
A PRIORI TESTS AND OVERALL F-RATIOS
  • There has been some disagreement in the
    literature about whether a priori comparisons
    should be made when the overall F is not
    significant.
  • A sensible view is that if the (scientific)
    hypothesis makes a specific prediction, then the
    comparison is legitimate.

35
A PRIORI TESTS
  • Comparisons between pairs of groups are sometimes
    made using t-tests.
  • For a two group ANOVA t-squared F
  • With three or more groups, a different method is
    usually preferred in which the same (overall)
    estimate of the background noise in the data
    (error term) is used.
  • With a t-test, you would make this estimate from
    just the data in the two groups being compared
    (and reduced denominator df).
  • Numerically, the procedure is very simple and
    produces a new F for each comparison.

36
A PRIORI TESTS - Cont.
  • Independence of contrasts
  • remember that in the overall ANOVA all main
    effects and interactions are independent - if the
    design is completely factorial.
  • Independent tests are useful because they are
    straightforward to interpret, but each main
    effect or interaction can only be broken down
    into (numerator) df independent contrasts.
  • e.g.
  • three level main effect -gt two tests
  • 2x2 - 1 test i.e. the F itself
  • 3x2 -gt 2 etc
  • cf. the number of possible pairwise
    comparisons.

37
A PRIORI TESTS - Cont.
  • Criterion - assign coefficients that
  • sum to zero for any one comparison
  • have products that sum to zero for any pair of
    comparisons.
  • Tables or see a text such as Howell.
  • Linear (or other) trends can be investigated
    using planned comparison technique
  • E.g with 2df can make orthogonal tests for linear
    and quadratic trends - more for more df
  • Note trends require treating independent
    variable as being measured on interval/ratio
    scale.

38
A PRIORI TESTS - Cont.
  • May want to make alpha smaller, e.g.
  • alpha for the effect/interaction
  • -----------------------------------
  • number of comparisons
  • Nonorthogonal comparisons - may be suggested by
    (scientific) hypothesis for a priori comparisons.
  • Generally acceptable, if the number of such
    comparisons is small
  • best if less than or equal to total number of
    orthogonal comparisons for the effect/interaction
  • beware of dangers of overinterpretation!

39
A PRIORI TESTS - Cont.
  • Can modify alpha via the Dunn/Bonferroni
    procedure - setting an error rate per experiment.
  • A very common practice, which leads to
    nonorthogonal comparisons, is testing for simple
    main effects ("carrying out subsidiary
    analyses").

40
A POSTERIORI TESTS
  • Used when data do not fit any (scientific)
    hypothesis or only a hypothesis suggested by the
    data themselves (hence post hoc)
  • Fisher's Least Significant Difference - requires
    a significant overall F before multiple t tests
    are carried out. This restricts the familywise
    error rate (probability of at least one type I
    error) to ?, but only if the complete null
    hypothesis is true (!! i.e. no differences
    between any means !!)

41
A POSTERIORI TESTS - Cont.
  • Newman-Keuls - arrange levels in order of value
    of dependent variable and calculate different
    critical values for different step lengths along
    the series (by definition, adjacent means have
    step 2).
  • Again, holds experimentwise error rate to alpha
    (but only if complete null hypothesis is true)
  • Duncan - similar to Newman-Keuls, but sets error
    rate per degree of freedom to ? (too liberal).
  • Tukey HSD (Honestly Significant Difference -
    Tukeys Test) - as Newman-Keuls, but uses
    biggest critical value. Holds experimentwise
    error rate to alpha (for all possible null
    hypotheses, all means equal, some equal)
  • Ryan REGWQ - holds experimentwise error rate to
    alpha for all null hypotheses (as Tukey), but
    varies the critical value with the number of
    means in the set.

42
A POSTERIORI TESTS - Cont.
  • Last 4 tests are variants, and are based on a
    statistic called the Studentised Range
  • Largest treatment total - Smallest
    treatment total
  • q --------------------------------------------
    ----------
  • sqrt(mean square error/n per group)
  • t sqrt(2)
  • Scheffé - Holds experimentwise error rate to
    alpha for all linear contrasts (i.e. all
    contrasts for which the sum of the coefficients
    0), not just pairwise.
  • Very conservative, Scheffé himself suggests
    setting ?0.1.
  • Not recommended for pairwise comparisons only.
  • Dunnett - for comparing all treatments against a
    control (more powerful than the Dunn/Bonferroni
    procedure for this case)

43
MULTIPLE COMPARISONS WITH REPEATED MEASURES
  • If you look at introductory texts you will notice
    that the procedures described are usually for
    between-subject comparsions.
  • The issue of multiple comparisons with repeated
    measures is a complicated one and I can do no
    better than refer you to Howells discussion at
  • http//www.uvm.edu/dhowell/StatPages/More_Stuff/R
    epMeasMultComp/RepMeasMultComp.html

44
TESTS OF CHOICE
  • In much of the cognitive psychology literature
    (at least) are
  • Dunn/Bonferroni - often called Bonferroni t - for
    a priori
  • Newman-Keuls used to be recommended for post hoc
    comparisons. Howell now recommends Ryan, which
    can be readily calculated in SAS using the REGWQ
    procedure (not so easy with SPSS).
Write a Comment
User Comments (0)
About PowerShow.com