Analysis of Variance - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Analysis of Variance

Description:

If group differences are found, we can say that there is a ... RESIDUALS HIST(ZRESID) . DATA VitC; INPUT group diff; DATALINES; 1 12. 1 -2. 1 9. 1 3 ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 24
Provided by: martyg
Category:

less

Transcript and Presenter's Notes

Title: Analysis of Variance


1
Analysis of Variance
2
One-Factor (Oneway) ANOVA
  • ANOVA represents a set of techniques designed to
    investigate relationships by testing group
    differences. If group differences are found, we
    can say that there is a relationship between the
    IV and the DV.
  • Assume you have some exposure to ANOVA previously
  • Path model
  • Surprise! What does the path model look like?

3
Doing Normal Statistics
One way ANOVA (Dummy coding, IV is nm)
x1
Y (m)
x2
x3
4
(No Transcript)
5
Anova with 2 Groups
  • Anova can be used to test group differences among
    2 or more groups.
  • The Independent t test and the 2-group Anova are
    equivalent procedures. F t2
  • The 7-step procedure is the same except for use
    of a different test statistic (F) and
    corresponding different rejection rule and
    critical value.

6
Analyses
  • Runs for independent t, 2-group anova, and a
    surprise.
  • Compare outputs
  • Path model for 2-group situation
  • Anoval Model
  • X u A(i) e
  • Another familiar model
  • Y B0 B1X1 e
  • (Compare models and terms. Substitute values for
    x1.)
  • Overview conclusion
  • Given indicator coding for grpi where 0 Alc and
    1 Placebo, how do you interpret the intercept
    and slope of the regression of digits on grpi?

7
Data list Free/ Grp digits. Begin
datat 1 5 1 6 1 7 1 4 1 4 1 3 1 5 1 7 1 7
1 6 2 7 2 7 2 8 2 6 2 7 2 7 2 8 2 5 2 6
2 7 End data. T-TEST GROUPS Grp(1 2)
/VARIABLES digits /CRITERIA CI(.90)
. EXAMINE VARIABLESdigits BY Grp
/PLOTBOXPLOT/STATISTICSNONE/NOTOTAL. Now
run the 2 sample problem using oneway
anova. UNIANOVA digits BY Grp
/METHOD SSTYPE(3) /INTERCEPT INCLUDE
/PLOT PROFILE( Grp ) /CRITERIA ALPHA(.05)
/DESIGN Grp . Recode Grp as indicator
varialbe. Recode grp (10) (21) into
grpi . / Warning recoding a variable into the
same variable will destroy original. list
/casesto 12. Execute. scatterplot
. GRAPH /SCATTERPLOT(BIVAR)grpi WITH
digits /MISSINGLISTWISE . GRAPH
/SCATTERPLOT(BIVAR)grpi WITH digits
/MISSINGLISTWISE . GRAPH /BAR(SIMPLE)MEAN(digi
ts) BY grpi . REGRESSION /DESCRIPTIVES MEAN
STDDEV CORR SIG N /STATISTICS COEFF OUTS R
ANOVA /DEPENDENT digits /METHODENTER Grpi
/SCATTERPLOT(SDRESID ,ZPRED ) .
8
Regression and ANOVA
  • Anova is really a special case of Multiple
    Regression (The underlying matrix-algebra based
    computational formulas are identical.)
  • Where the grouping variables are identified using
    special dummy variable coding schemes
  • Two groups, as we saw, uses one variable coded 0
    or 1 depending on the group
  • Three groups require 2 variables

9
Dummy Variables(and the Dummies who use them)
  • White is the reference category
  • By use of indicator coding (shown) and effect
    coding (not shown), the levels of one or more Ivs
    can be included as predictors in a regression
    model
  • By getting R2s for full and reduced models, one
    can get all the needed sums of squares to get the
    terms in an ANOVA Source table
  • In terms of underlying models, regression and
    ANOVA are the same
  • All are subsumed under THE GENERAL LINEAR MODEL
  • Which is actually a big multivariate multiple
    regression model
  • ANOVA techniques and formulas came about because
    people like Fisher were thinking in terms of
    agricultural treatment to plots and using
    regression models was not convenient, so special
    ANOVA methods were developed
  • There are some who think that the regression
    approach is better and have made it easier!!
  • One is guess who?

10
(No Transcript)
11
  • This Week's Citation Classic
  • http//www.garfield.library.upenn.edu/classics1982
    /A1982PB23900001.pdf
  • The current status
  • http//books.google.com/books?idfuq94a8C0ioCprin
    tsecfrontcoverdqmultipleregressioncohenclien
    tfirefox-aPPP1,M1
  • Keeping in mind that Anova and multiple
    regression are part of the same analytic model,
    we will revert to the traditional (and familiar)
    Anova methods
  • Well see the connection clearly again when we
    get to ANCOVA.

12
More Anova
  • One factor with 3 or more levels with post hoc
    tests
  • Two- or more factors, including interactions
  • Repeated measures Anova
  • Mixed Anova Between-subjects and
    within-subjects factors
  • Analysis of Covariance
  • Weve already done this (almost).
  • Add a covariate to Anova
  • Actually just adding a continuous regression
    predictor

13
One-Way Anova
  • Green, L25 Example concerning change from
    baseline in number of days of cold symptoms for
    three vitamin C groups 1-Placebo, 2-LoVC, 3-Hi
    dose VC.
  • Treatment structure box
  • Path diagram
  • F test of omnibus H0 of no group differences
  • Post hoc tests to find out which group has the
    most reduction in cold symptoms
  • Regression approach
  • SAS output for comparison

14
Data list free/ group diff. Begin
data 1 12 1 -2 1 9 1 3 1 3 1 0 1 3 1 2 1 4 1 1 2 -
2 2 -3 2 3 2 -2 2 0 2 -4 2 -3 2 5 2 -9 2 -6 3 6 3
-7 3 -6 3 -6 3 -6 3 -4 3 -2 3 -6 3 6 3 5 End
data. execute. Graph the data with
boxplot. EXAMINE VARIABLESdiff BY group
/PLOTBOXPLOT/STATISTICSNONE/NOTOTAL. UNIANOVA
diff BY group /METHOD SSTYPE(3)
/INTERCEPT INCLUDE /POSTHOC group ( TUKEY
LSD QREGW ) /EMMEANS TABLES(group) /PRINT
DESCRIPTIVE ETASQ HOMOGENEITY /CRITERIA
ALPHA(.05) /DESIGN group . Run as
regression using indicator vars. Compute LoVC
0. If (group 2) LoVC 1. Compute HiVC 0. if
(group 3) HiVC 1. Execute. list. REGRESSION
/MISSING LISTWISE /STATISTICS COEFF OUTS R
ANOVA ZPP /CRITERIAPIN(.05) POUT(.10)
/NOORIGIN /DEPENDENT diff /METHODENTER LoVC
HiVC /SCATTERPLOT(SDRESID ,ZPRED )
/RESIDUALS HIST(ZRESID) .
15
DATA VitC INPUT group diff DATALINES 1 12 1
-2 1 9 1 3 1 3 1 0 1 3 1 2 1 4 1 1 2 -2 2 -3 2 3 2
-2 2 0 2 -4 2 -3 2 5 2 -9 2 -6 3 6 3 -7 3 -6 3 -6
3 -6 3 -4 3 -2 3 -6 3 6 3 5 Proc print data
vitc (obs5) Run Proc boxplot data
vitc plot diffgroup run Proc Univariate
data vitc normal plot / May be more than you
want./ var diff class group run Proc GLM
data vitc /Green L25 vit c data./ class
group model diff group / ss3 means group
/ hovtest lsd tukey regwq run quit
16
2-Factor Anova
  • Two IVs giving two main effects and interaction.
  • Interaction is usually the most interesting
    result
  • Problems/complexity come in testing simple
    effects and groups to explain the interaction.
  • Always use UNIANOVA in SPSS and GLM in SAS
    because these routines handle unbalanced designs
    correctly.
  • Brief coverage

17
Green, L26Gender, Note Taking, Freshman GPA
  • Note taking instructions are given daily at start
    of spring semester with random assignment to
    Method 1, Method2, and Control groups. DV is
    change from fall to spring GPA
  • Treatment structure box and path model
  • Research Questions
  • Main effects
  • Are there gender differences in freshman gpa
    improvement, overall (regardless of treatment)?
  • Are there differences in Method, regardless of
    gender?
  • Interaction? Are the methods different in
    effectiveness for males vs. females?

18
Doing Normal Statistics
Two- way ANOVA (Dummy coding, all Ivs nm)
x1
Y (m)
x2
?
x1 x2
19
Results
  • J\PSYCH\MARTY\4123\GreenSalkind5Dat\Lesson
    26\Lesson 26 Data File 2.sav
  • J\PSYCH\MARTY\4123\AV2FGL26D2
  • Data arrangement
  • A graphic would be nice. 2x3 box for means and
    marginal means
  • Comparison of outputs from SAS and SPSS
  • Limited coverage
  • Interaction testing deserves more time than we
    have
  • Methods of probing interaction by testing
    appropriate cell differences are illustrated in
    both analyses.
  • General issue is Simple effect tests and
    follow-ups
  • These are covered in QM2
  • SAS and SPSS programs follow

20
Data list free/gender method gpaimpr. Begin
data 1 1 .25 2 2 1.00 1 3 .10 1 1 .20 2 2 .50 1 3
.15 1 1 .30 2 2 .80 1 3 .30 1 1 .30 2 2 .60 1 3 .2
0 1 1 .50 2 2 .60 1 3 .10 1 1 .40 2 2 .50 1 3 .20
1 1 .80 2 2 .80 1 3 .30 1 1 .50 2 2 .60 1 3 .40 1
1 .10 2 2 .40 1 3 .00 1 1 .00 2 2 .60 1 3 -.10 2 1
.10 1 2 .30 2 3 -.10 2 1 .00 1 2 .20 2 3 .00 2 1
.00 1 2 .25 2 3 .10 2 1 .40 1 2 .00 2 3 .40 2 1 .5
0 1 2 .60 2 3 .25 2 1 .20 1 2 .50 2 3 .00 2 1 .00
1 2 .20 2 3 .10 2 1 .00 1 2 .10 2 3 .10 2 1 .30 1
2 .50 2 3 .20 2 1 .20 1 2 .40 2 3 .00 End
data. EXAMINE VARIABLESgpaimpr BY method BY
gender /PLOTBOXPLOT/STATISTICSNONE
/NOTOTAL. GRAPH /LINE(MULTIPLE)MEAN(gpaimpr) BY
method BY gender . UNIANOVA gpaimpr BY gender
method /METHOD SSTYPE(3) /INTERCEPT
INCLUDE /POSTHOC method ( TUKEY ) /PLOT
PROFILE( gendermethod ) /EMMEANS
TABLES(gender) /EMMEANS TABLES(method)
/EMMEANS TABLES(gendermethod) compare (gender)
ADJ(sidak) /EMMEANS TABLES(gendermethod)
compare (method) ADJ(sidak) /PRINT ETASQ
OPOWER HOMOGENEITY /CRITERIA ALPHA(.05)
/DESIGN gender method gendermethod .
21
DATA L262 INPUT gender method gpaimpr
condition catx('-',gender,method)/Creates
condition variable coded for all
groups./ DATALINES 1 1 .25 2 2 1.00 1 3 .10 1 1
.20 2 2 .50 1 3 .15 1 1 .30 2 2 .80 1 3 .30 1 1 .3
0 2 2 .60 1 3 .20 1 1 .50 2 2 .60 1 3 .10 1 1 .40
2 2 .50 1 3 .20 1 1 .80 2 2 .80 1 3 .30 1 1 .50 2
2 .60 1 3 .40 1 1 .10 2 2 .40 1 3 .00 1 1 .00 2 2
.60 1 3 -.10 2 1 .10 1 2 .30 2 3 -.10 2 1 .00 1 2
.20 2 3 .00 2 1 .00 1 2 .25 2 3 .10 2 1 .40 1 2 .0
0 2 3 .40 2 1 .50 1 2 .60 2 3 .25 2 1 .20 1 2 .50
2 3 .00 2 1 .00 1 2 .20 2 3 .10 2 1 .00 1 2 .10 2
3 .10 2 1 .30 1 2 .50 2 3 .20 2 1 .20 1 2 .40 2 3
.00 Proc print data l262 (obs5) Run Proc
boxplot data L262 / What's going on?/ plot
gpaimprcondition run Proc sort by
condition run Proc print data l262
(obs30) Run Proc boxplot data L262 plot
gpaimprcondition run Proc GLM data L262
/Green L26 D2 GPA Improvement data./ Title
"Green L26 D2 GPA Improvement" class gender
method model gpaimpr gender method / ss3
lsmeans gender method / PDIFF ADJUSTSIDAK
/Must use GLM and LSMeans for unbalanced
data!!/ /Watch out. This will give all
possible pairwise comparisons. What's the
problem? / run quit Plot Interactions. Based
on CS, p. 216-217. First, get the needed means
out, then plot them Proc Means Data L262 Nway
noprint/NWay restricts output to just the cell
means./ Class gender method Var
gpaimpr Output outMeans meanM_GPAIMP run SYM
BOL1 VSQUARE COLORBLACK IJOIN SYMBOL2
VCIRCLE COLORbLACK IJOIN PROC GPLOT
DATAMeans TITLE "INTERACTION PLOT" PLOT
M_GpaIMp METHODGENDER Plot M_GpaImp
GenderMethod RUN
22
Analysis of Covarinace
  • Extension of Anova model by adding a control
    variable to the model method of statistical
    control
  • Conceptually similar to partial correlation in
    the sense that Anova is conducted on the IVS and
    DV after removing the part of the relationship
    predicted by the control variable (Think about
    residuals)
  • Green, L27
  • Revisit vitamin C example except now have
    PREDAYS, a measure of base rate of cold
    symptoms the first year.
  • Vitamin C treatment program applied during second
    year, main DV is DAYS, the number of days with
    cold symptoms the second year.

23
Doing Normal Statistics
ANCOVA
X (IV)
Y (DV)
Z cov
Write a Comment
User Comments (0)
About PowerShow.com