Multiple Imputation using SAS - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Multiple Imputation using SAS

Description:

Single imputation (e.g. with the mean) is biased, doesn't give measure of uncertainty ... mcmc (options); modify imputation method ... – PowerPoint PPT presentation

Number of Views:1159
Avg rating:3.0/5.0
Slides: 17
Provided by: donald60
Category:

less

Transcript and Presenter's Notes

Title: Multiple Imputation using SAS


1
Multiple Imputation using SAS
  • Don Miller
  • 812 Oswald Tower
  • miller_at_pop.psu.edu
  • 814-863-3155

2
Introduction
  • Missing values occur often in research
    refused/dont know, attrition, skip patterns
  • Dropping missing values may bias results
    (e.g. women and/or
    overweight tend to disclose their weight less
    often than others)
  • Attempts are made to impute the data (fill
    in missing values)
  • Single imputation (e.g. with the mean) is biased,
    doesnt give measure of uncertainty

3
Paris datasets
  • Open Windows Explorer (or My Computer)
  • Tools Map Network Drive
  • Drive P
  • Folder \\paris\sas_data
  • For help help_at_pop.psu.edu
  • Stat help stat-core_at_pop.psu.edu

4
Data Setup
5
Multiple Imputation Simple Procedure
  • 1. Impute using PROC MI
  • 2. Round off, if you want plausible values
  • (caution this will bias your results)
  • 3. Do analysis PROC REG, LOGISTIC, etc.
  • using by _imputation_ in the procedure
  • 4. Combine results using PROC MIANALYZE
  • For categorical variables Construct binary dummy
    variables, throwing out reference category (e.g.
    race 1white, 2black, 3other becomes
    black, other variables)

6
PROC MI
  • Typical syntax
  • proc mi databmx outimpdat seed33155
  • var bmxbmi bmxht bmxwt bmxarmc bmxarml
  • run
  • data 1 copy of data with missing values
  • out 5 copies of data with imputed values (will
    be different across copies)
  • seed random seed, you can keep same to
    reconstruct your results
  • var Variables with missing values you need
    imputed, in model, and those that may be
    helpful with imputation

7
PROC MI Sample Output
8
PROC MI Sample Output
9
PROC MI Options
  • nimpute5 imputations, default5 0
    gives missing patterns
  • minimum0 0 0 0 set min max, sometimes
    maximum1 1 1 90 doesnt converge as well
  • round1 1 1 0.01 round off option
  • alpha0.05 confidence limits
  • mu00.5 0.5 0.5 25 t test null hypothesis µµ0

10
PROC MI Statements
  • em maxiter200 outemdata
    EM algorithm, MLE of missing data
  • freq fweight
    weighs observations by
    frequency weight
  • mcmc (options)
    modify imputation method
  • class sex race
    specify categorical variables (dont need
    dummies) (new / experimental)

11
Output dataset
12
Regression
  • Fit your model as if data had no missing values,
    using by _imputation_
  • proc reg dataimpdat outestparmcov covout
  • model bmxbmibmxht bmxwt bmxarmc bmxarml
  • by _imputation_
  • run
  • Youll get nimpute (usually 5) sets of output
  • Estimates, covariances, errors will be combined
    in MIANALYZE (R² is just mean)
  • Need to generate parameter estimates and
    covariance data set (varies by procedure)

13
Parameter Est. Covariance Matrix
  • proc logistic dataimpdat descending model
    bmxbmibmxht bmxwt bmxarmc bmxarml /covb
    by _imputation_ ods output
    ParameterEstimatesparmsdat CovBcovbdat
    run
  • proc mixed dataimpdat model
    bmxbmibmxht bmxwt bmxarmc bmxarml /solution
    covb by _imputation_ ods output
    covparmsparmcov run

14
Parameter Est. Covariance Matrix
  • proc genmod dataimpdat model
    bmxbmibmxht bmxwt bmxarmc bmxarml /covb
    by _imputation_ ods output
    ParameterEstimatesparmsdat CovBcovbdat
    run
  • proc glm dataimpdat model bmxbmibmxht
    bmxwt bmxarmc bmxarml /inverse by
    _imputation_ ods output ParameterEstimatesp
    armsdat InvXPXxpxidat
    run

15
PROC MIANALYZE
  • Syntax depends on what procedure you used in
    previous step
  • proc mianalyze dataparmcov (or)
    proc mianalyze parmsparmsdat covbcovbdat
    (or) proc mianalyze parmsparmsdat
    xpxixpxidat
  • (then type this)
  • modeleffects intercept bmxht bmxwt bmxarmc
    bmxarml
  • run
  • Note the var statement is now modeleffects
  • Note that the dependent variable is omitted

16
PROC MIANALYZE Output
Write a Comment
User Comments (0)
About PowerShow.com