Title: Diagnostics for the evaluation of imputed data
1Diagnostics for the evaluation of imputed data
- Heather Wagstaff and Steve Rogers
- Methodology Directorate
- Office for National Statistics, U.K.
2Overview of the Presentation
- The presentation is structured as follows
- Introduction and background
- Fully controlled simulation environment
- Baseline analysis
- Limitations of selected statistical criteria
- Description of proposed heuristic
- Example application of the heuristic
- Conclusions and future work
3Introduction and Background
- Evaluating the quality of the imputation of large
datasets - mainly tabular outputs
- ensure statistical properties maintained
- emphasis on distributional accuracy
- ? predictive accuracy lesser importance.
- ONS endorsed CANCEIS as corporate edit and
imputation tool - where the data are mainly nominal
- implementation on household individual surveys
- Registration of Life Events
- 2011 Census of Population and Households.
4Introduction and Background
- Selected Statistical Evaluation Criteria
- Distributional accuracy
- Stuart-Maxwell significance test
- Predictive accuracy
- 1. Kappa coefficient
- 2. Large sample variance of Kappa
- 3. Proportion of true values recovered
5Fully Controlled Simulation Environment
- Construction of synthetic data
- 1. identify reference data
- 2. analyse and record statistical properties
- 3. identify further similar data
- 2001 Area Classifications
- 4. construct truth deck to mirror reference
data - hierarchical records approx. 170K households and
400K persons - 5. introduce levels and patterns of missingness
observed in reference data.
6Baseline Analysis
7Limitations of the Statistical Tests
8Limitations of Statistical Tests
9Proposed Heuristic
Calculate the proportion of true values
recovered in all cases
10Example Application of Heuristic
- Aim is to identify the optimal imputation
strategy - search for groupings amongst variables
- apply logistic regression to all person level
variables - identify 5 key demographic variables
- self predicting set
- but predictive power weak for two
- subject to repeated imputation and compare to
baseline.
11Example Application of Heuristic
12Example Application of Heuristic
13Example Application of Heuristic
14Conclusions and Future Work
- Evidence of instability in the chosen statistical
criteria - Stuart-Maxwell significance test
- threshold at which becomes unstable is dependent
on the marginal values and distribution of
discordant responses. - Kappa Coefficient
- is based on observed and expected values of
concordant responses (leading diagonal) - hence differing values of Kappa when same
proportion of records recovered in differing
tables.
15Conclusions and Future Work
- Main aim to construct set of diagnostics to
facilitate 1000 repetitions of imputation process
in fully controlled simulation environment. - Evidence that the heuristic is the right
approach - Stuart-Maxwell - understand when unstable
- supported by some categorisation of the
proportion of true values recovered. - Main outcome - need to do more work!!