Matt Hutmacher - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Matt Hutmacher

Description:

Need for better model selection techniques. Subject matter can ... Loess, kernel smoother, mean. gn true mi under mild conditions. Estimate sg2 using residuals ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 31
Provided by: matthut2
Category:
Tags: hutmacher | loess | matt

less

Transcript and Presenter's Notes

Title: Matt Hutmacher


1

Minimum Hellinger Distance in Model Selection and
Estimation
  • Matt Hutmacher
  • Pharmacometrics Pfizer, Inc.
  • 02MAY2006

Joint work Anand Vidyashankar (Cornell
University) Debu Mukherjee (Statsystem Inc.)
2
Introduction
  • Initiated by a conversation on model selection
    between non-nested models
  • Collaboration ensued
  • Need for better model selection techniques
  • Subject matter can not always provide model form
  • Empirical models often used in exposure-response
  • Nonlinear mixed effects is flexible with respect
    to model forms
  • Marginal variance depends upon model form
  • Population modelers (pharmacometricians) are
    well-suited for developing and applying new
    data-analytic techniques to enhance decision
    making

3
Objective
  • Introduce Hellinger Distance as a principled
    methodology for selection between nonhierarchical
    models
  • Introduce the concept of minimizing the Hellinger
    Distance as an efficient yet robust estimator
    an alternative to the MLE (or ELS)

4
Outline
  • What is Hellinger Distance (HD)
  • HD applied to Model Selection
  • Minimum Hellinger Distance Estimation (MHDE)
  • Remarks

5
What is Hellinger Distance
  • Definition

2
  • An absolute measure between two densities
  • HD 0 when f ? g
  • Ranges between 0 and 2 inclusive

6
What is Hellinger Distance
  • HD for two univariate normal densities

1
1
  • Easy to extend to a multivariate normal for
    population models

7
HD for Model Selection
  • Likelihood based methods are popular for model
    selection
  • LRT (asymptotic ?2)
  • Akaike information criterion
  • Schwarzs Bayesian information criterion (BIC or
    SBC)
  • All are comparative methods

8
HD for Model Selection
  • AIC and BIC often applied to nonhierarchical
    model selection
  • Yet, AIC and BIC were developed under the
    hierarchical construct
  • How robust are these likelihood based procedures
    to likelihood misspecification or outliers?
  • Isnt the analyst interested ultimately in a
    model that is closest to the underlying model?

9
HD for MS (Implementation)
  • Consider selecting between f1 and f2
  • How do we proceed?

10
HD for MS (Implementation)
  • Compare models to a nonparametric assessment of
    model form
  • Estimate mg,i nonparametrically (gn)
  • Loess, kernel smoother, mean
  • gn ? true mi under mild conditions
  • Estimate sg2 using residuals

11
HD for MS (Implementation)
  • Compare models to a non-parametric assessment of
    model form (cont.)
  • Compute HD(f1,g) and HD(f2,g)
  • Select the f with the smallest HD

12
HD for Model Selection (Examples)
  • Example 1
  • Emax model e
  • True model
  • 3 parameters
  • Emax model ? exp(e)
  • False model
  • 3 parameters
  • lt25 CV
  • 2000 simulations

13
HD for Model Selection (Examples)
  • Example 2
  • Emax e
  • True model
  • 3 parameters
  • Linear e
  • False model
  • 2 parameters
  • 10 outliers
  • 4-6 s range
  • 2000 simulations

14
HD for Model Selection
  • HD for model selection
  • Targets the closest parametric model to the
    assumption-poor nonparametric model
  • Closeness defined with respect to the first two
    moments (mean, variance)
  • Currently for non-hierarchical models
  • HD will select larger model without penalty
  • Similar to 2ll in this respect
  • Working on an appropriate penalty

15
Minimum HD Estimation (MHDE)
  • Recall HD definition
  • Consider
  • Why not estimate q ?

16
Minimum HD Estimation (MHDE)
  • HD (or g) is a well-defined objective function
    (bounded)
  • Suggests minimizing HD or maximizing g for
    estimation of q Beran (1977)

17
Minimum HD Estimation
  • Integral evaluation
  • Integral by SLLN Cheng Vidyashankar (2003)

18
Minimum HD Estimation
  • Sampling
  • Calculate residuals

19
Minimum HD Estimation
  • Estimate the density of the residuals

20
Minimum HD Estimation
  • Recall
  • For j-th term in the integral approximation
  • Sample a Ki with probability p such as 1/n (i)

21
Minimum HD Estimation
  • Optimization over q is now possible
  • Optimization can be performed using SAS PROC NLP
    (or PROC MODEL)
  • These routines support several optimization
    routines, convergence criteria, and symbolic
    differentiation

22
MHDE (Examples)
  • Example 3
  • N50

23
MHDE (Examples cont.)
  • Example 4
  • Compare MLE and MHDE
  • Sample size n20
  • 1000 Simulations

24
MHDE (Example 4)
25
MHDE (Examples cont.)
  • Example 5
  • Compare MLE (ELS) and MHDE
  • Sample size n20
  • 1000 Simulations

26
MHDE (Example 5)
27
Remarks
  • Properties of a good estimator
  • Efficient when model is true
  • Not much loss when model is approximately true
  • MLE (ELS)
  • Is efficient when the model is true
  • Can suffer instability under data contamination
    (inefficiencies and lack of robustness)
  • Uses a squared error with a penalty for
    increasing the variance

28
MHDE
  • MHDE is efficient with increased robustness
  • Efficient when model is true
  • Increased efficiency relative to total
    nonparametric estimation
  • MHDEs use of the empirical density
    (nonparametric kernel density) estimator reduces
    the influence of outliers and data contamination
  • Empirical density puts mass of 1/n on data ( is
    do to the smoothing)
  • Some experience with choosing cn is necessary to
    get comfortable with the method

29
Remarks
  • Novel methodology
  • Smoothed mixture of kernels to emulate empirical
    density of data
  • Makes MHDE generalizable to more typical
    data-analytic problems
  • Regression
  • ANOVA / ANCOVA / Partial Linear Models
  • Well-suited to adequately simulate data
  • Simulate from smoothed empirical data
    distribution

30
Remarks
  • This estimation methodology reflects a conceptual
    shift away from the likelihood and towards
    density.
  • Minimum Hellinger Distance Estimation provides an
    alternate, efficient (yet robust) inferential
    strategy and a unique data simulation methodology
  • Mixed effects methodology under development
Write a Comment
User Comments (0)
About PowerShow.com