Biostatistics 760 - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Biostatistics 760

Description:

Counting processes; martingales. Semiparametric approaches. Kaplan-Meier estimator ... Martingale based survival analysis. New work using artificial intelligence ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 21
Provided by: bios91
Category:

less

Transcript and Presenter's Notes

Title: Biostatistics 760


1
Biostatistics 760
  • Random Thoughts

2
Upcoming Classes
  • Bios 761 Advanced Probability and Statistical
    Inference
  • Bios 763 Generalized Linear Model Theory and
    Applications
  • Bios 767 Longitudinal Data Analysis
  • Bios 780 Theory and Methods for Survival
    Analysis
  • Bios 841 Statistical Consulting

3
Bios 760
  • Frequentist and Bayesian decision theory
  • Hypothesis testing UMP tests, etc.
  • Bootstrap and other methods of inference
  • Stochastic processes
  • Poisson processes
  • Markov chains
  • Martingales
  • Brownian motion

4
Bios 780
  • Time-to-event data
  • Right censoring
  • Counting processes martingales
  • Semiparametric approaches
  • Kaplan-Meier estimator
  • Log-rank statistic
  • Cox model
  • Data analysis

5
Bios 841
  • Consulting versus collaboration
  • Bringing it all together to solve problems
  • Communicating about statistics
  • Three real problems
  • Three journal style reports
  • One final oral presentation
  • Real time problem solving
  • What is the role of statistical theory?

6
A Few War Stories
  • As a student thesis on surrogates
  • As a postdoc infectious diseases
  • As a new professor cystic fibrosis (CF)
  • Working on tenure empirical processes
  • Empirical processes and cancer
  • Chair of the DSMC for NICHD
  • Artificial intelligence and NSCLC

7
CF Neonatal Screening
  • 1992 Joined Phil Farrells CF study team
  • 1997 Farrell, Kosorok, Laxova, et al, published
    in NEJM
  • 2004 (Oct. 15) CDC recommended CF newborn
    screening the 1997 article was judged the only
    valid randomized trial
  • States offering CF newborn screening 3 in 1997,
    12 in 2004, 45 today

8
What Role Did Theory Play?
  • Used state-of-the-art statistical methods that
    were robust (GEE)
  • In other CF research we have used
  • Current status methods (parametric, robust)
  • Constrained regression estimation
  • Semiparametric bootstrap inference
  • Martingale based survival analysis
  • New work using artificial intelligence

9
Empirical Processes and Cancer
  • Non-Hodgkins Lymphoma Prognostic Factors Project
    (1993, NEJM)
  • Cox proportional hazards model employed to
    ascertain risks of 5 prognostic factors Age,
    performance Status, serum lactate dehydrogenase
    Level, number of extranodal disease Sites, tumor
    Stage
  • Diagnostics show the model fits poorly

10
What is the Problem?
  • Poor survival function prediction
  • Possibly incorrect interpretation of risk factor
    effects
  • A model that adds a single parameter to the Cox
    model was developed and fit
  • This new model fits well (Kosorok,Lee and Fine,
    2004)
  • Inference for the new model is complicated

11
What Does Theory Tell Us?
  • We can derive valid inferential tools for the new
    model estimation and bootstrap
  • Robustness was also studied we learn
    theoretically that the Cox model is robust to
    this kind of model misspecification
  • The direction of the regression coefficients is
    preserved
  • Should use robust variance for Cox model

12
Theory Versus Applications
  • The title implies there is conflict between
    theory and applications
  • This isnt true!
  • Theory provides a basis for correct thinking and
    problem solving for applications
  • Applications drive new theoretical development

13
Theory Can Be Impractical
  • Law of iterated logarithm needs sample size of
    108 (asymptopia).
  • Sometimes higher order approximations are needed
    before it becomes useful.
  • Sometimes computational properties of
    asymptotically optimal estimators are poor.
  • Some hard problems take years to solve.

14
Why Theory is Needed
  • Often it does work for practical sample sizes.
  • Can reveal properties that are universally valid
    simulation studies are limited to the scenarios
    investigated.
  • Theory can lead toward methodological solutions
    (Cook and Kosorok, 2004 JASA).
  • Theory can drive scientific discovery.
  • Some results are beautiful.

15
Data Mining Versus Inference
  • Data mining is summarizing and representing data
    no matter how complicated
  • Inference is determining valid measures of
    uncertainty
  • Patterns obtained from data mining can be
    misleading
  • Inference without data mining may miss important
    structure

16
The Core of Statistics
  • Statistics is the science of science
  • How do we learn from our world and draw
    meaningful and valid conclusions from it?
  • Need both data mining and valid inference
  • Requires a unique kind of intuition
  • Needs many different intellectual perspectives
  • One of the most challenging of all fields

17
Everyone Needs Core Literacy
  • All statisticians need to know enough theory to
    have core literacy about statistics and to be
    able to problem solve
  • All statisticians need to know enough about
    applications to know what is important
  • All biostatisticians need to know enough
    statistical methods to be useful in practice
  • The purpose of a Ph.D. in Biostatistics is to
    enable the creation of new methodology

18
Semiparametric Inference
  • The study of statistical models with parametric
    and/or nonparametric parts
  • Can achieve trade-off between scientific meaning
    and model robustness
  • Estimation and inference are often hard
  • There exists an efficiency bound for parametric
    and some nonparametric parts
  • NPMLE, testing and estimating equations

19
Empirical Processes
  • Tools for complex model inference and high
    dimensional data
  • Can determine universal properties of
    semiparametric methods
  • Consistency
  • Rate of convergence
  • Limiting distributions
  • Valid inference (empirical process bootstrap)
  • Empirical processes are everywhere

20
The Road Ahead
  • Whatever you choose to do, the core statistical
    theory classes will help you.
  • Be patient as your learn.
  • Be willing to work hard (struggle is good).
  • It takes many different kinds of thinkers with
    different learning styles.
  • There are important discoveries to be made in
    both applications and theory.
Write a Comment
User Comments (0)
About PowerShow.com