Statistical Tools A Few Comments - PowerPoint PPT Presentation

About This Presentation
Title:

Statistical Tools A Few Comments

Description:

Some difficulties with tools used in HEP. Difficult to express ideas cleanly and clearly. Tools scattered over different (typically, monolithic) programs ... – PowerPoint PPT presentation

Number of Views:12
Avg rating:3.0/5.0
Slides: 10
Provided by: hep5
Learn more at: http://www.hep.fsu.edu
Category:

less

Transcript and Presenter's Notes

Title: Statistical Tools A Few Comments


1
Statistical ToolsA Few Comments
  • Harrison B. Prosper
  • Florida State University
  • PHYSTAT Workshop 2004
  • 1-2 March 2004

2
Outline
  • Issues
  • Wish List
  • Example
  • Summary

3
Statistical Tools Issues
  • Some difficulties with tools used in HEP
  • Difficult to express ideas cleanly and clearly
  • Tools scattered over different (typically,
    monolithic) programs
  • Interface between heterogeneous data formats and
    disparate tools is a headache
  • Histograms are tightly coupled to their viewers
  • Algebra of histograms relatively crude
  • Inadequate support for systematic study of
    ensembles

4
Issues II
  • In a systematic statistical study one may wish
    to
  • Generate different ensembles of observations,
    possibly with conditioning, and study various
    statistical properties (bias, variance, coverage
    etc.)
  • Assess robustness with respect to
  • prior densities and likelihoods
  • Study different confidence limit procedures
  • Study different optimization criteria

5
Issues III
  • One may wish to study
  • Type I and type II error rates
  • Consistency both convergence to, and rate of
    convergence to, the true answer as sample size
    increases
  • Probability densities p(z) given underlying
    distributions p(x)

6
Wish List
  • Decoupling
  • Statistical tool separate from, and independent
    of, the environment in which it might be used.
  • However, provide bindings for different
    environments/languages (R, Root, Python, Java,
    etc.)
  • Modularity
  • Each statistical tool encapsulates a single
    coherent statistical idea. Avoid monoliths.
  • Histograms
  • Histogram and histogram viewers independent of
    each other. (A sensible idea from Marc Paterno!)
  • Elegant algebra of histograms h ah1bh2/h3
    etc.
  • Powerful, intuitive tools for multi-dim. data
    exploration

7
Wish List II
  • Likelihoods
  • Flexible method for reporting them maybe as
    swarms of points generated via MCMC?
  • Frequency Methods
  • Flexible ensemble generator, which allows easily
    extracted sub-ensembles
  • Flexible query of ensembles (to get coverage,
    error rates, variances, bias etc.)
  • Bayesian Methods
  • Flexible robustness studies (prior family,
    likelihood family etc.)
  • Multi-dimensional integration (adaptive and
    Markov chain MC)

8
Example A Current Statistical Problem From DØ
Single Top Group
  • Set limit on s(ppbar ? t X) given an histogram
    for each of
  • 4 signal channels
  • tq(EC), tqb(EC), tq(CC), tqb(CC)
  • 4 background sources per signal channel
  • QCD, ttbar(ljets), ttbar(ll), WJets
  • Some histograms are weighted, some unweighted
  • We would like to study different limit
    procedures, including Bayesian, and study their
    frequency properties. Currently using ad hoc and
    rather inflexible pieces of homegrown C!

9
Summary
  • The Good
  • Lots of statistical tools already exist
  • A lot more needed opportunity for creativity!
  • The Bad
  • Use of current tools, however, often requires
    familiarity with several frameworks/languages
  • The Ugly
  • Lack of a simple, but powerful, language for
    expression of statistical ideas. Rapid what if
    analyses done with C. This is crazy! I dont
    want to think about pointers and de-referencing
    when Im trying to think about mathematics.
Write a Comment
User Comments (0)
About PowerShow.com