Title: A Toolkit for Statistical Data Analysis
1A Toolkit for Statistical Data Analysis
- B. Mascialino, A. Pfeiffer, M.G. Pia, A. Ribon,
P. Viarengo
CHEP 2004 Interlaken, 26-30 September 2004
http//www.ge.infn.it/geant4/analysis/HEPstatistic
s
Work supported and partially funded by the
European Space Agency (ESA) under Contract
No.16339/02/NL/FM
2The project
A project to develop a statistical analysis
system
Provide tools for the statistical comparison of
distributions (Goodness-of Fit Tests)
- Regression testing
- Throughout the software life-cycle
- Online DAQ
- Monitoring detector behaviour w.r.t. a reference
- Simulation validation
- Comparison with experimental data
- Reconstruction
- Comparison of reconstructed vs. expected
distributions - Physics analysis
- Comparisons of experimental distributions (ATLAS
vs. CMS Higgs?) - Comparison with theoretical distributions (data
vs. Standard Model)
Typical use cases in HEP
3Software tools
- Commercial products used by professional
statisticians - SPSS, NCSS...
- In HEP
- A lot of activity
- workshops/conferences (CERN, Durham, SLAC etc.)
- books (F. James et al., L. Lyons, R. Barlow etc.)
- sophisticated statistical algorithms applied in
various data analyses - ...but, in spite of the relevant role played by
statistics in HEP, very limited availability of
software tools for statistics in our field - and in open-source software in general
4We need it, lets do the work ourselves...
A project to develop an open-source software
system for statistical analysis
Provide tools for the statistical comparison of
distributions Create a hub to aggregate
expertise and collaborative contributions from
scientists interested in statistical methods
5Vision the basics
- Rigorous software process
6Architectural guidelines
- The project adopts a solid architectural approach
- to offer the functionality and the quality needed
by the users - to be maintainable over a large time scale
- to be extensible, to accommodate future
evolutions of the requirements - Component-based architecture
- to facilitate re-use and integration in diverse
frameworks - Dependencies
- adopt a standard (AIDA) for the user layer
- no dependence on any specific analysis tool
- Python
- the glue for interactivity
- The approach adopted is compatible with the
recommendations of the LCG Architecture
Blueprint Report
7Software process
- United Software Development Process, specifically
tailored to the project - practical guidance and tools from the RUP
- both rigorous and lightweight
- mapping onto ISO 15504
- significant experience gained in the group from
other projects - Incremental and iterative life-cycle model
8User Requirements
- User requirements elicited, analysed and formally
specified - Functional (capability) and not-functional
(constraint) requirements - User Requirements Document available from the web
site
Requirement traceability
- Requirements
- Design
- Implementation
- Test test results
- Documentation
9(No Transcript)
10(No Transcript)
11- Simple user layer
- Shields the user from the complexity of the
underlying algorithms and design - Only deal with AIDA objects and choice of
comparison algorithm
12GoF algorithms
- Algorithms for binned distributions
- Anderson-Darling Test
- Chi-squared Test
- Fisz-Cramer-von Mises Test
- Tiku Test (Cramer-von Mises test in chi-squared
approximation) - Algorithms for unbinned distributions
- Anderson-Darling Test
- Fisz-Cramer-von Mises Test
- Goodman Test (Kolmogorov-Smirnov test in
chi-squared approximation) - Kolmogorov-Smirnov Test
- Kuiper Test
- Tiku test (Cramer-von Mises test in chi-squared
approximation)
13Chi-squared test
- Applies to binned distributions
- It can be useful also in case of unbinned
distributions, but the data must be grouped into
classes - Cannot be applied if the counting of the
theoretical frequencies in each class is lt 5 - When this is not the case, one could try to unify
contiguous classes until the minimum theoretical
frequency is reached
14Tests based on a supremum statistics
Unbinned distributions
- Goodman approximation of KS Test
Dmn
15Tests containing a weighting function
Unbinned distributions
Binned distributions
- Fisz-Cramer-von Mises Test
- k-sample Anderson-Darling Test
16Comparative evaluation of tests
Anderson-Darling High Sensitive to tails
c2 Low General
Fisz-Cramer-von Mises High Symmetric, right-skewed distributions
Goodman Medium Approximation of K-S to c2 test statistics
Kolmogorov-Smirnov Medium Derives from Kolmogorov statistics
Kuiper Medium Sensitive to tails and median
Tiku High Converts CvM statistics to a c2
More about a comparative evaluation of tests in
the User Documentation on our web Topic still
subject to research activity in the domain of
statistics
17Power of tests
The power of a test is the probability of
rejecting the null hypothesis correctly
In terms of power
- ?2 loses information in a test for unbinned
distribution by grouping the data into cells - Kac, Kiefer and Wolfowitz (1955) showed that
Kolmogorov-Smirnov test requires n4/5
observations compared to n observations for ?2
to attain the same power - Cramer-von Mises and Anderson-Darling statistics
are expected to be superior to Kolmogorov-Smirnov
s, since they make a comparison of the two
distributions all along the range of x, rather
than looking for a marked difference at one point
Talk at IEEE NSS, Rome, 16-22 October 2004
paper submitted for publication November 2004
18(No Transcript)
19Unit test ?2
Test from PICCOLO BOOK (STATISTICS - page 711)
Exact p-value 0.200758 Expected p-value
0.200757
?2 test-statistics 15.8 Expected ?2 15.8
Binned data
Test from CRAMER BOOK (MATHEMATICAL METHODS OF
STATISTICS - page 447)
Exact p-value 0 Expected p-value 0
?2 test-statistics 123.203 Expected ?2 123.203
20Unit test K-S Goodman
Test from PICCOLO BOOK (STATISTICS - page 711)
?2 test-statistics 3.9 Expected ?2 3.9
Exact p-value0.140974 Expected p-value0.140991
Test from LANDENNA BOOK (NONPARAMETRIC TESTS
BASED ON FREQUENCIES - page 287)
?2 test-statistics 1.5 Expected ?2 1.5
Exact p-value0.472367 Expected p-value0.472367
21Unit test Kolmogorov-Smirnov
Test from LANDENNA BOOK (NONPARAMETRIC TESTS
BASED ON FREQUENCIES - page 318-325)
D test-statistics 0.65 Expected D 0.65
Cumulative
Exact p-value 2 10-19 Expected p-value 8 10-19
this is just a sample of the test process and
results!
22GPL License
Feedback from users is welcome!
23User Documentation
- Download
- Installation
- User Guide
- Statistics Reference Guide
24Example of application results
Validation of Geant4 physics models w.r.t. NIST
reference
ESA Bepi Colombo mission to Mercury Test beam
at Bessy
Kolmogorov-Smirnov Test
Data range Distance p-value
-84 ? -60 mm 0.38 0.23
-59 ? -48 mm 0.27 0.90
-47 ? 47 mm 0.43 0.19
48 ? 59 mm 0.30 0.82
60 ? 84 mm 0.40 0.10
Dosimetry at IST Cancer Inst. Monte Carlo and
experimental data
Intervallo distanza Distanza Livello di significativitÃ
-84 ? -60 mm 0.385 0.23
-59 ? -48 mm 0.27 0.90
-47 ? 47 mm 0.43 0.19
48 ? 59 mm 0.30 0.82
60 ? 84 mm 0.40 0.10
25A toolkit for modeling multi-parametric fit
problems
- F. Fabozzi, L. Lista
- INFN Napoli
- Initially developed while rewriting a FORTRAN
fitter for BaBar analysis - Simultaneous estimate of
- B(B? ?J/???) / B(B? ?J/?K?)
- direct CP asymmetry
- More control on the code was needed to justify a
bias appeared in the original fitter
New components included in the Statistical
Toolkit Toy Monte Carlo, PDF modelling, Max
Likelihood Fits Architecture open to extension
and evolution
26Feel free to contact us!
27Conclusions
- A project to develop an open source, general
purpose software toolkit for statistical data
analysis is in progress - to provide a product of common interest to user
communities - Rigorous software process
- to contribute to the quality of the product
- Component-based architecture, OO methods
generic programming - to ensure openness to evolution, maintainability,
ease of use - GoF component
- Component for modeling multi-parametric fit
problems - Software released and application results
available - toolkit in use for Geant4 physics validation and
in experiments - paper published on IEEE Trans. Nucl. Sci., 3
October 2004
Thanks to Fred James (CERN) and Louis Lyons
(Oxford) for many useful suggestions,
discussions, encouragement..