Title: RooUnfold unfolding framework and algorithms
1RooUnfoldunfolding frameworkand algorithms
- Tim Adye
- Rutherford Appleton Laboratory
- BaBar Statistics Working Group
- BaBar Collaboration Meeting
- 13th December 2005
2Outline
- What is Unfolding?
- and why might you want to do it?
- Overview of a few techniques
- Regularised unfolding
- Iterative method
- RooUnfold package
- Currently implements three methods with a common
interface - Status and Plans
- References
3Unfolding
- In other fields known as deconvolution,
unsmearing - Given a true PDF in µ, that is corrupted by
detector effects, described by a response
function, R, we measure a distribution in ?. In
terms of histograms - This may involve
- inefficiencies lost events
- bias and smearing events moving between
bins(off-diagonal Rij) - With infinite statistics, it would be possible to
recover the original PDF by inverting the
response matrix
4Not so simple
- Unfortunately, if there are statistical
fluctuations between bins this information is
destroyed - Since R washes out statistical fluctuations, R-1
cannot distinguish between wildly fluctuating and
smooth PDFs - Obtain large negative correlations between
adjacent bins - Large fluctuations in reconstructed bin contents
- Need some procedure to remove wildly fluctuating
solutions - Give added weight to smoother solutions
- Solve for µ iteratively, starting with a
reasonable guess and truncate iteration before it
gets out of hand - Ignore bin-to-bin fluctuations altogether
5What happens if you dont smooth
6True Gaussian, with Gaussian smearing, systematic
translation, and variable inefficiency trained
using a different Gaussian
7Double Breit-Wigner, with Gaussian smearing,
systematic translation, and variable inefficiency
trained using a single Gaussian
8So why dont we always do this?
- If the true PDF and resolution function can be
parameterised, then a Maximum Likelihood fit is
usually more convenient - Directly returns parameters of interest
- Does not require binning
- If the response function doesnt include smearing
(ie. its diagonal), then apply bin-by-bin
efficiency correction directly - If result is just needed for comparison (eg. with
MC), could apply response function to MC - simpler than un-applying response to data
9When to use unfolding
- Use unfolding to recover theoretical distribution
where - there is no a-priori parameterisation
- this is needed for the result and not just
comparison with MC - there is significant bin-to-bin migration of
events
10Where could we use unfolding?
- Traditionally used to extract structure functions
- Widely used outside PP for image reconstruction
- Dalitz plots
- Cross-feed between bins due to misreconstruction
- True decay momentum distributions
- Theory at parton level, we measure hadrons
- Correct for hadronisation as well as detector
effects
111. Regularised Unfolding
- Use Maximum Likelihood to fit smeared bin
contents to measured data, but include
regularisation function - where the regularisation parameter, a, controls
the degree of smoothness (select a to, eg.,
minimise mean squared error) - Various choices of regularisation function, S,
are used - Tikhonov regularisation minimise curvature
- for some definition of curvature, eg.
- RooUnfHistoSvd by Kerstin Tackmann and Heiko
Lacker - based on GURU by Andreas Höcker and Vakhtang
Kartvelishvili - uses Singular Value Decomposition
- RUN by Volker Blobel
- Maximum entropy
122. Iterative method
- Uses Bayes theorem to invert
- and using an initial set of probabilities, pi
(eg. flat) obtain an improved estimate - Repeating with new pi from these new bin contents
converges quite rapidly - Truncating the iteration prevents us seeing the
bad effects of statistical fluctuations - Fergus Wilson and I have implemented this method
in ROOT/C - Supports 1D, 2D, and 3D cases
132D Unfolding Example2D Smearing, bias, variable
efficiency, and variable rotation
14RooUnfold Package
- Make these different methods available as
ROOT/C classes with a common interface to
specify - unfolding method and parameters
- response matrix
- pass directly or fill from MC sample
- measured histogram
- return reconstructed truth histogram and errors
- full covariance matrix
- Easy to do with multiple dimensions (when
supported) - This should make it easy to try and compare
different methods in your analysis - Could also be useful outside BaBar!
15RooUnfold Classes
- RooUnfoldResponse
- response matrix with various filling and access
methods - create from MC, use on data (can be stored in a
file) - RooUnfold unfolding algorithm base class
- RooUnfoldBayes Iterative method
- RooUnfoldSvd Inteface to RooUnfHistoSvd package
- RooUnfoldBinByBin Simple bin-by-bin method
- Trivial implementation, but useful to compare
with full unfolding - RooUnfoldExample Simple 1D example
- RooUnfoldTest and RooUnfoldTest2D
- Test with different training and unfolding
distributions
16RooUnfold Status
- Available in CVS
- Announced in Statistics HN
- See README file for details of building and
running - Interface can still be adjusted based on comments
- I already have an idea for simplifying use in
multi-dimensional case
17Plans and possible improvements
- So far this is mostly a programming exercise
- Would be interesting to compare the different
methods for some real analysis distributions - But YMMV
- Add common tools, useful for all algorithms
- Inputs and results in different formats
- already supports histograms and ROOT
vectors/matrices - Automatic calculation of figures of merit (eg.
Â2) - can also use standard ROOT functions on
histograms - Simplify selection of regularisation parameter
- More algorithms?
- Maximum entropy regularisation
- Simple matrix inversion without regularisation
- perhaps useful with large statistics
18References - Overview
- G. Cowan, A Survey of Unfolding Methods for
Particle Physics, Proc. Advanced Statistical
Techniques in Particle Physics, Durham (2002) - http//www.ippp.dur.ac.uk/Workshops/02/statistics/
- G. Cowan, Statistical Data Analysis, Oxford
University Press (1998), Chapter 11 Unfolding - R. Barlow, SLUO Lectures on Numerical Methods in
HEP (2000),Lecture 9 Unfolding - www-group.slac.stanford.edu/sluo/Lectures/Stat_Lec
tures.html
19References - Techniques
- V. Blobel, Unfolding Methods in High Energy
Physics,DESY 84-118 (1984) also CERN 85-02 - A. Höcker and V. Kartvelishvili, SVD Approach to
Data Unfolding, NIM A 372 (1996) 469 - www.lancs.ac.uk/depts/physics/staff/kartvelishvili
.html - K. Tackmann, H. Lacker, Unfolding the Hadronic
Mass Spectrumin B-gtXu l? Decays, BAD 894. - G. DAgostini, A multidimensional unfolding
method based on Bayes theorem, NIM A 362 (1995)
487