Title: Difficulties in Limit setting and the Strong Confidence approach
1Difficulties in Limit setting and the Strong
Confidence approach
- Giovanni Punzi
- SNS and INFN - Pisa
- Advanced Statistical Techniques in Particle
Physics - Durham, 18-22 March 2002
2Outline
- Motivations for a Strong CL
- Summary of properties of Strong CL
- Some examples
- Limits in presence of systematic uncertainties.
3Motivation
- The set of Neymans bands is large, and contains
all sorts of inferences like - I bought a lottery ticket. If I win, I will
conclude then donkeys can fly _at_99.9999 CL - I want to get rid of those, but keep being
frequentist.
4Why should you care ?
- Wrong reason to make the CL look more like
p(hypothesis data). - Right reasonYou dont want to have to quote a
conclusion you know is bad. If you think harder,
you can do better - You are drawing conclusions based on irrelevant
facts (like a bad fit). - As a consequence, you are not exploiting at best
the information you have - Your results are counter-intuitive and convey
little information. - You must make sure your conclusions do not depend
on irrelevant information
5SOLUTIONImpose a form of Likelihood Principle
- Take any two experiments whose pdf are equal for
some subset c of observable values of x, apart
for a multiplicative constant. Any valid
Confidence Limits you can derive in one
experiment from observing x in c must also be
valid for the other experiment. - If you ask the Limits to be univocally
determined, there is no solution.
6RESULT
Non-coverage land
Neymans CL bands
Strong bands
Surprise a solution exists, and gives for any
experiment a well-defined, unique subset of
Confidence Bands
7Construction of CL bands
Regular
Strong
8Strong CL vs. standard CL
- A new parameter emerges sCL. Every valid band
_at_xx sCL is also a valid band _at_xx CL. - You can check sCL for a band built in any other
way. - sCL requirement effectively amounts to
re-applying the usual Neymans condition locally
on every subsample of possible results.This
ensures uniform treatment of all experimental
results, but in a frequentist way. - Strong Band definition is not an ordering
algorithm and answer is still not unique. You may
need to add an ordering to obtain a unique
solution.
9Strong CL
Neyman
(CR(x) is the accepted region for µ given the
observation of x. c is an arbitrary subset of x
space)
- It is similar to conditioning, a standard
practice in modern frequentist statistics. - There is a long history of attempts to modify
frequentist theory by utilizing some form of
conditioning. Earlier works are summarized in
Kiefer(1977), Berger and Wolpert(1988)
Kiefer(1977) formally established the conditional
confidence approach - The first point to stress is the unreasonable
nature of the unconditional test the
unconditional test is arguably the worst possible
frequentist test it is in some sense true
that, the more one can condition, the better - It is sometimes argued that conditioning on
non-ancillary statistics will lose information,
but nothing loses as much information as use of
unconditional testing (J. Berger)
10Summary of sCL properties
(see CLW proceedings and hep-ex/9912048)
- 100 frequentist, completely general.
- The only frequentist method complying with
Likelihood Principle - Invariant for any change of variables
- No empty regions, in full generality
- No unlucky results, no need for quoting
additional information on sensitivity. No
pathologies. - Robust for small changes of pdf
- More information gives tighter limits
- Easier incorporation of systematics
- Price tag
- Overcoverage
- Heavy computation
11Invariance for change of the observable
- All classical bands are invariant for change of
variable in the parameter (unlike Bayesian
limits) - The CL definition is invariant for change of
variable in the observable, too. But, most rules
for constructing bands break this invariance ! - Strong-CL is also invariant for any change of
variable. - Likelihood Ratio is also invariant
(non-advertised property?), so it is a natural
choice of ordering to select a unique Strong
Band.
12Effect of changing variables
Non-coverage land
Neymans CL bands
Strong bands
LR-ordered bands
13Poissonbackground
upper limit _at_90CL for n0
sCL 90, or R.-W.
LR-ordering
background
- The upper limit on µ decreases with expected
background in all unconditioned approaches. - Often criticized on the basis that for n0 the
value of b should be irrelevant.
14Behavior when new observables are added
- Do you expect limits to improve when you add
extra information ? - A simple example shows that neither PO or LRO
have this property (conjecture no ordering
algorithm has it !) - Example comparing a signal level with gaussian
noise with some fixed thresholds - Problem the limit loosens dramatically when
adding an extra threshold measurement.
15Example
L(µ)
LR(µ)
- Unknown electrical level µ plus gaussian noise (?
1). Limited to µlt 0.5. - Compare with a fixed threshold (2.5 ?), get a
(0,1) response. - Observe gt threshold
- PO empty region _at_90CL
- LR 0.49 lt µ lt 0.50 _at_90CL
- sCL -0.34 lt µ lt 0.50 _at_90sCL
- N.B. you MUST overcover unless you want an empty
region.
16Add another threshold
LR(µ)
L(µ)
0.27lt µ lt 0.5
- Now, add a second independent threshold
measurement at 0 limit become much looser ! - sCL limit is unaffected
- Conjecture no ordering algorithm can provide a
sensible answer in all cases.
17Observations
- It may be impossible to get sensible results
without accepting some overcoverage. Why blame
sCL for overcoverage ? - Ordering algorithms alone seem unable to prevent
very strange results the inclusion of additional
(irrelevant) information may produce a dramatic
worsening of limits.
18Adding systematics to CL limits
- Problem
- My pdf p(xµ) is actually a p(xµ,?), where ? is
an unknown parameter I dont care about, but it
influences my measurement (nuisance) - I may have some info of ? coming from another
measurement y q(y?) - My problem is
- p(x,yµ,?) p(xµ,?)q(y?)
- Many attempts to get rid of ? three main routes
- Integration/smearing (a la Bayes)
- Maximization (profile Likelihood)
- Projection (strictly classical)
19Hybrid method Bayesian Smearing
- 1) define a new (smeared) pdf p(xµ) ?
p(xµ,?)p(?) d ?where p(?) is obtained through
Bayes - p(?) q(y ?)p(?)/q(y)
- Need to assume some prior p(?)
- 2) Use p to obtain Conf. Limits as usual
- GOOD
- Simple and fast
- Used in many places
- Intuitively appealing
- BAD
- Intuitively appealing
- Interpretation mix Bayes and Neyman. Output
results have neither coverage nor correct
Bayesian probability gt waste effort of
calculating a rigorous CL - May undercover
- May exhibit paradoxical tightening of limits
20A simple example Bayes systematics
LR(µ)
LR(µ)
µ gt 0.272
µ gt 0.294
- Introduce a systematic uncertainty on the actual
position of the 0 threshold. Assume a flat prior
in -1,1. - Do smearing gt get tighter limits !
- No reason to expect a good behavior
21Approximate classical method Profile Likelihood
- 1) define a new (profile) pdf p prof(xµ)
p(x,y0µ,?best (µ)) where ?best(µ) maximizes the
value of a) p(x0,y0µ,?best) b) p(x
,y0µ,?best) (?best ?best(µ,x) !) - This means maximizing the likelihood wrt the
nuisance parameters, for each µ - 2) Use p prof to obtain Conf. Limits as usual
- GOOD
- Reasonably simple and fast
- Approximation of an actual frequentist method
- BAD
- Flip-flop in case a), non-normalized in case b)
!! - Only approximate for low-statistics, which is
when you need limits after all. - You dont know how far off it is unless you
explicitly calculate correct limits. - Systematically undercovers
22Exact Classical Treatment of Systematics in Limits
- 1) Use p(x,yµ,?) p(xµ,?)q(y?), and
consider it as p( (x,y) (µ,?) ) - 2) Evaluate CR in (µ,?) from the measurement
(x0,y0) - 3) Project on µ space to get rid of uninteresting
information on ? - It is clean and conceptually simple.
- It is well-behaved.
- No issues like Bayesian integrals Why is it
used so rarely ? - 1) It produces overcoverage
- 2) The idea is simple, but computation is heavy.
Have to deal with large dimensions - 3) Results may strongly depend on ordering
algorithm, even more than usual.
23profile method
24Overcoverage
- Projecting on µ effectively widens the CR ?
overcoverage. BUT - You chose to ignore information on ? - cannot ask
Neyman to give it all back to you as information
on µ - the two things are just not
interchangeable. - ? overcoverage is a natural consequence, not a
weakness - Q can you find a smaller µ interval that does
not undercover ? (same situation with
discretization)
25Optimization issue
- You want to stretch out the CR along ? direction
as far as possible. - BUT
- The choice of band is constrained by the need to
avoid paradoxes (empty regions, and the like) ! - No method on the marked allows you to treat µ and
? in a different fashion - Strong CL allows you to specify µ as the
parameters of interest, and to obtain the
narrowest µ interval - The solution does not require constructing a
multidimensional region
26Strong CL Band with systematics
- The solution does not require explicit
construction of a multidimensional region - The narrowest µ interval compatible with Strong
CL is readily found.
27Conclusions
- Strong Confidence bands have all good properties
you may ask for. - Systematics can be included naturally and
rigorously - They can even be actually evaluated
28Poisson Examplen5, b2, A0.020.006(gaussian)
Strong CL
(arbitrary units)
Upper limit 30 higher than from Bayesian
calculations shown by Luc Demortier