Title: G. Cowan
1Statistical Issues for Higgs Search
ATLAS Statistics Forum CERN, 16 April, 2007
Glen Cowan Physics Department Royal Holloway,
University of London g.cowan_at_rhul.ac.uk www.pp.rhu
l.ac.uk/cowan
2Outline
1 General framework 2 Histogram based
analysis 3 LEP-style analysis 4 Fit
method 5 Systematic uncertainties 6 Thoughts
on Feldman-Cousins limits
3Initial thoughts
PHYSTAT papers, LEP, FNAL, ATLAS notes etc.
already contain a lot of well worked out material
on statistics for LHC searches. Much of the draft
note I posted just summarizes well-known things
(? skim quickly). But many areas still not
completely clear (to me) and important choices
remain to be made.
4General framework
Assume N channels, data from each are sets of
numbers
Joint set of all data x Joint pdf for full
experiment
(if all channels statistically independent).
is a set of parameters
m mH is the parameter of interest, l are
nuisance parameters.
5Test of hypothesized mass m
The likelihood function is
Define likelihood ratio
Can use this to construct a test of the
hypothesized value m (and then do this for all
m). Take critical region for test (region with
low compatibility with the hypothesis) to
correspond to low values of l(m). Set size of
critical region such that probability for data to
be there under null hypothesis a (significance
level of test). If data fall in critical region,
reject the hypothesis m.
6Confidence interval from test
Now carry out the test for all m. The set of
values not rejected at significance a is a
confidence interval at confidence level
1-a. Often e.g. from a lower limit mlo to 8.
7Discovery, p-values
To discover the Higgs, try to reject the
background-only (null) hypothesis (H0). Define a
statistic t whose value reflects compatibility of
data with H0. p-value Prob(data with
compatibility with H0 when
compared to the data we got H0 )
For example, if high values of t mean less
compatibility,
If p-value comes out small, then this is evidence
against the background-only hypothesis ?
discovery made!
8Significance from p-value
Define significance Z as the number of standard
deviations that a Gaussian variable would
fluctuate in one direction to give the same
p-value.
TMathProb
TMathNormQuantile
9When to publish
HEP folklore is to claim discovery when p 2.85
? 10-7, corresponding to a significance Z
5. This is very subjective and really should
depend on the prior probability of the
phenomenon in question, e.g.,
phenomenon reasonable p-value for
discovery D0D0 mixing 0.05 Higgs 10-7
(?) Life on Mars 10-10 Astrology 10-20
Note some groups have defined 5s to refer to a
two-sided fluctuation, i.e., p 5.7 ? 10-7
10Likelihood ratio as test statistic
Take as test statistic
Sampling distribution for q(m) depends on
hypothesized mass. We need e.g.
for
(signal plus background)
and
Assume for now that these pdfs can be determined
with MC and clever tricks.
11Histogram-based analysis
Unlike LEP expect lots of background, so put data
in histogram
12Histogram-based analysis (2)
Assume ni Poisson (si bi), so the likelihood
is
or the log-likelihood (up to a constant),
For N independent channels this becomes
13Histogram-based analysis (3)
From the likelihood construct as before
This is used to construct tests and intervals as
before.
14LEP-style analysis CLb
Same basic idea L(m) ? l(m) ? q(m) ? test of m,
etc.
For a chosen m, find p-value of background-only
hypothesis
15LEP-style analysis CLsb
Normal way to get interval would be to reject
hypothesized m if
By construction this interval will cover the true
value of m with probability 1 - a.
16LEP-style analysis CLs
The problem with the CLsb method is that for
high m, the distribution of q approaches that of
the background-only hypothesis
So a low fluctuation in the number of background
events can give CLsb lt a This rejects a high m
value even though we are not sensitive to Higgs
production with that mass the reason was a
low fluctuation in the background.
17 CLs
A solution is to define
and reject the hypothesized m if
So the CLs intervals over-cover they are
conservative.
This method avoids the unwanted exclusion of high
masses, but it is not obvious to me that there is
not a better way, i.e., intervals that have
correct (or close) coverage but are on
average more stringent. I want to think about
this more.
18Fit method
Treat m and s as independent parameters (not
related à la SM).
Maximize L
Now consider background-only hypothesis, i.e., s
0 (m doesnt enter)
and find its pdf.
Define test statistic
Use this to get p-values, limits (regions in m, s
plane) as before.
19Systematics
Response of measurement apparatus is never
modelled perfectly
y (measured value)
model
truth
x (true value)
Model can be made to approximate better the truth
by including more free parameters.
systematic uncertainty ? nuisance parameters
20Nuisance parameters
Techniques for treating nuisance parameters
discussed at recent PHYSTAT meetings (Cranmer,
Cousins, Reid, ...) Here consider two
methods Profile likelihood Modified profile
likelihood ( Cousins-Highland)
21Profile likelihood
Suppose the likelihood contains a parameter of
interest, m, and some number of nuisance
parameters l. Define the profile likelihood as
Using this construct
and construct p-values, intervals, etc. as
before. See e.g. 2003 and 2005 PHYSTAT papers by
Kyle Cranmer.
22Modified profile likelihood
Treat l as random in Bayesian sense, i.e. having
a prior
(e.g. based on other measurements)
Define modified profile likelihood
Use this to find (modified profile) likelihood
ratio, determine tests, p-values, intervals, etc.
Equivalent to having Nature repeat the experiment
by resampling each time l from p(l), and is
essentially (I believe) the prior predictive
ensemble approach used by CDF.
23Modified profile likelihood (2)
This approach effectively averages over p-values,
which is essentially the Cousins-Highland
method. Kyle Cranmer has pointed out that the
intervals derived from this approach undercover,
i.e., one would need more data to exclude the
background-only hypothesis that otherwise
needed. This issue needs to be understood in
detail.
24Extra slides
25Likelihood ratio limits (Feldman-Cousins)
Define likelihood ratio for hypothesized
parameter value s
Here is the ML estimator, note
Critical region defined by low values of
likelihood ratio. Resulting intervals can be one-
or two-sided (depending on n).
(Re)discovered for HEP by Feldman and Cousins,
Phys. Rev. D 57 (1998) 3873.
26More on intervals from LR test (Feldman-Cousins)
Caveat with coverage suppose we find n gtgt
b. Usually one then quotes a measurement
If, however, n isnt large enough to claim
discovery, one sets a limit on s. FC pointed out
that if this decision is made based on n,
then the actual coverage probability of the
interval can be less than the stated confidence
level (flip-flopping). FC intervals remove
this, providing a smooth transition from 1- to
2-sided intervals, depending on n. But, suppose
FC gives e.g. 0.1 lt s lt 5 at 90 CL, p-value of
s0 still substantial. Part of upper-limit
wasted?
27Properties of upper limits
Example take b 5.0, 1 - ? 0.95
Upper limit sup vs. n
Mean upper limit vs. s
28Upper limit versus b
Feldman Cousins, PRD 57 (1998) 3873
b
If n 0 observed, should upper limit depend on
b? Classical yes Bayesian no FC yes
29Coverage probability of confidence intervals
Because of discreteness of Poisson data,
probability for interval to include true value in
general gt confidence level (over-coverage)
30Discussion on limits
Different sorts of limits answer different
questions. A frequentist confidence interval
does not (necessarily) answer, What do we
believe the parameters value is? Coverage
nice, but crucial? Look at sensitivity, e.g.,
Esup s 0. Consider also politics, need
for consensus/conventions convenience and
ability to combine results, ... For any result,
consumer will compute (mentally or otherwise)
Need likelihood (or summary thereof).
consumers prior
31Cousins-Highland method
Regard b as random, characterized by pdf
?(b). Makes sense in Bayesian approach, but in
frequentist model b is constant (although
unknown). A measurement bmeas is random but this
is not the mean number of background events,
rather, b is. Compute anyway
This would be the probability for n if Nature
were to generate a new value of b upon repetition
of the experiment with ?b(b). Now e.g. use this
P(ns) in the classical recipe for upper limit at
CL 1 - b
Result has hybrid Bayesian/frequentist character.
32Integrated likelihoods
Consider again signal s and background b, suppose
we have uncertainty in b characterized by a prior
pdf ?b(b). Define integrated likelihood as
also called modified profile likelihood, in any
case not a real likelihood.
Now use this to construct likelihood ratio test
and invert to obtain confidence intervals.
Feldman-Cousins Cousins-Highland (FHC2), see
e.g. J. Conrad et al., Phys. Rev. D67 (2003)
012002 and Conrad/Tegenfeldt PHYSTAT05
talk. Calculators available (Conrad, Tegenfeldt,
Barlow).
33Interval from inverting profile LR test
Suppose we have a measurement bmeas of b. Build
the likelihood ratio test with profile
likelihood
and use this to construct confidence
intervals. See PHYSTAT05 talks by Cranmer,
Feldman, Cousins, Reid.