Title: Datadriven background determination in 1lepton SUSY
1 Data-driven background determination in
1-lepton SUSY
- A. Koutsman, F. Koetsveld W. Verkerke
- 21 april 2008
SUSY CSC Note12 is ready!
2One lepton mode SUSY
- Dominant backgrounds
- Top pair (separate 1l, 2l)
- Wjets
- QCD
- Zjets
- TARGET Develop methods to discover/exclude SUSY
with 1 fb-1
- GOAL estimate and understand backgrounds from
data
3Seeking SUSY
- SUSY abundant at high missing ET
- Problem
- -SM shape at high missing ET also unknown
- -no direct measurement possible (if SUSY
exists) - -MC possibly unreliable ? data-driven
estimation - Possible solution
- -measure at low missing ET high MT and high
missing ET low MT (both regions practically
SUSY-free from kinematic considerations) - -combine info and extrapolate to high missing
ET,MT - - SUSY contamination in low missing ET,MT
4Combined fit method
- Main Idea
- Define a signal region (SUSY rich) and a control
region - Construct a model for each background sample
- Use the combined model for extrapolation
- Explicitly account for SUSY contamination in
control region
- Observables
- Missing ET
- MT mass ETlepton
- Mtop invariant mass of 3 jet system with
highest sum pT
Key issues to understand - Shape of each SM
background - Amount and shape of SUSY in
control region - Amount of correlations
between observables
5Fitting the background
- In absence of correlations, simple factorizing
multi-dimensional models - E.g. Pttbar(MT,ET,mtop) P1(MT)?P2(ET)?P3(mtop)
- Combined fit model describing combined
background in control region ?
extrapolate to signal region - Ptotal(MT,ET,mtop) Ntt1l Ptt1l(MT,ET,mtop)
Ntt2l
Ptt2l(MT,ET,mtop)
Nwnj Pwnj(MT,ET,mtop)
Nsusy Psusy(MT,ET,mtop) (Ansatz model) - Ansatz model for SUSY contamination in control
region - For low values of ET, SUSY distribution described
by a gentle slope - For low values of MT, SUSY distribution is flat
N.B. All plots for 1 fb-1
6Structure
- 1) Combined fit with FIXED SHAPES
- Uses most MC information
- Proof of SUSY Ansatz concept
- 2) Combined fit with FLOATING SHAPES
- Ideally float all shape parameters
- 3) Combined fit with EXTRAPOLATION
- Test of complete procedure
N.B. All plots for 1 fb-1
7Combined fit with FIXED SHAPES
- Include generic SUSY contribution in fit (gauss
in ET, gentle slope in MT,landau in mtop) and fit
to data with SUSY (SU3) contamination - Fix shapes from pre-fits to each background
component and float only the component fractions - Combined fit with SUSY on 1 fb-1 of data
TTbar Semileptonic
TTbar Dileptonic
Wjets
SU3
missing ET
MT
Mtop
Fit
Truth Ntt2l 15 30 70
Ntt1l 509 35 502 Nwjets
194 35 173 Nsu3 293
24 271
Next step does the fit work in an unbiased way?
8Combined fit with FIXED SHAPES
- Fix shapes from pre-fits to each background
component and float only the component fractions - Run fit 1000 times on toy-MC samples drawn from
combined background p.d.f. fitted to MC-data - ? pull distributions for the yield of SUSY
sample
TTbar Semileptonic
TTbar Dileptonic
Wjets
SU3
missing ET
MT
Mtop
Fit is unbiased! All studied SUSY points work
using the Ansatz!
Next step floating shapes
9Combined fit with CORRELATIONS
- A fixed shape fit with factorizing pdfs for each
component can be compared to the simple sideband
subtraction method
(assumption shape in ET is the same for all
MT)
- Refining of the method using conditional pdfs to
describe correlations (gradual change of ET as
function of MT)
- Example of parameter correlation
fit result sliced data
- Combined fit with all non-negligible correlation
coefficients floating (shape parameters fixed)
- Effect of correlations is not huge ? Aim to
introduce a subset of most significant
correlations in final model
Next step floating shape parameters
10Combined fit with FLOATING SHAPES
- The goal is to be able to float all SM shape
parameters in the final combined fit
?data-driven (N.B. SUSY shape is the Ansatz) - Combined fit with floating shapes on 1 fb-1 of
data - The fit procedure works fine with all the Wjets
tt1l shapes floating, returning the correct
yields of the SM backgrounds and SUSY signal - Floating the MT,ET-shapes of tt2l sample makes
the fit become unstable and the parameters highly
correlated. The fit with these parameters
floating is not possible without additional
information
TTbar Semileptonic
TTbar Dileptonic
Wjets
SU3
missing ET
MT
Mtop
Next step extrapolation to signal region
11Combined fit with EXTRAPOLATION
SB2
SIGNAL
- Define two SideBands and the signal region
-
MT gt 150 ET gt 200 - Fit the combined model in the sidebands and
extrapolate to signal region (errors and
correlations propagated correctly)
SB1
TTbar Semileptonic
Fit Truth in
SIG Ntt2l 4.7 7.9 5
Ntt1l -1.1 3.9 0 Nwjets
-1.2 2.7 2 Nsu3 95.6
4.0 91
- Combined fit in the control region has enough
information to constrain and correctly
extrapolate the background samples to the signal
region - A generic SUSY model in the control region
accounts correctly for the contamination and is
extrapolated to find the right yield of SUSY
events in the signal region - Can do with minimal MC input
- Extrapolate to full parameter space
Fit Truth in
FULL Ntt2l 17 54 70
Ntt1l 485 59 502 Nwjets
227 68 173 Nsu3 287
38 271
Next step check other SUSY points
12Combined fit with EXTRAPOLATION
- Fit the combined model in the sidebands and
extrapolate to signal region (errors and
correlations propagated correctly)
- Validated the procedure for various SUSY points
- SM backgrounds and SUSY are correctly
extrapolated in an unbiased way!
Last step study systematics
13Systematic uncertainties
- How stable is the complete procedure under energy
scale variations? - Study of relative variation of measured SUSY
cross-section ssel
- Uncertainties due to MC statistics listed as
these might be dominated by small event counts
(e.g. the number of events that migrate due to
systematic check)
- Both the shape fit stability (1st column) and
extrapolation stability (2nd column) studied - Variations of order 5 ? procedure is stable
14Summary of work in CSC Note
- All of the preceding work has been included in
the CSC note. - Have enough information with 1 fb-1 in MT,ET,mtop
to constrain SM background (tt1l,tt2l,Wjets)
with a combined fitting procedure ? Allows for
clean determination of SUSY excess in data - Can do with minimal MC input (? data driven)
- Checked validity of procedure for various SUSY
points - Stability of fit studied under possible energy
scale variations - More information from additional control samples
to get a handle on the tt2l shape parameters - In the final model introduce significant
correlations
15Floating tt2l shapes
ONGOING WORK
-
- Floating the MT,ET-shapes of tt2l sample makes
the fit become unstable and the parameters highly
correlated. The fit with these parameters
floating is not possible without additional
information - Get additional information on the tt2l shape by
performing a simultaneous fit to a tt2l-enriched
sample - Why is tt2l in the 1-lepton(electron) mode after
SUSY selection? - What tt2l events make it through?
0t 40 1t 51 2t 9
16tt2l events in 1-lepton mode
- What happens with the 2nd lepton for 0t events?
- How does tau decay in 1t events?
mis-ID 83 other 17
N.B. tt ? ee truth info
2-particle 72 3-particle 28
17Using tt1l for tt2l shape
- Main idea
- W-decay into qqbar and lnu
- has the same kinematics.
- Use the tt1l sample
- to simulate the shape
- of tt2l in ET,MT
- ? Take a pure sample of tt1l
- ? find the 2 jets from W-decay
- ? substitute these jets by a lepton and a
neutrino - t 2-particle decay (1 jet 1
neutrino) - mis-ID e jet
- ? apply SUSY selection
- ? calculate ET,MT and you have the tt2l-shape
?
t / mis-ID e
- Selection of pure tt1l sample
- gt 3 light jets gt0 b-jet
- (pT gt20)
- missing ET gt 20
- 1 good lepton with pT gt20
- 150 lt mtop lt 190
- (highest pT sum of 1 b-jet and 2
light-jets) - 70 lt mW lt 90
- (gives the 2 correct light jets)
18Results with substitution method
-
- Statictics is the limiting factor for this
method very few tt1l?tt2l substituted events
survive SUSY selection - Stringent selection for pure tt1l sample
- One hard jet ? neutrino leaves 3 hard jets
- SUSY selection requires 4 hard jets
- Events with at least 5 original hard jets needed
- to survive substitution method
- Hard gluon must radiate off b- or t-quarks
otherwise - 3-jet hadronic top mass not reconstructed
?
-
- Possible solutions
- Investigate method on truth level
- Tune down SUSY selection criteria to allow more
events - Use ATLFAST to produce more ttbar events
19Outlook for combined fit method
-
- Continue studies to constrain and handle the tt2l
background in 1-lepton SUSY - Substitution method with tuned down SUSY
selection - Develop selection criteria for an unbiased
tt2l-enriched sample to be used in a simultaneous
fit (use Top group knowledge) - Change model parametrisation to distinguish SUSY
from tt2l - Use extra variables in the combined fit
- Repeat the whole procedure for 1-muon SUSY (so
far 1-electron)
-
- Continue studies to constrain and handle the tt2l
background in 1-lepton SUSY - Substitution method with tuned down SUSY
selection - Develop selection criteria for an unbiased
tt2l-enriched sample to be used in a simultaneous
fit (use Top group knowledge) - Change model parametrisation to distinguish SUSY
from tt2l - Use extra variables in the combined fit
- Repeat the whole procedure for 1-muon SUSY (so
far 1-electron) - Study the combined fit method on 100pb-1 with
possibility of putting tt2l into tt1l
combinatorics - Use FDR2 data with SU4 (low-mass high cross
section point) model as practice for first data
20Outlook for combined fit method
- CSC Study targeted at 1fb-1 of data. Will have
much less data in 2008 (2009) - Now turning to studies on what can be learned
from 10 -100 pb-1 of data - Study backgrounds with relaxed selection
- See to which extent high cross-section SUSY
points (e.g. SU4) can be seen or excluded with
limited data (FDR2)
21Back-up slides
22Multidimensional method
- Old MT Method
- extrapolate Wjets/ttbar bkg from control region
(low MT) to signal region (high MT) - Main Idea Improve method
- Try to use additional observables for
extrapolation (e.g. mtop) - Explicitly account for SUSY contamination in
control region
Overestimated by factor 2.5
Key issues to understand - Amount of
correlations between observables and
significance of correlations -
Amount and shape of SUSY in control region
23Shapes of backgrounds
24Triggers
- Trigger-efficiencies (e mode)
- 4jet50 and missing ET trigger have high
efficiency - Single electron trigger efficiency 80-90
- Single muon trigger in e-mode 0.01-0.05
- Inclusive efficiency very high
- 95-100
- results comparable to previously shown
25Results with substitution method
tt2l events in 1-lepton SUSY
tt1l?tt2l (tau-decay substituted) events in
1-lepton SUSY
tt1l?tt2l (misID substituted) events in 1-lepton
SUSY
26Release 12 Samples
- W0,1,2,3,4,5 partons
- WenunJets (n2..5) 5223-5226
- WmununJets (n3..5) 8203-8205
- WtaununJets (n2..5) 8208-8211
- T1 (MC_at_NLO) 5200
- separate at truth-level between
- semi-leptonic (e, mu, tau)
- di-leptonic (ee, mumu, tautau, emu,
etau, mutau) - SUSY
- SU1 5401
- SU2 5402
- SU3 5403
- SU4 6400
- SU6 5404
- SU8 5406