Title: Introduction to Analysis Methods Part I
1Introduction to Analysis Methods Part I
2Outline
- Steps of Data Analysis
- Signal significance
- Cut based and advanced analysis methods
3Analysis and Analysis Methods
- Extraction of physical parameters from data
- Mass
- Charge
- Cross section
- Branching fraction
- What is the best method to use?
- Unbiased we can usually correct for this if
bias is known - Efficient smallest variance or uncertainty
- Biased estimator usually has less sensitivity
4Data Analysis Steps
All Data
- High efficiency for data
- Detector/trigger problems
- Not calibrated
Cleaned up Calibrated Data
- Reject events with known detector problems
- Reject unnormalizable data
- Reject cosmic ray events
- Apply detector calibration
Preselection
- Select on triggers, loose selection
- high efficiency signal
- moderate background rejection
- Understand background
Final Selection
5Data Analysis Steps
- Data clean up
- Reject known bad detector problems
- Reject unnormalizable data problem with
luminosity measurement (usually due to triggering
problems) - Cosmic ray event rejection
- Apply Calibration/correction
- Tracker Known energy loss accomodated in
software. Additional correction usually necessary - Calorimeter Extremely important, raw energy
unusable. - For data/MC comparisons, efficiency correction
made to the simulation
6Data Analysis Steps preselection
- Trigger selection
- For signal sample use unprescaled
- Trigger efficiencies measured on unbiased sample
for a well-known samples - - Usually dictated by final state particles and
kinematics - Offline selection
- Triggered objects are necessarily dirty
speed, efficiency - Objects from offline reconstruction are closer to
physics - Kinematic, geometric criteria
- Topological (angular)
Triggered Object
Reconstructed Object
7Data Analysis Steps Selection
- After preselection, data is dominated by
backgrounds - This data should be used to check simulations of
backgrounds extensively assuming the signal is
small - The aim of selection is to reject background
while keeping signal as much as possible - Isolate signal
- How much is good?? Best measurement is what we
wantor Best possibility for discovery
8Toy Example
- A mix of signal (gaussian) and background (flat)
in some phase space - Hypothetical signal is at 50, gaussian width of 5
- Ratio of signal to background 3100
- With 1000 events you dont see anything
9Toy Example with more statistics
- With enough data, even with poor S/B the bump
will eventually be seen - With x10 more data you clearly see things
10Background Prediction
- Dont define signal region by looking at the data
first! - Expected background events in signal region
obtained by - Sideband method
- MC Simulated events
- ? We get mean number of expected bkg events
- Expect 68 of the times that the actual of bkg
events is
Signal window
11Figure of Merit Signal Significance
- Lets say total number of events from data in
signal window is N - Goal of event selection make N sufficiently
greater than expected background (B)
B
Signal significance
Signal window
12Signal Significance and Measurement
- Lets say our measurement is cross section
- Fractional error on cross section measurement
- If we can neglect other errors
- ? Maximizing signal significance minimizes
measurement error.
13Signal Significance in Our Toy Example
- 15 bins 193 3000
- Sqrt(3300)57
- (3300-3000)/57 5
14With Larger S/B
- Same 300 signal events
- Bkg
- 15 94 1410
- Sqrt(1700) 41
- Significance 270 / 41 6.6
- Optimal selection
- If two different selections have different B for
the same S, then one with smaller B is more
optimal - Real life selection reduces both B and S. In this
case, one having greatest signal significance is
the one to use
Compare Errors with Previous page
15Signal Significance and Discovery
- With more luminosity, signal significance
improves - For searches, we usually deal with S/B and
sometimes you see the significance defined as
follows
16Cut Based Analysis
- Series of selections to increase signal
significance - From the simple to complicated topological
variables - Finally, distribution of the most sensitive
variable is made - Simple counting
- Distribution fitting
- Problems of cut-based analysis
- Throws away signal
- Which variables to cut on?
- Which order to cut?
17Samples with Different S/B
- Phase space occupied by data is multidimensional
- Measurements pT, h, ? of leptons and jets
- In this space, there are regions with different
S/B - What is the best way to use this of data to make
the best measurement? A Use them separately
18Optimal Analysis
- Optimal decision theory tells us toanalyze data
after binning the data with different S/B or some
monotonic function of S/B - Of course, we need ways to calculate S/B or
S/(SB) given an event
19Signal and Background Likelihoods
- Given a set of measurements in an event, we ask
what is the probability that this event is due to
signal? - LS (LB) are called signal (background) likelihood
- Connection with Bayes theorem
20Example
Absolute value of Likelihood is meaningless
x
x1
x2
- x1 PS(x1)0 ? P(Sx1)0
- x2 PS(x2) 2PB(x2) ? P(Sx2)2/3
21Optimal Analysis in Multidimensions
- Calculating the likelihoods in multidimensions is
difficult - We must rely on multivariate analysis techniques
- Multidimensional likelihoods
- Neural networks
- Boosted decision trees
- Support vector machines
-
- What do they do?
- They try to reconstruct P(Sx) from training
samples - We treat multivariate analysis tools as mostly
black boxes - ROOT v5 has TMVA class with many of these methods
implemented
22Multivariate Analysis
- All we really need to know is the likelihood
ratio, or posterior probability for signal given
data - A general method is given by the Artificial
Neural Network (ANN) - With 1 hidden layer, it can approximate any
continuous function - With 2 hidden layers, it can approximate even
discontinuous functions
23NN Training
- Important parts are the weights and thresholds
- of hidden nodes
- And sometimes the non-linear response function g
- We train to get these parameters
- Backpropagation training
- Prepare a training sample and a test sample of
signal and background - For signal, desired output is 1. For bkg, 0
- Minimize training error
- Numerical methods are used
- There are many implementations of ANN
24Neural Network Example
- 2-D
- Signal points rlt1
- Background points 1ltrlt2
- NN architecture - 1 hidden node
251 Hidden Node Training Result
26- Large portion of the background is mistaken as
signal
272 Hidden Nodes
- Training error is reduced
28Example
29(No Transcript)
307 nodes
317 Hidden Node Result
- How many hidden nodes?
- Depends on how complicated signal and background
distributions in multidimensional space is
32Higgs Search Example at D0
33Conclusion
- In a cut-based analysis, goal is not to improve
S/B, but to improve signal significance - For optimal analysis, we should make use of all
data where signal is present, no matter how small - Selection should be minimal, leave the rest up to
multivariate tools
34References
- Particle Data Groups Statistics Review
http//pdg.lbl.gov/2007/reviews/statrpp.pdf - Probability and Statistics in Experimental
Physics Springer-Verlag - Look up ROOT TMVA class in v5.12 or higher
http//root.cern.ch/root/Reference.html - Phystat 2003 conference proceedings
http//www.slac.stanford.edu/econf/C030908/ - Advanced material