Ballistics DNA - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

Ballistics DNA

Description:

The probability to find a known match in the Top10 decreases even if the score does not change ... against the same large database (with no match in it) ... – PowerPoint PPT presentation

Number of Views:99
Avg rating:3.0/5.0
Slides: 37
Provided by: gag3
Category:
Tags: dna | ballistics | match

less

Transcript and Presenter's Notes

Title: Ballistics DNA


1
(No Transcript)
2
Ballistics DNA
Alain Beauchamp, PH.D.
3
The path to a ballistic probability model
  • PART I Correlation score and probability
  • PART II Ballistic probability model
  • PART III How could we implement a probability
    model in a ballistic system?
  • Conclusion and future work

4
Part I Correlation scores and probability
  • Strengths and limitations of the current
    correlation score
  • Why are correlation scores hard to interpret?
  • Benefits of a probability score

5
Strength and limitations of the correlation score
  • In the last 15 years, the correlation score has
    been in the core of FTs ballistic systems
  • Strength of a correlation score
  • Useful as a ranking tool
  • Can compare score values computed with the same
    reference A (and same type of mark)
  • Score(A against B) Score(A against C) means
  • B looks more similar to A than C does

6
Strength and limitations of the correlation score
(Contd)
  • Limitations of a correlation score
  • Correlation score is hard to interpret
  • Not useful as an intrinsic similarity measure
  • Examples
  • Cannot compare score values computed with
    different references (same type of mark)
  • Score(A-B) Score(C-D)
  • DOES NOT mean that
  • the A-B pair looks more similar than the C-D
    pair
  • Cannot compare score values computed from
    different marks
  • Score(A-B) for the Firing Pin Score(A-B) for
    BreechFace
  • DOES NOT mean
  • B looks more similar to A on the FiringPin than
    on the BreechFace

7
The score is hard to interpret. Why?
  • 5 reasons
  • 1 Different algorithms for different marks
  • Characteristics of the correlatable features and
    the geometry are very different
  • FP/BF circular contour and a wide variety of
    features
  • Ejector/Rimfire polygonal contour
  • Bullets stria only
  • 2 Algorithms change over time

8
The score is hard to interpret. Why? (Contd)
  • 3 No unique cartridge or bullet score
  • More than 1 score per exhibit
  • Cartridge cases
  • BF/FP/Ejector scores
  • Bullets (Land)
  • MaxPhase2D, PeakPhase2D, PeakScore2D
  • 3DScore
  • Number of score per exhibit expected to increase
    in the future
  • Cartridge cases 3D scores
  • Bullets
  • Added 3D Land score
  • GEA scores?

9
The score is hard to interpret. Why? (Contd)
  • 4 Effect of the database size
  • As the database size increases, the probability
    to find non matches that look similar to a given
    reference increases
  • The probability to find a known match in the
    Top10 decreases even if the score does not
    change
  • The score value alone is not sufficient. The
    database size is an important factor as well.
  • Universal law, not only in ballistics systems

10
The score is hard to interpret. Individual Score
Response
  • 5 Each reference has its own score response.
  • Example
  • If two cartridges A and B are correlated against
    the same large database (with no match in it)
  • Sometimes get two very different list of scores
  • For example, scores associated with A could be
    greater then scores associated with B

11
The score is hard to interpret. Individual Score
Response (Contd)
  • Experiment Correlate 9LG bullets against the
    same large database (800 non matches) with
    BulletTRAX-3D
  • Compare their non match score distribution
  • Significant differences
  • high score region
  • position of the peak
  • Each bullet has its own statistical distribution
    of non match scores
  • No universal score response common to all
    bullets

9LG Bullet A
9LG Bullet B
12
Solution Convert scores into probabilities
  • Each of the previous problems can be solved using
    probabilities (in principle)
  • Different Algorithms
  • Probability is a common concept for all score
    types
  • Algorithms change over time
  • Probability value may still change, but slightly
  • Distinct score response for each bullet/cartridge
  • Probability is a common concept for all exhibits
  • Effect of database size
  • Statistical models based on relevant data could
    quantify this effect
  • More than 1 score per bullet/cartridge
  • Compute a probability for each score and combine
    them to find a unique probability for the
    bullet/cartridge

13
How could we combine probabilities? Cartridge case
  • Assume
  • we have a BF and a FP score for a pair of
    cartridge cases AND
  • the 2 following probabilities are known
  • P(FP) Confirmed match according to FP
  • P(BF) Confirmed match according to BF
  • 4 possible scenarios
  • Confirmed match according to BOTH FP and BF
  • Confirmed match according to FP ONLY
  • Confirmed match according to BF ONLY
  • Not a confirmed match

14
How could we combine probabilities? Cartridge
case (Contd)
  • FP/BF marks provide independent information
  • A combined probability is computed by assuming
    independent information
  • P Combined 1 (1-PBF)(1-PFP)
  • Results
  • A mark with a low probability has no effect on
    the combined probability
  • As we add marks, the combined probability
    improves
  • Easy to generalize for 3 independent marks

15
How could we combine probabilities? Bullets
  • The 4 bullets scores are not computed from
    independent information
  • Are computed from the same areas on the bullet
  • A combined probability cannot be computed by
    assuming independent information
  • Keep the highest probability only (conservative)

16
Conclusion Part I
  • The probability of being a match is a more
    meaningful concept than correlation score
  • Using probability solves all problems found with
    the interpretation of correlation scores
  • Probabilities of individual marks can be combined
    nicely
  • Challenge Compute the probability of being a
    match for individual marks
  • Two main unknowns
  • How to deal with the individual score response of
    each cartridge/bullet
  • How to predict the effect of database size

17
Part II Ballistic Probability Model
  • Goal and constraints of the model
  • Hypothesis
  • Tests and results

18
Statistical model of scores Goal Constraints
  • Project started in 2003
  • Goal Develop a model which
  • Converts the correlation score of a mark into a
    probability of being a match
  • Current constraints
  • We only have database of sister pairs
  • Tests with BulletTRAX-3D scores
  • The model should find the same performance as the
    large database study
  • As the database size increases, the probability
    to find a known match in the first position
    should decrease

19
Ballistic Statistical Model Hypothesis
  • Any mathematical or physical model starts with a
    small number of hypotheses/laws/axioms
  • Need hypotheses for the (3D bullet) ballistic
    model
  • Need to find something common to all bullet score
    distributions
  • However, each bullet has its own score response

20
Hypothesis (Contd)
Non Match Statistical distribution
  • Experiment already discussed
  • Correlate 9LG bullets against the same large
    database (800 non matches)
  • Compare their non match score distribution (3D)
  • Differences
  • in the high score region
  • in the position of the peak
  • Similarity
  • The distributions have a similar shape

9LG Bullet 1
9LG Bullet 2
21
Hypothesis (Contd)
  • Core Hypothesis
  • The non match score distribution of all bullets
  • Has the same universal shape (up to a shift and
    stretch factor)
  • This shape is independent of calibre, material
    and quality of the marks
  • Can be broken into two hypotheses
  • Hypothesis I
  • The non match score distribution of each bullet
    is fully characterized by only two parameters
  • its mean (position of the peak)
  • its width
  • Hypothesis II
  • If we remove the effect of these 2 parameters,
  • the non match score distributions of bullets are
    strictly identical
  • The effect of the 2 parameters is removed as
    follows
  • Shift the overall distribution at the same peak
    position for every bullet
  • Shrink or expand the overall distribution to get
    the same width for every bullet

22
Hypothesis (Contd)
  • The effect of the 2 parameters is removed as
    follow
  • Shift the mean to 0
  • Shrink to unit width
  • Get very similar distributions!
  • Small variations due to limited data

9LG Bullet 1
? ?
9LG Bullet 2
23
Ballistic Statistical Model Testing the model
  • 4 steps
  • Compute 3D correlation scores from a large
    database study with BulletTRAX-3D
  • 4 calibers, 2 materials/compositions
  • Compute the individual parameters for each bullet
    (Hypothesis I)
  • mean and width of its non match score
    distribution
  • Determine a Universal Non Match score
    distribution
  • (Hypothesis II)
  • By simulations, predict the performance of the
    correlation algorithm as a function of database
    size

24
Testing the model Database General Information
Pittsburgh bullets database (Allegheny County
Coroners Office Forensic Laboratory Division)
25
Testing the model Compute individual parameters
  • For each bullet
  • get an approximation of the universal
    distribution (Hypothesis II)
  • The scores are normalized by this process
  • For each bullet
  • Mean and width are computed
  • The distribution is
  • Shifted the mean to 0
  • Rescaled to unit width

?
26
Testing the model Define a universal non
match distribution
  • Add up the approximated universal distributions
    found for all bullets
  • Smooth shape even in high score region
  • Universal Normalized distribution for non match
    scores

27
Testing the model Simulations
  • The simulation reproduces the operations done in
    a real large database study
  • Real study (with sister pairs)
  • For each reference bullet
  • Introduce its known match in the database of size
    N
  • Compute all correlation scores between the
    reference and (N1) bullets in the database
  • Find the rank of the known match
  • Compute the performance of the correlation
    algorithm (number of known matches at the first
    position)

28
Testing the model Simulations (Contd)
  • Simulation
  • For each reference bullet
  • Select randomly N non match (normalized)
    correlation scores from the universal score
    distribution
  • Normalize the (known) score of its known match by
    using
  • the references individual parameters (mean and
    width of its non match score distribution)
  • Introduce the normalized score of its known match
    in the (generated) non-match score list
  • Find the rank of the known match
  • Compute the simulated performance of the
    correlation algorithm
  • Repeat the same process for several databases
    sizes N

29
Testing the model Simulations (Contd)
Probability that the sister is at the first
position as a function of its normalized score
S
  • Dark circles experimental data
  • Dark curve
  • Result from the model
  • Gray curves Prediction for other database sizes

8
30
Testing the model Simulations (Contd)
  • Summary of the figure
  • If the sister has a normalized score 8
  • The probability to be in first position is
  • 90 for N 500
  • 70 for N 2K
  • 20 for N 10K
  • If we want the sister to be at the first position
    with a 95 probability,
  • its score must be
  • 9 for N 500
  • 10 for N 2K
  • 12 for N 10K

31
Part II Summary
  • A statistical model of non match scores was built
  • a database of 2000 bullets, 4 calibers, 2
    compositions/materials
  • 3D correlation on BulletTRAX-3D
  • Hypothesis
  • The non match score distribution has the same
    shape for all bullets
  • (except for a shift and stretch factor)
  • The model computes the probability that the
    sister with a given score is in first position
  • The prediction agrees with the actual performance
    in the large database study
  • Performance decreases as the database size
    increases

32
Part III
  • How could we implement a probabilistic model in a
    ballistic system?

33
How could we implement a probabilistic model in a
ballistic system?
  • Correlate a given bullet against a large database
  • From the (large) list of scores, compute the two
    characteristic parameters of the reference bullet
  • mean and width of its non match score
    distribution
  • Compute the probability that the bullet in the
    first position is a match by using
  • The universal non match score distributions
  • Two characteristic parameters computed previously
  • Actual score of the bullet at the first position
  • Information about match score distributions
    (unknown yet)

34
How could we implement a probabilistic model in a
ballistic system? (Contd)
  • Repeat the same process for all score types
  • MaxPhase2D, PeakPhase2D, PeakScore2D
  • 3DScore
  • Combine the 4 probabilities into a unique
    probability for the bullet

35
Future work
  • Improving the model with new large database
    studies (new calibers)
  • Test on cartridges
  • Get a better knowledge of sister score
    distributions
  • The current study was done with sister pairs only
  • Use the model to improve correlation algorithms

36
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com