Sampling Based Clerical Review Methods in Probabilistic Matching

1 / 17
About This Presentation
Title:

Sampling Based Clerical Review Methods in Probabilistic Matching

Description:

Sampling Based Clerical Review Methods in Probabilistic Matching ... Power (1- ), probability of sending a reviewable batch to clerical inspection. 0. 20 ... –

Number of Views:22
Avg rating:3.0/5.0
Slides: 18
Provided by: publiche4
Category:

less

Transcript and Presenter's Notes

Title: Sampling Based Clerical Review Methods in Probabilistic Matching


1
Sampling Based Clerical Review Methods in
Probabilistic Matching
Sampling Based Clerical Review Methods in
Probabilistic Matching
2
Clerical Review
File A
Automatically assign as a non-link
File B
Send for manual clerical review



Automatically assign as a link
3
Output of Record Pair Comparisons
  • Set of matched records with an associated
    comparison weight
  • Lots of these, high weights, low weights and
    in-between

4
Frequency
Comparison Weight
5
... but
  • Clerical review can be time consuming
  • Thousands or tens of thousands of clerical review
    pairs
  • High level of repetitive VDU based tasks can lead
    to health and safety issues

6
Frequency
Comparison Weight
7
Acceptance Sampling
  • Allows quantification of uncertainty in sampling
  • Methods
  • AS 1199 Sampling Procedures and Tables for
    Inspection by Attributes
  • DIY calculations

proc power onesamplemeans mean
5 10 ntotal 150 stddev
30 50 power . plot xn min100
max200 run
8
(No Transcript)
9
Producer's Risk (?), risk of having to review a
batch with a large number of number of non-matches
AQL Match rate for automatic rejection
10
RQL Match rate that is unacceptable to
automatically accept as non-links
11
P(send for manual review)?
100
80
60
40
20
0
0
20
40
60
80
100
Acual Quality Level
12
(No Transcript)
13
Setting a Single Cut-off
  • Sometimes there are not enough fields to do
    meaningful clerical review
  • Particularly, when we are not using names and
    addresses
  • In these cases we want to meaningfully set a
    single cut off

14
Estimated Cumulative Matches Linked
Estimated Cumulative Non - Matches Linked
Comparison Weight
15
Case Study
  • Migrants Settlement Database to Census 2006
    linkage
  • 131,000 records identified for clerical review
  • Sampling scheme was 50 batches from which 65
    records pairs was selected
  • Only 39 batches were actually inspected for a
    total of 2,535

16
Final Remarks
  • Sampling based clerical review can very
    significantly reduce the amount of clerical
    review
  • Can be used to rigorously set up cut-offs
  • Provide information on linkage quality
  • Can introduce missed and false links, but the
    extent of these can be estimated

17
Thank you for your attention
Any Questions?
Write a Comment
User Comments (0)
About PowerShow.com