Title: Data Quality Indicators
1Data Quality Indicators (DQIs) What are they, and
how do they affect me?
A
P
S
R
C
C
2DQIs Defined
- DQIs are quantitative and qualitative measures of
principal quality attributes - precision
- bias
- representativeness
- comparability
- completeness
- sensitivity
- Quantitative DQIs
- precision, bias, and sensitivity
-
- Qualitative DQIs
- representativeness, comparability, and
completeness
3The Hierarchy of Quality Terms
DQOs
Qualitative and quantitative study objectives
Attributes
Descriptive qualitative and quantitative aspects
of collected data
DQIs
Indicators of the quality attributes
MQOs
Acceptance criteria for the quality attributes
measured by project DQIs
4DQIs in the Project Life Cycle
5Precision
- Precision is the measure of agreement among
repeated measurements of the same property under
identical or substantially similar conditions - properties in environmental studies
- concentration of a contaminant
- physical measurement of some media
- A precision DQI is a quantitative indicator of
the random errors or fluctuations in the
measurement process - e.g., standard deviation or variance
6Bias
- Bias is systematic or persistent distortion of a
measurement process that causes error in one
direction - A bias DQI is a quantitative indicator of the
magnitude of systematic error resulting from - biased sampling design
- calibration errors
- response factor shifts
- unaccounted-for interferences
- chronic sample contamination
- e.g., instrument reads XX mg/L too high
7Accuracy
- Accuracy is composed of precision and bias
- Accuracy is a measure of the overall agreement
of a measurement to a known value - when random errors are tightly controlled, bias
dominates the overall accuracy - when random errors predominate, variance
dominates the overall accuracy - EPA policy to use bias and precision as separate
measures rather than accuracy
8Influence of Bias and Imprecision on Overall
Accuracy
Precise and biased
Imprecise and biased
Imprecise and unbiased
Precise and unbiased
9Representativeness
- Representativeness is the measure of the degree
to which data suitably represent a characteristic
of a population, parameter variations at a
sampling point, a process condition, or an
environmental condition - representativeness DQIs are qualitative and
quantitative statements regarding the degree to
which data reflect the true characteristics of a
well defined population - e.g., these samples are representative of surface
soil to be found in a specific area of XX square
meters -
10Comparability
- Comparability is a qualitative expression of the
measure of confidence that two or more data sets
may contribute to a common analysis - a comparability DQI is a qualitative indicator of
the similarity of attributes of data sets - e.g., groundwater data sets are comparable as
they share a common preparation and analytical
method operated under similar conditions
11Completeness
- Completeness is a measure of the amount of valid
data obtained from a measurement system,
expressed as a percentage of the number of valid
measurements that should have been collected - the DQI for completeness is often expressed as a
percentage - e.g., the percentage of valid samples for which
data for all analytes of interest were reported
12Sensitivity
- Sensitivity is the capability of a method or
instrument to discriminate between measurement
responses representing different levels of the
variable of interest - Sensitivity can be regarded as detection limit
- but this term is often used without defining what
is intended (minimum detection or quantitation) - A sensitivity DQI describes the capability of
measuring a constituent at low levels - a Practical Quantitation Level describes the
ability to quantify a constituent with known
certainty - e.g., a PQL of .05 mg/L for mercury represents
the level where a precision of /- 15 can be
obtained
13Verification
- Data verification refers to the procedures needed
to ensure that a set of data is a faithful
reflection of all the processes and procedures
used to generate the data - verification entails the examination of objective
evidence that the specified method, procedures,
and contractual requirements were fulfilled
14Validation
- Data validation is an analyte and sample
matrix-specific process to determine the
analytical quality of a specific data set - validation entails the inspection of data
handling practices for deviations from
consistency, the review of quality control (QC)
information for deviations, assessment of
deviations, and assignment of data qualification
codes - Validation can entail the examination of the data
with respect to the QA Project Plan
15Integrity
- Lack of integrity affects all aspects of data
interpretation, especially data used for decision
making - Lack of integrity includes
- manipulation of QC measurements
- drylabbing (complete falsification of data)
- manipulation of results during analysis
- failure to conduct required analytical steps
- post-analysis alteration of results
16After Verification and Validation
- The set of data are then analyzed by comparing
the results to the original objectives. In many
cases this is a comparison of the results to the
project's DQOs using data quality assessment - Data quality assessment, a five step process
- review of DQOs and sample design
- preliminary data review
- selection of statistical test
- verification of assumptions
- drawing conclusions from the data
- ...but
that's another course!
17Representativeness Statistical and
Conceptual Model-Based Approaches
18Representativeness
- Representativeness is the measure of the degree
to which data suitably represent a characteristic
of a population, parameter variations at a
sampling point, a process condition, or an
environmental condition - Representativeness DQIs are qualitative and
quantitative statements regarding the degree to
which data reflect the true characteristics of a
well- defined population
19What Does "Representativeness" Mean?
- Very vaguely defined in working English
- "seal of approval" by simple statement of writer
- there is an absence of biasing forces
- it is a miniature or replica of the population
- it is a typical or ideal case
- there is wide coverage of a population
- it enables good estimation
- it is good enough for the purposes of the study
- statistically based sampling method
20"Representativeness" at EPA
- "...expected to exhibit the average properties of
the universe or whole"
RCRA 40 CFR 260.10 - "...at locations representative of the air
entering the abatement site"
TSCA 40 CFR 763 - "...should be selected on the basis of spatial
and temporal representativeness" -
Air Programs 40 CFR 51-Apdx W - "...samples should be representative of daily
operations"
Water Programs 40 CFR 403.12(b)
21Achieving Representativeness Involves a Process
- Planning, design, and assessment
- careful attention to measurement and analysis
process - consideration of the size (amount of material)
and method for sample collection and handling - determination of adequate type, location, timing,
and number of samples to be taken - defensible approach for drawing inferences from
sample data to the target population - Sample design and measurement processes should
minimize unintentional bias
22The Process Involves Evaluating Both Micro and
Macro Scales
- Micro scale
- how well measurements taken within a sampling
unit reflect that unit - (e.g.,"parameter variations at a sampling point")
- Macro scale
- degree to which measurements from a set of
sampling units reflect the population of interest - (e.g., "accurately and precisely represent a
characteristic of a population")
23Micro Scale (Within-Sampling-Unit)
Representativeness
- An appropriate quality system to ensure quality
implementation and sample integrity - Carefully defined sampling units with correct
sampling procedures and equipment - Adequate sample support (amount of material) to
make inferences about the characteristics within
the sampling unit - Appropriate analytical methods (including sample
preparation), designed to achieve MQOs for
measurement precision, bias, and sensitivity
24What is a Sampling Unit?
- A sampling unit (SU) can be defined as the
portion of the environment for which a
measurement has meaning for its intended use - Defining SUs for a project allows us to
communicate more clearly about components of
total-study precision
25Specifying Sampling Units
- SUs can vary depending on the specific problem
they can be - as small as the physical sample itself
- something encompassing multiple physical samples
- something much larger
- In classical survey design (e.g., opinion survey)
the SU is typically an individual - SUs are less well defined in other types of
surveys (e.g., in a survey to determine blood
lead levels) - in this case, a blood sample is much smaller than
the individual - is the sampling unit the sample
of blood, or the person? - Consider how data will be used
- average over multiple units, the spatial
distribution of units, or some combination?
26Alternative Sampling Unit Definitions
- Default SU Definition
- equivalent to the physical sample (specimen)
taken - Alternative SU Definitions
- units comprised of multiple specimens in order to
obtain enough of the medium to perform all
desired analyses - units of a size adequate to collect multiple
specimens (such as collocated samples) - units defined to include a group of specimens
when individual specimens are not the unit of
interest
27Choice of Sampling Unit - What Does a Sample
Represent?
A small farm
6-in. core
1/8-acre area
28Choice of Sampling Unit - What Does a Sample
Represent?
A cove or other discrete area
25-ft area around a boat
Single ponar grab sample
29Sampling Theory Within-SU Error
- To what degree is heterogeneity within a sampling
unit inherent? - Gy refers to this as the constitution
heterogeneity. No amount of mixing or
homogenization can reduce this. - Constitution heterogeneity leads to fundamental
error - Fundamental errors are negligible for liquids and
gases without suspended solids.
30Sampling Theory Within-SU Error
- What is the distribution and variance between
small increments of the media? - Gy refers to this as distribution heterogeneity
which reflects the distribution of groups of some
number of neighboring fragments. - Grouping and segregation errors result from
distribution heterogeneity - minimize these errors by taking more increments
to form a sample of the required weight
31Heterogeneity of Pollutants Can Lead to Sampling
Errors
- h1 small scale (random fluctuations)
- h2 large scale (trends, nonrandom, bias)
- h3 cyclic phenomena
- h h1 h2 h3
- Each of these components of heterogeneity lead to
errors - Experiments to characterize these components
(using variograms) allow one to optimize a design
32Controlling Sampling Errors
- Ensure the field sampling protocol does not
distort or bias sample - should be capable of ensuring all parts of the
media (e.g., all particle sizes) have the same
probability of being included in the increment
obtained to form a sample - Ensure the laboratory subsamples represent all
the particle-size fractions - subsamples must be large enough (optimal sample
weight to accommodate the range of particle sizes - samples and subsamples should be comprised of as
many correctly obtained increments as possible
33Sampling Theory Raises Important Questions
Related to Within-SU Error
- What is the correct scale at which to sample?
- What is the correct protocol for obtaining
increments to form samples of the media of
interest? - pilot studies needed to determine the nature of
the heterogeneity - If contaminants are highly clustered on a scale
smaller than the scale of real concern, small
grabs will reveal varied results - If homogenization and subsampling does not remove
clustering, representation of a sampling unit
from a single sample will not be achievable - sampling protocols should be selected that do not
alter the characteristics of the media (e.g.,
particle- size composition)
34Empirical versus Model-Based Approaches
- Empirical statistical approach
- Involves a probability-based design to represent
the target population empirically -
- Conceptual model-based approach
- Involves developing a plan to link the results of
statistical tests about portions of the physical
system (or processes) to the target population - Both approaches are concerned with the
- degree to which meaningful inferences can be
- drawn from the results of one or more studies
35Conceptual Model-Based Approach
- Establish a conceptual site model (CSM)
- Specify hypothesis about CSM to be tested
- subpopulations to be compared
- size of the sampling units
- Develop a sound sampling plan
- choose statistical or judgmental design
- ensure sampling units are representative of
subpopulations to be compared - Evaluate process for drawing inferences
- determine the degree of scientific judgment
required to draw inferences to population and
subpopulations of interest - inferences based on outcome of statistical tests
may not be valid if a non-representative sample
36Conceptual Model-Based Example
Hypotheses
Source
Higher levels of metals are associated with fine
particles
Overbank
The proportion of fine particles is greatest in
overbanks, medium in sandbars, and least in the
open channel
Sandbar
Open Channel
Concentrations within a feature type decrease
with distance from source
37Conceptual Model-Based Example
- Physical conceptual model
- source contaminants introduced to stream system
- metals are associated with particulates in the
stream - particles sort out by complex physical processes
and tend to settle out in a nonrandom manner - concentrations expected to be higher in some
feature types and locations, and to decrease with
distance from source - Study design options to support risk assessment
- empirical probabilistic design of population of
interest - series of studies that represent the physical
process - test and confirm specific hypotheses
- narrow the definition of population of interest
- perform focused sampling of specific areas
38Conceptual Model-Based Design
- Stream divided into three reaches upper, middle
and lower (concentrations expected to decrease) - Samples representative of each geomorphic feature
(active channel, sand bar, and overbank) within
three reaches will be taken - three depth increments shallow, deep, very deep
- samples will be sieved and weighed relative to
pre-sieved samples - concentrations of metals of concern in each
fraction - Sample sizes estimated based on number of samples
needed to yield a 90 confidence of detecting a
20 difference between features, size fraction,
and reaches
39Classical Statistical Approach
- Define the population of interest
- spatial and temporal boundaries
- sampling units
- Develop a statistical sampling plan
- probability-based design, every sampling unit has
a known probability of inclusion - Evaluate process for drawing inferences from data
- how well the sampling units selected represent
the available population - how data will be used to estimate target
population parameters such as the mean and
variance - how well the sampled population allows inferences
to be made about the target population
40Strategies for Improving Within-Sampling-Unit
Representativeness
- Utilize within-sampling-unit replication
- can reduce the variability of the average by a
factor of n-1/2 - Utilize within-sampling-unit compositing
- increasing the number of increments in the sample
reduces the variability of the unit average - Increase the sample support area or volume
- expanding the definition of what area or volume
the analytical measurement will represent can
alleviate small-scale (or short-term) fluctuations
41Statistical Strategies for Improving
Between-Sampling-Unit Representativeness
- Statistical sampling schemes
- simple random sampling
- systematic (grid) sampling
- stratified random sampling
- ranked set sampling
- cluster sampling
- between-sampling-unit composite sampling
- Where to find information on environmental
applications of these schemes? - Guidance for Choosing a Sampling Design for
- Environmental Data Collection EPA QA/G-5S
42Balanced Design to Achieve Representativeness
- Understanding the relative contribution of
within-sampling-unit and between- sampling-unit
variance - focus on components of variance to which the
total variability is most sensitive
More samples to lower between- sampling-unit varia
nce
More precise measurements to lower
within- sampling-unit variance
43Assessing Representativeness
- Evaluating existing data
- representativeness affects the degree to which a
data set can be used for a purpose other than
originally intended - Use of a checklist
- promotes a thorough evaluation of the attributes
of representativeness - Use of quality assessment samples such as
collocates, splits, or other replicates - can assist in answering questions about
within-sampling-unit representativeness
44Important Attributes (Micro-level)
- Was a rationale provided to support the selection
of sampling equipment and handling procedures? - correct choice of equipment and handling
procedures directly affect degree to which the
increments and samples reflect the
characteristics of the matrix - Was the rationale to support selection of
analytical methods provided? - choice of sample preparation and analytical
instrument is critical - Were samples collected from all selected sampling
units? - incomplete sampling, if biased due to the lack of
completeness, can lead to spurious conclusions
45Important Attributes (Macro-level)
- Were study objectives adequately defined using
the DQO process or equivalent planning process? - intended use of data provides the context for
evaluating representativeness - Was the population of interest clearly defined?
- probability-based designs require the population
to be defined as a set of sampling units - Was the statistical basis for the sampling plan
explained (number of samples, their allocation)? - representativeness hinges on adequate number of
samples - different sample allocation approaches can
maximize effectiveness
46Case Study Waste Stream Characterization
- Heterogeneous, radioactive debris
-
- Mixture of media including
- Plastic
- Rubber
- HEPA filter
- Metal
- Combustible/Rags
47Packaging of the Waste Stream
- Waste items were initially placed into individual
bags that were taped and numbered. - The individual bags were placed into lined drums
referred to as parent drums. - Parent drums contained 1-15 bags of waste.
48Repackaging of the Waste Stream
- Some parent drums had to be repackaged in order
to reduce the amount of nuclear material in each
drum to meet transportation requirements. - During repackaging, some or all of the drum
contents were placed into one or more daughter
drums. - If requirements were not met for a daughter
drum, this drum was repackaged into
granddaughter drums.
49Repackaging Illustration
50Drums And Bags For Sample Selection
51Intent and Method of Sampling
- To determine if any RCRA listed waste was present
at levels exceeding the TCLP limits for burial in
a non mixed waste facility - Confirm existing process knowledge
- Random selection of
- Drums within waste stream
- Bags within drums
- Media within bags
52Constraints on Sample Selection
- Must minimize number of samples due to high
radiation concentrations. - Samples must be selected randomly.
- Parent drums may not exist due to repackaging.
Other generational (daughter, granddaughter,
etc.) drums must be included for possible
selection. - At least 2 samples from each media must be
sampled.
53Estimating Variability (Variance)
- To determine the number of samples necessary, an
estimate of the variability of the inorganics
concentrations was needed. - Similar media, without radiation contamination
(cold) , were analyzed for inorganics. - Sample size calculations for actual waste stream
were based on these variance estimates. - Combination of random, multistage, and quota
sampling methods was selected.
54Case Study Conclusions
- Well documented representativeness of the samples
to the population made this study nearly
irrefutable. - Just because an SOP/QAPP is written, doesnt mean
it will be read or properly followed. - When every sample is potentially dangerous to the
collector, transporter, and chemist, it is worth
every effort to minimize the risks due to
sampling.
55Precision Indicators Reflective of the Data
Collection Life Cycle
Planning
Implementation
Assessment
56Precision
- "Precision is the measure of agreement among
repeated measurements of the same property under
identical or substantially similar conditions." - properties in environmental studies
- concentration of a contaminant
- physical measurement (e.g., grain size) of some
media - a precision DQI is a quantitative indicator of
the random errors or fluctuations in the
measurement process
57Common Indicators of Precision
- Range
- difference between largest and smallest values
- Variance or standard deviation
- a statistical measure of the spread of data
calculated from two or more measured values - the standard deviation is the square root of the
variance - Relative range
- the Range divided by the mean of the data set
- Relative standard deviation (CV)
- the standard deviation calculated from two or
more values divided by the mean of those values
58Framework for Evaluating Indicators of Precision
- A simple model allows us to evaluate the
components and indicators of total-study
variability - within-sampling-unit variability
- measurement process
- small-scale variability
- sample acquisition
- between-sampling-unit variability
- inherent spatial variability
- sampling design error
Total-Study Variability
Within- Sampling-Unit Variability
Between- Sampling-Unit Variability
59Simple Total-Study Variability Model
Total-Study Variability
Between- Sampling-Unit Variability
Within- Sampling-Unit Variability
Small-Scale Variability (within unit)
Inherent Spatial Variability (among units)
Sampling Design Error
Sample Collection and Measurement Process
Variability
60Sampling Units
- A sampling unit (SU) can be defined as the
portion of the environment for which a
measurement has meaning for its intended use - Defining SUs for a project allows us to
communicate more clearly about components of
total-study precision
61Specifying Sampling Units
- SUs can vary depending on the specific problem
they can be - as small as the physical sample itself
- something encompassing multiple physical samples
- something much larger
- In classical survey design (e.g., opinion survey)
the SU is typically an individual - SUs are less well defined in other types of
surveys (e.g., in a survey to determine blood
lead levels) - in this case, a blood sample is much smaller than
the individual - is the sampling unit the sample
of blood, or the person? - Consider how data will be used
- average over multiple units, the spatial
distribution of units, or some combination?
62Alternative Sampling Unit Definitions
- Default SU Definition
- equivalent to the physical sample (specimen)
taken - Alternative SU Definitions
- units comprised of multiple specimens in order to
obtain enough of the medium to perform all
desired analyses - units of a size adequate to collect multiple
specimens (such as collocated samples) - units uniquely defined to measure properties of
interest when a specimen is not the unit of
interest, nearby specimens are highly correlated,
or there is an explicit desire to control the
precision within the unit
63Evaluating Sampling Unit Definitions
- Defining SUs larger than the physical specimen
has some potential benefits - clarifies whether collocated samples should be
treated as additional field samples or replicates - forces the consideration of the scale at which
measurements have meaning - facilitates a more comprehensive consideration of
sources of error affecting our understanding of
properties of interest in the environment, and
sources of variability affecting individual
measurements - Most study designs do not account for
within-sampling-unit variability in any explicit
way - tradeoffs between fewer precise measurements
versus more imprecise measurements begin to
address this issue
64Sampling Theory Raises Important Questions
Related to Within-SU Error
- What is the correct scale at which to sample?
- What is the correct protocol for obtaining
specimens? - pilot studies needed to determine the nature of
the heterogeneity - if contaminants are highly clustered on a scale
smaller than the scale of real concern, small
grabs will reveal varied results - if homogenization and subsampling does not remove
clustering, representation of a sampling unit
from a single sample will not be achievable - sampling protocols should be selected that do not
alter the characteristics of the media (e.g.,
particle- size composition)
65Components of Within-Sampling Unit Precision
Within- Sampling-Unit Variability
Small-Scale Variability (within unit)
Sample Collection and Measurement Process
Measurement Method Imprecision
Within Sample Variability
Inherent small-scale variability
Analytical instrument
Sample handling and preparation
Subsampling or homogenization
66QA Samples Used to Evaluate Components of
Total-Study Variability
67Total Within-Sampling-Unit Precision Pyramid
68Components of Total-Study Variability Example
QC Sample
Components of Variability Captured
Estimated Standard Deviation
Instrument replicate
Instrument response (IR)
0.046
Laboratory replicate
IR subsampling and extraction/digestion (SE)
0.11
Laboratory split
IR SE lab homogenization (LH)
0.12
Field split
IR SE LH sample handling (SH)
0.12
Collocated samples
IR SE LH SH field sample acquisition and
small-scale variability (ASV)
0.15
Field samples
IR SE LH SH ASV between-
sampling-unit variability
1.11
69A Simple Additive Variance Model
s2t s2b s2w s2t s2b s2m s2s
t total study w within-sampling-unit b
between-sampling-unit m measurement s
small-scale variability
70Visualizing the Contribution of Components of
Total-Study Variance
Total-Study (Field Samples)
Within-Sampling- Unit
st
(Collocates)
sw
sb
Between-Sampling-Unit
71Visualizing the Within-Sampling-Unit Components
of Variability
Analytical-Instrument Variability (Instrument
Replicates)
.046
Sample-Preparation Variability .10
.11
Total-Study (Field Samples)
Within- Sampling- Unit Variability
st
Measurement Variability (Lab Replicates)
sw .15 (Collocates)
.10 Small-Scale Variability
sb
Between-Sampling-Unit
72Calculating Variance
- Total variance
- Within-sampling-unit variance
- for duplicates for multiple replicates
- Between-sampling-unit variance
73Calculating Components of Variance from an
Existing Data Set
Sample ID
Arsenic (ppm)
Cadmium (ppm)
Lead (ppm)
Sample Type
99-7510
3.9
lt0.07
22
99-7510
2.2
lt0.06
20
lab replicate
99-7511
3.1
lt0.07
104
99-7512
2.6
5.3
37.6
99-7512a
2
5.9
34.8
field split
99-7513
2.4
lt0.07
782
99-7513
4.6
lt0.07
829
lab replicate
99-7514
2.5
0.47
35.9
99-7514a
3.1
lt0.07
37.7
field split
99-7515
4.4
1.4
17.5
99-7515a
2.9
1.6
28.3
field split
99-7516
3.2
4.5
55.2
99-7517
2.8
5.1
921
99-7517
3.5
5.6
902
lab replicate
Standard deviation
0.78
2.5
390
74Calculating Components of Variance from an
Existing Data Set (cont.)
- Within-sampling-unit variance for lead may be
calculated using the field split data - Based on all samples, the total-study variance
for lead is estimated by - Between-sampling-unit variance for lead is
calculated as
75Using Indicators of Variance
Arsenic
Cadmium
Lead
s2w
0.25
0.13
21.3
s2b
0.36
6.01
148,876
s2t
0.61
6.14
148,897
- For cadmium and lead, the total variability is
swamped by between-unit-sampling variability
(i.e., site heterogeneity) - Arsenic probably near background - most
variability comes from measurement process
76Establishing MQOs
- Decomposing total-study variance facilitates the
identification of the relative importance of
components of total error - this exercise also helps determine what kind of
QA samples to employ - Total-study variance estimates are plugged
directly into sample-size calculations - Individual Measurement Quality Objectives (MQOs)
should be established for components of variance
that primarily drive the total variability - MQOs on specific measurement components must
reflect the requirements for total-study error
77Strategies for Reducing Within-Sampling-Unit
Variance
- Replication
- Small-scale compositing
- Increasing sample support
- More precise measurement method
78Strategies for Reducing Between-Sampling-Unit
Variance
- Statistical sampling design
- value of pilot studies to obtain relevant
estimates of variance - value of more samples versus expending effort to
reduce within-sampling-unit variance - use of field screening samples in conjunction
with traditional fixed-laboratory analyses - See Guidance on Choosing Sampling Designs for
Environmental Data Collection (QA /G-5S)
79Bias Analysis and Prevention
80Bias
- Bias measured result - true value
- Relative bias measured result - true value
-
true value - When dealing with recovery rates
- Recovery 1 measured result - true
value -
true value -
- and expressed
as a percentage
81Principal Causes of Bias
- Incomplete data
- Analytical
- calibration error
- sample contamination
- matrix effects
- interferences
- Sampling
- incorrect location identification
- judgmental sampling scheme
82Bias Due to Incompleteness
- Example the objective is to estimate the
percentage of correctly documented permits for
exemption - Data obtained by requesting the holders of
permits to respond, 60 responded to the request - Of these responses, 70 were correctly
documented, does the 40 non-response rate really
matter? - True percentage (respondents x their
percentage) - (non-respondents x
their percentage) - Bias non-respondents x (difference in
percentages)
83Bias Due to Incomplete Response
- If non-responses were 70 correctly documented
- Bias 0, correct estimate is 70
- If non-responses were 50 correctly documented
- Bias 8, correct estimate is 62
- If non-responses were 30 correctly documented
- Bias 16, correct estimate is 54
- If non-responses were 10 correctly documented
- Bias 24, correct estimate is 46
84Calibration Errors Leading to Bias
85Sample Contamination Leading to Bias
- Field and laboratory
- contamination of volatile organics with vehicle
exhaust - contamination of metals with materials used in
sample collection - incorrect sample containers and unlined sample
tops - reagent water, standards contamination from
laboratory solvents (methanol and methylene
chloride), and spent membranes (periodic
maintenance) - contamination from other samples in storage
(refrigerators)
86Matrix Effects Leading to Bias
- The composition of the matrix can influence both
preparation and analysis - Non-ideal chemical behavior influences samples
differently than standards - high ionic strength water enhances purging of
volatile organic chemicals (VOCs) (bias high) - natural buffering in soil influences leaching of
lead in TCLP extraction - x-ray fluorescence can result in high (secondary
excitation) or low (matrix absorbs greater than
analyte) bias
87Method of Standard Addition
Concentration in sample
88Interferences Leading to Bias
- Atomic Absorption
- spectral cannot resolve analyte from other
species - chemical chemical processes alter absorption
characteristics of analyte - Possible resolution
- successive serial dilutions
- matrix modification
- MSA
- Gas Chromatography
- coelution
- Possible resolution
- use MS
- confirmation column two species should elute at
different times
89Loss of Sample During its Introduction into the
Instrument
- Loss of VOCs/SVOCs due to leaks/evaporation
- internal standard and surrogate areas should
provide evidence of this - Loss of reactive analytes (e.g., DDT, Endrin)
- breakdown studies prior to analysis should
provide evidence of this
No breakdown
Significant breakdown
90Sample Handling Errors Leading to Bias
- Loss of sample during collection, storage
- inadequate preservation (acid, darkness, cool,
excessive holding time) - Examples
- VOCs can undergo microbial degradation if not
acidified - metals require acidification to prevent/minimize
precipitation and adsorption to sample container
91Why Bias, Why Not Accuracy?
- Accuracy includes both precision (random error
that could be positive or negative for each
individual reading) and bias (systematic error
that is either positive or negative for all
readings) - Accuracy (mean square error) variance bias2
- Precision is estimated through replicate
measurements - Bias is estimated by comparison of the mean of
replicate measurements to a known standard - Without standards bias cannot be estimated with
confidence, only a reduction in bias is possible
92Bias Hidden as Variability
50
x
x
x
x
x
x
x
x
x
x
x
x
40
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
30
x
x
x
x
x
x
x
x
x
x
x
x
x
20
x
10
A
B
0
Is data set A or B a better representation of the
population?
93Bias Hidden as Variability (cont.)
50
x
x
x
x
x
x
x
x
x
x
x
40
x
x
m38.5
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
30
x
x
x
x
x
x
x
x
x
x
x
x
x
20
x
10
A
B
0
Both data sets have similar variability. Data
set B is a biased representation of the
population of interest.
94Sensitivity Discerning the Signal in the Noise
Response
Concentration
95 Sensitivity
- Sensitivity is the capability of a method or
instrument to discriminate between measurement
responses representing different levels of the
variable of interest - the term "detection limit" is often used without
consideration of what is really meant - there are several sensitivity DQIs including
IDLs, MDLs, and PQLs - A sensitivity DQI describes the capability of
measuring a constituent at low levels - a PQL describes the ability to quantify a
constituent with known certainty - e.g., a PQL of 0.05 mg/L for mercury represents
the level where a precision of /- 15 can be
obtained
96Calibration Standards
- Samples containing the analytes of interest are
prepared, generally in a clean matrix, to develop
a relationship between concentration and
instrument response. This can be a problem if
the matrix under investigation interacts with the
analyte differently than it does with a clean
matrix. - The relationship between concentration and
instrument response is used to predict the
unknown concentration found in samples of
interest - Calibration allows for the determination of
theoretical detection and quantification limits
97Calibration Standards - graphically
- This graph shows
- Instrument response for six calibration levels
- Each response represents an average of multiple
runs - Where are the quantification levels?
1 mg/L
2 mg/L
3 mg/L
4 mg/L
3.5 mg/L
4.5 mg/L
98Calibration Curve from Standards
- By graphing the calibration standards we can see
three regions of interest in the relationship - below the linear range
- the linear range
- above the linear range
Concentration (mg/L)
99Relationship of Instrument Calibration Curve and
Analyte Detection/Quantification
2
4
1
Region of unknown identification and quantitation
3
5
1
2
Region of less certain identification
Instrument Response
3
Region of less certain quantification
4
Region of known quantification
5
Region of less certain quantification
Concentration
MDL
PQL
IDL
LOL
Concentration
IDL instrument detection limit MDL method
detection limit PQL practical quantitation
limit LOL limit of linearity
100Commonly Used Sensitivity Indicators
Sensitivity Indicator
Numerical Definition
Definition
Common Use
Instrument Detection Limit (IDL)
Usually 3 times the instrument noise level
Lowest value at which instrument can distinguish
from zero
Provides basis for determining an MDL
Method Detection Limit (MDL)
MDL t (n-1, 0.99) x s s standard
deviation for 7 aliquots t (n-1, 0.99) 3.14
Defined 40 CFR Part 136 Appendix B
Determines the theoretical detection limit
Practical Quantitation Limit (PQL)
PQL 5 x MDL or PQL 10 x MDL (more precisely
defined as the lowest standard on the instrument
calibration curve)
"the lowest concentration of an analyte that can
be reliably measured within specified limits of
precision and accuracy during routine laboratory
operating conditions"
Provides numerical lower limit for critical data
Reporting Limit (RL)
Laboratory defined (often the RL PQL)
Lowest value reported by laboratory without a "J"
flag
Laboratory basis for data reporting
101MDL Controls the Type I Error
40 CFR Definition of MDL "99 confidence that
analyte concentration is greater than zero." By
accepting 1 chance of type I error (false
positive), we define the MDL as equal to MDL
t st. dev.
Type I error (false positive) concluding that
the analyte is present when in fact it is absent
(zero).
102MDL Does Not Control the Type II Error
If the MDL is chosen as the reporting limit,
then by default there is a 50 probability of a
type II (false negative) error. This means a
sample that is truly at the MDL will be
considered below the MDL 50 of the time.
Type II error (false negative) concluding that
the analyte is absent when in fact it is present.
103PQL Relationship to MDL
PQL Multiple definitions, generally 5-10 times
MDL, or lowest point on calibration. Requiring
5-10 times MDL usually provides precision of less
than 20 RSD (quantification).
If the PQL is 5 times the MDL, and the MDL is
about 3 times standard deviation, then the PQL is
approximately 15 times the standard deviation.
104Quality Associated with Calibration Regions
Approximate MDL Level
Approximate PQL Level
Zero Analyte Concentration
Region of high uncertainty
Region of certain detection
Region of certain quantification
Region of less certain quantification
LOD
Matrix/method blank
LOQ
10 s
3s
Instrument signal, standard deviation units
LOD limit of detection LOQ limit of
quantitation s population standard deviation
105Project Management Perspective
- There are major differences between the various
sensitivity DQIs (RL, IDL, MDL, and PQL) - even when you specify a particular indicator, it
is imperative to get a precise definition,
description of the process and formula to know
what it really means! - it is important to know if the indicator reflects
the detection limit in a clean matrix, or in an
actual sample - The PQL is usually the most useful indicator
- when selecting an analytical method or laboratory
106Project Management Perspective (cont.)
- Important to specify what results should be
reported and how - all results above the MDL should be reported
- values that fall between the MDL and PQL should
be "flagged" to indicate uncertainty in the
value - values below the MDL should be reported as Non
Detects and the MDL included
107What Drives These "Detection Limits"?
- Regulatory requirements
- primary drinking water MCLs
- risk based goals
- hazardous waste characterization
- ambient air quality standards
- Background values for comparison
- project-wide
- site-specific
108Examples of Other Sensitivity Indicators
- Detection limit (DL)
- commonly seen in regulations, no rigorous
definition - Limit of detection (LOD)
- similar to MDL, however statistical formula
modified (set at 3 times st. dev.) - Reliable detection limit (RDL)
- level where detection is extremely likely (set at
6 times st. dev.) - Minimum level (ML)
- CWA defines this as 3.18 times the MDL --
equivalent to 10 time st. dev. of a low level
standard - Limit of quantification (LOQ)
- set at 10 times st. dev.
- Estimated quantitation limit (EQL)
- SW-846 defines this similar to the PQL
- Contract required detection limits (CRDL)
- Contract required quantification limits (CRQL)
109Common Problem Analytes Mercury
80
Hg
mercury
- Problem Traditional methods do not provide
sensitivity required for CWA - CVAA Detection Limits are roughly 50-200 ng/L but
the CWA action level is at 12 ng/L - Solution Method 1631
- A gold trap purge then thermally desorbed into
atomic fluorescence spectrometer - MDL 0.2 ng/L
110Common Problem Analytes PAHs
- Benzo(a)pyrene B(a)P generally determines the
overall required sensitivity - GC/MS (standard method) has a PQL of 10 mg/L
- HPLC (Method 8310) has a MDL of 23 ng/L
- Example Problems
- Massachusetts MCL is 200 ng/L and so the more
complex HPLC must be used in analysis - Marine capping of contaminated sediments, the
mandatory QA Project Plan requires MDLs in the
range 0.42-20 ng/L and so an even more sensitive
method must be used - Single Ion Monitoring (SIM) eliminates most
interferences and can reach an MDL of 0.78 ng/L
111Controversies
- 40 CFR Part 136, Appendix B definition of MDL is
statistically confusing. However, it remains the
most widely documented DQI for sensitivity and
one of the simplest ways to calculate a detection
limit - There are too many definitions and indicators
- DL, MDL, IDL, CRDL, PQL, EQL, ML, ...
- Some action levels were chosen based on available
MDLs at the time instead being based on
health-based limits
112Controversies (cont.)
- There are still more indicators proposed
- quality control level Choose an acceptable bias
and Relative Standard Deviation prior to study n
- Kimbrough and Wakakuwa, EST Vol. 28, No. 2, 1994
- Censoring of low level values
- results in lost information because by not
reporting data below a sensitivity indicator
(e.g., PQL), information that could potentially
be utilized in decision making is lost - The correct approach is dependent upon how the
data will be interpreted and used
113Techniques for Lowering the Method Detection
Limit
5 ml
25 ml
- Use more sample material
- VOC analysis 25 ml versus 5 ml
- Improve detector sensitivity
- ICP emission spectrometers axial viewing 8-10
times more sensitive than radial viewing - samples with high dissolved solids use radial
viewing - ECD detector more sensitive than MS
- subject to interferences
1 ug/L
114Techniques for Lowering the Method Detection
Limit (cont.)
- Reduce interferences
- preparation
- column cleanup (gel permeation chromatography)
- sulfuric acid/potassium permanganate cleanup
destroys interfering pesticides in PCB analysis - instrumentation
- selective GC detectors ECD, NPD, GC/MS-SIM
- selective AA graphite furnace versus flame
- atomization techniques in GFAA
- matrix modifiers in AA to alter volatility
115Case Study 1 Calculation of MDL
- Preliminary sampling has resulted
- in request for lower detection limit for
1-chloro-1,1,2-trifluoroethane - What are options?
- more sample material
- reduced interferences
- more sensitive detector
- Elect to choose more sensitive detector
- electrolytic conductivity (ELCD) halogen
specific, more sensitive than MS
116Initial Calibration of More Sensitive Detector
(ELCD)
Concentration (mg/L)
117Select the Spiking Level
- Evaluate the calibration data
- The calculated MDL must be less than the spike
level - The spike level should not be greater than 10
times calculated MDL (prefer spike 1-5 times MDL) -
- Optional pick a spike at level where
signal/nois