Title: Rearranging the equation gives tsvn
13.2 Hypothesis Testing continued - confidence
limits null hypotheses
3.2.1 The comparison of an experimental mean with
an accepted value.
When testing the accuracy of a method using an
SRM for example. The null hypothesis is that
the mean is not significantly different from the
accepted value. Any differences are due to
random errors. The probability that the accepted
value lies within the ranges of the experimental
value can be assessed by using µ ? ts/vn
or µ ? z s/vn depending on
whether s is a good estimate of s or not.
Rearranging the equation gives (µ - ?) ts/vn
Using the t values from t-tables we can
calculate the critical value and then compare
that to the observed difference between the
experimental and accepted values. If (µ -
?)observed gt ts/vn then the null hypothesis is
rejected.
The critical value ts/vn t depends on DOF and
CL
2(?1 µ)crit ? ts/?N
3.2.1 The comparison of an experimental mean with
an accepted value continued
- Problem Consider the lead data.
- If the data of set 1 related to a SRM whose
accepted Pb is 409ppm, are the data accurate? - If only one analysis had been done giving a
result of 429ppm, is the result accurate? - If all the data are included and the true value
was 429, are the results accurate?
95 confidence level 5 DOF t 2.57
Pb ppm set 1 398, 399, 429, 397, 393, 413
?1 405, s1 13.6, n1 6
Critical value ? ts/?N ? 2.57(13.6)/?6
14.3. (?1 - µ)obs (409 - 405) 4.
The observed difference is less than the critical
value the null hypothesis is retained. If there
is a systematic error (bias) it cannot be
detected at the 95 confidence level.
33.2.1 The comparison of an experimental mean with
an accepted value continued
b) For the 429 datum point alone DOF 1, CL
95 (P 0.05), t 12.7 (?1 µ)crit ? ts/?N
? 12.7(13.6)/?1 172. (?1 - µ)obs (429 -
409) 20. Again the null hypothesis is
retained.
c) all set 1 data accepted value 429 ppm
DOF 5, CL 95, t 2.57. (?1 µ)crit ?
ts/?N 14.3 (?1 - µ)obs (429 - 405) 24.
The null hypothesis has to be rejected. A
systematic error(s) has been detected.
SWH pages 51 - 53, example 4.4 problems 4.12
4.13. MM section 3.2, problem 2.
3.2.2. Comparing two experimental means.
Critical value (?1 - ?2 )crit ? t
spooled?(N1 N2)/N1N2 (µ1, µ2, s z)
spooled ??(xi - ?1 )2 ?(xi -
?f)2/n1 nf - nt spooled can only be
used when the si are not significantly different.
43.2.2. Comparing two experimental means continued.
Considering the two sets of Pb data. a) show that
the two means are not significantly different at
the 90 confidence level? b) At what level of
confidence are they significantly different?
Pb ppm set 2 414, 435, 404, 391, 409, 405,
395. ?2 409, s2 14.5, n2 7
Pb ppm set 1 398, 399, 429, 397, 393, 413
?1 405, s1 13.6, n1 6
Critical value (?1 - ?2 )crit ? t
spooled?(N1 N2)/N1N2 (µ1, µ2, s z)
a) spooled ?? (xi - ?1 )2 ?(xi -
?2)2/n1 n2 - nt 14.1ppm (see
2.3.1) DOF (6 7 2) 11 at CL 90
t 1.8 (12 DOF t 1.78 10 DOF t
1.81) (?1 - ?2)crit 14.1 while (?1 - ?2)obs
4 The null hypothesis is retained.
si v?(xi -?)2/(ni-1) si2 ?(xi -
?)2/(ni-1) (ni 1)si2 ?(xi - ?)2.
53.2.2. Comparing two experimental means continued.
Example Considering the two sets of Pb data. a)
show that the two means are not significantly
different at the 90 confidence level? b) At
what level of confidence are they significantly
different?
Pb ppm set 2 414, 435, 404, 391, 409, 405,
395. ?2 409, s2 14.5, n2 7
Pb ppm set 1 398, 399, 429, 397, 393, 413
?1 405, s1 13.6, n1 6
Critical value (?1 - ?2 )crit ? t
spooled?(N1 N2)/N1N2 (µ1, µ2, s z)
b) Since (?1 - ?2)obs 4 (?1 - ?2)crit must
be lt 4 4 tcrit (14.1)?(13/42) tcrit 0.51
i.e. they will never test to be different.
Example 4-8, SWH, 7th Use the test of two means
to calculate a detection limit for a method
measure ? s for a blank and then ask what would
the minimum result have to be for it to be
significantly greater than the blank.
SWH problems 4.14, 4.16.
63.2.3. Comparison of the Precisions of
Measurements.
Test on variances
F test Fobserved s12/s22 (s1 gt s2) Fcritical
from tables. If Fobserved gt Fcritical then reject
the null hypothesis
Has a change in the analytical method produced
more precise data? Does a new analyst produce
more precise results than an existing
analyst? Has the precision of the results from an
analytical method changed over time?
snew lt sold - a 1 tailed test
snew ltsexisiting or snew gt
sexisting a 2 tailed test
See t-tables for how to handle 1 vs 2
tailed t-tests.
MM, 7th, A-3
DOF
73.2.3. Comparison of the Precisions of
Measurements continued.
Problem Is the precision for the first Pb data
set significantly less than that for the second
set at the 95 confidence level? Are the two
precisions significantly different at the 95
confidence level?
F test Fobserved s12/s22 (s1 gt s2) Fcritical
from tables. If Fobserved gt Fcritical then reject
the null hypothesis
Pb ppm set 1 398, 399, 429, 397, 393, 413
?1 405, s1 13.6, n1 6
for the one tailed test for 6 DOF in the
numerator and 5 DOF in the denominator Fcritical
4.95. ? the precision of the 1st set is
not significantly less than that for the 2nd set
at the 95 confidence level.
Pb ppm set 2 414, 435, 404, 391, 409, 405,
395. ?2 409, s2 14.5, n2 7
for the two tailed test for 6 DOF in the
numerator and 5 DOF in the denominator Fcritical
6.98. ? the precisions are not significantly
different at the 95 confidence level.
S1 13.6 (5 DOF), s2 14.5 (6 DOF). ? F
s22/s12 14.52/13.62 1.14.
Examples/Problems SWH examples 4.5 4.6
problem. 4.16 MM prob. 3.4, 6, 8, 13 .
83.2.4. Comparing two experimental means revisited
(MM p47).
Critical value (?1 - ?2 )crit ? t
spooled?(N1 N2)/N1N2 (µ1, µ2, s z)
spooled ??(xi - ?1 )2 ?(xi -
?f)2/n1 nf - nt spooled can only be
used when the si are not significantly different.
Before comparing experimental means the standard
deviations must be tested for significant
differences.
When the standard deviations are significantly
different then the critical value is given by (?1
- ?2)calc. t?(s12/n1 s22/n2).
if s1 s2 (?1 - ?2)calc. ts?(1/n1 1/n2).
As before.
DOF (s12/n1 s22/n2)2 (s14/(n12(n1
- 1))) s24/(n22(n2 - 1))
rounded to the nearest integer.
si/?n standard error in the mean
93.2.4. Comparing two experimental means
revisited. continued.
- Consider problem 4, chapter 3, MM, 4th edition.
- The following data give the recovery of bromide
from spiked samples of vegetable matter, measured
by using a gas-liquid chromatographic method.
The same amount of bromide was added to each
sample. - Tomato 777, 790, 759, 790, 770, 758, 764 µg
g-1. - Cucumber 782, 773, 778, 765, 789, 797, 782 µg
g-1. - Do the recoveries from the 2 vegetables have
variances which differ significantly? - Do the mean recovery rates differ significantly?
- a) F test on variances Fobserved si2/sj2
where si gt sj. and si v?(xi -
?)2/(n-1) - s1 13.6 ? s12 184.0 and s2
10.4 ? s22 108.5. - Fobserved 184.0/108.5 1.70.
- Degrees of freedom 6 in both sets
- Fcritical (2 tailed 95 CL or P 0.05) 5.82
- Fobserved lt Fcritical retain the null
hypothesis. - The variances are not significantly different.
103.2.4. Comparing two experimental means
revisited problem 3.4 MM p71 continued.
- The following data give the recovery of bromide
from spiked samples of vegetable matter, measured
by using a gas-liquid chromatographic method.
The same amount of bromide was added to each
sample. - Tomato 777, 790, 759, 790, 770, 758, 764 µg
g-1. - Cucumber 782, 773, 778, 765, 789, 797, 782 µg
g-1. - Do the recoveries from the 2 vegetables have
variances which differ significantly? - Do the mean recovery rates differ significantly?
s1 13.6 s12 184.0 s2 10.4 s22 108.5
?1 772.6 ?2 780.9
a) Fobserved lt Fcritical
b) t-test (?1- ?2) tspooled/v((n1n2)/n1n2)
spooled2 (n1-1)s12(n2-1)s22/(n1n2-2)
(7-1) x 184.0 (7-1) x 108.5/(7 7 - 2)
146.2, spooled 12.1 ( )critical (t
12.1)/(v14/49) for 95 CL and (7 7 -2) DOF, 2
tailed t 2.18 (?1 ?2)critical (2.18
x 12.1)/(v14/49) 14.1
b) cont. (?1 ?2)critical (2.18 x
12.1)/(v14/49) 14.1 and (?1
?2)observered (772.7 780.9) 8.2 Retain the
null hypothesis
(tobserved 8.2/12.1xv14/7 0.4)
113.2.4. Comparing two experimental means
revisited problem 3.4 MM p71 continued.
- The following data give the recovery of bromide
from spiked samples of vegetable matter, measured
by using a gas-liquid chromatographic method.
The same amount of bromide was added to each
sample. - Tomato 777, 790, 759, 790, 770, 758, 764 µg
g-1. - Cucumber 782, 773, 778, 765, 789, 797, 782 µg
g-1. - Do the recoveries from the 2 vegetables have
variances - which differ significantly?
- Do the mean recovery rates differ significantly?
s1 13.6 ?1 772.6 s2 10.4 ?2 780.9
a) Fobserved lt Fcritical b) (?1 ?2)critical
14.1 (?1 ?2)observered 8.2 Retain the
null hypotheses
If s1 13.6 what would s2 have to be for the si
to be significantly different? 95 CL 6 6 DOF
2 tailed Fcritical 5.82 13.62/s22. s22
13.62/5.82 31.8, s2 lt 5.6
If cucumber recoveries were 781,
777, 779, 772, 784, 789, 781 µg g-1. (?
780.4, s 5.35) then s1 and s2 are significantly
different and ?use (?1 - ?2)calc.
t?(s12/n1 s22/n2).
s12/s22 13.62/5.352 6.46 gt Fcritical DOF
(7.8) 8 then (?1 - ?2)calc t?(s12/n1
s22/n2) 2.18v(13.62/7 5.352/7)
12.0. (?1 - ?2)observed 772.6 780.4
7.8. Still not significantly different.
12Topics 3.1 3.2 Summary
3.1 confidence limits and confidence
intervals µ ? ts/vn, µ ? zs/vn. null
hypotheses and type I and II errors.
3.2 means with accepted values (µ -
?)observed gt ts/vn means with
means Critical value (?1 - ?2 )crit ? t
spooled/?(N1 N2)/N1N2 (µ1, µ2, s z)
spooled ??(xi - ?1 )2 ?(xi -
?f)2/n1 nf - nt spooled2
(n1-1)s12 (n2-1)s22/(n1n2-2) spooled can
only be used when the si are not significantly
different.
F test Fobserved s12/s22 (s1 gt s2)
Fcritical from tables. If Fobserved gt Fcritical
then reject the null hypothesis
si v?(xi - ?)2/(ni-1) si2 ?(xi -
?)2/(ni-1) (ni 1)si2 ?(xi - ?)2.
When s1 ? s2 (?1 ?2)critical t?(s12/n1
s22/n2).
DOF (si2/n1 sj2/n2)2 (s14/n12(
n1 - 1) s24/n22(n2 - 1)
133.3 Detection of Gross Errors
Three types of errors Systematic Errors - poor
accuracy Random Errors poor
precision Gross Errors arise from mistakes
(spills, wrong pipette, incorrectly recording of
weights, mixing up of samples, , incorrectly
recording dilution factors, use of the wrong
pipette, ).
- Precautions during analyses
- Carefully record all observations (not
conclusions from observations eg 10ml pipette
to 100ml flask and not 10 fold dilution). - Carefully label all containers and keep glassware
and equipment logically organized. - If an error has been noted (spills, initial
burette reading not recorded, missed end point)
or suspected (flasks mixed up, reagent not added,
weight read incorrectly, ) then dont continue.
Note the problem.
- Precautions during analyses cont.
- Monitor data as they are obtained.
- Repeat analyses should agree within the known
precision of the analytical method. If not why
not? - Do more repeats if necessary.
- Precautions when considering data
- Always be cautious, the data may be correct and
your expectations wrong. - Do not use statistical rejection tests on small
(n ? 5) data sets. - Make sure that your decisions make sense. Think.
Justify decisions.
143.3 Detection of Gross Errors continued
3.3.1. Dixons Q test for rejections of
outliers. (SWH pages 57 59 MM
54-57) A null hypothesis the data are not
significantly different. Q10(expt) (xn -
xn-1)/(xn - x1). (n 3 7) Q11(expt)
(xn - xn-1)/(xn - x2). (n 8
10) Q21(expt) (xn - xn-2)/(xn - x2). (n
11 13) Q22(expt) (xn - xn-2)/(xn - x3).
(n 14 25) Qcrit from table for the relevant
n and CL.
xn - the point being considered, xn-1 - the
point closest to xn, xn-2 - the point next
closest to xn, etc. Sort the data and then
apply the test (in ascending or descending order)
Example Cr(VI) determined by colourimetry gave
the following data 0.0893, 0.1439, 0.0809,
0.1035, 0.1042, 0.1062, 0.1037, 0.1034, 0.1073,
0.0968. Should the smallest value be rejected?
Should the largest be rejected?
- Sort the data 0.0809, 0.0893, 0.0968, 0.1034,
0.1035, 0.1037, 0.1042, 0.1062, 0.1073, 0.1439. - xn 0.0809, 10 datum points.
- ? Q11(expt) (xn - xn-1)/(xn - x2).
- (0.0809 - 0.0893)/(0.0809 - 0.1073)
- 0.318.
- Qcrit at the 95 confidence level 0.477.
- Qexpt lt Qcrit retain the null hypothesis and
retain the datum point. - For xn 0.1439, xn-1 0.1073,
x2 0.0893. Qexpt 0.670, Qcrit 0.521 reject
the point. - Problems SWH 4.3, 4.17, 4.18 MM 3.3.
15C20J Laboratory Management and Statistics Topic
4 Monitoring the Performance of the Laboratory.
Select and verify methods
1. The reason for doing the analyses. depends
on laboratory type medical, environmental,
forensic, research, industrial )
Obtain samples
LOD LOQ (LOD 3sdl)
(x y) units and reports
Apply methods
Systematic errors types, detect and
eliminate Gross errors avoid or reject.
Types method instrumental, personal
Detect RMs, spike recoveries, inter- intra-
comparisons, statistics.
2. Errors
Eliminate training, method development,
calibrations, rejections.
Random Errors quantify and minimize
Confidence intervals (x y) y 2s
Replicates means medians, standard deviations
ranges.
3. Quality assurance programme (quality control
analyses)
16Topic 4 Monitoring the Performance of the
Laboratory.
Quality Assurance Programme
ISO 9001 ISO 17025
Document the Quality Document procedures and
activities. Conduct Quality Control analyses and
activities. Store data appropriately. Monitor
quality control results.
Samples
Report (x 2s) units
Methods
17Topic 4 Monitoring the Performance of the
Laboratory.
SRMs IHRMs, spike recoveries, inter- and
intra-comparison results, detection limits,
duplicates, calibration data,
Quality Control Charts plots of data against
time to illustrate whether or not analytical
results (production) meet specifications.
Shewhart Charts plot data on a chart which
shows the expected value and 95 and 99
confidence limits.
UAL
- Setting up chart
- Inspect the data and reject outliers (Qtest,
knowledge). - Estimate µ and s from present or previous data.
Plot µ . - Set the control limits
- UAL µ 3s/?n
- UWL µ 2s/?n
- LWL µ - 2s/?n
- LAL µ - 3s/?n
- Plot the data.
UWL
mean
LWL
LAL
Mean 95.0, s 5.8, n 2
where n is the number of observations used to
determine the value to be plotted. UWL, UAL -
upper warning and action limits LWL, LAL - lower
warning and action limits
18Topic 4 Monitoring the Performance of the
Laboratory.
Observations µ 3s/?n 99 confidence limits µ
2s/?n 95 confidence limits Some data fall
above (below) the warning and action limits -
expected statistically but only 1 and 5 times per
100 measurements respectively.
UAL
UWL
mean
LWL
LAL
Mean 95.0, s 5.8, n 2
-ve Bias in SRP
If above (below) warning limits consider analyses
for possible systematic or gross errors, take
action if indicated. If above (below) WL twice
in a row stop and take action.
Mean 0.129, s 0.0087, n 1
UAL
If outside action limits, stop the analyses,
reject the data and identify the error.
UWL
mean
LWL
LAL
- 0.129 dm-3 µmol-1 5cm-1
- 25800 dm3 mol-1 cm-1