Title: STATISTICAL TESTS AND ERROR ANALYSIS
1STATISTICAL TESTS AND ERROR ANALYSIS
2(No Transcript)
3PRECISION AND ACCURACY
PRECISION Reproducibility of the
result ACCURACY Nearness to the true value
4(No Transcript)
5 How sure are you that the experimentally obtained
value is close to the true value? How close is
it?
Finding errors
Experimental error
Uncertainty of every experiment (measurement)
6SYSTEMATIC ERROR /DETERMINATE ERROR
- Reproducible under the same conditions in the
same experiment - Can be detected and corrected for
- It is always positive or always negative
- To detect a systematic error
- Use Standard Reference Materials
- Run a blank sample
- Use different analytical methods
- Participate in round robin experiments
(different labs and people running the same
analysis)
7RANDOM ERROR /INDETERMINATE ERROR
- Uncontrolled variables in the measurement
- Can be positive or negative
- Cannot be corrected for
- Random errors are independent of each other
- Random errors can be reduced by
- Better experiments (equipment, methodology,
training of analyst) - Large number of replicate samples
Random errors show Gaussian distribution for a
large number of replicates Can be described using
statistical parameters
8For a large number of experiment replicates the
results approach an ideal smooth curve called the
GAUSSIAN or NORMAL DISTRIBUTION CURVE
Characterised by
The mean value ?x gives the center of the
distribution
The standard deviation s measures the width of
the distribution
9The mean or average, ?x ? the sum of the measured
values (xi) divided by the number of measurements
(n)
The standard deviation, s ? measures how closely
the data are clustered about the mean (i.e. the
precision of the data)
NOTE The quantity n-1 degrees of freedom
10Other ways of expressing the precision of the
data
Variance s2
- Relative standard deviation
- Percent RSD / coefficient of variation
11POPULATION DATA
The experiment that produces a small standard
deviation is more precise . Remember, greater
precision does not imply greater accuracy.
12The Gaussian curve equation
It guarantees that the area under the curve is
unity
Probability of measuring a value in a certain
range area below the graph of that range
The Gaussian curve whose area is unity is called
a normal error curve. µ 0 and s 1
13The standard deviation measures the width of the
Gaussian curve. (The larger the value of s, the
broader the curve)
Range Percentage of measurements µ
1s 68.3 µ 2s 95.5 µ 3s 99.7
14EXAMPLE Replicate results were obtained for the
analysis of lead in blood. Calculate the mean and
the standard deviation of this set of data.
15NB DONT round a std dev. calc until the very end.
16Also
Variance s2
17Lead is readily absorbed through the gastro
intestinal tract. In blood, 95 of the lead is in
the red blood cells and 5 in the plasma. About
70-90 of the lead assimilated goes into the
bones, then liver and kidneys. Lead readily
replaces calcium in bones. The symptoms of lead
poisoning depend upon many factors, including the
magnitude and duration of lead exposure (dose),
chemical form (organic is more toxic than
inorganic), the age of the individual (children
and the unborn are more susceptible) and the
overall state of health (Ca, Fe or Zn deficiency
enhances the uptake of lead).
European Community Environmental Quality
Directive 50 ?g/l in drinking water World
Health Organisation recommended tolerable
intake of Pb per day for an adult 430 ?g
- Pb where from?
- Motor vehicle emissions
- Lead plumbing
- Pewter
- Lead-based paints
- Weathering of Pb minerals
Food stuffs lt 2 mg/kg Pb Next to highways 20-950
mg/kg Pb Near battery works 34-600 mg/kg Pb Metal
processing sites 45-2714 mg/kg Pb
18CONFIDENCE INTERVALS
The confidence interval is given by
Where t is the value of students t taken from
the table
19A t test is used to compare sets of
measurements. Usually 95 probability is good
enough.
20Example The mercury content in fish samples were
determined as follows 1.80, 1.58, 1.64, 1.49 ppm
Hg. Calculate the 50 and 90 confidence
intervals for the mercury content.
50 confidence t for n-1 3
There is a 50 chance that the true mean lies
between 1.58 and 1.68 ppm
2190 confidence t 2.353 for n-1 3
There is a 90 chance that the true mean lies
between 1.48 and 1.78 ppm
22Confidence intervals - experimental uncertainty
23APPLYING STUDENTS T
1) COMPARISON OF MEANS
Comparison of a measured result with a known
(standard) value
tcalc gt ttable at 95 confidence level ?
results are considered to be different ?
the difference is significant!
Statistical tests are giving only probabilities.
They do not relieve us of the responsibility of
interpreting our results!
242) COMPARISON OF REPLICATE MEASUREMENTS
For 2 sets of data with number of measurements n1
, n2 and means x1, x2
Where Spooled pooled std dev. from both sets
of data
tcalc gt ttable at 95 confidence level
? difference between results is significant.
Degrees of freedom (n1 n2 2)
253) COMPARISON OF INDIVIDUAL DIFFERENCES
Use two different analytical methods, A and B, to
make single measurements on several different
samples.
Perform t test on individual differences between
results
Where d the average difference between methods
A and B n number of pairs of data
tcalc gt ttable at 95 confidence level
? difference between results is significant.
26(No Transcript)
27ttable for 95
confidence
tcalc ttable ?
28F TEST
COMPARISON OF TWO STANDARD DEVIATIONS
Fcalc gt Ftable at 95 confidence level ? the
std dev.s are considered to be different
? the difference is significant.
29(No Transcript)
30Q TEST FOR BAD DATA
The range is the total spread of the data. The
gap is the difference between the bad point and
the nearest value.
Example
12.2 12.4 12.5 12.6
12.9
Gap
Range
If Qcalc gt Qtable ? discarded questionable point
31EXAMPLE The following replicate analyses were
obtained when standardising a solution 0.1067M,
0.1071M, 0.1066M and 0.1050M. One value appears
suspect. Determine if it can be ascribed to
accidental error at the 90 confidence interval.
Arrange in increasing order 0.1050M
0.1066M 0.1067M 0.1071M
?
Gap
Range
0.7619
Qtable 0.76
Qcalc gt Qtable ? can reject
BUT these values are very close ? rather do
another analysis to confirm!!!
32STATISTICS OF SAMPLING
A chemical analysis can only be as meaningful as
the sample! Sampling process of collecting a
representative sample for analysis
OVERALL VARIANCE ANALYTICAL VARIANCE
SAMPLING VARIANCE
33Where does the sampling variance come from?
34How much of the sample should be analysed?
Where p, q fractions of each kind of particles
present
? nR2 pq
The mass of sample (m) is proportional to number
of particles (n) drawn, therefore Ks mR2
Where R RSD as a and Ks
(sampling constant) mass of sample
required to reduce the relative sampling
standard deviation to 1
35How many samples/replicates to analyse?
Rearranging Students t equation
µ true population mean x measured mean n
number of samples needed ss2 variance of the
sampling operation e sought-for uncertainty
Since degrees of freedom is not known at this
stage, the value of t for n ? 8 is used to
estimate n. The process is then repeated a few
times until a constant value for n is found.
36Example In analysing a lot with random sample
variation, there is a sampling deviation of ?5.
Assuming negligible error in the analytical
procedure, how many samples must be analysed to
give 90 confidence that the error in the mean is
within ?4 of the true value?
For 90 confidence
t? 1.645
n 4.23 (n4)
t 3 2.353
n 8.65 (n9)
t8 1.860
n 5.41 (n5)
t4 2.132
n 7.10 (n7)
t6 1.943
n 5.90 (n6)
n 6
t5 2.015
n 6.34
37(No Transcript)
38SAMPLE STORAGE
Not only is the sampling and sample preparation
important, but the sample storage is also
critical.
- The composition of the sample may change with
time due to, for example, the following - reaction with air
- reaction with light
- absorption of moisture
- interaction with the container
Glass is a notorious ion exchanger which can
alter the concentration of trace ions in
solution. Thus plastic (especially Teflon)
containers are frequently used. Ensure all
containers are clean to prevent contamination.
39EXAMPLE (for you to do) Consider a random
mixture containing 4.00 g of Na2CO3 (? 2.532
g/ml) and 96.00 g of K2CO3 (? 2.428 g/ml) with
an approximated uniform spherical radius of 0.075
mm. How many particles of Na2CO3 are in the
mixture? And K2CO3?
Na2CO3 4.00 g at 2.532 g/ml
1.58 ml 1.58 cm3
K2CO3 96.00 g at 2.428 g/ml
39.54 ml 39.54 cm3
40VNa2CO3 1.58 cm3 VK2CO3 39.54 cm3
Particles r 0.075 mm 0.0075 cm
nNa2CO3
8.94x105 particles
nK2CO3
2.24x107 particles
41EXAMPLE Consider a random mixture containing
4.00 g of Na2CO3 (? 2.532 g/ml) and 96.00 g of
K2CO3 (? 2.428 g/ml) with an approximated
uniform spherical radius of 0.075 mm. What is the
expected number of particles in 0.100 g of the
mixture?
8.94x105 2.24x107 2.33x107 particles in 4.00
96.00 100.00 g sample
? 2.33x104 particles in 0.1 g sample
Also 8.94x102 particles of Na2CO3 and
2.24x104 particles of K2CO3 in a 0.1 g
sample
42EXAMPLE Calculate the relative standard
deviation in the number of particles for each
type in the 0.100 g sample of the mixture.
n 2.33x104 particles