Title: Statistical Treatment of Data 1 of
115
- Statistical Treatment of Data
- Statistics will give us information on whether
we can be confident about a result to a given
level. -
-
-
- Remember, an experiment is repeated due to the
existence of errors. -
Thus, there are no absolutes when reporting a
"determined value."
We can only reasonably repeat an experiment a
small number of times, not an infinite number of
times.
215
- Question So, what information can we get from
finite sampling? - Answer Sample Parameters (vs. Population
Parameters) - Arithmetic Mean - (Average)
- n number of values
- xi individual values
- Standard Deviation - s is a measure of random
error
variance s2 (c.f. Measurements and Errors)
315
- So if s is small, the values are close to the ,
and thus confidence of a measurement is high. - Also, on the topic of sample parameters
- The value about which there are an equal number
of points above and below, is referred to as the
median. -
- Range
(highest value - lowest)
415
- Now, from where do we get the Gaussian
Distribution or Normal Distribution? - Example from Davies and Goldsmith,
- "Statistical Methods in Research and Production,"
- "Analysis of Carbon in a Powder"
If an infinite number of samples could be taken,
obtain a smooth curve.
515
- The smooth curve is known as the
- Normal Distribution Curve or Gaussian Curve.
- It can be described by the equation
-
- s determines the breadth of the curve.
- s1lts2lts3lts4
615
- If we express the abscissa in terms of s, life is
easier. -
- In all of the curves above and those we will work
with, the total area under the curve will be
unity (Probability of finding any value 1)
715
- The area under the curve, or any portion, is
directly related to the probability of finding a
value between the defined limits. - µ 1s
- µ 2s
- µ 3s
- (Note s (or s) is unique for a given data set)
- So, the probability of measuring z in a certain
range is proportional to Area of that range. - This allows us to assign some confidence to
determined values
thus, Confidence Limits can be used for data.
815
- Evaluation of Data
- Student's t - the degree of confidence that the
true mean µ, is likely to fall within a
particular interval of measured x otherwise
known as Confidence Intervals. - values of t are in Harris
- Let's do a Confidence Interval calculation
- data 10.19, 9.89, 9.98, 10.35, 10.41
- n 5 2.236 10.164
- s (10.19 - 10.16)2 (9.89 - 10.16)2 (9.98
- 10.16)2 (10.35 - 10.16)2 (10.41 -
10.16)2/(5-1)1/2 - s 0.226
-
- t at 95 (n 5) is 2.776
- µ 10.164 (2.776)(0.226)/(5)1/2
- µ 10.16 0.28 _at_ 95 confidence
- and at 99 µ 10.16 0.47
915
- t-Test If two procedures are used to analyze for
a particular quantity ( and ) and we
want to see if the two means are different, use - "t Test"
- pooled s
If tcalc gt ttab, the two results are
significantly different at the confidence level
in question.
1015
- t-Test Example Analysis of a solution of Cl-
using two different methods - Trial 1 2 3 4 5
- Method 1 (M) 0.2876 0.2871 0.2878 0.2880 0.2875
- Method 2 (M) 0.2843 0.2848 0.2839
-
- sp
- tcalc
- For n (degrees of freedom n1n2-2) 6 at 95
confidence level, -
0.2876 0.2843
3.9 x 10-4
(0.2876 - 0.2843)/(3.9 x 10-4)((5 x 3)/(5
3))1/2 12
ttab 2.447
Because tcalc gt ttab, the two methods are
significantly different at the 95 level.
1115
- Bad Data -- Get rid of it? The Q-test
- Question Is the datum (one point) inconsistent
enough to neglect it in future calculations? -
- Answer
- Qcalc gap/range
- (nearest point - bad point)/(high -
low) - If Qcalc gt Qtab, reject point. (at 90
confidence limit) - Example
- 5.96, 5.83, 5.89, 5.68, 5.85
-
-
- for n 5, Qtab(90) 0.64.
(Never apply any statistical test with removal to
fewer than 4 points (3 must remain))
The best rejection test is known as the Q- test
Qtab gt Qcalc, Keep the point.
1215
- Calibration of Instrument Responses with Analyte
Properties Linear Least-Squares Regression
Method - Calibration Curves
- Make standard samples of known analyte amounts
- Make the amounts in each standard different
- Measure response of each standard sample
- Compensate for any Background responses
(non-zero y-intercept) - Plot response versus amount in each standard
1315
- Thus, for any response in the future (an
unknown), we can obtain the property of that
sample IFF we have a mathematical relationship. - Linear Least-Squares Regression Method allows
fabrication of a line through data points by
minimizing the vertical (ordinate or y values)
values of deviation between the points and the
calculated line.
1415
- Linear Least-Squares Method
- based on minimizing the sum of the squares of the
ordinate deviations (y-residual values) - minimization of the residuals leads to equations
for the best slope and intercept - also, a litmus test number, the Correlation
Coefficient - r can be calculated
r should be gt0.99 for a truly linear correlation
1515
- A Few Bits and Pieces
- You may have heard about the Limit of Detection
and Sensitivity when it comes to Calibration
Plots. - SIGNAL Limit of Detection (SLOD) smallest
amount of analyte giving a response significantly
different from a blank or background response - Sensitivity of Response Curve (calibration plot)
the slope of the response curve - Also, what is that thing called the Matrix?
ANALYTE Limit of Detection
- All of the other things in the sample
- Causes background responses
- Can alter the response of the analyte
- Can interfere with the analyte response