Statistical Methods For UO Lab Part 1 - PowerPoint PPT Presentation

About This Presentation

Title:

Statistical Methods For UO Lab Part 1

Description:

Statistics is the science of problem-solving in the presence of variability (Mason 2003) ... 81.3% would tell an acquaintance to zip his pants. 29% of us ignore RSVP. ... – PowerPoint PPT presentation

Number of Views:36

Avg rating:3.0/5.0

Slides: 26

Provided by: larryb73

Category:

more less

Transcript and Presenter's Notes

Title: Statistical Methods For UO Lab Part 1

1
Statistical Methods For UO Lab Part 1

Calvin H. Bartholomew
Chemical Engineering
Brigham Young University

2
Background

Statistics is the science of problem-solving in
the presence of variability (Mason 2003).
Statistics enables us to
Assess the variability of measurements
Avoid bias from unconsidered causes variation
Determine probability of factors, risks
Build good models
Obtain best estimates of model parameters
Improve chances of making correct decisions
Make most efficient and effective use of
resources

3
Some U.S. Cultural Statistics

58.4 have called into work sick when we weren't.
3 out of 4 of us store our dollar bills in rigid
order with singles leading up to higher
denominations.
50 admit they regularly sneak food into movie
theaters to avoid the high prices of snack foods.
39 of us peek in our host's bathroom cabinet.
17 have been caught by the host.
81.3 would tell an acquaintance to zip his
pants.
29 of us ignore RSVP.
35 give to charity at least once a month.
71.6 of us eavesdrop.

4
Population vs. Sample Statistics

Population statistics
Characterizes the entire population, which is
generally the unknown information we seek
Mean generally designated m
Variance standard deviation generally
designated as s 2, and s, respectively

Sample statistics
Characterizes a random, hopefully representative,
sample typically data from which we infer
population statistics
Mean generally designated
Variance standard deviation generally
designated as s2 and s, respectively

5
Point vs. Model Estimation

Model development
Characterizes a function of dependent variables
Complexity of parameter estimation and
statistical analysis depend on model complexity
Parameter estimation and especially statistics
are somewhat ambiguous

Point estimation
Characterizes a single, usually global
measurement
Generally simple mathematic and statistical
analysis
Procedures are unambiguous

6
Overall Approach

Use sample statistics to estimate population
statistics
Use statistical theory to indicate the accuracy
with which the population statistics have been
estimated
Use linear or nonlinear regression
methods/statistics to fit data to a model and to
determine goodness of fit
Use trends indicated by theory to optimize
experimental design

7
Sample Statistics

Estimate properties of probability distribution
function (PDF), i.e., mean and standard deviation
using Gaussian statistics
Use student t-test to determine variance and
confidence interval
Estimate random errors in the measurement of data
For variables that are geometric functions of
several basic variables, use the propagation of
errors approach estimate (a) probable error (PE)
and (b) maximum possible error (MPE)
PE and MPE can be estimated by differential
method MPE can also be estimated by brute force
method
Determine systematic errors (bias)
Compare estimated errors from measurements with
calculated errors from statisticswill reveal
whether methods of measurement or quantity of
data is limiting

8
Random Error Single Variable (i.e. T)
Questions

Several measurements
are obtained for a
single variable (i.e. T).
What is the true value?
How confident are you?
Is the value different on
different days?

9
How do you determine bounds of m?

Lets assume a normal Gaussian distribution
For small sample s is known
For large sample s is assumed

well pursue this approach
Use z tables for this approach
10
Example 1
11
Properties of a Normal PDF

About 68.26, 95.44, and 99.74 of data lie
within 1, 2, and 3 standard deviations of the
mean, respectively.
When mean is zero and standard deviation is 1, it
is referred to as a standard normal distribution.
Plays fundamental role in statistical analysis
because of the Central Limit Theorem.

12
Central Limit Theorem

Distribution of means calculated from a large
data set is approximately normal
Becomes more accurate with larger number of
samples
Sample mean approaches true mean as n ? ?
Assumes distributions are not peaked close to a
boundary and variances are finite

13
Student t-Distribution

Widely used in hypothesis testing and determining
confidence intervals
Equivalent to normal distribution for large
sample size
Student is a pseudonym, not an adjective actual
name was W. S. Gosset who published in early
1900s.

14
Student t-Distribution

Used to compute confidence intervals according to
Assumes mean and variance are estimated by sample
values
Value of t decreases with DOF or number of data
points n increases with increasing confidence

15
Student t-test (determine error from s)
5
5
t
a 1- probability r n -1 error t s /n 0.5
e.g. From Example 1 n 7, s 3.27
16
Values of Student t Distribution

Depend on both confidence level desired and
amount of data.
Degrees of freedom are n-1, where n number of
data points (assumes mean and variance are
estimated from data).
This table assumes two-tailed distribution of
area.

17
Example 2

Five data points with sample mean and standard
deviation of 713.6 and 107.8, respectively.
The estimated population mean and 95 confidence
interval is (from previous table ta 2.77645)

18
Example 3 Comparing Averages
Day 1 Day 2
What is your confidence that mx?my?
99 confident different 1 confident same
nxny-2
19
Error Propagation Multiple Variables
Obtain value (i.e. from model) using multiple
input variables. What is the uncertainty of your
value? Each input variable has its own error
Example How much ice cream do you buy for
the AIChE event? Ice
cream f (time of day, tests, ) Example You
take measurements of r, A, v to
determine m rAv. What is the
range of m and its associated uncertainty?
20
Value and Uncertainty

Values are used to make decisions by managers
uncertainty of a value must be specified
Ethics and societal impact of values are
important
How do you determine the uncertainty of a value?

Sources of uncertainty
Estimation- we guess!
Discrimination- device accuracy (single data
point)
Calibration- may not be exact (error of curve
fit)
Technique- i.e. measure ID rather than OD
Constants and data- not always exact!
Noise- which reading do we take?
Model and equations- i.e. ideal gas law vs real
gas
Humans- transposing,

21
Estimates of Error (d ) for Input
Variable (Methods or rules)

Measured variable (as we just did) measure
multiple times obtain s
d 2.57 s (t chart shows gt 2.57 s for 99
confidence
e.g. s 2.3 ºC for thermocouple, d 5.8
ºC2. Tabulated variable d 2.57 times last
reported significant digit (e.g. r 1.0 g/ml
at 0º C, d 0.257 g/ml)

22
Estimates of Error (d) for Variable

Manufacturer specs use given accuracy data
(ex. Pump is 1 ml/min, d 1 ml/min)
Variable from regression (i.e. calibration
curve) d standard error (e.g. Velocity from
equation with std error 2 m/s )
Judgment for a variable use judgment for d
(e.g. graph gives pressure to 1 psi, d 1 psi)

23
Calculating Maximum or Probable Error

Maximum error can be calculated as shown
previously
Brute force method
Differential method
Probable error is more realistic positive and
negative errors can lower the error. You need
standard deviations (s or s) to calculate
probable error (PE) (i.e. see previous
example). PE d 2.57 s

? y 1.96 SQRT(s2y) 95
? y 2.57 SQRT(s2y) 99
24
Calculating Maximum (Worst) Error
1. Brute force method substitute upper and
lower limits of all xs into function to get
max and min values of y. Range of y (? ) is
between ymin and ymax. 2. Differential method
from a given model
y f(a,b,c, x1,x2,x3,)
Exact constants
Independent variables
Range of y (?) y dy
25
Example 4 Differential method
m r A v
y x1 x2 x3
x1 r 2.0 g/cm3 (table) x2 A 3.4 cm2
(measured avg) x3 v 2 cm/s (calibration)
d1 0.257 g/cm3 (Rule 2) d2 0.2 cm2 (Rule
1) d3 0.1 cm/s (Rule 4)
? 13.6 3.2 g/s
y (2.0)(3.4)(2) 13.6 g/s dy
(6.8)(0.257)(4.0)(0.2)(6.8)(0.1) 3.2 g/s
Which product term contributes the most to
uncertainty?
This method works only if errors are symmetrical

Write a Comment

User Comments (0)