Title: Statistics Presentation
1Statistics Presentation Ch En 475 Unit
Operations
2Quantifying variables (i.e. answering a question
with a number)
- Directly measure the variable. -
referred to as measured variable ex.
Temperature measured with thermocouple - Calculate variable from measured or
tabulated variables - referred to as
calculated variable ex. Flow rate m
r A v (measured or tabulated)
Each has some error or uncertainty
3A. Error of Measured Variable
- Several measurements
- are obtained for a
- single variable (i.e. T).
-
- What is the true value?
- How confident are you?
- Is the value different on
- different days?
Questions
4How do you determine the error?
- Lets assume normal Gaussian distribution
- For small sampling s is known
- For large sampling s is assumed
well pursue this approach
Use t tables for this approach
Dont often have this much data
Use z tables for this approach
5Example
n Temp
1 40.1
2 39.2
3 43.2
4 47.2
5 38.6
6 40.4
7 37.7
6Standard Deviation Summary
(normal distribution)
40.9 (3.27) 1s 68.3 of data are within
this range 40.9 (3.27x2) 2s 95.4 of
data are within this range 40.9
(3.27x3) 3s 99.7 of data are within this
range
If normal distribution is questionable, use
Chebyshev's inequality At least 50 of the
data are within 1.4 s from the mean. At least
75 of the data are within 2 s from the mean. At
least 89 of the data are within 3 s from the
mean.
The above ranges dont state how accurate the
mean is - only the of data within the given
range
7Student t-test (gives confidence of where m (not
data) is located)
measured mean
5
5
t
true mean
2a1- probability r n-1 6
2-tail
Prob. a t -
90 .05 1.943 2.40
95 .025 2.447 3.02
99 .005 3.707 4.58
8T-test Summary
- exact mean
- 40.9 is sample mean
41 2 90 confident m is somewhere in this
range 41 3 95 confident m is somewhere in
this range 41 5 99 confident
m is somewhere in this range
9Comparing averages of measured variables
Day 1 Day 2
What is your confidence that mx?my (i.e. they are
different)?
Larger t More likely different
1-tail
1-a confident different a confident same
nxny-2
10Example
- Calculate average and s for both sets of data
- Find range in which 95.4 of the data falls (for
each set). - Determine range for m for each set at 95
probability - At what confidence are pressures different each
day?
Data points Pressure Day 1 Pressure Day 2
1 750 730
2 760 750
3 752 762
4 747 749
5 754 737
11B. Uncertainty of Calculated Variable
Calculate variable from multiple input
(measured, tabulated, ) variables (i.e. m
rAv) What is the uncertainty of your
calculated value? Each input variable has its
own error
Example You take measurements of r, A, v
to determine m rAv. What is the
range of m and its associated
uncertainty?
Details provided in Applied Engineering
Statistics, Chapters 8 and 14, R.M. Bethea and
R.R. Rhinehart, 1991).
12- To obtain uncertainty of calculated variable
- DO NOT just calculate variable for each set of
data and then average and take standard
deviation - DO calculate uncertainty using error from input
variables use uncertainty for calculated
variables and error for input variables
Plan obtain max error (d) for each input
variable then obtain uncertainty of
calculated variable Method 1 Propagation of
max error - brute force Method 2 Propagation
of max error - analytical Method 3 Propagation
of variance - analytical Method 4 Propagation
of variance - brute force - Monte Carlo
simulation
13Value and Uncertainty
- Value used to make decisions - need to know
- uncertainty of value
- Potential ethical and societal impact
- How do you determine the uncertainty of the
value?
- Sources of uncertainty (from Rhinehart, Applied
Engineering Statistics, 1991) - Estimation - we guess!
- Discrimination - device accuracy (single data
point) - Calibration - may not be exact (error of curve
fit) - Technique - i.e. measure ID rather than OD
- Constants and data - not always exact!
- Noise - which reading do we take?
- Model and equations - i.e. ideal gas law vs.
real gas - Humans - transposing,
14Estimates of Error (d) for input variables (ds
are propagated to find uncertainty)
- Measured measure multiple times obtain s d
2.5s Reason 99 of data is within 2.5s
Example s 2.3 ºC for thermocouple, d 5.8
ºC - Tabulated d 2.5 times last reported
significant digit (with 1) Reason Assumes last
digit is 2.5 ( 0 assumes perfect, 5
assumes next left digit is fuzzy) Example r
1.3 g/ml at 0º C, d 0.25 g/ml Example
People 127,000 d 2500 people
15Estimates of Error (d) for input variables
- Manufacturer spec or calibration accuracy use
given spec or accuracy data Example Pump spec
is 1 ml/min, d 1 ml/min - Variable from regression (i.e. calibration
curve) d 2.5standard error (std error is
stdev of residual) Example Velocity is slope
with std error 2 m/s - Judgment for a variable use judgment for d
Example Read pressure to 1 psi, d 1 psi
16Estimates of Error (d) for input variables
If none of the above rules apply, give your best
guess
Example Data from a computer show that the flow
rate is 562 ml/min 3 ml/min (stdev of computer
noise). Your calibration shows 510 ml/min 8
ml/min (stdev). What flow rate do you use and
what is d?
In the following propagation methods, its
assumed that there is no bias in the values used
- lets assume this for all lab projects.
17Method 1 Propagation of max error- brute force
- Brute force method obtain upper and lower
limits of all input variables (from maximum
errors) plug into equation to get uncertainty
of calculated variable (y). -
- Uncertainty of y is between ymin and ymax.
- This method works for both symmetry and
- asymmetry in errors (i.e. 10 psi 3 psi or -
2 psi)
18Example Propagation of max error- brute force
m r A v
Brute force method
max min
r
A
v
r 2.0 g/cm3 (table) A 3.4 cm2 (measured
avg) v 2 cm/s (slope of graph)
Additional information
sA 0.1 cm2 std. error (v) 0.1 cm/s
All combinations
What is d for each input variable?
mmin lt m lt mmax
19Method 2 Propagation of max error- analytical
- Propagation of error Utilizes maximum error
of input variable (d) to estimate uncertainty
range of calculated variable (y) - Uncertainty of y y yavg dy
- Assumptions
- input errors are symmetric
- input errors are independent of each other
- equation is linear (works o.k. for non-linear
equations if input errors are relatively
small)
Remember to take the absolute value!!
20Example Propagation of max error- analytical
m r A v
y x1 x2 x3
Av rv rA
r 2.0 g/cm3 (table) A 3.4 cm2 (measured
avg) v 2 cm/s (slope of graph)
m mavg dm rAv dm 13.6 4.4 g/s
Additional information
sA 0.1 cm2 std. error (v) 0.1 cm/s
(3.4)(2)(0.25) 0.39 (4.4)
ferror,r
21Propagation of max error
- If linear equation, symmetric errors, and input
errors are independent ? brute force and
analytical are same - If non-linear equation, symmetric errors, and
input errors are independent ? brute force and
analytical are close if errors are small. If
large errors (i.e. gt10 or more than order of
magnitude), brute force is more accurate. - Must use brute force if errors are dependant on
each other and/or asymmetric. - Analytical method is easier to assess if lots of
inputs. Also gives info on contribution from
each error.
22Method 3 Propagation of variance- analytical
- Maximum error can be calculated from max errors
of input variables as shown previously - Brute force
- Analytical
- Probable error is more realistic
- Errors are independent (some may be and some
-). Not all will be in same direction. - Errors are not always at their largest value.
- Thus, propagate variance rather than max error
- You need variance (s2) of each input to propagate
variance. If s (stdev) is unknown, estimate s
d/2.5
23Method 3 Propagation of variance- analytical
y yavg 1.96 SQRT(s2y) 95
y yavg 2.57 SQRT(s2y) 99
- gives propagated variance of y or (stdev)2
- gives probable error of y and associated
confidence - error should be lt10 (linear approximation)
- use propagation of max error if not much data,
use propagation of variance if lots of data
24Method 4 Monte Carlo Simulation (propagation of
variance brute force)
- Choose N (N is very large, e.g. 100,000) random
di from a normal distribution of standard
deviation si for each variable and add to the
mean to obtain N values with errors - rnorm(N,µ,s) in Mathcad generates N random
numbers from a normal distribution with mean µ
and std dev s - Find N values of the calculated variable using
the generated xi values. - Determine mean and standard deviation of the N
calculated variables.
25Monte Carlo Simulation Example
- Estimate the uncertainty in the critical
compressibility factor of a fluid if Tc 514 2
K, Pc 61.37 0.6 bar, and Vc 0.168 0.002
m3/kmol?
26Example Propagation of variance
Calculate r and its 95 probable error
All independent variables were measured multiple
times (Rule 1) averages and s are given
M 5.0 kg s 0.05 kg L 0.75 m s
0.01 m D 0.14 m s 0.005 m
27Propagation of Errors
28Monte Carlo
29Overall Summary
- measured variables use average, std dev (data
range), and student t-test (mean range and
mean comparison) - calculated variable determine uncertainty
-- Max error propagating error with brute force
-- Max error propagating error
analytically -- Probable error propagating
variance analytically - -- Probable error propagating variance with
brute force - (Monte Carlo)
30Data and Statistical Expectations
- Summary of raw data (table format)
- Sample calculations including statistical
calculations - Summary of all calculations- table format is
helpful - If measured variable average and standard
deviation for all, confidence of mean for at
least one variable - if calculated variable 1 of the 4 methods.
Please state in report. If messy equation, you
may show 1 of 4 methods for small part and then
just average (with std dev.) the value
(although not the best method).