2DS00

About This Presentation

Title:

2DS00

Description:

to prepare students for (first-year) laboratory assignments ... make excercises during guided self-study. reread lecture notes after guided self-study ... –

Number of Views:37

Avg rating:3.0/5.0

Slides: 51

Provided by: adibucc

Category:

more less

Transcript and Presenter's Notes

Title: 2DS00

1
2DS00

Statistics 1 for Chemical Engineering

2
Lecturers

Dr. A. Di Bucchianico
Department of Mathematics,
Statistics group
HG 9.24
phone (040) 247 2902
a.d.bucchianico_at_tue.nl
Ir. G.D. Mooiweer,
Department of Mathematics
ICTOO
HG 9.12
phone 040 247 4277 (Thursdays)
g.d.mooiweer_at_tue.nl

Dr. R.W. van der Hofstad
Department of Mathematics,
Statistics group
HG 9.04
phone (040) 247 2910
rhofstad_at_win.tue.nl

3
Goals of this course

to prepare students for (first-year) laboratory
assignments
to learn students how to perform basic
statistical analyses of experiments
to learn students how to use software for data
analysis
to learn students how to avoid pitfalls in
analysing measurements

4
Important to remember

Web site for this course www.win.tue.nl/sandro/
2DS00/
No textbook, but handouts (Word) Powerpoint
sheets through web site
Bring notebook to both lectures and self-study
(Optional) buy lecture notes 2256 Statgraphics
voor regulier onderwijs
(Optional) buy lectures notes 2218 Statistisch
Compendium

5
How to study

read lecture notes briefly before lecture
ask questions during lecture
study lecture notes carefully after lecture
make excercises during guided self-study
reread lecture notes after guided self-study
try out previous examinations shortly before the
examination
N.B. Lecture notes (pdf documents) ? PowerPoint
files

6
Week schedule

Week 1 Measurement and statistics
Week 2 Error propagation
Week 3 Simple linear regression analysis
Week 4 Multiple linear regression analysis
Week 5 Nonlinear regression analysis

7
Detailed contents of week 1

measurement errors
graphical displays of data
summary statistics
normal distribution
confidence intervals
hypothesis testing

8
Measurements and statistics

perfect measurements do not exist
possible sources of measurement errors
reading
environment
temperature
humidity
...
impurities
...

9
Necessity of good measurement system
10
Three experiments
11
Types of measurement errors

Random errors
always present
reduce influence by averaging repeated
measurements
Systematic errors
requires adjustment/repair of measuring devices
Outliers
recording errors
mistakes in applying procedures

12
Illustration of measurement concepts
13
Accuracy
difference between average of measured values and
true value
14
Accuracy

relates to systematic errors
absolute error
relative error

15
Location statistics

mean
median
trimmed means

16
Precision

the degree in which consistent results are
obtained

17
Accurate and precise
18
Statistics for precision standard deviation co

standard deviation
standard error
variation coefficient
variance
range

19
Robust statistics for precision

robust statistics
less sensitive to outliers
difficult mathematical theory
requires use of statistical software
interquartile range
IQR 75 quantile 25 quantile 3rd quartile
1st quartile
mean absolute deviation

20
Graphical displays

always make graphical displays for first
impression
one picture says more than 1000 words

2 3.1 4 1.9 2.8
21
Basic graphical displays

scatter plot
watch out for scale (automatic resizing)
time sequence plot
for detecting time effects like warming up
Box-and-Whisker plot
outliers
quartiles
skewness

22
Time sequence plot
23
Box-and-Whisker plot
24
(No Transcript)
25
Probability theory

(cumulative) distribution function
density
density to distribution function

26
The concept of probability density
density function
a
b
area denotes probability that observation falls
between a and b
27
Normal distribution
28
Normal distribution

bell shaped curve
Important because of Central Limit Theorem
Normal distribution
symmetric around µ (location of centre)
spread parametrised by ?2
http//www.win.tue.nl/marko/statApplets/function
Plots.html
http//www-stat.stanford.edu/naras/jsm/NormalDen
sity/NormalDensity.html
µ0 and ?21 standard normal distribution Z

29
More on normal distribution

Area between
? ? 0,67? is 0,500
? ? 1,00? is 0,683? ? 1,645? is 0,975
? ? 1,96? is 0,950
? ? 2,00? is 0,954
? ? 2,33? is 0,980? ? 2,58? is 0,990
? ? 3,00? is 0,997

30
Standardisation

X normally distributed with parameters ? en ?2,
then (X-?)/? standard normal
suppose
?3
?24

31
Testing normality

many statistical procedures implicitly assume
normality
if data are not normally distributed, then
outcome of procedure may be completely wrong
user is always responsible for checking
assumptions of statistical procedures
Graphical checks
normal probability plot
density trace
Formal check
Shapiro-Wilks test

32
Estimation of density function histogram
curve normal distribution with sample mean and
variance as parameters
33
Drawbacks of the histogram

misused for investigating normality
time ordering of data is lost
shape depends heavily on bin width bin location

Histogram for strength
5
4
same data set
3
frequency
2
1
0
24
29
34
39
44
49
54
strength

shape is stable for data sets of size 75 or
larger
optimal number of bins ??n

34
Alternative to histogram Density Trace

Density Trace (also called naive density
estimator)
use moving bins instead of fixed bins
choose bin width (automatically in Statgraphics)
count number of observations in bin at each
point
divide by length of bin

35
Density Trace

Example dataset

4/9
3/9
2/9
1/9
1
2
3
4
5
6

36
Choice of bin widths in density trace

too small bin width yields too fluctuating curve
too large bin width yields too smooth curve

37
Patterns in distribution normal curve

Depicted by a bell-shaped curve
Indicates that measurement process is running
normally

38
Patterns in distribution bi-modal curve

Distribution appears to have two peaks
May indicate that data from more than process
are mixed together

39
Patterns in distribution saw-toothed

Also commonly referred to as a comb distribution,
appears as an alternating jagged pattern
Often indicates a measuring problem
improper gauge readings
gauge not sensitive enough for readings

40
Testing normality
41
Normal Probability Plot
42
Normally distributed?
43
Normal Probability Plot of not normally
distributed data
44
Test for Normality Shapiro-Wilks

statistical test for Normality Shapiro-Wilks
idea sophisticated regression analysis in the
spirit of normal probability plot
makes Normal Probability Plot objective
check outliers (measurement error? normality
sometimes disturbed by single observation)
analyse if not normally distributed

45
Statgraphics Shapiro Wilks
Tests for Normality for width Computed
Chi-Square goodness-of-fit statistic
254.667 P-Value 0.0 Shapiro-Wilks W statistic
0.921395 P-Value 0.000722338

Interpretation
value statistic itself cannot need be
interpreted
P-value indicates how likely normal distribution
is
use ? 0.01 as critical value in order to avoid
too strict rejections of normality

46
Dixons test

Box-and-Whisker plot graphical test of outliers
if data are normally distributed, then formal
test may be used

47
Disadvantages of point estimators
48
Confidence intervals

95 confidence interval for µ probability 0.95
that interval contains true value µ
more observations ? narrower interval (effect in
particular for n lt 20)
higher confidence ? wider interval
example ?0,05 ?

49
Confidence intervals example
50
Hypothesis testing

example test whether there is a systematic
error

Hypothesis Tests for meting Sample mean
4.994 Sample median 5.01 t-test ------ Null
hypothesis mean 5.0 Alternative not
equal Computed t statistic -0.155011 P-Value
0.880233 Do not reject the null hypothesis for
alpha 0.05.

Write a Comment

User Comments (0)

About PowerShow.com