Chapter 5: The Normal Approximation for Data - PowerPoint PPT Presentation

1 / 30

About This Presentation

Title:

Chapter 5: The Normal Approximation for Data

Description:

1. Graph is symmetric around 0. 2. The total area under the curve = 100%. 3. The ... HANES data for women: mean height: 63.5. standard deviation: 2.5. 63.5 ... – PowerPoint PPT presentation

Number of Views:49

Avg rating:3.0/5.0

Slides: 31

Provided by: University354

Category:

more less

Transcript and Presenter's Notes

Title: Chapter 5: The Normal Approximation for Data

1
Chapter 5 The Normal Approximation for Data
2
Chapter 5 The Normal Approximation for Data
1. The Normal Curve
Features
1. Graph is symmetric around 0 2. The total
area under the curve 100 3. The curve is
always above the horizontal axis 4. The
horizontal axis is scaled in standard units 5.
The curve is bell shaped
3
Standard Scores

Standard scores
how far above/below the average a score is in
terms of standard deviation units
standard scores state the exact location of a
score in the distribution
most common form is the z-score

HANES data for women mean height
63.5 standard deviation 2.5
63.5
mean height of 63.5 is 0 standard deviations
above the mean thus in standard units the mean
0
4
Converting to standard scores/z-scores
HANES data for women mean height 63.5
standard deviation 2.5
Simple formula standard score score minus
average standard deviation
63.5 - 63.5 0 2.5
56 - 63.5 -2 2.5
62.25 - 63.5 -.50 2.5
68.5 - 63.5 2 2.5
0
2
-.5
-2
5
Converting to original scores from z-scores
Simple formula raw score (z score
times standard deviation) average
(-1 x 2.5) 63.5 61
(3 x 2.5) 63.5 71
(-2.5 x 2.5) 63.5 57.25
(.50 x 2.5) 63.5 64.75
57.25
61
71
64.75
6
Finding areas under the curve
Recall from the previous chapter - we stated that
certain percentages fall between the various
standard deviations
-3 -2 -1 0 1 2 3
68
95
7
Finding areas under the curve
Statisticians have created tables that provide
the exact percentages that fall within areas
under the normal curve z tables OH Table A105
(p.A105)
In terms of the standard scores the percentages
are as follows
34.13
34.13
2.14
2.14
13.59
13.59
0.14
0.14
8
Exercise 1
Find the area between -1.50 and 1.50?
Table A105
86.64
9
Exercise 2
Find the area between -2.30 and 2.30?
Table A105
97.86
10
Exercise 3
Find the area between -1.50 and 2.30?
0 to 2.30 half of (-2.30 to 2.30)
.50(97.86) 48.93
0 to -1.50 half of (-1.50 to 1.50)
.50(86.64) 43.32
48.93 43.32 92.25
11
Exercise 4
Find the area between -2 and -.50?
0 to -2 half of (-2 to 2) .50(95.45) 47.73
0 to -.50 half of (-.50 to .50) .50(38.29)
19.15
47.73 - 19.15 28.58
12
Exercise 5
Find the area outside of -2.50 and 2.50?
-2.50 to 2.50 98.76
100 - 98.76 1.24
13
Using the normal approximation
many histograms (types of data) follow the normal
curve, thus the average pins down the
center the standard deviation gives the spread
if histograms/data follow the normal curve, then
we can estimate the percentage of
datapoints/people that fall within a certain
interval
using the normal approximation step 1 convert
raw scores to standard scores step 2 find the
corresponding area under the normal curve using
Table A105 ALWAYS DRAW THE DIAGRAMS
14
Percentiles
Average standard deviation good for normally
distributed data
not satisfactory for non-normal data/skewed data
0 100
15
Percentiles
0 100
25th percentile 25 of the subjects fall below
it, 75 above
50th percentile median , 50 fall below it, 50
above
75th percentile 75 fall below it, 25 above
In skewed distributions like this, the median is
preferable to the mean as it is not influenced by
extreme data points Interquartile range 75th
percentile - 25th percentile used as a
measure of spread in skewed distributions also
not as influenced by extreme cases
16
Chapter 6 Measurement Error
17
Measurement Error
Repeated measurements of an object do NOT
produce the same result - the observed
differences are due to chance error example
weigh yourself on a bathroom scale questions
about chance error where do they come
from? how big are they? We can assess how
large these are by replicating the
measurement National Bureau of Standards in
Washington holds the national stds for weights
and measures (K20 (kilogram 20), NB 10 (10
grams) used to calibrate weights in the
U.S. these have been repeatedly weighed under
the same conditions (room, apparatus, people,
procedure, air pressure, temp) attempted to
control all factors that may influence the
weight NB10 weights are the same to the first 3
decimal places then they differ (Table 1, p99)
18
Estimating chance error
The standard deviation (average of the
differences) estimates the likely size of the
chance error in a single measurement Note
individual measurement exact value chance
error
varies
a constant
repeated individual
measurements differ due to chance error
variability their standard deviation estimates
the chance error or variability for any single
measurement
standard deviation R.M.S. of the deviations from
the average
19
Outliers
Extreme scores that are not the result of
errors Table 1 data 36, 86 and 94 very
extreme numbers that seldom occur
94 -5 z 36 3 z 86
5 z
99.87 below 00.13 at/above
4 z 99.997 below 00.003
at/above
(OH fig 2, p102) with these in mean 405
micrograms below 10 grams SD 6 micrograms 86
fall within 1 SD of average
Effect of outliers inflate the average and SD
with these out
mean 404 micrograms below 10 grams SD 4
micrograms closer to 68 within 1 SD of average
20
Bias
A systematic influence in the same
direction NOT random
random/chance sometime positive, sometimes
negative systematic always in only one
direction Now individual
measurement exact value bias chance
error Without bias
the long run average of repeated measurements
variables exact value known as the
EXPECTED VALUE With bias long run
average will be off in the same direction as the
bias Note bias does not equal chance error
21
Chapter 6 Concept Review

All measurements, no matter how carefully
made, may differ
reflects chance error
So individual measurement true value
chance error
Researchers need to estimate the likely size
of chance error before relying on a
single measurement best method is via
replication
Likely size of chance error in a single
measurement is estimated by the standard
deviation of a series of replicated measurements
Bias/systematic error causes measurements
to be systematically too high/low
individual measurement true value bias
chance error
Even in careful measurement, we can expect a
small percentage of outliers
these can strongly influence the average and the
SD

22
Chapter 7 Plotting Points and Lines
23
Chapter 7 Slope
Take any 2 points (A and B) on a line
Moving from A to B
X changes in some way
Y changes in some way
B
Rise 1
A
Run 2
Ratio of rise slope run Slope the rate
at which y changes for each unit change in x
how much does y change as x changes one unit Here
the slope 1/2 0.5
24
Positive, Negative, 0 Slope Values
0 change in y 0 slope
change in y slope
- change in y - slope
25
Intercept
the height of the line where the line crosses the
Y axis
Intercept 0.50
Intercept 4
Intercept -3
26
Plotting Lines
Plot the line passing through the point (1, 1)
with a slope of 2/3
Construction point
Run 4
Rise 2.7
The next step involves the slope
slope rise/run which here 2/3 we chose the
run 4 the rise should be positive (it is not
-2/3) Rise slope x run 2/3 times 4 2.7
27
Plotting Lines
Plot the line passing through the point (1, 1)
with a slope of 2/3
Construction point
Run 4
The next step involves the slope
slope rise/run which here 2/3 we chose the
run 4 the rise should be positive (it is not
-2/3) Rise slope x run 2/3 times 4 2.7
28
Algebraic Equation for a line
rule for computing the y coordinate of a point
from its x coordinate
y 0.25x 2 OR y 2 0.25x
intercept
x coordinate
General form y mx b
slope
29
Algebraic equation for a line of the rule y
.25x 2
x 4 8 12 16
y .25x 2 .25(4) 2 .25(8) 2 .25(12)
2 .25(16) 2
y 3 4 5 6
Rise 2
Run8
Note slope rise/run 2/8 1/4 0.25 all
points fall on the line line the graph of the
equation
the graph of the equation y mx b is a
straight line with slope m and interce
pt b
30
Algebraic equation for a line of the rule y
-.75x 4
x 0 2 4 6
y -.75x 2 -.75(0) 4 -.75(2) 4 -.75(4)
4 -.75(6) 4
y 4 2.5 1 -0.5
Run 2
Rise -1.5
Note slope rise/run -1.5/2 -3/4
-0.75 all points fall on the line
the graph of the equation y mx b is a
straight line with slope m and interce
pt b

Write a Comment

User Comments (0)