Title: 3.3 Density Curves and Normal Distributions
13.3 Density Curves and Normal Distributions
- Density Curves
- Measuring Center and Spread for Density Curves
- Normal Distributions
- The 68-95-99.7 Rule
- Standardizing Observations
- Using the Standard Normal Table
- Inverse Normal Calculations
- Normal Quantile Plots
2Exploring Quantitative Data
We now have a kit of graphical and numerical
tools for describing distributions. We also have
a strategy for exploring data on a single
quantitative variable. Now, well add one more
step to the strategy.
2
Exploring Quantitative Data
- Always plot your data Make a graph.
- Look for the overall pattern (shape, center, and
spread) and for striking departures such as
outliers. - Calculate a numerical summary to briefly describe
center and spread. - Sometimes, the overall pattern of a large number
of observations is so regular that we can
describe it by a smooth curve.
3Density curves
A density curve is a mathematical model of a
distribution The total area under the curve, by
definition, is equal to 1, or 100. The area
under the curve for a range of values is the
proportion of all observations for that range.
Area under Density Curve Relative Frequency
of Histogram
Histogram of a sample with the smoothed, density
curve describing theoretically the population.
rel. freq of left histogram287/947.303
area .293 under rt. curve
4Density Curves
- A density curve is a curve that
- Is always on or above the horizontal axis
- Has an area of exactly 1 underneath it
- A density curve describes the overall pattern of
a distribution. The area under the curve and
above any range of values on the horizontal axis
is the proportion of all observations that fall
in that range.
5Density curves come in many shapes. Some are well
known mathematically and others arent but they
all lie above the horizontal axis and have total
area 1.
6Density Curves
- Our measures of center and spread apply to
density curves as well as to actual sets of
observations.
6
Distinguishing the Median and Mean of a Density
Curve
- The median of a density curve is the equal-areas
point?the point that divides the area under the
curve in half. - The mean of a density curve is the balance point,
at which the curve would balance if made of solid
material. - The median and the mean are the same for a
symmetric density curve. They both lie at the
center of the curve. The mean of a skewed curve
is pulled away from the median in the direction
of the long tail.
7Density Curves
- The mean and standard deviation computed from
actual observations (data) are denoted by and
s, respectively, and are called the sample mean
and sample standard deviation. - The mean and standard deviation of the idealized
distribution represented by the density curve are
denoted by µ (mu) and ? (sigma),
respectively, and are sometimes called the
population mean and population standard
deviation.
8Normal Distributions
- One particularly important class of density
curves are the Normal curves, which describe
Normal distributions. - All Normal curves are symmetric, single-peaked,
and bell-shaped. - A specific Normal curve is described by giving
its mean µ and standard deviation s.
9Normal Distributions
- A Normal distribution is described by a Normal
density curve. Any particular Normal
distribution is completely specified by two
numbers its mean µ and standard deviation s. - The mean of a Normal distribution is the center
of the symmetric Normal curve. - The standard deviation is the distance from the
center to the change-of-curvature points on
either side, the points of inflection of the
density. - We abbreviate the Normal distribution with mean µ
and standard deviation s as N(µ,s).
10Normal distributions
Normal or Gaussian distributions are a family
of symmetrical, bell-shaped density curves
defined by a mean m (mu) and a standard deviation
s (sigma) N(m,s).
x
x
e 2.71828 The base of the natural logarithm p
pi 3.14159
11A family of density curves
Here, means are the same (m 15) while standard
deviations are different (s 2, 4, and 6).
Here, means are different (m 10, 15, and 20)
while standard deviations are the same (s 3).
12The 68-95-99.7 Rule
- The 68-95-99.7 Rule
- In the Normal distribution with mean µ and
standard deviation s - Approximately 68 of the observations fall within
s of µ. - Approximately 95 of the observations fall within
2s of µ. - Approximately 99.7 of the observations fall
within 3s of µ.
Heres a N(64.5, 2.5) distribution of heights
of college-aged females.
13The 68-95-99.7 Rule
- The distribution of Iowa Test of Basic Skills
(ITBS) vocabulary scores for 7th-grade students
in Gary, Indiana, is close to Normal. Suppose
the distribution is N(6.84, 1.55). - Sketch the Normal density curve for this
distribution. - What percent of ITBS vocabulary scores are less
than 3.74? - What percent of the scores are between 5.29 and
9.94?
14Standardizing Observations
All Normal distributions are the same if we
measure in units of size s from the mean µ as
center.
The standard Normal distribution is the Normal
distribution with mean 0 and standard deviation
1. That is, the standard Normal distribution is
N(0,1) it is represented by Z and we write Z
N(0,1)
15The Standard Normal Table
Because all Normal distributions are the same
when we standardize, we can find areas under any
Normal curve from a single table.
The Standard Normal Table Table A is a table of
areas under the standard Normal curve. The table
entry for each value z is the area under the
curve to the left of z.
16The Standard Normal Table
Suppose we want to find the proportion of
observations from the standard Normal
distribution that are less than 0.81. We can
use Table A
P(z lt 0.81)
0.7910
Z 0.00 0.01 0.02
0.7 0.7580 0.7611 0.7642
0.8 0.7881 0.7910 0.7939
0.9 0.8159 0.8186 0.8212
17Tips on using Table A
To calculate the area between 2 z- values, first
get the area under N(0,1) to the left for each
z-value from Table A.
Then subtract the smaller area from the larger
area.
A common mistake made by students is to subtract
both z values - it is the areas that are
subtracted, not the z-scores!
area between z1 and z2 area left of z1 area
left of z2
18Normal Calculations
How to Solve Problems Involving Normal
Distributions
- Express the problem in terms of the observed
variable X. - Draw a picture of the distribution of X and shade
the area of interest under the curve. - Perform calculations.
- Standardize X to restate the problem in terms of
a standard Normal variable Z. - Use Table A and the fact that the total area
under the curve is 1 to find the required area
under the standard Normal curve. - Write your conclusion in the context of the
problem.
19Inverse Normal Calculations
- According to the Health and Nutrition Examination
Study of 19761980, the heights X (in inches) of
adult men aged 1824 are N(70, 2.8).
How tall must a man be (? below) to be in the
lower 10 for men aged 1824?
N(70, 2.8)
20Inverse Normal Calculations
How tall must a man be in the lower 10 for men
aged 1824?
Look up the closest probability (closest to 0.10)
in the table. Find the corresponding z the
standardized score. The value you seek is that
many standard deviations from the mean.
z 0.07 0.08 0.09
1.3 0.0853 0.0838 0.0823
1.2 .1020 0.1003 0.0985
1.1 0.1210 0.1190 0.1170
Z 1.28
21Normal Calculations
How tall must a man be in the lower 10 for men
aged 1824?
Z 1.28
We need to unstandardize the z-score to find
the observed value (x)
x 70 z(2.8) 70 (-1.28 ) ? (2.8)
70 (3.58) 66.42
A man would have to be approximately 66.42 inches
tall or less to place in the lower 10 of all men
in the population.
22Normal Quantile Plots
- One way to assess if a distribution is indeed
approximately Normal is to plot the data on a
Normal quantile plot. - The data points are ordered from smallest to
largest and their percentile ranks are converted
to z-scores with Table A. These z-scores are then
plotted against the data to create a Normal
quantile plot. - If the distribution is indeed Normal, the plot
will show a straight line, indicating a good
match between the data and a Normal distribution
in JMP the points fall within the dotted lines. - Systematic deviations from a straight line
indicate a non-Normal distribution. Outliers
appear as points that are far away from the
overall pattern of the plot some points fall
outside the dotted lines in JMP.
23Normal Quantile Plots
Good fit to a straight line the distribution of
rainwater pH values is close to normal. The
intercept of the line mean of the data and the
slope of the line s.d. of the data
Curved pattern The data are not Normally
distributed. Instead, it shows a right skew A
few individuals have particularly long survival
times.
Normal quantile plots are complex to do by hand,
but they are easy to do in JMP under the red
triangle, choose Normal Quantile Plot but
notice the difference when compared to the above
plots
24- HW
- Finish reading section 3.3
- Work over all the examples!
- Ive put up some videos on computing Normal
Probabilities as an online assignment due 9/19
9/22 at 900am - Work on 3.82-3.93, 3.95, 3.100-3.102,
- 3.104, 3.106-3.109, 3.110-3.128. Do as many
- of these problems in order to really
- understand whats going on here!
- Quiz in class on Monday 9/22
- Test 1 on October 1, covering Chapts. 1-4