Title: Normal Probability Distributions
1Normal Probability Distributions
2Overview
3Normal Distribution
- If a continuous random variable has a
distribution with a graph that is symmetric and
bell-shaped, and it can be described by the
equation below, we say that it has a normal
distribution.
4The Normal Distribution
- The curve is bell-shaped and symmetric.
5The Standard Normal Distribution
6Uniform Distribution
- A continuous random variable has a uniform
distribution if its values spread evenly over the
range of possibilities. The graph of a uniform
distribution results in a rectangular shape.
7Uniform Distribution
- The uniform distribution is symmetric and
rectangular.
8Requirements for a Probability Distribution
- where x assumes all possible
values. - for every individual value of
x.
9Density Curve
- A density curve is a graph of a continuous
probability distribution. It must satisfy the
following properties - The total area under the curve must equal 1.
- Every point on the curve must have a vertical
height that is 0 or greater. (That is, the curve
cannot fall below the x-axis.)
10Relationship Between Area Under the Curve and
Probability
- Because the total area under a density curve is
equal to 1, there is a correspondence between
area and probability.
11Example
- Suppose that the continuous random variable X has
a uniform distribution over the interval from 0
to 5. Find the probability that a randomly
selected value of X is - More than 4.
- Less than 2.
- Between 1 and 4.
12Standard Normal Distribution
- The standard normal distribution is a normal
probability distribution that has a mean of 0 and
a standard deviation of 1, and the total area
under its density curve is equal to 1.
13Standard Normal Distribution
- The standard normal distribution
14Probability
- A probability of falling in an interval is just
the area under the curve. -
-
-
15Probability
16Probabilities and a Continuous Probability
Distribution
- For continuous numerical variables and any
particular numbers a and b,
17Calculating Probabilities Given a z Score
- Table A-2 is designed only for the standard
normal distribution, which has a mean of 0 and a
standard deviation of 0. - Table A-2 is on two pages, with one page for
negative z scores and the other page for positive
z scores. - Each value in the body of the table is a
cumulative area from the left up to a vertical
boundary above the specific z score.
18Calculating Probabilities Given a z Score
- When working with a graph, avoid confusion
between z scores and areas. - z score Distance along the horizontal scale of
the standard normal distribution refer to the
leftmost column and top row of Table A-2. - Area Region under the curve refer to the values
in the body of Table A-2. - The part of the z score denoting hundredths is
found across the top row of Table A-2.
19Example
- Find the area under the standard normal
distribution to the left of 1.5. - Find the area under the standard normal
distribution to the right of -2. - Find the area under the standard normal
distribution between -2 and 1.5.
20Example
- Let z denote a random variable that has a
standard normal distribution. Determine each of
the following probabilities -
-
-
21Calculating a z Score Given a Probability
- Draw a bell-shaped curve and identify the region
under the curve that corresponds to the given
probability. If that region is not a cumulative
region from the left, work instead with a known
region that is a cumulative region from the left. - Using the cumulative area from the left, locate
the closest probability in the body of Table A-2
and identify the corresponding z score.
22Example
- Determine the z value that separates
- the smallest 10 of all the z values from the
others, - the largest 5 of all the z values from the
others.
23Applications of Normal Distributions
24Standardizing Scores
- If we convert values to scores using
,then
procedures with all normal distributions are the
same as those for the standard normal
distribution.
25Converting Values in a Nonstandard Normal
Distribution to z Scores
- Sketch a normal curve, label the mean and the
specific z values, then shade the region
representing the desired probability. - For each relevant value x that is a boundary for
the shaded region, use the z Score formula to
convert that value to the equivalent z score. - Refer to Table A-2 and use the z scores to find
the area of the shaded region. This area is the
desired probability.
26Example
- The serum cholesterol levels of 17-year-olds
follow a normal distribution with a mean of 176
mg/dLi and a standard deviation of 30 mg/dLi. If
a 17-year-old is selected at random, what is the
probability he/she has a serum cholesterol level - of 156 mg/dLi or less?
- of more than 216 mg/dLi?
- between 121 mg/dLi and 186 mg/dLi?
27z Scores and Area
- Dont confuse z scores and areas.
- Choose the correct side of the graph.
- A z score must be negative whenever it is located
in the left half of the normal distribution. - Areas (or probabilities) are positive or zero
values, but they are never negative.
28Finding Values From Known Areas
- Sketch a normal distribution curve, enter the
given probability or percentage in the
appropriate region of the graph, and identify the
x value(s) being sought. - Use Table A-2 to find the z score corresponding
to the cumulative left area bounded by x. Refer
to the body of Table A-2 to find the closest
area, then identify the corresponding z score.
29Finding Values From Known Areas
- Using the formula
,enter the values for , ,
and the z score found in Step 2, then solve for
x. - Refer to the sketch of the curve to verify that
the solution makes sense in the context of the
graph and in the context of the problem.
30Example (continued)
- The serum cholesterol levels of 17-year-olds
follow a normal distribution with a mean of 176
mg/dLi and a standard deviation of 30 mg/dLi.
Find - the 80th percentile.
- the 25th percentile.
31Sampling Distributions and Estimators
32Sampling Distribution of the Mean
- The sampling distribution of the mean is the
probability distribution of sample means, with
all samples having the same sample size n.
33Example
- Suppose our population consists of the three
values 1, 2, and 5. - Calculate the mean, median, range, variance and
standard deviation for the population. - Find all possible samples of 2 values.
- Calculate the mean, median, range, variance and
standard deviation for each sample. - Calculate the mean of the sample means, sample
medians, sample ranges, sample variances, and
sample standard deviation. - Compare the results of d with the results of a.
34Example (continued)
35Sampling Variability
- The value of a statistic, such as the sample mean
, depends on the particular values included
in the sample, and it generally varies from
sample to sample. This variability of a statistic
is called sampling variability.
36Sampling Distribution of the Proportion
- The sampling distribution of the proportion is
the probability distribution of sample
proportions, with all samples having the same
sample size n.
37Properties of the Distribution of Sample
Proportions
- Sample proportions tend to target the value of
the population proportion. - Under certain conditions, the distribution of
sample proportions approximates a normal
distribution.
38Biased and Unbiased Estimators
- A sample statistic is an unbiased estimator of a
population parameter if it targets the
population parameter. - A sample statistic is a biased estimator of a
population parameter if it does not target the
population parameter.
39Estimators Good and Bad
- Statistics that target population parameters
Mean, Variance, Proportion - Statistics that do not target population
parameters Median, Range, Standard Deviation
40The Central Limit Theorem
41Example
- The serum cholesterol levels of 17-year-olds
follow a normal distribution with a mean of 176
mg/dLi and a standard deviation of 30 mg/dLi. If
a 17-year-old is selected at random, what is the
probability he/she has a serum cholesterol level
of 156 mg/dLi or less?
42The Central Limit Theorem and the Sampling
Distribution of
- Given
- The random variable x has a distribution (which
may or may not be normal) with mean and
standard deviation . - Simple random samples all of the same size n are
selected from the population. (The samples are
selected so that all possible samples of size n
have the same chance of being selected.)
43The Central Limit Theorem and the Sampling
Distribution of
- Conclusions
- The distribution of sample means will, as the
sample increases, approach a normal distribution. - The mean of all sample means is the population
mean . (That is, the normal distribution from
Conclusion 1 has mean .) - The standard deviation of all sample means is
. (That is, the normal distribution from
Conclusion 1 has standard deviation .)
44The Central Limit Theorem and the Sampling
Distribution of
- Practical Rules Commonly Used
- If the original population is not itself normally
distributed, here is a common guideline For
samples of size n greater than 30, the
distribution of the sample means can be
approximated reasonably well by a normal
distribution. (There are exceptions, such as
populations with very non-normal distributions
requiring samples sizes much larger than 30, but
such exceptions are relatively rare.) The
approximation gets better as the sample size n
becomes larger. - If the original population is itself normally
distributed, then the sample means will be
normally distributed for any sample size n (not
just the values of n larger than 30).
45Notation for the Sampling Distribution of
- If all possible random samples of size n are
selected from a population with mean and
standard deviation , the mean of the sample
means is denoted by , soAlso, the
standard deviation of the samples means is
denoted by , so is often called the
standard error of the mean.
46Example
- The serum cholesterol levels of 17-year-olds
follow a normal distribution with a mean of 176
mg/dLi and a standard deviation of 30 mg/dLi. If
a random sample of ten 17-year-olds is selected,
what is the probability that the sample mean is
156 mg/dLi or less?
47The Central Limit Theorem - Bottom Line
- As the sample size increases, the sampling
distribution of sample means approaches a normal
distribution.
48Example
- If the mean and standard deviation of serum iron
values for healthy men are 120 and 15 micrograms
per 100 mL, respectively, what is the probability
that a random sample of 50 healthy men will yield
a sample mean between 115 and 125 micrograms per
100 mL?
49Applying the Central Limit Theorem
- When working with an individual value from a
normally distributed population, use the methods
of Section 5.3. Use - When working with a mean for same sample (or
group), be sure to use the value for
the standard deviation of the sample means. Use
50Interpreting Results
- Rare Event RuleIf, under a given assumption, the
probability of a particular observed event is
exceptionally small, we conclude that the
assumption is probably not correct.
51Correction for a Finite Population
- When sampling with replacement and the sample
size n is greater than 5 of the finite
population size N (that is, ),
adjust the standard deviation of the sample means
by multiplying it by the finite population
correction factor
52Normal as Approximation to the Binomial
53Recall
- A binomial probability distribution results from
a procedure that meets all the following
requirements - The procedure has a fixed number of trials.
- The trials must be independent.
- Each trial must have all outcomes classified into
two categories. - The probabilities must remain constant for each
trial.
54Binomial Distributions p 0.5 n 3, n 4, n
5, n 6
55Binomial Distributions p 0.5 n 10, n 20,
n 30, n 40
56Binomial Distributions p 0.3 n 10, n 20,
n 30, n 40
57Normal Distribution as Approximation to Binomial
Distribution
- When working with a binomial distribution, if
and , then the binomial random
variable has a probability distribution that can
be approximated by a normal distribution with the
mean and standard deviation given as
58Definition
- When we use the normal distribution (which is a
continuous probability distribution) as an
approximation to the binomial distribution (which
is discrete), a continuity correction is made to
a discrete whole number x in the binomial
distribution by representing the single x value
by the interval from to
(that is, by adding and subtracting 0.5).
59Example
- In a certain population of mussels (Mytilus
edulis), 80 of the individuals are infected with
an intestinal parasite. A marine biologist plans
to examine 100 randomly chosen mussels from the
population. Find the probability that at least 85
of the mussels will be infected.
60Assessing Normality
61Definition
- A normal quantile plot (or normal probability
plot) is a graph of the points (x, y) where each
x value is from the original set of sample data,
and each y value is the corresponding z score
that is a quantile value expected from the
standard normal distribution.
62Procedure for Determining Whether Data Have a
Normal Distribution
- Histogram Construct a histogram. Reject
normality if the histogram departs dramatically
from a bell shape. - Outliers Identify outliers. Reject normality if
there is more than one outlier present. - Normal quantile plot If the histogram is
basically symmetric and there is at most one
outlier, construct a normal quantile plot.
Examine the normal quantile plot using these
criteria - If the points do not lie close to a straight
line, or if the points exhibit some systematic
pattern that is not a straight-line pattern, then
the data appear to come from a population that
does not have a normal distribution. - If the pattern of the points is reasonably close
to a straight line, then the data appear to come
from a population that has a normal distribution.
63Example
- Recall our study of bears, the data for the
lengths of bears is given in Data Set 6 of
Appendix B. Determine whether the requirement of
a normal distribution is satisfied. Assume that
this requirement is loose in the sense that the
population distribution need not be exactly
normal, but it must be a distribution that is
basically symmetric with only one mode.
64Example (continued)
65Example (continued)
66Example (continued)
- Using the weights of bears (given in Data Set 6
of Appendix B), determine whether the requirement
of a normal distribution is satisfied. Assume
that this requirement is loose in the sense that
the population distribution need not be exactly
normal, but it must be a distribution that is
basically symmetric with only one mode.
67Example (continued)
68Example (continued)
69Data Transformations
- For data sets where the distribution is not
normal, we can transform the data so that the
modified values have a normal distribution.
Common transformations include -
-
-
-