The Normal - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

The Normal

Description:

Approximately 95% of the area under the normal. curve is between x = - 2 and x = 2 ... n 30 the sampling distribution of will be approximately normal. Example: ... – PowerPoint PPT presentation

Number of Views:216
Avg rating:3.0/5.0
Slides: 40
Provided by: stug
Category:

less

Transcript and Presenter's Notes

Title: The Normal


1
Chapter 7 The Normal Probability
Distribution Section 7.1 Properties of
the Normal Distribution
Recall a continuous random variable is a random
variable that has an infinite number of possible
values that is not countable. To find
probabilities for continuous random variables, we
do not use probability distribution
functions (as we did for discrete random
variables). Instead, we use probability
density functions. (pdf)
2
  • Probability Density Function
  • A probability density function is an equation
    used to compute probabilities of continuous
    random variables that must satisfy the following
    two properties.
  • The area under the graph of the equation over all
    possible values of the random variable must equal
    one.
  • The graph of the equation must be greater than or
    equal to zero for all possible values of the
    random variable. That is, the graph of the
    equation must lie on or above the horizontal axis
    for all possible values of the random variable.

Discrete
Continuous
Discrete
Continuous
3
The area under the graph of a density function
over some interval represents the probability of
observing a value of the random variable in that
interval.
4
A continuous random variable is normally
distributed or has a normal probability
distribution if its relative frequency histogram
of the random variable has the shape of a normal
curve (bell-shaped and symmetric).
  • Properties of the Normal Density Curve
  • It is symmetric about its mean, ?.
  • The highest point occurs at x ?.
  • The area under the curve is one.
  • The area under the curve to the right of ?
    equals the area under the curve to the left of ?
    equals ½.
  • As x increases without bound (gets larger and
    larger), the graph approaches, but never equals,
    zero. As x decreases without bound (gets larger
    and larger in the negative direction) the graph
    approaches, but never equals, zero.
  • The Empirical Rule

½
½
x
5
  • The Empirical Rule
  • Approximately 68 of the area under the normal
  • curve is between x ? - ? and x ? ?.
  • Approximately 95 of the area under the normal
  • curve is between x ? - 2? and x ? 2?.
  • Approximately 99.7 of the area under the normal
  • curve is between x ? - 3? and x ? 3?.

See p. 42
6
  • The Area under a Normal Curve
  • Suppose a random variable X is normally
    distributed with a mean ? and a standard
    deviation ?. Notation X N(? ,?) The area
    under the normal curve for any range of values of
    the random variable X represents either
  • The proportion of the population with the
    characteristics described by the range, or
  • The probability that a randomly selected
    individual from the population will have the
    characteristics described by the range.

e.g. 30 of ----- are between a b
e.g. x has a 0.3 probability to be between a b
7
Standardizing a Normal Random Variable Suppose
the random variable X is normally distributed
with a mean ? and standard deviation ?. Then the
random variable is normally distributed
with a mean ? 0 and standard deviation ? 1.
The random variable Z is said to have the
standard normal distribution. Notation Z
N(0 ,1)
See p 43
For any given x, we can calculate the associated
Z-score using the formula above.
8
Section 7.2 The Standard Normal Distribution
Z N(0 ,1)
  • Properties of the Standard Normal Curve
  • It is symmetric about its mean, ? 0.
  • The highest point occurs at ? 0.
  • The area under the curve is one. This
    characteristic is required in order to satisfy
    the requirement that the sum of all probabilities
    in a legitimate probability distribution equals
    1.
  • The area under the curve to the right of ? 0
    equals the area under the curve to the left of ?
    0 equals ½.
  • As z increases without bound (gets larger and
    larger), the graph approaches, but never equals,
    zero. As z decreases without bound (gets larger
    and larger in the negative direction) the graph
    approaches, but never equals, zero.
  • The Empirical Rule

½
½
z
9
  • The Empirical Rule
  • Approximately 0.68 68 of the area under the
  • standard normal curve is between 1 and 1.
  • Approximately 0.95 95 of the area under the
  • standard normal curve is between 2 and 2.
  • Approximately 0.997 99.7 of the area under
    the
  • standard normal curve is between 3 and 3.

10
Notation for the Probability of a Standard Normal
Random Variable P(a lt Z lt b) represents the
probability a standard normal random variable is
between a and b P(Z gt a) represents
the probability a standard normal random variable
is greater than a P(Z lt b) represents
the probability a standard normal random variable
is less than b.
The notation z? (pronounced z sub alpha) is the
Z-score such that the area under the standard
normal curve to the right of z? is ?.
?
z?
11
Table II at the back of the text is referred to
as a Z-table. It tabulates the area to the left
of a given Z-score.
?
12
Z -1.33 ?
P(Z) 0.0918
Z 1.33 ?
P(Z) 0.9082
? a 1 0.0918 0.9082
? a 1 0.9082 0.0918
13
Normal Curves on Board
14
Section 7.3 Applications of the Normal
Distribution
  • Finding the Area under any Normal Curve
  • Draw a normal curve with the desired area
    shaded.
  • Convert the values of X to Z-scores, using

3. Draw a standard normal curve with the area
desired shaded. 4. Find the area under the
standard normal curve. This is the area under
the normal curve drawn in Step 1.
15
  • Procedure for Finding the Value of a Normal
    Random Variable Corresponding to a Specified
    Proportion or Probability
  • Draw a standard normal curve with the area
    corresponding to the proportion or probability
    shaded.
  • Use the Z-table to find the Z-score that
    corresponds to the shaded area.
  • Obtain the normal value from the fact that X ?
    Z?.
  • We will take a look at some examples.

P(Z) some
z
16
Normal Curves on Board
17
Section 7.4 Assessing Normality
Suppose that we obtain a simple random sample
from a population whose distribution is unknown.
Many of the statistical tests that we perform on
small data sets (sample size less than 30)
require that the population from which the sample
is drawn be normally distributed. So, how do we
know if a data set comes from a normal
distribution?
We will use a normal probability plot to answer
the above question. This plot is also called a
normal quantile plot. A normal probability plot
plots observed data verses normal scores. A
normal score is the expected Z-score of the data
value if the distribution of the random variable
is normal. If sample data are taken from a
population that is normally distributed, a normal
probability plot of the actual values versus the
expected Z-scores will be approximately linear.
(Fat pencil test)
18
The book talks in detail on how to manually draw
a normal probability plot. We will not do this
by hand. We will use JMP to draw these plots for
us.
How to Obtain a Normal Probability Plot from
JMP Click on Analyze and then Distribution.
Select a column heading for Y columns. Click
OK. You will obtain a histogram and other
output. On the output screen, find the red down
triangle ( ) and find Normal Quantile
Plot. This will yield a plot that can be used
to test for normality.
19
Example Use the following normal probability
plots to assess whether the sample data could
have come from a population that is normally
distributed.
Normal
Evidence NOT Normal
20
Normal
Evidence NOT Normal
21
Section 7.5 Sampling Distributions And The
Central Limit Theorem
In general, a sampling distribution of a
statistic is a probability distribution (such as
the normal distribution) for all possible values
of the statistic computed from a sample of size
n. The sampling distribution of the sample mean
is a probability distribution of all possible
values of the random variable computed from a
sample of size n from a population with mean ?
and standard deviation ?.
22
  • The idea behind obtaining the sampling
    distribution of the mean is as follows
  • Obtain a simple random sample of size n.
  • Compute the sample mean.
  • Assuming that we are sampling from a finite
    population, repeat steps 1 and 2 until all
    simple random samples of size n have been
    obtained.

If population size N 100 and sample size n
5 What is the number of possible samples of size
5?
23
Since each sample of size n will have an observed
value of and not all observed values will
be exactly the same, is a random variable.
Since is a random variable, we can ask
the following questions What is the E( )
? What is the Var( ) ? What is the
distribution of ?
24
The Mean and Standard Deviation of the Sampling
Distribution of . Suppose that a simple
random sample of size n is drawn from a
population with mean ? and standard deviation ?.
The sampling distribution of will have a
mean and standard deviation The
standard deviation of the sampling distribution
of , , is called the standard error of
the mean.
Now we have answered the questions What is the
E( )? What is the Var( )?
Population mean
Population variance / sample size
25
What about the distribution of ?
What happens if the distribution of X is not
normal?
.
26
CENTRAL LIMIT THEOREM Suppose a random variable
X has a population mean ? and standard deviation
? and that a random sample of size n is taken
from this population. Then the sampling
distribution of
becomes approximately normal as the sample size
n increases. The mean of the distribution is
and standard deviation
.
Let us visualize this.
27
100 random draws
n 5
n 100
n 25
28
  • When is n large enough to assume normality?
  • The size of n depends on how close to normal the
    original population is.
  • If the population is normal, n 1 is large
    enough
  • As a rule of thumb, we will use n 30 as
    sufficiently large

Hence, when n ? 30 the sampling distribution of
will be approximately normal.
29
Example The length of human pregnancies is
approximately normally distributed with a mean of
266 days and standard deviation of 16 days.
  • What is the probability a randomly selected
    pregnancy lasts less than 260 days?
  • What is the probability that a random sample of
    20 pregnancies have a mean gestation period of
    260 days or less?
  • What is the probability that a random sample of
    50 pregnancies have a mean gestation period of
    260 days or less?
  • What might you conclude if a random sample of 50
    pregnancies resulted in a mean gestation period
    of 260 days or less?

30
  • What is the probability a randomly selected
    pregnancy lasts less than 260 days?
  • What is the probability that a random sample of
    20 pregnancies have a mean gestation period of
    260 days or less?

31
3. What is the probability that a random sample
of 50 pregnancies have a mean gestation period of
260 days or less? 4. What might you
conclude if a random sample of 50 pregnancies
resulted in a mean gestation period of 260 days
or less?
We might conclude that the population from which
the 50 pregnancies were drawn from has a mean
gestation period less than 266 days. (Only 0.4
chance)
32
Section 7.6 The Normal Approximation To the
Binomial Probability Distribution
  • Criteria for a Binomial Probability Experiment
  • A probability experiment is said to be a binomial
    experiment if all the following are true
  • The experiment is performed on n independent
    times. Each repetition of the experiment is
    called a trial.
  • Independence means, that the outcome of one
    trial will not affect the outcome of the other
    trials.
  • For each trial, there are two mutually exclusive
    outcomes, success or failure.
  • The probability of success, p, is the same for
    each trial of the experiment.

33
When we were dealing with probabilities for the
binomial distribution, we only set up an
expression, since it is very mathematically
tedious. However, we have a new way to
approximate those probabilities.
p.101
As the number of trials n in a binomial
experiment increases, the probability
distribution of the random variable X becomes
more nearly symmetric and bell-shaped. As a
general rule of thumb, if np gt 5 and nq gt 5, then
the probability distribution will be
approximately symmetric and bell shaped.
(or as in the text book npq 10)
34
The Normal Approximation to the Binomial
Probability Distribution. If np gt 5 and nq gt 5,
then the binomial random variable X is
approximately normally distributed with mean ?X
np and standard deviation
What is the major difference between a binomial
random variable and a normal random variable? A
binomial random variable is a discrete random
variable and a normal random variable is a
continuous random variable.
.
Therefore, since we are using a continuous
density function to approximate a discrete
probability we must apply a correction for
continuity. The continuity correction says that
we add and subtract 0.5 from every value of x.
35
Continuity Correction
P(X x) P(x 0.5 lt X lt x 0.5)
x
36
Exact
Approximation
If n 200 and p 0.1, Then Exact
0.6146 Approx 0.6256
37
Example Suppose a softball player safely
reaches base 45 of the time. Assuming at-bats
are independent events, use the normal
approximation to the binomial to approximate the
probability that, in the next 100 at bats
X Bin(100,0.45)
  • The player reaches base safely exactly 50 times.
  • The player reaches base safely 60 or more times.
  • The player reaches base safely 50 or fewer times.
  • The player reaches base safely between 60 and 90
    times, inclusive.

38
  • The player reaches base safely exactly 50 times.
  • The player reaches base safely 60 or more times.

45
50
0.91
1.11
0
45
60
2.92
0
39
  • 3. The player reaches base safely 50 or fewer
    times.
  • 4. The player reaches base safely between 60 and
    90 times, inclusive.
Write a Comment
User Comments (0)
About PowerShow.com