Title: Using the 68-95-99.7 Rule Normal Quantile Plots
1Using the 68-95-99.7 RuleNormal Quantile Plots
2Learning Objectives
- By the end of this lecture, you should be able
to - Do various calculations involving areas under the
density curve using the 68-95-99.7 rule - Identify the mathematical technique used to help
confirm (thought not guarantee!) that our
distribution is indeed Normal.
3A few numbers worth memorizing(though not just
yet)
- Because we use the Normal distribution SO much,
it is worth memorizing the approximate areas from
the Normal table that correspond to a few
different z-scores. - I say approximate, because the values are rounded
off. - Look at the areas shown here but dont memorize
them just yet. - z -2 ? about 2.2
- z -1 ? about 16
- z 1 ? about 84
- z 2 ? about 98
- What I do want you to memorize are the 3 numbers
shown in a famous rule on the next slide.
4The 68-95-99.7 Rule for Normal Distributions
- This is essentially a shortcut for a mental
ballpark of the areas under the normal curve. It
is definitely worth memorizing. -
- The area between -1 and 1 standard deviations
corresponds to about 68 of the observations. - The area between -2 and 2 standard deviations
corresponds to about 95 of the observations. - The area between -3 and 3 standard deviations
corresponds to about 99.7 of the observations.
You WILL be asked to use these numbers on quizzes
and exams. Please note that on your exams you
will not be provided with the three numbers (68,
95, 99.7).
5Examples The 68-95-99.7 Shortcut Rule for
Normal Distributions
The z0 line (black line) is very helpful in
doing many of these calculations.
- Now lets play around with these numbers by
answering some questions. All numbers refer to
z-scores (i.e. standard deviations) - What percentage of observations lie between -1
and 1? - Answer As we just discussed, the number of
observations between -1 and 1 standard
deviations is 68. - What percentage lie between 0 and 1?
- Answer Recall that z0 represents 50. So, if -1
to 1 is 68, then 0 to 1 is half of that, which
is 34. - This is an important one. Make sure you
understand how to do it!! - There are a few ways to think of it Look at the
area between z0 (the black line) and z1. Note
that is is half of the area between -1 and 1. - If you need to visualize it (and you should!!),
then shade in the area between z0 and z1. - What percentage of observations lie below 1?
- Answer To do this, look at your z0 line. Make
sure you recognize that the area to the left of
z0 represents 50 of observations. Now, how many
observations are between 0 and 1? Recall from
the previous question that this is 34.Therefore,
from 0 to 1 34, and below 0 is 50, so the area
to the left of 1 represents 84 of observations.
6Examples The 68-95-99.7 Shortcut Rule for
Normal Distributions
- More examples
- What percentage of observations lies between -2
and 1? - Answer Use your midline! I would solve this by
adding the area between -2 and 0 (half of 95) to
the area between 0 and 1 (half of 68) ? 47.5
34 81.5 - What percentage of observations lies between 0
and 3? - Answer Half of the area between -3 and 3 (99.7)
which is 49.85. - What percentage of observations lies below -2?
- Answer While this too can be answered in a few
different ways, I would like you to make sure you
can do it this way - Look at the area between -2 and 2. Our
shortcut tells us that this contains 95 of
observations. - This means that the area above 2 and below -2
together compromise 5 of observations. So the
area above 2 2.5 of observations, and the
area below -2 also comprises 2.5 of
observations. - Answer 2.5
- What percentage of observations lies above 3?
- Answer Use the same technique as was just
discussed - Between -3 and 3 makes up 99.7.
- Therefore below -3 and above 3 makes up 0.3.
- Therefore below -3 is 0.15 and above 3 0.15
7Examples The 68-95-99.7 Shortcut Rule for
Normal Distributions
- One more!
- What percentage of observations lies below 2
standard deviations? - Answer Repeat the process from before to
determine the area on either side of 2 and -2.
That value was 2.5. If 2.5 of values lie above
2, then 97.5 of observations lie below it. - Answer 97.5
8The 68-95-99.7 Shortcut Rule for Normal
Distributions
- What percentage of women are between 62 and 67
inches tall? - Answer Corresponds to -1 to 1 SDs, that is,
about 68 - What is the range of heights between which about
95 of women fall? - Answer About -2 to 2 SDs, so, about 59.5 to
69.5 inches tall. - What is the range of heights between which nearly
all (over 99) of women fall? - Answer A quick answer would simply to pick the
-3 to 3 SD range (57-72).
Inflection point
mean µ 64.5 standard deviation s 2.5
N(µ, s) N(64.5, 2.5)
9The 68-95-99.7 Shortcut Rule for Normal
Distributions
- More Examples
- What percentage are taller than 67 inches?
- Answer If 68 of all women are between 62 and
67 inches tall, this means that 32 are outside
of that range. In other words, 16 are shorter
than 62 inches, and 16 are taller than 67. - What percentage are shorter than 59.5 inches?
- Answer If 95 of all women are between 59.5 and
69.5, then 5 are outside of that range. In
other words, 2.5 are shorter than 59.5 and 2.5
are taller than 69.5.
Inflection point
mean µ 64.5 standard deviation s 2.5
N(µ, s) N(64.5, 2.5)
10Shortcut Rule or Z-Table?
- Students have often been confused as to which
should be used. - Whenever possible, use your z-table as you will
get a much more accurate result. In particular,
if you are given z-scores that are not anywhere
near whole numbers (e.g. 2.332), then there is no
shortcut to use! The shortcut can only be used
with whole (integer) numbers between -3 and 3. - The main purpose of learning the shortcut rule
(in addition to the fact that they come up on all
kinds of exams), is to encourage you develop an
undersatnding of what you are trying to do rather
than just jumping to calculators and z-tables. - For this course, you will be asked to do both.
11Is the distribution truly Normal?
- Deciding whether data does indeed show a Normal
(or, close to Normal) distribution is a very
important question. - All the examples weve been discussing above
involving z-scores assume that the data is
Normal. If the data was not Normal, all of our
answers and calculations would be flawed. - Recall that there are many other types of
distributions that are not Normal. Some examples
include skewed, bimodal, Binomial (later in the
quarter), Poisson, etc, etc - Each type of distribution has its own
characteristic formulas, calculations, inference
techniques, etc. Again, because the Normal
distirbution is one of the most commonly
encountered distributions, we will spend lots of
time discussing it. - So how to you decide if a distribution is Normal?
- You might be tempted to say look at a graph.
And this is not entirely false When examining
data, a chart is a great (if not the BEST) place
to start! - However, as humans, we are easily fooled. There
are many histograms (and related density curves)
that look Normal, but in fact, are not. - Fortunately, we do have a statistical test that
can help confirm (thought not guarantee) that our
dataset does indeed appear to be Normal.
12Normal Quantile Plot
- The Normal Quantile plot is a graph that helps us
determine if a distribution is indeed Normal - It is a mathematical plot that we can create
using our statistical software package of choice.
- Here is the method (which is provided for
interest only) - The data points are ranked and the percentile
ranks are converted to z-scores with Table A. The
z-scores are then used for the x axis against
which the data are plotted on the y axis of the
normal quantile plot. - If the distribution is indeed normal the plot
will show a fairly straight line, indicating a
good match between the data and a normal
distribution. - Systematic deviations from a straight line
indicate a non-normal distribution. Outliers
appear as points that are far away from the
overall pattern of the plot.
13Normal Quantile Plot shows a good fit to a
straight line the distribution of rainwater pH
values is close to normal.
Normal quantile plot is not a straight line. This
tells us that the data do not follow a Normal
distribution.
Normal quantile plots are complex to do by hand,
but they are standard features in most
statistical software.
14The normal quantile test supports normality, but
does NOT guarantee it!
- Two key points here
- If the plot is NOT straight, then your data is
NOT normal! - If the plot IS straight, then you have supported
the idea that your dataset is normal. However,
you have NOT guaranteed it! - This concept (confirming / supportive tests) will
come up with various other statistical concepts
down the road. Whenever you encounter them, you
should be sure to make use of them.
15(No Transcript)