Title: Continuous Distributions
1Continuous Distributions
2General Properties of Continuous Random Variables
- Discrete random variables
- Can only assume a finite or countable infinite
number of distinct values. - Typically involves counting something like the
number of occurrences. - We can list all the possible outcomes
- Meaningful to consider the probability that a
particular value will occur
3General Properties of Continuous Random Variables
- Continuous random variables
- Can assume an uncountable infinite number of
values (We cannot list all the possible outcomes
because there is always a number in between any
two of its values) - Can assume any value in the interval between two
points "a" and "b" (a lt x lt b) - Typically involves measuring attributes such as
weight, length, time, or temperature. - The probability that a continuous random variable
will assume any particular value is zero
4Continuous Probability Distributions
- A continuous random variable has an uncountably
infinite number of values in the interval (a,b).
As the number of values increases the probability
of each value decreases. This is so because the
sum of all the probabilities remains 1.
- The probability that a continuous variable X will
assume any particular value is zero. Why?
When the number of values approaches infinity
(because X is continuous) the probability of
each value approaches 0.
The probability of each value
1/4 1/4 1/4
1/4 1
1/3 1/3
1/3 1
1/2
1/2 1
1/2
5Continuous Probability Distributions
- The probability distribution of a continuous
random variable, X, is described by a continuous
probability density function, f(x). A
probability density function must satisfy 2
requirements - Requirement 1
- f(x) gt 0 for all possible x values, range of
possible value -? to ? - The cumulative probability distribution function,
F(x), is defined as - F(x) f(u)du
- Which gives the probability that the random
variable will achieve a value less than or equal
to x. - Consider Figure 4.1
- Suppose f(x) is the curve. The probability of any
event is equal to the area above the x-axis and
below the f(x) curve. - Consider A13 lt x lt 15 the shaded area is the
probability of event A.
6Continuous Probability Distributions
- Requirement 2
- The total area under f(x) over the range of
possible values must equal 1 - f(x)dx 1
- thus the area under f(x) from a to b must equal 1
- Other differences between discrete and continuous
probability distribution functions - f(x) gives a measure of the intensity or density
of the probability mass at x and in the
"neighborhood" of x for continuous f(x).
7Continuous Probability Distributions
- What does this mean?
- The higher value of f(x), the more likely the
values in the neighborhood of x are likely to
occur. - What about before we perform the experiment?
- The probability that a continuous random variable
X will assume any particular value is zero. - When we define an event, we must define a
continuous interval of length greater than zero.
This is different from discrete random variables,
which have probability mass defined only at
specific and unique x values.
8Continuous Probability Distributions
- Other unique features of continuous
- There are an uncountable infinite number of
outcomes over any interval greater than zero
length. Why? - Because continuous random variables are defined
over a continuum. So, the only meaningful events
for a continuous random variable are intervals. - It is meaningful only to talk about the
probability that the value assumed by X will fall
within some interval of values
9Continuous Probability Distributions
- When dealing with continuous data, we attempt to
find a function f(x), called a Probability
Density Function, whose graph approximates the
relative frequency polygon for the population. - A probability density function f(x) must satisfy
2 conditions - f(x) in nonnegative
- The total area under the curve representing f(x)
equals 1. - Note that f(x) is not a probability.
- The probability that X will take on any specific
value is zero. - Consider the following figure (7.2 from Keller)
- The area under the graph of f(x) between the two
values a and b is the probability that X will
take a value between a and b. - P(a lt X lt b) f(x) dx
10- To calculate probabilities we define a
probability density function f(x).
Area 1
P(altxltb)
11Uniform Distribution
- A random variable X is said to be uniformly
distributed if its density function is
12Uniform Distribution
- The expected value of X is the midpoint of the
domain of X - The values of a uniform random variable X are
evenly distributed - Intervals of equal size within the domain of X
are equally likely to contain the value that the
variable X will assume - Area under the uniform density function f(x)
equals 1
f(x)
A 1
1/b - a
x
a
b
13Example 7.1
- The time elapses between the placement of an
order and the delivery time is uniformly
distributed between 100 and 180 minutes. - Define the graph and the density function.
- What proportion of orders takes between 2 and 2.5
hours to deliver?
f(x) 1/80 100ltxlt180
P(120lt xlt150) (base)(height)(150-120)(1/80)
.375
1/80
x
100
180
120
150
14The Normal Distribution A bell shaped
distribution, symmetrical around m
m
- The function is symmetric about the mean, m
- 68.26 of all events occur within 1 standard
deviation of the mean, - 95.44 within 2s
- 99.74 within 3s
15The Normal Distribution
- The normal is certainly the most well known
continuous distribution, but why is it so
important? - The normal distribution occurs naturally - there
are many things in the physical world that are
distributed normally, such as - The heights or weights of a group of people
- The total annual sales of a company
- The grades of a class of students
- The measurement errors that arise in the
performance of an experiment. - Certain other random variables can be
approximated by a normal distribution. - Some random variables that are not even
approximately normally distributed can easily be
transformed into normally distributed random
variables.
16The Normal Distribution
- Many results and analysis techniques that are
useful in statistical work are strictly correct
only when the associated random variables are
normally distributed. - Even if the distributed of the original
population is far from normal, the distribution
associated with the sample averages from this
population tends to become normal, under a wide
variety of conditions, as the size of the sample
increases. - The normal distribution is the cornerstone of
statistical distribution of statistical
inference, representing the distribution of
possible estimates of a population parameter that
may arise from different samples. - We can see from the formula for the probability
density function that a normal distribution is
completely determined once the parameters m and s
2 are specified. - All normal distributions have the same
bell-shaped curve, but the location of the mean
can change or the variance can change.
17Normal Distribution
- A random variable X with mean m and variance s2
is normally distributed if its probability
density function is given by
As is the case with any other probability density
function, the value of f(x) is not the
probability that X assumes the value x, but an
expression of the height of the curve at the
value x. The entire area under the curve
depicting f(x) must equal 1.
18Normal random variable
1. A random variable that is normally distributed
2. Can assume any real value from - ? to ? How
do we use the normal distribution? 1. We must
determine that the process to be considered is
approximately normal. 2. Find the various normal
probabilities, which are represented by areas
under the normal curve.
19Two methods of solving "normal" problems
1.Computer software 2.Standard normal
distribution where m 0 and s 1 To use the
standard normal, we use the z transform Z
(X - ?)/ ? Thus, z is distributed according
to the standard normal distribution which has ?
0 and ? 1. Standard Normal Random variable
Z has a mean of ? 0 and a standard deviation
? 1. Restrictions Only when X is
normally distributed can we conclude that Z is
normally distributed. Why does this
conversion work? The shape of the normal curve
is completely determined by the standard
deviation.
20How does the standard deviation affect the shape
of f(x)?
s 2
s 3
s 4
How does the expected value affect the location
of f(x)?
m 10
m 11
m 12
21Finding Normal Probabilities
- Two facts help calculate normal probabilities
- The normal distribution is symmetrical.
- Any normal distribution can be transformed into a
specific normal distribution called
Standard Normal Distribution
- Example
- The time it takes to write a standard entrance
exam is normally distributed, with a mean of 60
minutes and a standard deviation of 8 minutes. - What is the probability that a student will
finish the in between 60 and 70 minutes?
22Solution
- If X denotes the time taken to write the exam, we
seek the probability P(60ltXlt70). - This probability can be calculated by creating a
new normal variable the standard normal variable.
Every normal variable with some m and s, can be
transformed into this Z.
Therefore, once probabilities for Z are
calculated, probabilities of any normal variable
can found.
V(Z) 1
E(Z) 0
23Example - continued
- m
60
X
70
- 60
- 60
P(60ltXlt70) P( lt lt
)
s
8
8
P(0ltZlt1.25)
To complete the calculation we need to compute
the probability under the standard normal
distribution
24Standard normal probabilities have been
calculated and are provided in a table .
The tabulated probabilities correspond to the
area between Z0 and some Z z0 gt0
Z z0
Z 0
250.3944
0.3944
0.3944
0.3944
P(0ltZlt1.25)
0.3944
0.3944
0.3944
In this example z0 1.25
26Example of Normal Distribution
From our example Z X - 60 / 8 In order to find
the desired probability, P(60 lt X lt 70), we must
first determine the interval of z-values
corresponding to the x-values of interest 60 lt X
lt 70. 60 - 60 / 8 lt x - 60 / 8 lt 70 - 60 / 8 0 lt
z lt 1.25 P(60 lt X lt 70) P(0 lt z lt 1.25) .5000
- .8944 .3944 Now we can look up the answer in
Table A.3, pp. 364-365 (xo - m ) expresses how
far xo is from the mean, the corresponding
z-value, zo tells us how many standard deviations
is from the mean. Thus, the value 70 is 1.25
standard deviations from the mean (positive - to
the right of the mean), that is 70 60
1.25(8)
27Example of Normal Distribution
- Example (example 7.2 in Keller p.267)
- P(Z gt 1.47) .0708
- P(-.2.25 lt Z lt 1.85) .9556
- P(.65 lt Z lt 1.36) .1709
28- The symmetry of the normal distribution makes it
possible to calculate probabilities for negative
values of Z using the table as follows
-z0
z0
0
P(-z0ltZlt0) P(0ltZltz0)
29Example 7.2
- Determine the following probabilities
P(Zgt1.47) ?
0.5
- P(0ltZlt1.47)
P(Zgt1.47)
1.47
0
P(Zgt1.47) 0.5 - 0.4292 0.0708
30P(-2.25ltZlt1.85) ?
.4878
P(-2.25ltZlt0) ?
P(0ltZlt1.85) .4678
P(0ltZlt2.25) .4878
0
-2.25
1.85
2.25
P(-2.25ltZlt1.85) 0.4878 0.4678 0.9556
31P(.65ltZlt1.36) ?
P(0ltZlt1.36) .4131
P(0ltZlt.65) .2422
0
1.36
.65
P(.65ltZlt1.36) .4131 - .2422 .1709
32Example 7.3
- The rate of return (X) on an investment is
normally distributed with mean of 30 and
standard deviation of 10 - What is the probability that the return will
exceed 55?
.5 - P(0ltZlt2.5) .5 - .4938 .0062
Z 2.5
0
33- What is the probability that the return will be
less than 22?
.8
-.8
0
P(Zgt.8) 0.5 - P(0ltZlt.8) 0.5 - .2881 .2119
34Example 7.4
- If Z is a standard normal variable, determine the
value z for which P(Zltz) .6331.
0.6331
z
0
z .34
.5
.1331
35Example 7.5
- Determine z.025
- Solution
- zA is defined as the z value for which the area
to the right of zA under the standard normal
curve is A.
0.475
0.025
0.025
-1.96
Z0.025
1.96
- Z0.025
0
36Approximating the binomial distribution with the
normal
- The normal approximation to the binomial is
useful when the number of trials n is so large
that the binomial tables cannot be used. - Because the normal distribution is symmetrical,
it best approximates binomial distributions that
are reasonably symmetrical. - Therefore, since a binomial distribution is
symmetrical when the probability p of success
equals .5, the best approximation is obtained
when p is reasonably close to .5. The farther p
is away from .5, the larger n must be in order
for a good approximation to result. - When the number of trials is large and the
probability of success is not near 0 or 1.
37Approximating the binomial distribution with the
normal
The approximation is reasonably good as long as
there is a very small probability that the
approximating normal random variable will assume
a value outside the binomial range (0 lt X lt n).
The approximation is reasonably good as long as
np gt 5 when p lt .5 or n(1-p) gt 5 When p gt
.5 Recall the following Given a binomial
distribution with n trials and probability p of
success on any trial, mean of the binomial m
np variance of the binomial s 2 npq We
therefore need to choose the normal distribution
with m np s 2 npq to be the approximating
distribution.
38Approximating the binomial distribution with the
normal
Example Consider a binomial with n 20 p
.5 We approximate the binomial probabilities by
using the normal distribution with m np
(20)(.5) 10 s 2 npq (20)(.5)(.5) 5 s
2.24 Let X denote the binomial random variable
and let Y denote the normal random variable. The
binomial probability P(X 10) represented by the
height of the line above x 10 in the graph
(figure 7.15), is equal to the area of the
rectangle erected above the interval from 9.5 to
10.5.
39Approximating the binomial distribution with the
normal
This area (or probability) is approximated by the
area under the normal curve between 9.5 and 10.5.
This relationship is expressed as P(X 10)
P(9.5 lt Y lt 10.5) The .5 that is added to and
subtracted from 10 is called the continuity
correction factor it corrects for the fact that
we are using a continuous distribution to
approximate a discrete distribution. To check
the accuracy of this particular approximation, we
can use the binomial tables to obtain P(X 10)
.176 The normal approximation is P(9.5 lt Y lt
10.5) P(9.5 - 10 / 2.24 lt Z lt 10.5 - 10 /
2.24) P(9.5 lt Y lt 10.5) P(-.22 lt Z lt .22) P(9.5
lt Y lt 10.5) 2(.0871) .1742
40Approximating the binomial distribution with the
normal
The approximation for any other value of X would
proceed in the same manner. In general, the
binomial probability P(X xo) is approximated by
the area under the normal curve between (xo - .5)
and (xo .5). Suppose, with the present
example, that we want to approximate the binomial
probability P(5 lt X lt 12). This probability
would be approximated by the area under the
normal curve between 4.5 and 12.5. P(5 lt X lt 12)
P(4.5 lt Y lt 12.5) P(5 lt X lt 12) P(4.5 - 10 /
2.24 lt Z lt 12.5 - 10 / 2.24) P(5 lt X lt 12)
P(-2.46 lt Z lt 1.12) P(5 lt X lt 12) P(0 lt Z lt
2.46) P(0 lt Z lt 1.12) P(5 lt X lt 12) .4931
.3686 .8617 As a check, the binomial table
yields P(5 lt X lt 12) .862
41Approximating the binomial distribution with the
normal
The continuity correction factor becomes less
important as n becomes larger. When n is greater
than 25, the continuity correction factor is
ignored.
42Exponential Distribution
- The exponential distribution can be used to model
- the length of time between telephone calls
- the length of time between arrivals at a service
station - the life-time of electronic components.
- When the number of occurrences of an event
follows the Poisson distribution, the time
between occurrences follows the exponential
distribution.
43Exponential
The exponential distribution can be used to
measure the time that elapses between occurrences
of an event. ExampleThe exponential
distribution can be used to model the length of
time before the first telephone call is received
or the length of time between arrivals at a
service station. Probability density
function f(x) ?e-?x, where x gt 0 e 2.71 ?
parameter of the distribution (? gt 0) The
exponential distribution is a one dimensional
distribution the distribution is completely
specified once the value of the parameter ? is
known.
44Exponential
The probability that an exponential variable
exceeds the number "a" P(X gt a) e-?a The
total area under this equation must equal 1. The
probability that X will take a value between two
numbers "a" and "b" P(a lt X lt b) P(X lt b) -
P(X lt a) P(a lt X lt b) e-?b - e-?a
Exponential distribution for l .5, 1, 2
45Exponential Example 1
- The lifetime of a transistor is exponentially
distributed, with a mean of 1,000 hours. - What is the probability that the transistor will
last between 1,000 and 1,500 hours. - Solution
- Let X denote the lifetime of a transistor (in
hours). - ? 1/? 1/1000 .001
- f(x) ?e-?x
- P(1000ltXlt1500) e-(.001)(1000) - e-(.001)(1500)
.3679 - .2231 .1448
46Exponential Example 2
- A tollbooth operator has observed that cars
arrive randomly and independently at an average
rate of 360 cars per hour. - a) Use the exponential distribution to find the
probability that the next car will not arrive
within 30 seconds. - Let X denote the time in minutes that will elapse
before the next car arrives. If is important
that X and ? be defined in terms of the same
units. Thus, ? is the average number of cars
arriving per minute - ? 360/60 6
- P(X gt a) e-?a
- P(X gt .5) e-6(.5) .4098
P(altxltb) e-la - e-lb
a
b
47Exponential Example 2
- What is the probability that no car will arrive
within the next half minute? - Solution
- If Y counts the number of cars that will arrive
in the next half minute, then Y is a Poisson
variable with m (.5)(6) 3 cars per half a
minute.Using the formula for Poisson
probability - P(Y 0 ? 3) e-3(30)/0! .0498.
- Comment If the first car will not arrive
within the next half a minute then no car will
arrives within the next half minute. Therefore,
not surprisingly, the probability found here is
the exact same probability found in the previous
question.
48Chi-square
- F(x) see page 72 in Barnes
- A Chi-square random variable with n degrees of
freedom, - 1. Number of terms in the sum determines the
degrees of freedom. - 2. There is a separate and unique Chi-square
probability distribution for each value of n. - 3. The Chi-square distribution approaches
symmetry only for large degrees of freedom (n GTE
30). - 4. X2 replicates itself under the positive
addition of statistically independent x random
variables. - When do we use the Chi-square?
- To test conjectures about the variance or
dispersion of the probability distribution
function of a normal random variable.
49- The sample variance s2 is an unbiased, consistent
and efficient point estimator for s2. - The statistic has a
distribution called Chi-squared, if the
population is normally distributed.
d.f. 1
d.f. 10
d.f. 5
50Student t Distribution
- F(x) see page 72 in Barnes
- A random variable governed by a t distribution
with n degrees of freedom - T
- Each value of n yields a unique and different
probability distribution function. - When do we use the student t?
- To test hypothetical statements about the mean of
a normal random variable when the variance of the
distribution is unknown and the sample size is
small.
51Z
Z
t
t
Z
t
t
Z
t
t
Z
t
t
Z
t
t
t
t
t
s
s
s
s
s
s
s
s
s
s
When the sampled population is normally
distributed, the statistic t is Student t
distributed.
The degrees of freedom, a function of the
sample size determines how spread
the distribution is (compared to the normal
distribution)
The t distribution is mound-shaped, and
symmetrical around zero.
d.f. n2
d.f. n1
n1 lt n2
0
52Probability Calculations For The t Distribution
- The t table provides critical value for various
probabilities of interest. - For a given degree of freedom, and for a
predetermined right hand tail probability A, the
entry in the table is the corresponding tA. - These values are used in computing interval
estimates and performing hypotheses tests.
53tA
t.100
t.05
t.025
t.01
t.005
54Student t Distribution
- Characteristics of the student t
- 1. Symmetric about its mean
- 2. Usually compared to the standardized normal
distribution N(0,1) - 3. At lower values of n, the t is flatter, with a
large variance and has a flatter tail. - 4. As n becomes large, the t approaches the
N(0,1) - 5. When n gt 30, the N(0,1) and t are
approximately the same - 6. The extent to which the student t distribution
is more spread out than the normal distribution
is determined by a function of the sample size
called the degrees of freedom which varies by the
t-statistic.
55The F Distribution
- The ratio of two independent chi-squared
variables divided by their degrees of freedom is
F distributed. - F(x) see p. 72 in Barnes
- Where
- (a,b) see p. 72 in Barnes
- The F random variable
- F see p. 72 in Barnes
- The F distribution has n degrees of freedom in
the numerator and m in the denominator. - Each different possible pairing of n and m yields
a unique and different probability distribution. - Variables that are F distributed range from 0 to
? -
- Show figure 4-7. P.77
56The F Distribution
- When do we use the F distribution?
- Comparing the variances taken from two different
normal distributions which we will consider in
Chapter 10.
57Gamma Distribution
- F(x) see p. 72 in Barnes
- If we set l 1/2 and r n/2 the gamma becomes
the X2 distribution - thus the x X2 is a special
case of the gamma. The exponential distribution
is a special case of the gamma when r 1. - f(x) e x, 0
- When do we use the Gamma and exponential?
- When a population is decaying- when we want to
characterize a portion of a population still
surviving under the condition of a constant
failure rate. - The exponential distribution is memoryless if
the time to failure T is an exponential random
variable, then the probability of T being less
than t minutes, given that it has already lasted
exactly (Toa) minutes is equal to the
probability of T being less than t - (Toa)
minutes when the experiment has just begun.
58Weibull Distribution
- f(x) see p. 72 in Barnes
- The Weibull distribution can represent all three
types of component failure early burnout, the
change failure and the wear out mode. -
59Lognormal Distribution
- f(x) see p. 72 in Barnes
- The lognormal distribution is a random variable
whose natural logarithm, ln (log base e) is
distributed as a normal random variable - Y ln X
60Normal Example 1
- 4.9 The length of a drive shaft is made up of
three parts, A, B, and C. The lengths (in
inches) of A, B, and C are statistically and
normally distributed with the following means and
variances. - What is the probability that the total length,
YABC, will exceed 12.2 inches? - Application of additive property of the normal
distribution - ?y 3 5 4 12
- ?2y .0025 .0049 .0036 .011
61Normal Example 1
- ?y .10488
- Z (X - ?y)/ ?y (12.2 - 12)/.10488 1.9069
- P(Z lt 1.9069) .9717
- P(Y gt 12.2) .0283
62Normal Example 2
- The lifetime of a certain brand of tires is
approximately normally distributed, with a mean
of 45,0000 miles and a standard deviation of
4,000 miles. The tires carry a warranty for
35,000 miles. - a) What proportion of the tires will fail before
before the warranty expires? - b) At what mileage should warranty life be set so
that fewer than 5 of tires must be replaced
under warranty?
63Normal Example 2
- a) Let the lifetime of a tire, where
64Normal Example 2
- b) Let x0 represent the desired mileage. Then
The warranty life should be set at 38,420 miles
(or less).
65Exponential Example 3
- Let X be an exponential random variable with a
mean of .5. Find the follwing probabilities - P(Xgt1)
- P(Xgt2)
- P(Xlt.5)
- P(Xlt.4)
66Exponential Example 3
P(Xgt2)
P(Xlt.4)
67Exponential Example 4
- Let X be an exponential random variable with ?
4. Find the probability that X will take a value
within 1.2 standard deviations of its mean.
68Exponential Example 4
- Let X be an exponential random variable with ?
4. Find the probability that X will take a value
within 1.2 standard deviations of its mean.
69Exponential Example 4
- The expected value of an exponential random
variable, X, is 1/?. Find the probability that X
will take a value that is less than its expected
value.
70Exponential Example 4
- The expected value of an exponential random
variable, X, is 1/?. Find the probability that X
will take a value that is less than its expected
value.
71Exponential Example 5
- Airplanes arrive at an airport according to the
Poisson model, with a mean time between arrivals
of 5 minutes. - Find the probability that a plane will arrive
within the next 5 minutes - Find the probability that no planes will arrive
during a given 30 minute period - Find the probability that no more than one plane
willarrive during a given 30 minute period.
72Exponential Example 5
- a) Let T be the time in minutes before the next
arrival. Then T is exponentially distributed
with
b)
73Exponential Example 5
- c) If X is the number of arrivals during a
30-minute period, then X is a Poisson random
variable with