Title: Topic 4: Discrete Random Variables and Probability Distributions
1Topic 4 Discrete Random Variables and
Probability Distributions
- CEE 11 Spring 2002
- Dr. Amelia Regan
These notes draw liberally from the class text,
Probability and Statistics for Engineering and
the Sciences by Jay L. Devore, Duxbury 1995 (4th
edition)
2Definition
- For a given sample space S of some experiment, A
random variable is any rule that associates a
number with each outcome in S - Random variables may take on finite or infinite
values - Examples 0,1, the number of sequential tosses
of a coin in which the outcome head is observed
- A set is discrete either if it consists of a
finite number of elements or if its elements may
be listed in sequence so that there is a first
element, a second element, a third element, and
so on, in the list - A random variable is said to be discrete if its
set of possible values is a discrete set.
3Definition
- The probability distribution or probability mass
function (pmf) of a discrete random variable is
defined for every number x by
- The cumulative distribution function (cdf) F(x)
of a discrete rv X with pmf p(X) is defined for
every number x by
4Examples
5Definition
- Let X be a discrete rv with set of possible
values D and pmf p(x). The expected value or
mean value of X, denoted E(x) or mx is given by
- If the rv X has a set of possible values D and
pmf p(x), then the expected value of any function
h(X), denoted by Eh(x) or mh(x) is computed by - Note according to Ross(1988) p. 255 this is
known as the law of the unconscious statistician
6Example
- Let X be a discrete rv with the following pmf and
corresponding cdf
7Class exercise
- Let X be a discrete rv with the following pmf
- Calculate the cdf of X
- Now let h(x) 3x3-100 -- Calculate h(x)
8Definition
- For linear functions of x we have simple rules
9Variance of a random variable
- If X is a random variable with mean m, then the
variance of X denoted by Var(X) is defined by
- Recall that we previously defined the variance of
a population as the average of the squared
deviations from the mean. The expected value is
nothing other than the average or mean so this
form corresponds exactly to the one we used
earlier.
10Variance of a random variable
- Its often convenient to use the a different form
of the variance which, applying the rules of
expected value which we just learned and
remembering the EX m, we derive in the
following way.
11The Bernoulli distribution
- Any random variable whose only two possible
values are 0 and 1 is called a Bernoulli random
variable. - Example Suppose a set of buildings are examined
in a western city for compliance with new
stricter earthquake engineering specifications.
After 25 of the cities buildings are examined at
random and 12 are found to be out of code while
88 are found to conform to the new
specifications it is supposed that buildings in
the region have a 12 likely hood of being out of
code. - Let X 1 if the next randomly selected building
is within code and X 0 otherwise. - The distribution of buildings in and out of code
is a Bernoulli random variable with parameters p
0.88 and 0.12.
12The Bernoulli distribution
13The Bernoulli distribution
- In general form, a is the parameter for the
Bernoulli distribution but we usually refer to
this parameter as p, the probability of success
- The mean of the Bernoulli distribution with
parameter p is
- The variance of the Bernoulli distribution with
parameter p is
14The Binomial distribution
- Now consider a random variable that is made up of
successive (independent) Bernoulli trials and
define the random variable X as the number of
successes among n trials.
- The Binomial distribution has the following
probability mass function
- Remembering what we learned about combinations
this makes intuitive sense. The binomial
coefficient represents the number of ways to
distribution the x successes among n trials. px
represents the probability that we have x
successes in the n trials, while (1-p)(n-x)
represents the probability that we have n-x
failures in the n trials.
15The Binomial distribution
- Computing the mean and the variance of the
binomial distribution is straightforward. First
remember that the binomial random variable is the
sum of the number of successes in n consecutive
Bernoulli trials. Therefore
16The Binomial distribution
17The Binomial distribution
18The Binomial distribution
19Class exercise
- A software engineer has historically delivered
completed code to clients on schedule, 40 of the
time. If her performance record continues, the
probability of the number of on schedule
completions in the next 6 jobs can be described
by the binomial distribution.
- Calculate the probability that exactly four jobs
will be completed on schedule
- Calculate the probability that at most 5 jobs
will be completed on schedule
- Calculate the probability that at least two jobs
will be completed on schedule - Calculate the probability that at most 5 jobs
will be completed on schedule
20The Hypergeometric distribution
- The binomial distribution is made up of
independent trials in which the probability of
success does not change.
- Another distribution in which the random variable
X represents the number of successes among n
trials is the hypergeometric distribution.
- The hypergeometric distribution assumes a fixed
population in which the proportion or number of
successes is known.
- We think of the hypergeometric distribution as
involving trials without replacement and the
binomial distribution involving trials with
replacement.
- The classic illustration of the differences
between the hypergeometric and the binomial
distribution is that of black and white balls in
a urn. Assume the proportion of black balls is
p. The distribution of the number of black balls
selected in n trials is binomial(xnp) if put
the balls back in the urn after selection and
hypergeometric(xnM,N) if we set them aside
after selection. (engineering examples to come)
21The Hypergeometric distribution
- The probability mass function of the
hypergeometric distribution is given by the
following, where M is the number of possible
successes in the population, N is the total size
of the population, x is number of successes in
the n trials.
22The Hypergeometric distribution
- Example Suppose out of 100 bridges in a region,
30 have been recently retrofitted to be more
secure during earthquakes. Ten bridges are
selected randomly from the 100 for inspection.
What is the probability that at least three of
these bridges will be retrofitted?
23The Hypergeometric distribution
- The mean and variance of the hypergeometric
distribution is given below
24The Hypergeometric and the Binomial distributions
- Note the following
- Sometimes we refer to M/N, the proportion of the
population with a particular characteristic as p.
- This is very close to EX np and Var(x)
pn(1-p) which are the mean and variance of the
Binomial distribution. In fact we can think of - as a correction term which accounts for the fact
that in the hypergeometric distribution we sample
without replacement.
- Question Should the variance of the
Hypergeometric distribution with proportion p be
greater than or less than the variance of the
Binomial with parameter p? Can you give some
intuitive reason why this is so?
25The Hypergeometric and the Binomial distributions
- Example Binomial, p 0.40, three trials vs.
- Hypergeometric with M/N 2/5
0.40, three trials
26The Hypergeometric and the Binomial distributions
- If the ratio of the n, the number of trials is
small and N, the number in the population is
large then we can approximate the hypergeometric
distribution with the binomial distribution in
which p M/N.
- This used to be very important. As recently as
five years ago computers and hand calculators
could not calculate large factorials. The
calculator I use can not calculate 70!. A few
years ago 50! was out of the question.
- Despite improvements in calculators its still
important to know that if the ratio of n to N is
less than 5 (we are sampling 5 of the
population) then we can approximate the
hypergeometric distribution with the binomial. - Check your calculators now -- what is the maximum
factorial they can handle?
27Class exercise
- The system administrator in charge of a campus
computer lab has identified 9 machines out of 80
with defective motherboards. - While he is on vacation the lab is moved to a
different room, in order to make room for
graduate student offices. - The administrator kept notes about which machines
were bad, based on their location in the old lab.
During the move the computers were places on
their new desks in a random fashion so all of the
machines must be checked again. - If the administrator checks three of the machines
for defects, what is the probability that one of
the three will be defective? - Calculate using both the hypergeometric
distribution and the binomial approximation to
the hypergeometric.
28The Geometric distribution
- The geometric distribution refers to the random
variable represented by the number of consecutive
Bernoulli trials until a success is achieved.
- Suppose that independent trials, each having
probability p, - 0 lt p lt 1, of being a success are performed until
a success occurs. If we let X equal the number
of trials prior to the success, then - The above definition is the one used in our
textbook. A more common definition is the
following Let X equal the number of trials
including the last trial, which by definition is
a success. Then we get the following
29The geometric distribution
- The expected value and variance of the geometric
distribution is given for the first form by
- For the second form, the expected value has more
intuitive appeal. Can you convince yourself that
the value is correct?
- Please note that the variance is the same in both
cases. - Explain why this is so.
30The geometric distribution
- We can derive E(X) in the following way
31The geometric distribution
These steps are so that we can work with the
geometric series 1xx2x31/(1-x)
so p(1qq2)p(1/1-q) Here we just substitute
p for 1-q
32The geometric distribution
Try deriving the variance of the geometric
distribution by finding E(X2)
33Poisson Distribution
- One of the most useful distributions for many
branches of engineering is the Poisson
Distribution - The Poisson distribution is often used to model
the number of occurrences during a given time
interval or within a specified region. - The time interval involved can have a variety of
lengths, e.g., a second, minute, hour, day, year,
and multiples thereof.
- Poisson processes may be temporal or spatial.
The region in question can also be a line
segment, an area, a volume, or some n-dimensional
space. - Poisson processes or experiments have the
following characteristics
34Poisson Distribution
- 1. The number of outcomes occurring in any given
time interval or region is independent of the
number of outcomes occurring in any other
disjoint time interval or region.
- 2. The probability of a single outcome occurring
in a very short time interval or very small
region is proportional to the length of the time
interval or the size of the region. This value is
not affected by the number of outcomes occurring
outside this particular time interval or region.
- 3. The probability of more than one outcome
occurring in a very short time interval or very
small region is negligible. - Taken together, the first two characteristics are
known as the memoryless property of Poisson
processes.
- Transportation engineers often assume that the
number of vehicles passing by a particular point
on a road is approximately Poisson distributed.
- Do you think that this model is more appropriate
for a rural hiqhway or a city street?
35Poisson Distribution
- The pdf of the Poisson distribution is the
following
- The parameter l is equal to a t, where a is the
intensity of the process the average number of
events per time unit and t is the number of time
units in question. In a spatial Poisson process
a represents the average number of events per
unit of space and t represents the number of
spatial units in question. - For example, the number of vehicles crossing a
bridge in a rural area might be modeled as a
Poisson process. If the average number of
vehicles per hour, during the hours of 1000 AM
to 300 PM is 20 we might be interested in the
probability that fewer than three vehicles cross
on from 1230 to 1245 PM. In this case l (20
per hour)(0.25hours) 5. - The expected value and the variance of the
Poisson distribution are identical and are equal
to l.
36Class exercise
- An urban planner believes that the number of gas
stations in an urban area is approximately
Poisson distributed with parameter a 3 per
square mile. Lets assume she is correct in her
assumption.
- Calculate the expected number of gas stations in
a four square mile region of the urban area as
well as the variance of this number.
- Calculate the probability that this region of
four square miles has less than six gas stations.
- Calculate the probability that in four adjacent
regions of one square mile each, that at least
two of the four regions contains more than three
gas stations.
- Do you think the situation is accurately modeled
by a Poisson process? - Why or why not?
37Some random variables that typically obey the
Poisson probability law (Ross, p.130)
- The number of misprints on a page (or group of
pages) of a book
- The number of people in a community living to 100
years of age
- The number of wrong telephone numbers that are
dialed in a day
- The number of packages of dog biscuits sold in a
particular store each day
- The number of customers entering a post office
(bank, store) in a give time period
- The number of vacancies occurring during a year
in the supreme court
- The number of particles discharged in a fixed
period of time from some radioactive material
- WHAT ABOUT ENGINEERING?
- Poisson processes are the heart of queueing
theory --which is one of the most important
topics in transportation and logistics. Lots of
other applications too -- water, structures,
geotech etc.
38The Poisson Distribution as an approximation to
the Binomial
- When n is large and p is small, the Poisson
distribution with parameter np is a very good
approximation to the binomial (the number of
successes in n independent trials when the
probability of success is equal to p for each
trial).
- Example --Suppose that the probability that an
idem produced by a certain machine will be
defective is 0.10. Find the probability that a
sample of 10 items will contain at most one flaw.
np 0.10101.0
39References
- Ross, S. (1988), A first course in probability,
Macmillian Press.