Title: Review of Probability and Statistics
1Review of Probability and Statistics
2Outline
- Uses of Probability and Statistics in Simulation
- Experiments, Sample Spaces and Events
- Probability
- Random Variables (Discrete and Continuous)
- Probability Distribution Functions
- Expected Value and Variance
- Joint Probability Functions
- Covariance, Correlation
- Central Limit Theorem
- Sample Mean and Sample Variance
- Confidence Intervals
- Hypothesis Tests
- Exercise Confidence Intervals and Hypothesis
Tests
3Uses of Probability and Statistics in Simulation
- Representing variability of input parameters of
models (fitting distributions) - Generating samples from given distributions
(generating random numbers) - Determining initial conditions for simulation
runs and when to start collecting data - Determining run length and number of replications
- Summarizing output data from model
- Comparing outputs from various simulation runs
(hypothesis testing, confidence intervals,
selecting the best of several systems,
experimental design, optimization)
4 Experiments, Sample Spaces, Events
- An experiment is a well-defined action whose
outcome is not known with certainty. - The sample space for an experiment is the set of
all possible outcomes. The outcomes themselves
are called the sample points in the sample space. - An event is some subset of the sample space.
-
- Note that for any particular experiment, usually
several different types of sample spaces can be
defined. - Example 1 Experiment Toss a coin 3 times.
- Sample Space HHH, HHT, HTH, HTT, THH, THT,
TTH, TTT Event HHT, HTH, THH - Example 2 Experiment Toss two dice. Sample
Space (1,1),(1,2),(1,3),(1,4),...(6,6) Event
(1,2),(2,1) - Example 3 Experiment Run a simulation model
of a particular design for a manufacturing
facility for a one week period. Sample Space
Number of units that can be produced. Event
Number of units produced lt 200.
5- Running a simulation model is like running an
actual experiment. The output will be a random
variable.
6Probability and Random Variables
- The probability of occurrence of an event is the
ratio of the number of sample points in the event
to the number of sample points in the sample
space. - A random variable is a function which assigns a
real number to each point in the sample space.
(Note that we usually think of the random
variable as the range of the function, and not
the function itself). A random variable can be
either discrete (e.g., number of defective
items), or continuous (e.g., time to failure). - Example In tossing a coin 3 times, the
discrete random variable corresponding to the
numbers of heads that occur is given by the
following function - Domain HHH HHT HTH HTT THH THT TTH TTT
- Range 3 2 2 1 2 1
1 0 - Event 0 heads 1 head 2 heads 3
heads - Probability 1/8 3/8 3/8
1/8
7Probability Distribution Functions
- Source (Walpole and Myers, 1972)
- The function f (x) is a probability distribution
of the discrete random variable X if, for each
possible outcome x, - The function F (x) is a cumulative distribution
function of a discrete random variable X with
probability distribution f (x), if
- The function f (x) is a probability density
function for the continuous random variable X
if - The function F (x) is a cumulative distribution
function of the random variable X with density
function f (x), if
8Expected Value and Variance of a Random Variable
- The expected value (or mean) of a random variable
is a measure of central tendency
- The variance of a random variable is a measure of
the spread of that random variable Exampl
e Consider discrete random variable X, with
the distribution function
9Covariance and Correlation
- The covariance of two random variables X and Y is
given by Note that
covariance, which is a measure of linear
dependence is symmetric (i.e., cov(X,Y)cov(Y,X))
- The correlation between two random variables X
and Y is given by Correlation,
instead of covariance, is the primary measure
used in determining linear dependence, since
correlation is dimensionless.
10Example Covariance and Correlation
- Consider the jointly discrete random variables X
and Y with joint probability mass function given
by -
- The marginal probability mass functions, expected
values, and variances for X and Y are given by -
-
-
11Example Covariance and Correlation
.85-.85(.8).17 (i.e., X and Y are positively
correlated).
12Central Limit Theorem(Ostle, 1963)
- If a population has a finite variance and
mean , then the distribution of the sample
mean approaches the normal distribution with the
variance and mean as the sample size n
increases. - Example The distribution of the sample mean
of the production rate attained for n independent
runs of a simulation model approaches a normal
distribution as n increases.
13Sample Mean and Sample Variance
- Suppose that we have n independent, identically
distributed random variables (observations) with
finite mean and finite population variance
, denoted as X1, X2, ..., Xn (e.g., these might
be observations from n independent runs of a
simulation model of a particular design).
Then - are unbiased point estimators of and
- , respectively.
- Also,
14Confidence Intervals (for the mean of n i.i.d
observations)
- If n is sufficiently large, an approximate
- percent confidence interval for
is - given by
- where is the upper critical
point for a standard normal random
variable. - Law and Kelton (1991) interpretation of a
confidence interval If one constructs a very
large number of independent percent
confidence intervals each based on n
observations, where n is sufficiently large, the
proportion of these confidence intervals that
contain (cover) should be . We
call this proportion the coverage for the
confidence interval.
15C.I.
16Confidence Intervals (for the mean of n iid, but
normally distributed, observations)
- Observations (iid, but normally distributed) X1,
X2, ..., Xn. - An exact percent confidence interval for
is given by - where is the upper critical point
- for the t distribution with (n-1) degrees of
freedom. - Law and Kelton (1991) note that typically the
Xis - will not be normally distributed, so that the
- t-confidence interval given above will also be
- approximate. However, the t-confidence interval
- will be more accurate (larger) than the one given
- on the previous page.
17Example Computation of Sample Mean, Sample
Variance, and Confidence Interval
- Suppose that you have built a simulation model
of a hospital emergency department, and have made
10 independent runs of the model for a particular
system design, in order to estimate the average
time in the system for patients, in minutes. The
results of the 10 replications are - 145.2, 134.9, 142., 149.1, 139., 150.9,
141.1, 148.2, 154., 142.1. -
- 95 confidence interval for average time in the
system for patients over a simulation run
(assuming that the 10 individual observations are
normally distributed)
18Example Variation in Confidence for C.I.
- Suppose that we are only interested in a 90
Confidence Interval -
- Note that as confidence decreases, the size of
the confidence interval decreases. - Various t-Confidence Intervals
19Example Increasing the Number of Replications
- Suppose that we make 10 more independent runs of
our simulation model to reduce the variance of
our estimate. The observations for these runs
are -
- 149.3, 158.8, 140.5, 149., 139.3, 142.6, 132.5,
139.2, 144.5,139. - Recomputing the sample mean and sample variance
gives and the new
confidence intervals are given by -
20Example Summary of Results for Various
Confidences and Sample Sizes
- Half Length of Confidence Interval
- Half-length of confidence interval decreases as
confidence decreases/sample size increases.
21Example Estimate of as n increases
-
- Making additional runs of the
- simulation model resulted in a 39
- decrease in the variance estimator.