Title: SIMULATION OUTPUT ANALYSIS
1SIMULATION OUTPUT ANALYSIS
- Week 8
- Kelton text (Ch. 6) and prepared notes
- GE-703 Kumpaty
2Statistical Analysis of Simulation Output
- Based on inferential/ descriptive statistics
- The population is the entire set of items or
outcomes. Applied to simulation, it is the entire
set of observations or outcomes that a simulation
model can produce. A sample is at least one
unbiased observation from the population. The
sample size is the number of samples that have
been collected.
3Statistical Analysis of Simulation Output
- Statistical concepts
- Samples are generated by running the experiment.
- The population size is often very large or
infinite. - A random variable is used to express each
possible outcome of an experiment as a continuous
or discrete number. - Experimental samples (replications) are
independent.
4Simulation Replications
- One run constitutes one replication of the
experiment. To obtain a sample size n, we need
to run n independent replications of the
experiment. - Random Number Generation (Ch. 3) Initial Seed
Values - With adequate collection of independent
observations, statistical methods could be
applied to estimate output response.
5Point Estimates
- Population Mean
- Mean of a Distribution
- Continuous case
- m -?? ? x p(x) dx where p(x) is the
probability density function - Discrete case
- m i1?n pi xi where pi is the probability of
i th outcome
6Point Estimates
- Population Variance
- Variance calculated from a distribution
- Continuous Case
- s2 -?? ? (x m)2 p(x) dx where p(x) is the
probability density function - Discrete Case
- s2 i 1?n (xi m)2 pi where pi is the
probability of i th outcome
7Estimators for the population
- Sample
- Collection of n observations (outcome of a
process) X1, X2, X3 --------- Xn - Observations has randomness that can be modeled
by normal distribution - Note Size of n may not be large enough to prove
that n observations form a normal distribution
8Estimators for the population
- Sample Mean
- It is an estimator for the population mean (i.e.
the mean value of the probability distribution
that is modeling the outcome) - m i1 ? n Xi/n where Xi is the ith
observation - Sample Variance
- It is an estimator for the population variance
from n observations X1, X2, X3, Xn - s2 i 1?n (Xi m)2 /(n-1)
9Theorem
- If a population has a finite mean m and finite
variance s2 then the distribution of sample mean
m approaches normal distribution with variance
s2/n and mean m as the sample size n increases. - The central limit theorem relates that if a
variable is defined as the sum of several
independent and identically distributed (IID)
random values (observations), that the variable
representing the sum of the observations tends to
be normally distributed.
10Properties of m and s2
- If X1, X2,X3, ..,Xn are samples (observations or
outcome) from a population of normal distribution
of mean m and variance s2. Then - The sample mean m i1 ? n Xi/n is normally
distributed with mean m and variance s2/n - The quantity s2 i 1?n (Xi m)2 /(n-1)
follows a Chi Square distribution with n-1
degrees of freedom (?) - The quantity t? (m-m)/(s/?n) follows Students
t distribution with n-1 degrees of freedom (?)
11Range of m
- The expression of t? can be written as
- m m - t? (s/?n)
- An item of interest is the calculation of the
upper and lower bounds of the population mean m
with a probability value called confidence level.
Typical number of confidence level is 95 or the
probability P 0.95 (1-a) where a is the
significance level.
12t Distribution and Range of m
P Area of the curve between the arrow
Area (1-P)/2
Area (1-P)/2
t?,(1-P)/2
-t?,(1-P)/2
Range of m m - (s/?n) t?,(1-P)/2 ? m ? m
(s/?n) t?,(1-P)/2
t?,(1-P)/2 can be obtained from t distribution
table
13Number of Replications (Sample size)
- Use the relation t? (m-m)/(s/?n)
- Set ? ?? and (m-m)Range/2 (hw or e), Select a
- From the t distribution table determine t?,(1-a
)/2 - Compute n (s t?,(1-a)/2/hw)2
- This is the number of replications that will
provide an adequate sample size for meeting the
desired absolute error (e or hw) and significance
level a or 1- a. - relative error version, use re/(1re)m in
place of e.
14Sample Calculations
15Statistical Issues with Simulation Output
- Assumptions that must be met regarding the
sample of observations used to construct the
confidence interval - Observations are independent so that no
correlation exists between consecutive
observations. - Observations are identically distributed
throughout the entire duration of the process
(they are time invariant). - Observations are normally distributed.
16Terminating Simulations
- A terminating simulation is one in which
simulation starts at a defined state or time and
ends when it reaches some other defined state or
time. - 1. Select the initial model state
- 2. Select a terminating event
- 3. Determine the number of replications
- Usually ten replications and then construct a
confidence interval for the performance measure
of interest. If you are satisfied with the width
of the confidence interval, stop and report your
results otherwise, continue running additional
replications until the confidence interval is
reduced to the desired width.
17Nonterminating Simulations
- A nonterminating simulation is one which the
steady state (long-term average) behavior of the
system is analyzed. - 1. Determine and eliminate the initial warm-up
bias. - Figure 9.3. Welch moving average (Table 9.4)
- 2. Obtain sample observations.
- By running replications or batch intervals
- Lag-1 autocorrelation (between -0.20 and 0.20)
- 3. Determine run length.
18Nonterminating Simulations
- Lag-1 autocorrelation (between -0.20 and 0.20)
-
19COMPARING SYSTEMS
20Hypothesis Testing
- Two strategies identify the one that maximizes
the throughput of the production - Null hypothesis, H0 The value of m1 is not
significantly different from m2.at the a level of
significance. - H0 m1 m2
- H1 m1 ? m2 (Alternate hypothesis)
21Errors in Hypothesis Testing
- Type I error
- Rejecting H0 in favor of H1 when in fact H0 is
true - a level of significance is the probability of
making a Type I error. Typical value 0.05. - Type II error
- Fail to reject H0 in favor of H1 when in fact H1
is true - b is the probability of making a Type II error.
It increases when a decreases. (Caution Dont
make a too small.)
22Confidence Interval method
- Equivalent to conducting a two-tailed test of
hypothesis - P(m1-m2)-hw(m1-m2)(m1-m2)hw 1-a
- If the two populations means are the same, then
m1-m2 0, which is the null hypothesis. - If the confidence interval includes zero, we
fail to reject H0. Conclusion The value of m1
is not significantly different than the value of
m2 at the a level of significance.
23Comparing two alternative system designs
- Recall assumptions Observations are independent
and normally distributed. - Two common methods for constructing a confidence
interval for evaluating hypotheses - 1.Welch Confidence Interval method (modified
two-sample-t confidence interval) - 2. Paired-t Confidence Interval method
24Welch C.I. method
- Observations from each population be normally
distributed and independent within a population
and between populations - Does not require the number of samples from one
population (n1) equal the number of samples from
another population (n2) - Does not require that the two populations have
equal variances
25Welch C.I. method continued..
- P(m1-m2)-hw(m1-m2)(m1-m2)hw 1-a
- Welch c.i is given as above with
-
- Where
- df is estimated by
26Paired-t C.I.method
- Observations from each population be normally
distributed and independent within a population
but this method does not require the observations
between populations be independent - Does require the number of samples from one
population (n1) equal the number of samples from
another population (n2) - Does not require that the two populations have
equal variances
27Paired-t C.I. method continued..
- Pm1-2-hw(m1-2)m1-2hw 1-a
- Find sample mean and sample S.D. for the paired
sample (difference in sample values) - Paired-t c.i is given as above with
- H0 m1-20
- H1 m1-20
28Comparing more than two alternative system designs
- 1.The Bonferroni Approach (for comparing 3 to 5
designs) - 2. Analysis of Variance (ANOVA) in conjunction
with multiple comparison test
29Bonferroni Approach
- Construct a series of confidence intervals to
compare all system designs to each other (all
pairwise comparisons) - For three designs (1-2), (2-3), (1-3)
- K designs C.Is K(K-1)/2
- Significance level ai a/ CIs (not necessary
that all individual ones are to be equal) but S
ai a - H0 m1 m2 m3 m
- H1 m1 ? m2 or m1 ? m3 or m2 ? m3
30Advanced Statistical Models (ANOVA and multiple
comparison test)
- Analysis of Variance (ANOVA) in conjunction with
multiple comparison test - H0 m1 m2 m3 mK m for K alternative
systems - H1 m1 ? m2 or m1 ? m3 or mi ? mj for at
least one pair - Number of factor levels number of alternative
system designs K - Number of observations for each factor level n
- Total number of observations N nK
31Analysis of Variance (ANOVA test)
- Analysis of Variance (ANOVA) Single Factor
- F calc tested against F critical
- Do the Excel sheet
- 3 strategies, 10 observations each
- DFT 2, DFE 27
- Find SS (sum of squares), SSE, SST
- MSE, MST, Fcalc MST/MSE
- F critical
32Multiple Comparison Test
- Do this after confirming from ANOVA that at
least one strategy performs differently than the
other strategies. - Fishers Least Significant Difference Test
- If m1-m2 gt LSD (a) then m1 and m2 are
significantly different at the a level of
significance. - LSD (a) t (dfe, a/2) sqrt (2 MSE/n)
- Do the calculations on Excel sheet