Altiok / Melamed Simulation Modeling and Analysis with Arena - PowerPoint PPT Presentation

1 / 23

About This Presentation

Title:

Altiok / Melamed Simulation Modeling and Analysis with Arena

Description:

(fit) its parameters from the sample, using such common techniques as the ... The Chi-Square test compares the empirical histogram density, ... – PowerPoint PPT presentation

Number of Views:365

Avg rating:3.0/5.0

Slides: 24

Provided by: elsayeda

Learn more at: https://courses.vcu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Altiok / Melamed Simulation Modeling and Analysis with Arena

1
SIMULATION MODELING AND ANALYSIS WITH ARENA T.
Altiok and B. Melamed Chapter 7 Input
Analysis
2
Input Analysis Activities

Input Analysis activities consist of the
following stages
Stage 1 data collection
Stage 2 data analysis
Stage 3 modeling time series data
Stage 4 goodness-of-fit testing
Random variables with negligible variability are
simplified
and modeled as deterministic quantities.
Unknown distributions are postulated to have a
particular
functional form that incorporates any available
partial
information.

3
Data Collection

To illustrate data collection activities,
consider modeling a painting station,
where
jobs arrive at random, wait in the buffer until
the sprayer is available
having been sprayed, they leave the station
suppose that the spray nozzle can get clogged
an event that results in a stoppage during which
the nozzle is cleaned or replaced.
suppose further that the measure of interest is
the expected job delay in the buffer.
The data collection activity in this simple case
would consist of the following tasks
collection of job inter-arrival times
collection of painting times
collection of times between nozzle clogging
collection of nozzle cleaning/replacement times

4
Data Analysis

Data Analysis deals with statistics of empirical
data
statistics related to moments (mean, standard
deviation, coefficient of variation, etc.)
statistics related to distributions (histograms)
statistics related to temporal dependence
(autocorrelations within an empirical time
series, or cross-correlations among two or more
distinct time series)
For example, consider the sample of 100 repair
time observations
12.9 27.7 13.5 13.7 22.2
20.9 26.6 29.1 22.4 10.7
30.0 27.4 18.8 25.3 15.0
17.0 21.7 13.7 15.5 23.2
11.0 27.5 22.5 27.1 25.2
10.3 18.0 11.5 14.1 24.0
10.9 27.0 24.2 25.6 22.4
21.0 21.3 23.1 15.8 13.2
22.8 25.9 22.4 13.8 16.6
10.8 10.3 15.1 19.0 27.9
20.5 19.4 10.9 24.1 10.9

5
Data Analysis Example

Data Analysis of the repair time data produced
the histogram and summary statistics
shown below

6
Modeling Time Series Data

Independent observations are modeled as a renewal
time series, namely, a sequence of iid
random variables. In this case, the analysts
task is to merely identify (fit) a good
distribution and its parameters to the empirical
data.
Arena provides built-in facilities for fitting
distributions to empirical data.
Dependent observations are modeled as random
processes with temporal
dependence. In this case, the analysts task is
to identify (fit) a good probability law
to empirical data. This is a far more difficult
task
than the previous one, and often requires
advanced mathematics.
Arena does not provide facilities for fitting
dependent random processes
An advanced method is described, however, in
Chapter 10
Examples
Observed sequences of arrival times to a queue
are often modeled as iid exponential
inter-arrival times (i.e., Poisson processes)
For observed sequence of times to failure and the
corresponding repair times, the associated
uptimes may be modeled as a Poisson process, and
the downtimes as a renewal process or as a
dependent process (e.g., Markov process)

7
Modeling Empirical Distributions

The simplest approach is to construct a histogram
from the empirical data
(sample), and then normalize it to a step pdf or
a pmf, depending on the underlying state
space. The obtained pdf or pmf is then declared
to be the fitted distribution.
The main advantage of this approach is that no
assumptions are required on the
functional form (shape) of the fitted
distribution.
The previous approach may reveal (by inspection)
that the histogram pdf has a particular
functional form (e.g., decreasing, bell shape,
etc.). The analyst may then try to
obtain a better fit, by postulating a particular
class of distributions having that
shape, and then proceeding to estimate
(fit) its parameters from the sample, using such
common techniques as the method of
moments and the maximum likelihood estimation
(MLE) method. This approach can be
further generalized to multiple functional forms
by searching for the best fit among a
number of postulated classes of
distributions.
The Arena Input Analyzer provides facilities for
both fitting approaches.

8
Method of Moments

The method of moments fits the moments of a
candidate model to sample
moments, using appropriate empirical statistics
as constraints on the
candidate model parameters.
As an example, consider a random variable X and a
data sample whose first
two moments, and are estimated as
and .
Write the formulas for the mean and variance of a
gamma distribution, connecting the first two
moments of a gamma distribution with its
parameters, and , namely
Substitute into the above the previous estimates
Solve the above equation to obtain

9
Maximal-likelihood Estimation (MLE)

The Maximal-likelihood Estimation (MLE) method
postulates a particular class of
distributions (e.g., normal, uniform,
exponential, etc.), and then estimates
their parameters from the sample, such that the
resulting parameters give rise to the
maximal likelihood (highest probability or
density) of obtaining the sample. More
precisely,
Let be the postulated pdf, as a
function of its ordinary argument, , as well
as the unknown parameter (possibly be a
vector of parameters, but here is assume a
scalar for simplicity)
Let be a sample of independent
observations
The MLE method estimates
via the likelihood function

10
MLE Method Examples

For the exponential distribution Expo( ) with
parameter ,
the corresponding maximal likelihood function is
the log-likelihood function is
the value of that maximizes
is obtained by
differentiating it with respect to and
setting the derivative to zero, that is
solving the above in yields the maximal
likelihood estimate
For the uniform distribution Unif(a,b), a similar
computation yields the MLE estimates

11
The Arena Input Analyzer

The Arena Input Analyzer is a tool that fits a
distribution to sample data.

Distribution Arena Name Arena Parameters
Exponential EXPO Mean
Normal NORM Mean, StdDev
Triangular TRIA Min, Mode, Max
Uniform UNIF Min, Max
Erlang ERLA ExpoMean, k
Beta BETA Beta, Alpha
Gamma GAMM Beta, Alpha
Johnson JOHN G, D, L, X
Log Normal LOGN LogMean, LogStdDev
Poisson POIS Mean
Weibull WEIB Beta, Alpha
Continuous CONT P1, V1,
Discrete DISC P1, V1,
12
Best-fit uniform distribution for the repair
time data
13
Best-fit beta distribution for the repair time
data
14
Best-fit gamma distribution for a sample of lead
time data
15
Fit All Summary for a sample of lead time data
16
Goodness-of-Fit Tests for Distributions

Tests of goodness-of-fit for distributions
determine the likelihood that an empirical
sample is drawn from a given distribution
a statistical hypothesis is formulated
a statistic is computed from the empirical data
the distribution of the statistic is assumed
known under the null hypothesis, allowing the
computation of the probability that it
exceedsthe observed value
rejection or acceptance decisions can be taken at
a given significance level, but these are subject
to Type I and Type II statistical errors
Common goodness-of-fit tests for distributions
Chi-Square test
Kolmogorov-Smirnov test

17
Chi-Square Test

The Chi-Square test compares the empirical
histogram density, constructed from sample
data, to a candidate theoretical density
assume that the empirical sample
is a set of iid realizations from an
underlying (unknown) random variable, .
this sample is used to construct an empirical
histogram with cells, where cell
corresponds to the interval
The estimator of the probability
of cell is
is the number of observations in cell
it is commonly suggested to take
for statistical reliability)

18
Chi-Square Test (Cont.)

Let be some theoretical candidate
distribution of the random variable
whose goodness-of-fit is to be assessed
Compute the corresponding theoretical
probabilities
for continuous data we have
where is the density of
The Chi-square test statistic is then given by

19
Chi-Square Test Example

As an example, consider the repair time sample
data of size N 100, given earlier, for
which a histogram with J 10 cells was
constructed by the Input Analyzer
The table below displays the elements of the
Chi-Square test for the repair data

Cell Number Cell Interval Number of Observations Relative Frequency Theoretical Probability
1 10,12) 13 0.13 0.10
2 12,14) 9 0.09 0.10
3 14.16) 8 0.08 0.10
4 16,18) 9 0.09 0.10
5 18,20) 12 0.12 0.10
6 20,22) 8 0.08 0.10
7 22,24) 13 0.13 0.10
8 24,26) 10 0.10 0.10
9 26,28) 10 0.10 0.10
10 28,30) 8 0.08 0.10
20
Chi-Square Test Example (Cont.)

The histogram of the repair data suggests that a
uniform distribution Unif(a,b) is an acceptably
good fit to the sample repair data
The parameters of the uniform distribution are
estimated as
The Chi-Square statistic computation yields
A Chi-Square table shows that for significance
level and
degrees of freedom, the critical value is
Since the test statistic computed above is
, we accept the null
hypothesis that the uniform distribution
Unif(10,30) is an acceptably good fit to the
sample repair data

21
Kolmogorov-Smirnov Test

The Kolmogorov-Smirnov (K-S) test compares the
empirical cdf to a theoretical
counterpart
while, the Chi-Square test requires a
considerable amount of data (at least to set up
a reasonably smooth histogram), the K-S test
can get away with smaller samples, since it does
not require a histogram
The K-S test procedure proceeds as follows
sort the sample is ascending
order as
constructs the empirical cdf
construct the K-S test statistic
The smaller is the observed value of KS,
the better is the fit

22
Multi-Modal Distributions

A mode of a distribution is that value of its
associated pdf or pmf at which the
respective function attains a maximal value
A uni-modal distribution has exactly one mode
A multi-modal distribution is one whose
associated pdf or pmf is of the following
form
It has more than one mode
It has only one mode, but it is either not
monotone increasing to the left of its mode, or
not monotone decreasing to the right of its mode
Thus, a multi-modal distribution has a pdf or pmf
with multiple humps
One approach to Input Analysis of multi-modal
samples is
Separate the sample into mutually exclusive
uni-modal sub-samples
Fit a separate distribution to each sub-sample
The fitted models are then combined into a final
model according to the relative frequency of each
sub-sample

23
Multi-Modal Distribution Example

Consider a sample of observations such that
observations appear to form a uni-modal
distribution in an interval
observations appear to form a uni-modal
distribution in an interval
Suppose that the theoretical distributions,
and , are fitted separately to
the respective sub-samples
The combined distribution to be fitted the entire
sample is defined by
The distribution above is a legitimate
distribution, formed as a probabilistic mixture
of the two distributions, and