Altiok / Melamed Simulation Modeling and Analysis with Arena - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Altiok / Melamed Simulation Modeling and Analysis with Arena

Description:

(fit) its parameters from the sample, using such common techniques as the ... The Chi-Square test compares the empirical histogram density, ... – PowerPoint PPT presentation

Number of Views:364
Avg rating:3.0/5.0
Slides: 24
Provided by: elsayeda
Learn more at: https://courses.vcu.edu
Category:

less

Transcript and Presenter's Notes

Title: Altiok / Melamed Simulation Modeling and Analysis with Arena


1
SIMULATION MODELING AND ANALYSIS WITH ARENA T.
Altiok and B. Melamed Chapter 7 Input
Analysis
2
Input Analysis Activities
  • Input Analysis activities consist of the
    following stages
  • Stage 1 data collection
  • Stage 2 data analysis
  • Stage 3 modeling time series data
  • Stage 4 goodness-of-fit testing
  • Random variables with negligible variability are
    simplified
  • and modeled as deterministic quantities.
  • Unknown distributions are postulated to have a
    particular
  • functional form that incorporates any available
    partial
  • information.

3
Data Collection
  • To illustrate data collection activities,
    consider modeling a painting station,
    where
  • jobs arrive at random, wait in the buffer until
    the sprayer is available
  • having been sprayed, they leave the station
  • suppose that the spray nozzle can get clogged
    an event that results in a stoppage during which
    the nozzle is cleaned or replaced.
  • suppose further that the measure of interest is
    the expected job delay in the buffer.
  • The data collection activity in this simple case
    would consist of the following tasks
  • collection of job inter-arrival times
  • collection of painting times
  • collection of times between nozzle clogging
  • collection of nozzle cleaning/replacement times

4
Data Analysis
  • Data Analysis deals with statistics of empirical
    data
  • statistics related to moments (mean, standard
    deviation, coefficient of variation, etc.)
  • statistics related to distributions (histograms)
  • statistics related to temporal dependence
    (autocorrelations within an empirical time
    series, or cross-correlations among two or more
    distinct time series)
  • For example, consider the sample of 100 repair
    time observations
  • 12.9 27.7 13.5 13.7 22.2
  • 20.9 26.6 29.1 22.4 10.7
  • 30.0 27.4 18.8 25.3 15.0
  • 17.0 21.7 13.7 15.5 23.2
  • 11.0 27.5 22.5 27.1 25.2
  • 10.3 18.0 11.5 14.1 24.0
  • 10.9 27.0 24.2 25.6 22.4
  • 21.0 21.3 23.1 15.8 13.2
  • 22.8 25.9 22.4 13.8 16.6
  • 10.8 10.3 15.1 19.0 27.9
  • 20.5 19.4 10.9 24.1 10.9

5
Data Analysis Example
  • Data Analysis of the repair time data produced
    the histogram and summary statistics
    shown below

6
Modeling Time Series Data
  • Independent observations are modeled as a renewal
    time series, namely, a sequence of iid
    random variables. In this case, the analysts
    task is to merely identify (fit) a good
    distribution and its parameters to the empirical
    data.
  • Arena provides built-in facilities for fitting
    distributions to empirical data.
  • Dependent observations are modeled as random
    processes with temporal
  • dependence. In this case, the analysts task is
    to identify (fit) a good probability law
    to empirical data. This is a far more difficult
    task
  • than the previous one, and often requires
    advanced mathematics.
  • Arena does not provide facilities for fitting
    dependent random processes
  • An advanced method is described, however, in
    Chapter 10
  • Examples
  • Observed sequences of arrival times to a queue
    are often modeled as iid exponential
    inter-arrival times (i.e., Poisson processes)
  • For observed sequence of times to failure and the
    corresponding repair times, the associated
    uptimes may be modeled as a Poisson process, and
    the downtimes as a renewal process or as a
    dependent process (e.g., Markov process)

7
Modeling Empirical Distributions
  • The simplest approach is to construct a histogram
    from the empirical data
  • (sample), and then normalize it to a step pdf or
    a pmf, depending on the underlying state
    space. The obtained pdf or pmf is then declared
    to be the fitted distribution.
    The main advantage of this approach is that no
    assumptions are required on the
    functional form (shape) of the fitted
    distribution.
  • The previous approach may reveal (by inspection)
    that the histogram pdf has a particular
    functional form (e.g., decreasing, bell shape,
    etc.). The analyst may then try to
    obtain a better fit, by postulating a particular
    class of distributions having that
    shape, and then proceeding to estimate
    (fit) its parameters from the sample, using such
    common techniques as the method of
    moments and the maximum likelihood estimation
    (MLE) method. This approach can be
    further generalized to multiple functional forms
    by searching for the best fit among a
    number of postulated classes of
    distributions.
  • The Arena Input Analyzer provides facilities for
    both fitting approaches.

8
Method of Moments
  • The method of moments fits the moments of a
    candidate model to sample
  • moments, using appropriate empirical statistics
    as constraints on the
  • candidate model parameters.
  • As an example, consider a random variable X and a
    data sample whose first
  • two moments, and are estimated as
    and .
  • Write the formulas for the mean and variance of a
    gamma distribution, connecting the first two
    moments of a gamma distribution with its
    parameters, and , namely
  • Substitute into the above the previous estimates
  • Solve the above equation to obtain

9
Maximal-likelihood Estimation (MLE)
  • The Maximal-likelihood Estimation (MLE) method
    postulates a particular class of
    distributions (e.g., normal, uniform,
    exponential, etc.), and then estimates
    their parameters from the sample, such that the
    resulting parameters give rise to the
    maximal likelihood (highest probability or
    density) of obtaining the sample. More
    precisely,
  • Let be the postulated pdf, as a
    function of its ordinary argument, , as well
    as the unknown parameter (possibly be a
    vector of parameters, but here is assume a
    scalar for simplicity)
  • Let be a sample of independent
    observations
  • The MLE method estimates
    via the likelihood function

10
MLE Method Examples
  • For the exponential distribution Expo( ) with
    parameter ,
  • the corresponding maximal likelihood function is
  • the log-likelihood function is
  • the value of that maximizes
    is obtained by
    differentiating it with respect to and
    setting the derivative to zero, that is
  • solving the above in yields the maximal
    likelihood estimate
  • For the uniform distribution Unif(a,b), a similar
    computation yields the MLE estimates

11
The Arena Input Analyzer
  • The Arena Input Analyzer is a tool that fits a
    distribution to sample data.

Distribution Arena Name Arena Parameters
Exponential EXPO Mean
Normal NORM Mean, StdDev
Triangular TRIA Min, Mode, Max
Uniform UNIF Min, Max
Erlang ERLA ExpoMean, k
Beta BETA Beta, Alpha
Gamma GAMM Beta, Alpha
Johnson JOHN G, D, L, X
Log Normal LOGN LogMean, LogStdDev
Poisson POIS Mean
Weibull WEIB Beta, Alpha
Continuous CONT P1, V1,
Discrete DISC P1, V1,
12
Best-fit uniform distribution for the repair
time data
13
Best-fit beta distribution for the repair time
data
14
Best-fit gamma distribution for a sample of lead
time data
15
Fit All Summary for a sample of lead time data
16
Goodness-of-Fit Tests for Distributions
  • Tests of goodness-of-fit for distributions
    determine the likelihood that an empirical
    sample is drawn from a given distribution
  • a statistical hypothesis is formulated
  • a statistic is computed from the empirical data
  • the distribution of the statistic is assumed
    known under the null hypothesis, allowing the
    computation of the probability that it
    exceedsthe observed value
  • rejection or acceptance decisions can be taken at
    a given significance level, but these are subject
    to Type I and Type II statistical errors
  • Common goodness-of-fit tests for distributions
  • Chi-Square test
  • Kolmogorov-Smirnov test

17
Chi-Square Test
  • The Chi-Square test compares the empirical
    histogram density, constructed from sample
    data, to a candidate theoretical density
  • assume that the empirical sample
    is a set of iid realizations from an
    underlying (unknown) random variable, .
  • this sample is used to construct an empirical
    histogram with cells, where cell
    corresponds to the interval
  • The estimator of the probability
    of cell is
  • is the number of observations in cell
  • it is commonly suggested to take
    for statistical reliability)

18
Chi-Square Test (Cont.)
  • Let be some theoretical candidate
    distribution of the random variable
    whose goodness-of-fit is to be assessed
  • Compute the corresponding theoretical
    probabilities
  • for continuous data we have
  • where is the density of
  • The Chi-square test statistic is then given by

19
Chi-Square Test Example
  • As an example, consider the repair time sample
    data of size N 100, given earlier, for
    which a histogram with J 10 cells was
    constructed by the Input Analyzer
  • The table below displays the elements of the
    Chi-Square test for the repair data

Cell Number Cell Interval Number of Observations Relative Frequency Theoretical Probability
1 10,12) 13 0.13 0.10
2 12,14) 9 0.09 0.10
3 14.16) 8 0.08 0.10
4 16,18) 9 0.09 0.10
5 18,20) 12 0.12 0.10
6 20,22) 8 0.08 0.10
7 22,24) 13 0.13 0.10
8 24,26) 10 0.10 0.10
9 26,28) 10 0.10 0.10
10 28,30) 8 0.08 0.10
20
Chi-Square Test Example (Cont.)
  • The histogram of the repair data suggests that a
    uniform distribution Unif(a,b) is an acceptably
    good fit to the sample repair data
  • The parameters of the uniform distribution are
    estimated as
  • The Chi-Square statistic computation yields
  • A Chi-Square table shows that for significance
    level and
    degrees of freedom, the critical value is
  • Since the test statistic computed above is
    , we accept the null
    hypothesis that the uniform distribution
    Unif(10,30) is an acceptably good fit to the
    sample repair data

21
Kolmogorov-Smirnov Test
  • The Kolmogorov-Smirnov (K-S) test compares the
    empirical cdf to a theoretical
    counterpart
  • while, the Chi-Square test requires a
    considerable amount of data (at least to set up
    a reasonably smooth histogram), the K-S test
    can get away with smaller samples, since it does
    not require a histogram
  • The K-S test procedure proceeds as follows
  • sort the sample is ascending
    order as
  • constructs the empirical cdf
  • construct the K-S test statistic
  • The smaller is the observed value of KS,
    the better is the fit

22
Multi-Modal Distributions
  • A mode of a distribution is that value of its
    associated pdf or pmf at which the
    respective function attains a maximal value
  • A uni-modal distribution has exactly one mode
  • A multi-modal distribution is one whose
    associated pdf or pmf is of the following
    form
  • It has more than one mode
  • It has only one mode, but it is either not
    monotone increasing to the left of its mode, or
    not monotone decreasing to the right of its mode
  • Thus, a multi-modal distribution has a pdf or pmf
    with multiple humps
  • One approach to Input Analysis of multi-modal
    samples is
  • Separate the sample into mutually exclusive
    uni-modal sub-samples
  • Fit a separate distribution to each sub-sample
  • The fitted models are then combined into a final
    model according to the relative frequency of each
    sub-sample

23
Multi-Modal Distribution Example
  • Consider a sample of observations such that
  • observations appear to form a uni-modal
    distribution in an interval
  • observations appear to form a uni-modal
    distribution in an interval
  • Suppose that the theoretical distributions,
    and , are fitted separately to
    the respective sub-samples
  • The combined distribution to be fitted the entire
    sample is defined by
  • The distribution above is a legitimate
    distribution, formed as a probabilistic mixture
    of the two distributions, and
Write a Comment
User Comments (0)
About PowerShow.com