Title: AgentBased Enterprise Simulation Validation Plan
1Agent-Based Enterprise Simulation Validation Plan
Eighth Annual Navy Workforce Conference May 5 -
7, 2008
John Schmid (CSC), John Sauter (New
Vectors/TechTeam), Sanjay Nayar (CSC), Rick
Loffredo (CSC), Dr. Colin Osterman (NPRST),
Rodney Myers (NPRST), Kimberly Crayton (NPRST)
2Agenda
- Atypical Validation Challenges
- Methodology
- Quantitative/Qualitative Methods
- Determine Number of Simulation Runs
- Acquiring Subject Matter Expert (SME) Inputs
- Set Up and Execution of Functional Area (FA)
Simulation Experiments - Set Up/Execution of the Validation experiment
with Historical Data - Statistical Procedures for Validating with
Historical Data - Conclusions
3Atypical Validation Challenges
- Agent-Based Enterprise Simulation to be validated
is a Navy MPTE prototype workforce analysis
model, such as COMPASS - Scope and complexity of the system and its
subsystems (functional areas) - No existing version of the system
4Quantitative/Qualitative Methods
- Objectives
- Do simulation predictions reasonably compare to
SME expectations - Do simulation predictions compare to historical
observations - Phased Validation
- Qualitative validation of functional area
simulation predictions - Qualitative validation of system simulation
predictions - Quantitative validation of system simulation
predictions
5Prerequisites to running the validation
experiments
- Sensitivity evaluation of the model to random
seed effects - Ensure that the model produces identical results
when run with the same random number seed - Determine the number of model replications
required for statistical significance - Determine how the variance of the output
responses changes at design points to guide the
choice of input factors and output responses
61) Ensure identical results when run with the
same random number seed
- Test Setup
- Two design points each with a different random
seed - A high and low value for each of three different
input parameters -
- Test Execution
- Run with the same random number seed five times
for each design point and collect the output
metrics for analysis - Test Analysis
- Review the metric outcomes
- Are results identical and unaffected by input
parameter values?
72) Determine Required Number of Simulation Runs
Background
- Not possible to determine a priori the number of
runs required for the results to be statistically
significant - The required number of runs depends on the
desired confidence in the estimate of the
response mean and the variance of the response - Calculate the sample mean of a response by
running the experiment n times and collecting the
output metric, Xi, for each run - If the Xis are independent and identically
normally distributed, the actual mean, µ, will
fall within the interval I with probability a
82) Determine Required Number of Simulation Runs
Background
- s2, the true variance of the response, must be
estimated by, S2(n), the sample variance -
- t(a,n-1) is the upper critical value of the t
distribution - If the desired relative error in the sample mean
is ? or less, the number of runs, n, will be
chosen such that
92) Determine Required Number of Simulation Runs
Background
- Since the experiment must be run a few times
before being able to calculate X(n) and S2(n), an
initial number of runs, n0, will be chosen - The sample mean and variance will be used to
estimate an appropriate number of runs - The required number of runs will be approximately
-
102) Determine Required Number of Simulation Runs
Experiment
- Test Setup
- The first value of N must be an estimate, say 5
- Test Execution
- Run the experiment the given number of times
determined above with different random number
seeds and collect the sample mean and variance
for the metrics - Test Analysis
- Given the data set with these points, compute and
adjust the number of runs accordingly
113) Evaluate Variance in Response
- Test Setup
- The experiment setup consists of two design
points each, a high and low value for two
different input parameters - Test Execution
- Run the experiment the required number of times
with different random number seeds and collect
the sample mean and variance for the metrics - Test Analysis
- Use variance in response to parameter changes to
assess whether the metrics are sensitive to
parameter changes - If the change in the mean is not statistically
significant for a given metric - Then run additional experiment using midpoint to
determine if metric response is non-linear
12Qualitative Validation of Functional Area
Simulations
Based on Subject Matter Expert Opinion
- Subject Matter Experts (SME) in each Functional
Area (FA) provide opinions - How they expect metrics to change in value from
FY0 to FY2 in response to the policy value
changes - Validation of the FA is confirmed by the extent
to which the metric predictions of the simulation
conform to the expectations of the SMEs
13Acquiring SME Inputs - Example
14Acquiring SME Inputs - Example
15Set Up and Execution of Functional Area
Simulation Experiment
- Validation using SME opinions
- Experiment Setup
- Set inputs for the FA simulation experiment
- Observable behavior metrics
- Input parameters (policy) or observed
- Policy values for FY0, FY1, and FY2
16Set Up and Execution of Functional Area
Simulation Experiment
- Experiment Execution
- Run the FA simulation experiment the required
number of times - Experiment Analysis
- Populate SME input form
- policy values for FY0, FY1, and FY2 predicted
metric value for FY0 - Solicit SME expectations for the metric values in
FY1, and FY2 . - Produce a composite validation measure
17Compute Composite Validation Measure for FA
Simulation
- Compare SME expectations for the metric values in
FY1, and FY2 to the simulation-predicted metrics - Determine the extent to which simulation-predicted
- mean metric values by FY agree in direction and
magnitude with qualitative expectations of the
SME - mean metric values by FY and by month fall within
the range of metric values expected by the SME - value ranges (mean /- std dev) by FY and by
month fall within the value ranges expected by
the SME - Award validation points to the FA simulation in
proportion to the extent that its predictions
compare favorably to the metric value
expectations of each SME - Weight by confidence level of the SMEs
- Points awarded for all responses by all
participating SMEs are combined to produce a
composite validation measure
18Compute Composite Validation Measure for FA
Simulation
19 Qualitative Validation of the System Simulation
- Validation using SME opinions overall simulation
behavior - Experiment Setup
- Set inputs for the System simulation experiment
- Parameter sweep of values for the selected policy
parameter - Policy values for FY0, FY1, and FY2 for each
experiment, - Metrics to be collected include
- Total Number of Training Attritions
- Average Time to Train (months)
- Training Dollars Spent this FY (M)
- Sea Manning ()
- Shore Manning ()
20Qualitative Validation of the System Simulation
- Experiment Execution
- Run the system simulation experiment the required
number of times with different random number
seeds and collect the metric values - Experiment Analysis
- Populate SME input form
- Policy values for FY0, FY1, and FY2
- Predicted metric value in FY0
- Solicit SME expectations for the metric values in
FY1, and FY2 - Produce a composite validation measure for the
System simulation
21Validation Experiment with Historical Data
- Quantitatively validate the System simulation by
comparing simulation-predicted outcomes to actual
history - Experiment Setup
- Three different starting points EFY04 to EFY06
- Input policy parameters match the initial
conditions for each starting FY - Actual policy parameters (Advancement Zone
Bottom) - Actual observed proxy for the policy parameters
(NHSG) - Simulation outcomes will include metrics at
different levels of aggregation - Experiment Execution
- Run the experiment the required number of times
with different random number seeds and collect
the sample mean and variance for the metrics - Experiment Analysis
- Each historical measurement results in one single
observation - Statistical comparison of the historical
measurement to the observation means produced by
multiple runs of the System simulation
22Validation Experiment with Historical Data
- Metric comparison of COMPASS with historical
outcomes
23Statistical Procedures for Validating with
Historical Data
- R1 is the historical observation
- M1, M2,, Mn are observed outputs
- The sample mean of the Mi is given by
- The confidence interval is given by
- The constant, t(a,n-1), is the critical value of
the t distribution for level a and n-1 degrees of
freedom - S2(n) is the sample variance of the COMPASS model
observations and is calculated as
24Conclusions
- The multipronged approach presented addresses the
atypical challenges to validation of an
Agent-Based Enterprise Simulation, such as
COMPASS - Validation of each FA simulation followed by
System level validation - Qualitative validation experiments employ
opinions from SMEs with specific FA experience - no one person has all the Subject Matter
Expertise necessary to assess the entire system - Quantitative System validation experiments
compare System simulation outcomes against
historical outcomes - Uses actual or policy parameters and actual
outcomes (for FY04, FY05, FY06 and FY07 by EFY,
quarter and month)
25Contact Information
- New Vectors/TechTeam
- John Sauter (734.302.4682)
- john.sauter_at_newvectors.net
- CSC
- Navy Personnel Planning and Policy Analysis
Group - Federal Sector Defense Group
- John Schmid (352.671.2761), Sanjay Nayar
(703.461.2075), Rick Loffredo (703.461.2168) - jschmid21, snayar, rloffredo_at_csc.com
- NPRST
- Dr. Colin Osterman (901.874.4643)
- Mr. Rodney Myers (901.874.4925)
- Ms. Kimberly Crayton (901.874.2498)
- colin.j.osterman,rodney.myers,
kimberly.crayton_at_navy.mil