Title: tutorial:%20Parallel%20
1tutorialParallel Distributed Simulation
SystemsFrom Chandy/Misra to the High Level
Architecture and Beyond
- Richard M. Fujimoto
- College of Computing
- Georgia Institute of Technology
- Atlanta, GA 30332-0280
- fujimoto_at_cc.gatech.edu
2References
- R. Fujimoto, Parallel and Distributed Simulation
Systems, Wiley Interscience, 2000. - (see also http//www.cc.gatech.edu/classes/AY2000/
cs4230_spring) - HLA
- F. Kuhl, R. Weatherly, J. Dahmann, Creating
Computer Simulation Systems An Introduction to
the High Level Architecture for Simulation,
Prentice Hall, 1999. - (http//hla.dmso.mil)
3Part IIntroductionPart IITime
ManagementPart IIIDistributed Virtual
Environments
Outline
4Parallel and Distributed Simulation
5Reasons to Use Parallel / Distributed Simulation
- Enable the execution of time consuming
simulations that could not otherwise be performed
(e.g., simulation of the Internet) - Reduce model execution time (proportional to
processors) - Ability to run larger models (more memory)
- Enable simulation to be used as a forecasting
tool in time critical decision making processes
(e.g., air traffic control) - Initialize simulation to current system state
- Faster than real time execution for what-if
experimentation - Simulation results may be needed in seconds
- Create distributed virtual environments, possibly
including users at distant geographical locations
(e.g., training, entertainment) - Real-time execution capability
- Scalable performance for many users simulated
entities
6Geographically Distributed Users/Resources
- Geographically distributed users and/or resources
are sometime needed - Interactive games over the Internet
- Specialized hardware or databases
7Stand-Alone vs. Federated Simulation Systems
8Principal Application Domains
- Distributed Virtual Environments (DVEs)
- Networked interactive, immersive environments
- Scalable, real-time performance
- Create virtual worlds that appear realistic
- Typical applications
- Training
- Entertainment
- Social interaction
- Parallel Discrete Event Simulation (PDES)
- Discrete event simulation to analyze systems
- Fast model execution (as-fast-as-possible)
- Produce same results as a sequential execution
- Typical applications
- Telecommunication networks
- Computer systems
- Transportation systems
- Military strategy and tactics
9Historical Perspective
High Performance Computing Community
SIMulator NETworking (SIMNET) (1983-1990)
High Level Architecture (1996 - today)
Distributed Interactive Simulation
(DIS) Aggregate Level Simulation Protocol
(ALSP) (1990 - 1997ish)
Defense Community
Dungeons and Dragons Board Games
Multi-User Dungeon (MUD) Games
Multi-User Video Games
Adventure (Xerox PARC)
Internet Gaming Community
10Part IITime Management
Parallel discrete event simulation Conservative
synchronization Optimistic synchronization Time
Management in the High Level Architecture
11Time
- physical time time in the physical system
- Noon, December 31, 1999 to noon January 1, 2000
- simulation time representation of physical time
within the simulation - floating point values in interval 0.0, 24.0
- wallclock time time during the execution of the
simulation, usually output from a hardware clock
(e.g., GPS) - 900 to 915 AM on September 10, 1999
12Paced vs. Unpaced Execution
- Modes of execution
- As-fast-as-possible execution (unpaced) no fixed
relationship necessarily exists between advances
in simulation time and advances in wallclock time - Real-time execution (paced) each advance in
simulation time is paced to occur in synchrony
with an equivalent advance in wallclock time - Scaled real-time execution (paced) each advance
in simulation time is paced to occur in synchrony
with S an equivalent advance in wallclock time
(e.g., 2x wallclock time) - Here, focus on as-fast-as-possible execution can
be paced to run in real-time (or scaled
real-time) by inserting delays
13Discrete Event Simulation Fundamentals
- Discrete event simulation computer model for a
system where changes in the state of the system
occur at discrete points in simulation time. - Fundamental concepts
- system state (state variables)
- state transitions (events)
- simulation time totally ordered set of values
representing time in the system being modeled
(physical system) - simulator maintains a simulation time clock
A discrete event simulation computation can be
viewed as a sequence of event computations Each
event computation contains a (simulation time)
time stamp indicating when that event occurs in
the physical system. Each event computation
may (1) modify state variables, and/or (2)
schedule new events into the simulated future.
14A Simple DES Example
- Simulator maintains event list
- Events processed in simulation time order
- Processing events may generate new events
- Complete when event list is empty (or some other
termination condition)
15Parallel Discrete Event Simulation
A parallel discrete event simulation program can
be viewed as a collection of sequential discrete
event simulation programs executing on different
processors that communicate by sending time
stamped messages to each other Sending a
message is synonymous with scheduling an event
16Parallel Discrete Event Simulation Example
17The Rub
- Golden rule for each logical process
- Thou shalt process incoming messages in time
stamp order!!! (local causality constraint)
18A Simple PDES Example
Processor 1
Processor 2
Simulator B
Simulator A
19Parallel Discrete Event Simulation Example
20The Synchronization Problem
- Local causality constraint Events within each
logical process must be processed in time stamp
order - Observation Adherence to the local causality
constraint is sufficient to ensure that the
parallel simulation will produce exactly the same
results as the corresponding sequential
simulation
- Synchronization (Time Management) Algorithms
- Conservative synchronization avoid violating the
local causality constraint (wait until its safe) - 1st generation null messages (Chandy/Misra/Bryant
) - 2nd generation time stamp of next event
- Optimistic synchronization allow violations of
local causality to occur, but detect them at
runtime and recover using a rollback mechanism - Time Warp (Jefferson)
- approaches limiting amount of optimistic execution
provided events with the same time stamp are
processed in the same order as in the
sequential execution
21Part IITime Management
Parallel discrete event simulation Conservative
synchronization Optimistic synchronization Time
Management in the High Level Architecture
22Chandy/Misra/Bryant Null Message Algorithm
- Assumptions
- logical processes (LPs) exchanging time stamped
events (messages) - static network topology, no dynamic creation of
LPs - messages sent on each link are sent in time stamp
order - network provides reliable delivery, preserves
order - Observation The above assumptions imply the time
stamp of the last message received on a link is a
lower bound on the time stamp (LBTS) of
subsequent messages received on that link
Goal Ensure LP processes events in time stamp
order
23A Simple Conservative Algorithm
Algorithm A (executed by each LP) Goal Ensure
events are processed in time stamp order WHILE
(simulation is not over) wait until each FIFO
contains at least one message remove smallest
time stamped event from its FIFO process that
event END-LOOP
- wait until a message is received from H2
- Observation Algorithm A is prone to deadlock!
24Deadlock Example
H1 (waiting on H2)
7
H3 (waiting on H1)
15
10
H2 (waiting on H3)
9
8
- A cycle of LPs forms where each is waiting on the
next LP in the cycle. - No LP can advance the simulation is deadlocked.
25Deadlock Avoidance Using Null Messages
Break deadlock by having each LP send null
messages indicating a lower bound on the time
stamp of future messages it could send.
- H1 may now process message with time stamp 7
26Deadlock Avoidance Using Null Messages
Null Message Algorithm (executed by each
LP) Goal Ensure events are processed in time
stamp order and avoid deadlock WHILE (simulation
is not over) wait until each FIFO contains at
least one message remove smallest time stamped
event from its FIFO process that event send
null messages to neighboring LPs with time stamp
indicating a lower bound on future messages
sent to that LP (current time plus
lookahead) END-LOOP
The null message algorithm relies on a
lookahead (minimum delay).
27Lookahead Creep
H1 can process time stamp 7 message
Five null messages to process a single event
- If lookahead is small, there may be many null
messages!
28Preventing Lookahead Creep Next Event Time
Information
H1 (waiting on H2)
7
H3 (waiting on H1)
15
10
H2 (waiting on H3)
9
8
Observation If all LPs are blocked, they can
immediately advance to the time of the minimum
time stamp event in the system
29Lower Bound on Time Stamp
No null messages, assume any LP can send messages
to any other LP When a LP blocks, compute lower
bound on time stamp (LBTS) of messages it may
later receive those with time stamp lt LBTS safe
to process
LBTS min (6, 10, 7) (assume zero lookahead)
- Given a snapshot of the computation, LBTS is the
minimum among - Time stamp of any transient messages (sent, but
not yet received) - Unblocked LPs Current simulation time
lookahead - Blocked LPs Time of next event lookahead
30Lower Bound on Time Stamp (LBTS)
LBTS can be computed asynchonously using a
distributed snapshot algorithm (Mattern)
cut point an instant dividing processs
computation into past and future cut set of cut
points, one per process cut message a message
that was sent in the past, and received in the
future consistent cut cut all cut messages cut
value minimum among (1) local minimum at each
cut point and (2) time stamp of cut messages
non-cut messages can be ignored It can be shown
LBTS cut value
31A Simple LBTS Algorithm
- Initiator broadcasts start LBTS computation
message to all LPs - Each LP sets cut point, reports local minimum
back to initiator - Account for transient (cut) messages
- Identify transient messages, include time stamp
in minimum computation - Color each LP (color changes with each cut
point) message color color of sender - An incoming message is transient if message color
equals previous color of receiver - Report time stamp of transient to initiator when
one is received - Detecting when all transients have been received
- For each color, LPi keeps counter of messages
sent (Sendi) and received (Receivei) - At cut point, send counters to initiator
transients ? (Sendi Receivei) - Initiator detects termination (all transients
received), broadcasts global minimum
32Another LBTS Algorithm
An LP initiates an LBTS computation when it
blocks Initiator broadcasts start LBTS message
to all LPs LPi places cut point, report local
minimum and (Sendi Receivei) back to
initiator Initiator After all reports
received if (? (Sendi Receivei) 0) LBTS
global minimum, broadcast LBTS value Else Repeat
broadcast/reply, but do not establish a new cut
33Synchronous Algorithms
34Topology Information
- Global LBTS algorithm is overly conservative
does not exploit topology information - Lookahead minimum flight time to another
airport - Can the two events be processed concurrently?
- Yes because the event _at_ 1000 cannot affect the
event _at_ 1045 - Simple global LBTS algorithm
- LBTS 1030 (1000 030)
- Cannot process event _at_ 1045 until next LBTS
computation
35Distance Between LPs
- Associate a lookahead with each link LAB is the
lookahead on the link from LPA to LPB - Any message sent on the link from LPA to LPB must
have a time stamp of TA LAB where TA is the
current simulation time of LPA - A path from LPA to LPZ is defined as a sequence
of LPs LPA, LPB, , LPY, LPZ - The lookahead of a path is the sum of the
lookaheads of the links along the path - DAB, the minimum distance from LPA to LPB is the
minimum lookahead over all paths from LPA to LPB - The distance from LPA to LPB is the minimum
amount of simulated time that must elapse for an
event in LPA to affect LPB
36Distance Between Processes
The distance from LPA to LPB is the minimum
amount of simulated time that must elapse for an
event in LPA to affect LPB
- An event in LPY with time stamp TY depends on an
event in LPX with time stamp TX if TX DX,Y
lt TY - Above, the time stamp 15 event depends on the
time stamp 11 event, the time stamp 13 event does
not.
37Computing LBTS
- Assuming all LPs are blocked and there are no
transient messages - LBTSimin(NjDji) (all j) where Ni time of next
event in LPi
LBTSA 15 min (114, 135) LBTSB 14
min (113, 134) LBTSC 12 min (111,
132) LBTSD 14 min (113, 134) Need to
know time of next event of every other
LP Distance matrix must be recomputed if
lookahead changes
38Example
- Using distance information
- DSAN,JFK 630
- LBTSJFK 1630 (1000 630)
- Event _at_ 1045 can be processed this iteration
- Concurrent processing of events at times 1000
and 1045
39Lookahead
40Speedup of Central Server Queueing Model
Simulation
Deadlock Detection and Recovery Algorithm (5
processors)
Exploiting lookahead is essential to obtain good
performance
41Summary Conservative Synchronization
- Each LP must process events in time stamp order
- Must compute lower bound on time stamp (LBTS) of
future messages an LP may receive to determine
which events are safe to process - 1st generation algorithms LBTS computation based
on current simulation time of LPs and lookahead - Null messages
- Prone to lookahead creep
- 2nd generation algorithms also consider time of
next event to avoid lookahead creep - Other information, e.g., LP topology, can be
exploited - Lookahead is crucial to achieving concurrent
processing of events, good performance
42Conservative Algorithms
- Pro
- Good performance reported for many applications
containing good lookahead (queueing networks,
communication networks, wargaming) - Relatively easy to implement
- Well suited for federating autonomous
simulations, provided there is good lookahead
- Con
- Cannot fully exploit available parallelism in the
simulation because they must protect against a
worst case scenario - Lookahead is essential to achieve good
performance - Writing simulation programs to have good
lookahead can be very difficult or impossible,
and can lead to code that is difficult to maintain
43Part IITime Management
Parallel discrete event simulation Conservative
synchronization Optimistic synchronization Time
Management in the High Level Architecture
44Time Warp Algorithm (Jefferson)
- Assumptions
- logical processes (LPs) exchanging time stamped
events (messages) - dynamic network topology, dynamic creation of LPs
OK - messages sent on each link need not be sent in
time stamp order - network provides reliable delivery, but need not
preserve order - Basic idea
- process events w/o worrying about messages that
will arrive later - detect out of order execution, recover using
rollback
H1
9
8
2
H3 logical process
4
5
H3
H2
process all available events (2, 4, 5, 8, 9) in
time stamp order
45Optimistic Protocols
- Offer several advantages over conservative
mechanisms - fuller exploitation of parallelism
- less reliant on lookahead information
- greater transparency of synchronization protocol
- easier to develop and maintain application code
- Caveat Although lookahead is not essential, it
generally does improve performance, so should be
exploited when available
- Potential liabilities
- state saving required to enable rollback
- can significantly reduce performance
- may be cumbersome to implement
- possibility of excessive rollbacks, rollback
thrashing - memory requirements may be large
46Time Warp (Jefferson)
Each LP process events in time stamp order, like
a sequential simulator, except (1) do NOT
discard processed events and (2) add a rollback
mechanism
- Adding rollback
- a message arriving in the LPs past initiates
rollback - to roll back an event computation we must undo
- changes to state variables performed by the
event - message sends
47Anti-Messages
- Used to cancel a previously sent message
- Each positive message sent by an LP has a
corresponding anti-message - Anti-message is identical to positive message,
except for a sign bit - When an anti-message and its matching positive
message meet in the same queue, the two
annihilate each other (analogous to matter and
anti-matter) - To undo the effects of a previously sent
(positive) message, the LP need only send the
corresponding anti-message - Message send in addition to sending the message,
leave a copy of the corresponding anti-message in
a data structure in the sending LP called the
output queue.
48Rollback Receiving a Straggler Message
49Processing Incoming Anti-Messages
- Case I corresponding message has not yet been
processed - annihilate message/anti-message pair
- Case III corresponding message has not yet been
received - queue anti-message
- annihilate message/anti-message pair when message
is received
50Global Virtual Time and Fossil Collection
- A mechanism is needed to
- reclaim memory resources (e.g., old state and
events) - perform irrevocable operations (e.g., I/O)
- Observation A lower bound on the time stamp of
any rollback that can occur in the future is
needed.
- Global Virtual Time (GVT) is defined as the
minimum time stamp of any unprocessed (or
partially processed) message or anti-message in
the system. GVT provides a lower bound on the
time stamp of any future rollback. - storage for events and state vectors older than
GVT (except one state vector) can be reclaimed - I/O operations with time stamp less than GVT can
be performed. - GVT algorithms are similar to LBTS algorithms in
conservative synchronization - Observation The computation corresponding to GVT
will not be rolled back, guaranteeing forward
progress.
51Time Warp and Chandy/Misra Performance
- eight processors
- closed queueing network, hypercube topology
- high priority jobs preempt service from low
priority jobs (1 high priority) - exponential service time (poor lookahead)
52Other Optimistic Algorithms
- Principal goal avoid excessive optimistic
execution
- A variety of protocols have been proposed, among
them - window-based approaches
- only execute events in a moving window (simulated
time, memory) - risk-free execution
- only send messages when they are guaranteed to be
correct - add optimism to conservative protocols
- specify optimistic values for lookahead
- introduce additional rollbacks
- triggered stochastically or by running out of
memory - hybrid approaches
- mix conservative and optimistic LPs
- scheduling-based
- discriminate against LPs rolling back too much
- adaptive protocols
- dynamically adjust protocol during execution as
workload changes
53Summary of Optimistic Algorithms
- Pro
- Good performance reported for a variety of
applications (queueing communication networks,
combat models, transportation systems) - Good transparency offers the best hope for
general purpose parallel simulation software
(more resilient to poor lookahead than
conservative methods)
- Con
- Memory requirements may be large
- Implementation is generally more complex and
difficult to debug than conservative mechanisms - System calls, memory allocation ...
- Must be able to recover from exceptions
- Use in federated simulations requires adding
rollback capability to existing simulations
54Part IITime Management
Parallel discrete event simulation Conservative
synchronization Optimistic synchronization Time
Management in the High Level Architecture
55High Level Architecture (HLA)
- based on a composable system of systems
approach - no single simulation can satisfy all user needs
- support interoperability and reuse among DoD
simulations - federations of simulations (federates)
- pure software simulations
- human-in-the-loop simulations (virtual
simulators) - live components (e.g., instrumented weapon
systems) - mandated as the standard reference architecture
for all MS in the U.S. Department of Defense
(September 1996)
- The HLA consists of
- rules that simulations (federates) must follow to
achieve proper interaction during a federation
execution - Object Model Template (OMT) defines the format
for specifying the set of common objects used by
a federation (federation object model), their
attributes, and relationships among them - Interface Specification (IFSpec) provides
interface to the Run-Time Infrastructure (RTI),
that ties together federates during model
execution
56HLA Federation
Interconnecting autonomous simulators
Federation
Simulation (federate)
Simulation (federate)
Simulation (federate)
Interface Specification
Interface Specification
Runtime Infrastructure(RTI)
- Services to create and manage the execution of
the federation - Federation setup / tear down
- Transmitting data among federates
- Synchronization (time management)
57Interface Specification
Category
Functionality
58A Typical Federation Execution
- initialize federation
- Create Federation Execution (Federation Mgt)
- Join Federation Execution (Federation Mgt)
- declare objects of common interest among
federates - Publish Object Class (Declaration Mgt)
- Subscribe Object Class Attribute (Declaration Mgt)
- exchange information
- Update/Reflect Attribute Values (Object Mgt)
- Send/Receive Interaction (Object Mgt)
- Time Advance Request, Time Advance Grant (Time
Mgt) - Request Attribute Ownership Assumption (Ownership
Mgt) - Modify Region (Data Distribution Mgt)
- terminate execution
- Resign Federation Execution (Federation Mgt)
- Destroy Federation Execution (Federation Mgt)
59HLA Message Ordering Services
- The HLA provides two types of message ordering
- receive order (unordered) messages passed to
federate in an arbitrary order - time stamp order (TSO) sender assigns a time
stamp to message successive messages passed to
each federate have non-decreasing time stamps
- receive order minimizes latency, does not prevent
temporal anomalies - TSO prevents temporal anomalies, but has somewhat
higher latency
60Related Object Management Services
- Sending and Receiving Messages
- Update Attribute Values Reflect Attribute
Values - Send Interaction Receive Interaction
- Message Order (Receive Order or Time Stamp Order)
- Preferred Order Type default order type
specified in fed file for each attribute and
interaction - Sent Message Order Type
- TSO if preferred order type is TSO and the
federate is time regulating and a time stamp was
used in the Update Attribute Values or Send
Interaction call - RO otherwise
- Received Message Order Type
- TSO if sent message order type is TSO and
receiver is time constrained - RO otherwise
indicates callback to federate
61Federated vs. RTI Initiated Services
- Some services are initiated by the federate,
others by the RTI - Federate invoked services
- Publish, subscribe, register, update
- Not unlike calls to a library
- Procedures defined in the RTI ambassador
- RTI invoked services
- Discover, reflect
- Federate defined procedures, in Federate
Ambassador
Federate
Federate ambassador
Update
Reflect
RTI ambassador
RTI
62Example Receiving a Message
Tick() transfer execution to RTI to perform RTI
functions, perform callbacks
/ code sketch, receiving messages / Boolean
Waiting4Message Waiting4Message
TRUE while (Waiting4Message) Tick() /
Federate ambassador / Proc ReflectAttributeValues
() save incoming message in
buffer Waiting4Message FALSE
63Advancing Logical Time
- HLA TM services define a protocol for federates
to advance logical time logical time only
advances when that federate explicitly requests
an advance - Time Advance Request time stepped federates
- Next Event Request event stepped federates
- Time Advance Grant RTI invokes to acknowledge
logical time advances
If the logical time of a federate is T, the RTI
guarantees no more TSO messages will be passed to
the federate with time stamp lt T Federates
responsible for pacing logical time advances with
wallclock time in real-time executions
64HLA Time Management Services
65Time Regulating and Time Constrained Federates
- Federates must declare their intent to utilize
time management services by setting their time
regulating and/or time constrained flags - Time regulating federates can send TSO messages
- Can prevent other federates from advancing their
logical time - Enable Time Regulation Time Regulation
Enabled - Disable Time Regulation
- Time constrained federates can receive TSO
messages - Time advances are constrained by other federates
- Enable Time Constrained Time Constrained
Enabled - Disable Time Constrained
- Each federate in a federation execution can be
- Time regulating only (e.g., message source)
- Time constrained only (e.g., Stealth)
- Both time constrained and regulating (common case
for analytic simulations) - Neither time constrained nor regulating (e.g.,
DIS-style training simulations)
indicates callback to federate
66Synchronizing Message Delivery
- Goal process all events (local and incoming
messages) in time stamp order To support this,
RTI will - Deliver messages in time stamp order (TSO)
- Synchronize delivery with simulation time advances
next TSO message
RTI
TSO messages
next local event
T
federate
local events
logical time
T
current time
- Federate next local event has time stamp T
- If no TSO messages w/ time stamp lt T, advance to
T, process local event - If there is a TSO message w/ time stamp T T,
advance to T and process TSO message
67Next Event Request (NER)
- Federate invokes Next Event Request (T) to
request its logical time be advanced to time
stamp of next TSO message, or T, which ever is
smaller - If next TSO message has time stamp T T
- RTI delivers next TSO message, and all others
with time stamp T - RTI issues Time Advance Grant (T)
- Else
- RTI advances federates time to T, invokes Time
Advance Grant (T)
68Code Example Event Stepped Federate
sequential simulator T current simulation
time PES pending event set While (simulation
not complete) T time of next event in
PES process next event in PES End-While
federated simulator While (simulation not
complete) T time of next event in
PES PendingNER TRUE NextEventRequest(T) whil
e (PendingNER) Tick() process next event in
PES End-While / the following
federate-ambassador procedures are called by the
RTI / Procedure ReflectAttributeValues
() place event in PES Procedure
TimeAdvanceGrant () PendingNER False
69Lookahead in the HLA
- Each federate must declare a non-negative
lookahead value - Any TSO sent by a federate must have time stamp
at least the federates current time plus its
lookahead - Lookahead can change during the execution (Modify
Lookahead) - increases take effect immediately
- decreased do not take effect until the federate
advances its logical time
70Federate/RTI Guarantees
- Federate at logical time T (with lookahead L)
- All outgoing TSO messages must have time stamp
TL (Lgt0) - Time Advance Request (T)
- Once invoked, federate cannot send messages with
time stamp less than T plus lookahead - Next Event Request (T)
- Once invoked, federate cannot send messages with
time stamp less than T plus the federates
lookahead unless a grant is issued to a time less
than T - Time Advance Grant (T) (after TAR or NER service)
- All TSO messages with time stamp less than or
equal to T have been delivered
71Minimum Next Event Time and LBTS
LBTS010
MNET08
- LBTSi Lower Bound on Time Stamp of TSO messages
that could later be placed into the TSO queue for
federate i - TSO messages w/ TS LBTSi eligible for delivery
- RTI ensures logical time of federate i never
exceeds LBTSi
- MNETi Minimum Next Event Time is a lower bound
on the time stamp of any message that could later
be delivered to federate i. - Minimum of LBTSi and minimum time stamp of
messages in TSO queue
72Simultaneous Events
- Simultaneous events are events containing the
same time stamp - Ordering of simultaneous events often important
- RTI does not have sufficient information to
intelligently order simultaneous events - HLA ordering simultaneous events left to the
federate - Grant to time T (after TAR/NER) all events with
time stamp T delivered to federate - Simultaneous events delivered to federate in an
arbitrary order (may vary from one execution to
the next) - Federate must have a deterministic way to order
simultaneous to ensure repeatable executions
(e.g., sort events by type or other event
properties)
73Zero Lookahead
- Zero lookahead a federate at time T can send TSO
messages with time stamp T - If zero lookahead is allowed, a Time Advance
Grant to time T cannot guarantee delivery of all
events with time stamp equal to T
74Zero Lookahead in the HLATAR Available and NER
Available
- Zero lookahead allowed in HLA federations
- Next Event Request (NER), Time Advance Request
(TAR) - grant to time T guarantees delivery of all TSO
messages w/ time stamp T (or less) - constraint once a Time Advance Grant to time T
is issued for these requests, subsequent
messages sent by the federate must have time
stamp strictly greater than T. - Two new services Time Advance Request Available
(TARA) and Next Event Request Available (NERA) - TARA (NERA) advance logical time, similar to TAR
(NER) - federate can send zero lookahead messages after
receiving grant - grant to time T does not guarantee all messages
with time stamp T have been delivered, only those
available at the time of the call - order that TSO messages are delivered to the
federate is arbitrary
75Zero Lookahead Example
- Two federate types vehicle, commander
- Vehicle federate
- When sensors indicate position of another vehicle
has changed, notify commander via a zero
lookahead interaction
76Representation of Logical Time
- Time represented as a federation-defined abstract
data type - Federation must agree upon a common
representation of logical time during federation
development - Time value (e.g., 0900, 12 May 1973)
- Time duration (e.g., 30 minutes)
- Federation specifies
- Data type of time values and duration (structure
types allowed) - Comparison (time values) and addition (value and
duration) operators - Example 1 simple data type
- Value Float Duration Float Comparison lt
Addition - Example 2 simple data type tie breaking field
- Value (Float, Integer) Duration (Float)
- Comparison (X1,X2) lt (Y1,Y2) if (X1Y1) then
(X2ltY2) else (X1 lt Y1) - (1.3,1) lt (2.1, 0) (6.1,2) lt (6.1, 4)
- Addition (X1,X2) L if (L0) then (X1,X2),
else (X1L, 0) - (1.3, 1) 2.2 (3.5, 0) (1.3, 1) 0.0
(1.3, 1)
77Example Complex Time Type
- Vehicle federate
- When sensors indicate position of another vehicle
has changed, notify commander via interaction and
wait for reply interaction message - Message Types 0 position update, 1 Notify
interaction, 2 reply interaction - Time format (time value, message type)
78Event Retraction
- Previously sent events can be unsent via the
Retract service - Update Attribute Values and Send Interaction
return a handle to the scheduled event - Handle can be used to Retract (unschedule) the
event - Can only retract event if its time stamp gt
current time lookahead - Retracted event never delivered to destination
(unless Flush Queue used)
79Optimistic Time Management
- Mechanisms to ensure events are processed in time
stamp order - conservative block to avoid out of order event
processing - optimistic detect out-of-order event processing,
recover (e.g., Time Warp)
- Primitives for optimistic time management
- Optimistic event processing
- Deliver (and process) events without time stamp
order delivery guarantee - HLA Flush Queue Request
- Rollback
- Deliver message w/ time stamp T, other
computations already performed at times gt T - Must roll back (undo) computations at logical
times gt T - HLA (local) rollback mechanism must be
implemented within the federate - Anti-messages secondary rollbacks
- Anti-message message sent to cancel (undo) a
previously sent message - Causes rollback at destination if cancelled
message already processed - HLA Retract service deliver retract request if
cancelled message already delivered - Global Virtual Time
- Lower bound on future rollback to commit I/O
operations, reclaim memory - HLA Query Next Event Time service gives current
value of GVT
80Optimistic Time Management in the HLA
- HLA Support for Optimistic Federates
- federations may include conservative and/or
optimistic federates - federates not aware of local time management
mechanism of other federates (optimistic or
conservative) - optimistic events (events that may be later
canceled) will not be delivered to conservative
federates that cannot roll back - optimistic events can be delivered to other
optimistic federates - individual federates may be sequential or
parallel simulations
Flush Queue Request similar to NER except (1)
deliver all messages in RTIs local message
queues, (2) need not wait for other federates
before issuing a Time Advance Grant
81Summary HLA Time Management
- Functionality
- allows federates with different time management
requirements (and local TM mechanisms) to be
combined within a single federation execution - DIS-style training simulations
- simulations with hard real-time constraints
- event-driven simulations
- time-stepped simulations
- optimistic simulations
- HLA Time Management services
- Event order
- receive order delivery
- time stamp order delivery
- Logical time advance mechanisms
- TAR/TARA unconditional time advance
- NER/NERA advance depending on message time stamps
82Part IIIDistributed Virtual Environments
Introduction Dead reckoning Data distribution
83Example Distributed Interactive Simulation (DIS)
The primary mission of DIS is to define an
infrastructure for linking simulations of various
types at multiple locations to create realistic,
complex, virtual worlds for the simulation of
highly interactive activities DIS Vision, 1994.
- developed in U.S. Department of Defense,
initially for training - distributed virtual environments widely used in
DoD growing use in other areas (entertainment,
emergency planning, air traffic control)
84A Typical DVE Node Simulator
- Execute every 1/30th of a second
- receive incoming messages user inputs, update
state of remote vehicles - update local display
- for each local vehicle
- compute (integrate) new state over current time
period - send messages (e.g., broadcast) indicating new
state
Reproduced from Miller, Thorpe (1995), SIMNET
The Advent of Simulator Networking, Proceedings
of the IEEE, 83(8) 1114-1123.
85Typical Sequence
86DIS Design Principles
- Autonomy of simulation nodes
- simulations broadcast events of interest to other
simulations need not determine which others need
information - receivers determine if information is relevant to
it, and model local effects of new information - simulations may join or leave exercises in
progress - Transmission of ground truth information
- each simulation transmits absolute truth about
state of its objects - receiver is responsible for appropriately
degrading information (e.g., due to
environment, sensor characteristics) - Transmission of state change information only
- if behavior stays the same (e.g., straight and
level flight), state updates drop to a
predetermined rate (e.g., every five seconds) - Dead Reckoning algorithms
- extrapolate current position of moving objects
based on last reported position - Simulation time constraints
- many simulations are human-in-the-loop
- humans cannot distinguish temporal difference lt
100 milliseconds - places constraints on communication latency of
simulation platform
87Part IIIDistributed Virtual Environments
Introduction Dead reckoning Data distribution
88Distributed Simulation Example
- Virtual environment simulation containing two
moving vehicles - One vehicle per federate (simulator)
- Each vehicle simulator must track location of
other vehicle and produce local display (as seen
from the local vehicle) - Approach 1 Every 1/60th of a second
- Each vehicle sends a message to other vehicle
indicating its current position - Each vehicle receives message from other vehicle,
updates its local display
89Limitations
- Position information corresponds to location when
the message was sent doesnt take into account
delays in sending message over the network - Requires generating many messages if there are
many vehicles
90Dead Reckoning
- Send position update messages less frequently
- local dead reckoning model predicts the position
of remote entities between updates
91Re-synchronizing the DRM
- When are position update messages generated?
- Compare DRM position with exact position, and
generate an update message if error is too large - Generate updates at some minimum rate, e.g., 5
seconds (heart beats)
92Dead Reckoning Models
- P(t) precise position of entity at time t
- Position update messages P(t1), P(t2), P(t3)
- v(ti), a(ti) ith velocity, acceleration update
- DRM estimate D(t), position at time t
- ti time of last update preceding t
- ?t ti - t
- Zeroth order DRM
- D(t) P(ti)
- First order DRM
- D(t) P(ti) v(ti)?t
- Second order DRM
- D(t) P(ti) v(ti)?t 0.5a(ti)(?t)2
93DRM Example
- Potential problems
- Discontinuity may occur when position update
arrives may produce jumps in display - Does not take into account message latency
94Time Compensation
- Taking into account message latency
- Add time stamp to message when update is
generated (sender time stamp) - Dead reckon based on message time stamp
95Smoothing
- Reduce discontinuities after updates occur
- phase in position updates
- After update arrives
- Use DRM to project next k positions
- Interpolate position of next update
Accuracy is reduced to create a more natural
display
96Part IIIDistributed Virtual Environments
Introduction Dead reckoning Data distribution
97Data Distribution Management
- A federate sending a message (e.g., updating its
current location) cannot be expected to
explicitly indicate which other federates should
receive the message - Basic Problem which federates receive messages?
- broadcast update to all federates (SimNet, early
versions of DIS) - does not scale to large numbers of federates
- grid sectors
- OK for distribution based on spacial proximity
- routing spaces (STOW-RTI, HLA)
- generalization of grid sector idea
- Content-based addressing in general, federates
must specify - Name space vocabulary used to specify what data
is of interest and to describe the data contained
in each message (HLA routing space) - Interest expressions indicate what information a
federate wishes to receive (subset of name space
HLA subscription region) - Data description expression characterizes data
contained within each message (subset of name
space HLA publication region)
98HLA Routing Spaces
- Federate 1 (sensor) subscribe to S1
- Federate 2 (sensor) subscribe to S2
- Federate 3 (target) update region U
- update messages by target are sent to federate 1,
but not to federate 2
- HLA Data Distribution Management
- N dimensional normalized routing space
- interest expressions are regions in routing space
(S10.1,0.5, 0.2,0.5) - each update associated with an update region in
routing space - a federate receives a message if its subscription
region overlaps with the update region
99Implementation Grid Based
U publishes to 12, 13 S1 subscribes to
6,7,8,11,12,13 S2 subscribes to 11,12,16,17
- subscription region Join each group overlapping
subscription region - attribute update send Update to each group
overlapping update region - need additional filtering to avoid unwanted
messages, duplicates
100Changing a Subscription Region
- Change subscription region
- issue Leave operations for (cells in old region -
cells in new region) - issue Join operations for (cells in new region -
cells in old region) - Observation each processor can autonomously
join/leave groups whenever its subscription
region changes w/o communication.
101Another Approach Region-Based Implementation
- Multicast group associated with U
- Membership of U S1
- Modify U to U
- determine subscription regions overlapping with
U - Modify membership accordingly
- Associate a multicast group with each update
region - Membership all subscription regions overlapping
with update region - Changing subscription (update) region
- Must match new subscription (update) region
against all existing update (subscription)
regions to determine new membership
- No duplicate or extra messages
- Change subscription typically requires
interprocessor communication - Can use grids (in addition to regions) to reduce
matching cost
102Distributed Virtual Environments Summary
- Perhaps the most dominant application of
distributed simulation technology to date - Human in the loop training, interactive,
multi-player video games - Hardware in the loop
- Managing interprocessor communication is the key
- Dead reckoning techniques
- Data distribution management
- Real-time execution essential
- Many other issues
- Terrain databases, consistent, dynamic terrain
- Real-time modeling and display of physical
phenomena - Synchronization of hardware clocks
- Human factors
103Future ResearchDirections
104The good news...
- Parallel and distributed simulation technology is
seeing widespread use in real-world applications
and systems - HLA, standardization (OMG, IEEE 1516)
- Other defense systems
- Interactive games (no time management yet,
though!)
105Interoperability and Reuse
Reuse the principal reason distributed simulation
technology is seeing widespread use (scalability
still important!)
Sub-objective 1-1 Establish a common high level
simulation architecture to facilitate
interoperability and reuse of MS components.
- U.S. DoD Modeling and Simulation Master Plan
- critical capability that underpin the IMTR
vision is... total, seamless model
interoperability Future models and simulations
will be transparently compatible, able to
plug-and-play - - Integrated Manufacturing Technology Roadmap
- Enterprise Management
- Circuit design, embedded systems
- Transportation
- Telecommunications
106Research Challenges
- Shallow (communication) vs. deep (semantics)
interoperability How should I develop
simulations today, knowing they may be used
tomorrow for entirely different purposes? - Multi-resolution modelling
- Smart, adaptable interfaces?
- Time management A lot of work completed already,
but still not a solved problem! - Conservative zero lookahead simulations will
continue to be the common case - Optimistic automated optimistic-ization of
existing simulations (e.g., state saving) - Challenge basic assumptions used in the past New
problems solutions, not new solutions to the
old problems - Relationship of interoperability and reuse in
simulation to other domains (e.g., e-commerce)?
107Real-Time Systems
- Interactive distributed simulations for immersive
virtual environments represents fertile new
ground for distributed simulation research - The distinction between the virtual and real
world is disappearing
108Research Challenges
- Real-time time management
- Time managed, real-time distributed simulations?
- Predictable performance
- Performance tunable (e.g., synchronization
overheads) - Mixed time management (e.g., federations with
both analytic and real-time training simulations)
- Distributed simulation over WANs, e.g., the
Internet - Server vs. distributed architectures?
- PDES over unreliable transport and nodes
- Distributed simulations that work!
- High quality, robust systems
- Data Distribution Management
- Implementation architectures
- Managing large numbers of multicast groups
- Join/leave times
- Integration and exploitation of network QoS
109Simulation-Based Decision Support
- Battle management
- Enterprise control
- Transportation Systems
110Research Challenges
- Ultra fast model execution
- Much, much faster-than-real-time execution
- Instantaneous model execution Spreadsheet-like
performance - Ultra fast execution of multiple runs
- Automated simulation analyses smart,
self-managed devices and systems? - Ubiquitous simulation?
111Closing Remarks
- After years of being largely an academic
endeavor, parallel and distributed simulation
technology is beginning to thrive. - Many major challenges remain to be addressed
before the technology can achieve its fullest
potential.
112TheEnd