A Framework for Collaborative Distributed Simulation over the Grid

1 / 42

About This Presentation

Title:

A Framework for Collaborative Distributed Simulation over the Grid

Description:

A Framework for Collaborative Distributed Simulation over the Grid. Stephen ... Brian LOGAN (Univ of Nottingham) Stephen J TURNER (Nanyang Technological Univ) ... –

Number of Views:34

Avg rating:3.0/5.0

Slides: 43

Provided by: zjy2

Category:

more less

Transcript and Presenter's Notes

Title: A Framework for Collaborative Distributed Simulation over the Grid

1
A Framework for Collaborative Distributed
Simulation over the Grid

Stephen John Turner
Parallel Distributed Computing Centre
Nanyang Technological University
Singapore

2
Project Funding
SMA Inter-University Project Wentong CAI (Nanyang
Technological Univ) Stephen J TURNER (Nanyang
Technological Univ) Yong Meng TEO (National Univ
of Singapore) Rassul AYANI (Royal Institute of
Technology, Sweden)
UK e-Science Sister Project Georgios
THEODOROPOULOS (Univ of Birmingham) Brian LOGAN
(Univ of Nottingham) Stephen J TURNER (Nanyang
Technological Univ) Wentong CAI (Nanyang
Technological Univ)
3
Outline

Background
Distributed Simulation
Grid Computing
Motivation
Research Challenges
HLA-based Distributed Simulation
Grid Services and Service Discovery
Load Management System
Grid Enabled HLA/RTI
Conclusions

4
Distributed Simulation

Provides a way of linking simulation components
(federates) of various types at possibly
different locations to create a common virtual
environment (federation)

5
Example Application Areas

Battlefield Simulation
Linking different types of forces at multiple
physical locations to create a realistic and
complex virtual world
Supply Chain Simulation
Managing material and information flow, from
manufacturers through distributors to customers
Air Traffic Control
Simulating airports and airspace sectors to
provide faster than real-time simulation for
what-if analysis
Multi-player Internet Games
Involving massive multi-player (10,000) virtual
world

6
High Level Architecture
7
High Level Architecture

Features of High Level Architecture
Each federate has a simulation object model (SOM)
defining the data to be shared with other
federates allowing reuse in different federations
The federation (set of federates) has a common
federation object model (FOM)
HLA supports distributed simulations linking the
federates of a federation over a LAN or the
Internet
Time Management can be used to ensure the correct
ordering of events
HLA is an IEEE (1516) and OMG standard

8
Ambassador Paradigm
9
Grid Computing

Grid technology is the next step in the evolution
of computing, enabling new forms of collaboration
through the seamless sharing of distributed
computing and data resources

Communities can share geographically distributed
resources for their common purpose
10
Grid Computing
Web Services Grid Services OGSA OGSI Globus
Toolkit
11
Motivation

Collaborative Simulation Development
The development of complex simulations usually
requires collaborative effort from analysts with
different domain knowledge and expertise,
possibly at different locations
Sharing of Computing Resources
Simulation systems often require huge computing
resources and the participants in the simulation
and/or data sets required may also be
geographically distributed

12
Motivation

HLA-based Distributed Simulation on the Grid
HLA defines a standard for reuse and
interoperability
Grid technologies enable collaboration and the
use of distributed computing resources

Collaborative
Distributed
Complex Multi-dimensional

13
Simulation Life Cycle
14
Research Challenges

Service/Model Discovery
Based on requirements, suitable component
models are selected to form an overall simulation
Research Issues
How are simulation models registered as grid
services
How are simulation models discovered?
How are the interfaces defined?
Are the simulation models HLA compliant?
Do they conform to any standard reference models
(e.g. HLA-CSPIF)?

15
Research Challenges

Service/Model Composition
Checking semantic interoperability between
individual component simulation models from
different sources
Research Issues
Can the output of one simulation model feed into
the input of another?
How is the work flow of the configuration
described?
What are the mechanisms for verifying the
correctness of the simulation?

16
Research Challenges

Security
Simulation partners should be allowed to specify
selective access to their simulation models
Research Issues
Does a user have access to a particular
simulation model or data?
Can a user selectively share sensitive data with
different partners?
Does the simulation model originate from a
trusted partner?
Must the model be executed on a particular
resource?

17
Research Challenges

Execution
Simulation partners may obtain computing
resources from the Grid to supplement their needs
Research Issues
How can the different simulation runs be
partitioned onto the available computing
resources?
What mechanisms should be used for scheduling and
load management of simulations on the Grid?
What kind of fault tolerance mechanisms are
required?

18
Simulation Life Cycle
Semantic Interfaces
Resource Managemt
Workflow
Policies
19
HLA-based Distributed Simulation

Discovery and Composition of Models

Discovery of Resources

Management of Simulation Execution

20
Grid Services and Service Discovery

Query Index Service for RTI Service handle for
federation
Create RtiExec if necessary and get endpoint used
by RtiExec
Query Index Service for Federate Factory Service
handle
Create Federate Service and Federate Process
Federate Processes join federation

21
Grid Services and Service Discovery

Query Index Service for Federate Factory Service
handle
Create Federate Service and Federate Process
4a.Federate Service can query Index Service for
RtiExec endpoint
5. Federate Processes join federation

22
Load Management System

Use Grid software for
Authentication,
Resource Discovery, Allocation Monitoring, and
Facilitating Federate Migration

23
Load Management System
Resource Discovery Allocation Monitoring
Globus
Run Time Infrastructure
24
Problems

Developing a Grid-enabled, HLA-based simulation
requires a large effort
Check-pointing and state saving are application
dependent and are very difficult in general
Federate migration may require federation wide
synchronization an expensive operation
Messages may be delayed or lost in transit during
federate migration

25
Objectives

Develop a framework that allows modeler to
concentrate on the simulation
Provide an application-independent federate
execution model
Hide details of HLA/RTI interface and load
management realization from simulation designer
Make federate state saving easier and more
modular and simplify federate migration design
Achieve dynamic load balancing of HLA-based
distributed simulation over a Grid environment

26
SimKernel

Simulation code extended with two interfaces
One for communicating with Runtime
Infrastructure (RTI)
One for communicating with Load Management
System (LMS)

27
SimKernel
Design
Implementation
Execution
28
Federate

Each federate contains two threads (SimKernel)
and load management thread (LMClient)
SimKernel processes simulation events as defined
by the user and communicates with RTI
LMClient works with Load Manager (LM) to perform
federate migration
receive instruction from LM
stop SimKernel
get SimKernel execution state
transfer SimKernel configuration and execution
state

29
Load Manager

Load Manager
Constantly monitors and collects load information
of each individual participating computing node
Runs load balancing algorithm to determine which
federate should migrate from which host to which
destination
Communicates with the LMClients at both the
source and destination hosts until migration
succeeds

30
Migration Approaches

Federation wide synchronization

federate
federate
federate
Federation-Wide Save
Federate Migration
Federation-Wide Restore
Costly Operation!
31
Migration Approaches

Communication among federates
Messages may be lost in transit during migration

publish
subscribe
msg
network
resign
join
subscribe
subscribe
unsubscribe
32
Our Approach

We developed an algorithm aiming to
Provide transparent migration, and
Minimize the migration overhead
Run two instances of the migrating federate until
event integrity is ensured
No synchronization or FTP communication is
required
Implementation is specific to federates based on
SimKernel

33
Federate Migration
migrating federate
resignFederationExec
sendOutgoingEvents
returnStatus
suspend
missingMsg
receivedInteraction
flushQueueRequest
receivedInteraction
collect
returnStatus
LMClient _at_source
Req_migrate
migrationSucceeded
notifyMissingMsg
returnInformation
returnInformation
requestInformation
RTI
Load Manager
pub/sub Interaction
flushQueueRequest
receivedInteraction
joinFederation
Req_migrate
getMsgCount
recvMsgCount
LMClient _at_destination
resume
restore
new
restarting federate
Latency period
34
Experimental Results
35
Grid Enabled HLA/RTI
Client 1
Client 1
Grid Network

Client n
Client n
Federation 1
Federation m
36
Design
Grid Services indexing, discovery, resource
management, monitoring services
Grid Services
Globus
Proxy
Simulation Code
Proxies Federates
Grid-enabled API
HLA API
Grid-enabled HLA API
HLA API
Globus
RTI on LAN
Globus
Grid Network
Client
Resource
37
Client Proxy Communication
Federate
Proxy
My FedAmb Notification sink
Proxy RTIamb Grid Service
Proxy Fedamb Notification
RTIamb call to Grid Service
Grid Network
38
Proxy RTI Communication
Proxy
Proxy
Proxy RTIamb Grid Service
FedAmb
FedAmb
Proxy Fedamb Notification
RTIamb
RTIamb.
39
Discussion

Advantages
Avoids firewall issues as client communicates
with proxy via grid services
Client application code can run on heterogenous
platforms
Provides easy migration of client code, proxy
does not need to be migrated
Disadvantages
Overhead of communication as all simulation
events use grid services

40
Conclusions

Work Done
Developed a simple prototype using Globus for
resource discovery, allocation and federate
deployment (DS-RT 02)
Developed SimKernel framework to allow modeler to
concentrate on the simulation, rather than
implementation (DS-RT 03)
Developed a federate migration protocol without
using federation synchronization (ICCS 04)
Developed Grid Service and Service Discovery
Framework (submitted to DS-RT 04)

41
Conclusions

Future Work
Service/model discovery
Service/model composition
Grid workflow languages
Grid enabled HLA/RTI
Performance measurement
Alternative communication mechanisms
Migration and fault tolerance
Integration of sub-projects
Convert to GT4 (WS-RF)

42
Thank you for your attention!
While the HLA defines a standard for the
construction of large-scale distributed
simulations, Grid technologies enable
collaboration and the use of distributed
computing resources, while also facilitating
access to geographically distributed data sets