Title: QoS-driven%20Lifecycle%20Management%20of%20Service-oriented%20Distributed%20Real-time%20
1 QoS-driven Lifecycle Management of
Service-oriented Distributed Real-time Embedded
Systems
Aniruddha Gokhale a.gokhale_at_vanderbilt.edu www.dr
e.vanderbilt.edu/gokhale Assistant
Professor ISIS, Dept. of EECS Vanderbilt
University Nashville, Tennessee February 16th,
2006
www.dre.vanderbilt.edu
2Service-oriented Style of Distributed Real-time
Embedded Systems
- Regulating adapting to (dis)continuous changes
in runtime environments - e.g., online prognostics, dependable upgrades
- Satisfying tradeoffs between multiple (often
conflicting) QoS demands - e.g., secure, real-time, reliable, etc.
- Satisfying QoS demands in face of fluctuating
and/or insufficient resources - e.g., mobile ad hoc networks (MANETs)
3Characteristics of SOA-style DRE Systems
- Manifestation of Service-Oriented Architectures
(SOA) in the distributed real-time embedded
(DRE) systems space - Applications composed of a one or more
operational string of services - A service is a component or an assembly of
components - Dynamic (re)deployment of services into
operational strings is necessary - New class of QoS (performance survivability)
requirements - Realized using enabling component middleware
technologies e.g., CCM, .NET and J2EE
4QoS Issues for SOA-style DRE Systems
- Per-component concern choice of implementation
- Depends of resources, compatibility with other
components in assembly - Communication concern choice of communication
mechanism used - Assembly concerns what components to assemble
dynamically? What order? What configurations
end-to-end are valid? - Failure recovery concern what is the unit of
failover? - Sharing concern shared components will need
proactive survivability since it affects several
services simultaneously - Availability concern what is the degree of
redundancy? What replication styles to use? Does
it apply to whole assembly? - Deployment concern how to select resources?
Risk alleviation?
5Tangled Concerns in SOA-style DRE Systems
- Demonstrates numerous tangled para-functional
concerns - Significant sources of variability that affect
end-to-end QoS (performance survivability)
Separation of Concerns Managing Variability is
the Key
6(1) Design-time Variability Management in
SOA-style DRE Systems
- Focus on Separation of Concerns
- What if Analysis
- Analytical methods
- Simulation methods
- Model-driven generative programming for what if
- Understanding the impact of individual concerns
- Students involved
- Krishnakumar Balasubramanian, Jaiganesh
Balasubramanian, Gan Deng, Amogh Kavimandan,
James Hill, Sumant Tambe, Arundhati Kogekar,
Dimple Kaul
Work partly supported by DARPA PCES program (PI),
DARPA ARMS Program, PI on subcontracts from
Lockheed Martin ATL, NSF CSR-SMA Program, PI
7Separation of Concerns using CoSMIC
- Project Lead and PI DARPA PCES program
- CoSMIC project focuses on separation of
deployment and configuration concerns - Model-driven generative programming framework
- Complementary technology to CIAO and DAnCE
middleware - www.dre.vanderbilt.edu/cosmic
- CoSMIC tools e.g., PICML used for separation of
concerns in operational strings - Captures the data model of the OMG DC
specification - Synthesis of static deployment plans for DRE
components - New capabilities being added for static
deployment planning
Work supported by DARPA PCES Program, PI
8Case Study for What if Analysis Virtual Router
- Network services need support for efficient
(de)-multiplexing, dispatching and
routing/forwarding
- .e.g., VPN Service provided by a virtual router
- Provides differentiated services to customers,
e.g., prioritized service - VPN setup messages must be efficiently (de)
multiplexed, serviced and forwarded - Implemented using middleware
- Need to estimate capacity of the system at
design-time
Problem boils down to capacity planning and
estimating performance of configured middleware
9Performance Analysis of Reactor Pattern in VR
- Customers send VPN setup messages to router
- VPN setup messages manifest as events at the VR
- VR must service these events (e.g., resource
allocation) and honor the prioritized service, if
any - Accepted messages are forwarded
- Events could be dropped in overload conditions
The Reactor architectural pattern allows
event-driven applications to demultiplex
dispatch service requests that are delivered to
an application from one or more clients.
- Reactor pattern decouples the detection,
demultiplexing, dispatching of events from the
handling of events - Participants include the Reactor, Event handle,
Event demultiplexer, abstract and concrete event
handlers
10Modeling VR Capabilities in a Reactor
- Consider VPN service for two customer classes
- Reactor accepts and handles two types of input
events - Differentiated services for two classes
- Events are handled in prioritized order
- Each event type has a separate queue to hold the
incoming events. Buffer capacity for events of
type one is N1 and of type two is N2. - Event arrivals are Poisson for type one and type
two events with rates l1 and l2, resp. - Event service time is exponential for type one
and type two events with rates m1 and m2, resp.
Model of a single-threaded, select-based reactor
implementation
11Performance Metrics of Interest for Reactor
- Throughput
- -Number of events that can be processed
- -Applications such as telecommunications call
processing. - Queue length
- -Queuing for the event handler queues.
- -Appropriate scheduling policies for
applications with real-time requirements. - Total number of events
- -Total number of events in the system.
- -Scheduling decisions.
- -Resource provisioning required to sustain
system demands. - Probability of event loss
- -Events discarded due to lack of buffer
space. - -Safety-critical systems.
- -Levels of resource provisioning.
- Response time
12Performance Analysis using Stochastic Reward Nets
Transition
Inhibitor arc
Place
Immediate transition
Token
- Stochastic Reward Nets (SRNs) are an extension to
Generalized Stochastic Petri Nets (GSPNs) which
are an extension to Petri Nets. - Extend the modeling power of GSPNs by allowing
- Guard functions
- Marking-dependent arc multiplicities
- General transition probabilities
- Reward rates at the net level
- Allow model specification at a level closer to
intuition. - Solved using tools such as SPNP (Stochastic Petri
Net Package).
13Modeling the Reactor using SRN (1/2)
Event arr.
Drop events on overflow
Service queue
Prioritized service
Servicing the event
Service completion
- Models arrivals, queuing, and prioritized service
of events. - Transitions A1 and A2 Event arrivals.
- Places B1 and B2 Buffer/queues.
- Places S1 and S2 Service of the events.
- Transitions Sr1 and Sr2 Service completions.
- Inhibitor arcs Place B1and transition A1 with
multiplicity N1 (B2, A2, N2) - - Prevents firing of transition A1 when
there are N1 tokens in place B1. - Inhibitor arc from place S1 to transition Sr2
- - Offers prioritized service to an event
of type one over event of type two. - - Prevents firing of transition Sr2 when
there is a token in place S1.
14Modeling the Reactor using SRN (2/2)
- Process of taking successive snapshots
- Reactor waits for new events when currently
enabled events are handled - Sn1 enabled Token in StSnpSht Tokens in B1
No Token in S1. - Sn2 enabled Token in StSnpSht Tokens in B2
No Token in S2. - T_SrvSnpSht enabled Token in S1 and/or S2.
- T_EndSnpSht enabled No token in S1 and S2.
- Sn1 and Sn2 have same priority
- T_SrvSnpSht lower priority than Sn1 and Sn2
15VR SRN Performance Estimates
- SRN model solved using Stochastic Petri Net
Package (SPNP) to obtain estimates of performance
metrics. - Parameter values l1 0.5/sec, l2 0.5/sec, m1
2.0/sec, m2 2.0/sec. - Two cases N1 N2 1, and N1 N2 5.
- Observations
- Probability of event loss is higher when the
buffer space is 1 - Total number of events of type two is higher than
type one. - Events of type two stay in the system longer than
events of type one. - May degrade the response time of event requests
for class 2 customers compared to requests from
class 1 customers
16VR SRN Sensitivity Analysis
- Analyze the sensitivity of performance metrics to
variations in input parameter values. - Vary l1 from 0.5/sec to 2.0/sec.
- Values of other parameters l2 0.5/sec, m1
2.0/sec, m2 2.0/sec, N1 N2 5. - Compute performance measures for each one of the
input values.
- Observations
- Throughput of event requests from customer class
1 increases, but rate of increase declines. - Throughput of event requests from customer class
2 remains unchanged.
17Middleware Pattern Simulations in OMNeT
- OMNeT is a discrete event simulator for
networked systems - Developers write C code for simulation
- www.omnetpp.org
18The Simulation Model for Reactor
Event Handlers with queues
Synchronous Event Demultiplexer
Statistics Collector
Event Generator
Reactor
19Addressing Middleware Variability Challenges
Although middleware provides reusable building
blocks that capture commonalities, these blocks
and their compositions incur variabilities that
impact performance in significant ways.
- Compositional Variability
- Incurred due to variations in the compositions of
these building blocks - Need to address compatibility in the compositions
and individual configurations - Dictated by needs of the domain
- E.g., Leader-Follower makes no sense in a single
threaded Reactor
- Per-Block Configuration Variability
- Incurred due to variations in implementations
configurations for a patterns-based building
block - E.g., single threaded versus thread-pool based
reactor implementation dimension that crosscuts
the event demultiplexing strategy (e.g., select,
poll, WaitForMultipleObjects
20Automation Goals for What if Analysis
Applying design-time performance analysis
techniques to estimate the impact of variability
in middleware-based DRE systems
Composed System
Refined model of a pattern
Refined model of a pattern
- Build and validate performance models for
invariant parts of middleware building blocks - Weaving of variability concerns manifested in a
building block into the performance models - Compose and validate performance models of
building blocks mirroring the anticipated
software design of DRE systems - Estimate end-to-end performance of composed
system - Iterate until design meets performance
requirements
Invariant model of a pattern
Refined model of a pattern
Refined model of a pattern
Refined model of a pattern
weave
weave
variability
variability
Refined model of a pattern
Refined model of a pattern
system
21Automating Scaling the What if Process
- Model-driven Generative technologies
- Developed the SRN Modeling Language (SRNML) in
GME - Applied C-SAW framework (from Univ of Alabama,
Birmingham) for model scalability
RD supported by NSF CSR-SMA Program in
collaboration with Dr. Jeff Gray (UAB) and Dr.
Swapna Gokhale (UConn)
22Analyzing Impact of Individual Concerns
Engineering Mechanics Statics Dynamics for
analyzing impact of concerns?
- Borrow concepts from physical systems to analyze
the impact of individual concerns on end-to-end
system - Method of joints, method of sections, free body
diagrams, equilibrium conditions
23Engineering Mechanics for DRE Systems
- A concern is viewed as a force
- Challenges
- Directionality are concerns vectors?
- Rigidity are assemblies rigid or deformable?
- Force distribution does a concern have
components along Cartesian axes - Well-defined structures do software components
have properties like trusses - Second order effects transient effects showing
up elsewhere - Notion of friction these are probably the
capacities of resources
24(2) Deployment-time Intelligence
- Near optimal deployment planning decisions
- Specialized middleware stacks
- Students involved
- Arvind Krishna (graduated), Jaiganesh
Balasubramanian, Gan Deng, Dimple Kaul, Arundhati
Kogekar, Amogh Kavimandan
Work partly supported by DARPA ARMS Program, PI
on subcontracts from Lockheed Martin ATL
25Deployment Challenges
- Service workloads and resource capacity issues
service placement depends on workloads and
available resources - Component accessibility patterns -- component
survivability depends on its sharing degree - Differentiated levels of service affects
resource provisioning and survivability
strategies - Service failover different failover
possibilities e.g., as a whole or part assembly
or one component at a time - Resource sharing increases the risk of
component(s) requiring proactive survivability
strategy - No one-size-fits-all dependability strategy
cannot dictate one FT strategy on all services
26Service Placement Problem
- A resource configuration is a tuple RC (C, D,
HC, EC) where - C is a set of computation nodes each attributed
by - PI(c) processing index (capacity)
- MI(c) memory index
- RI(c) reliability index
- D is a set of Data access units of types in
Ai,Sj - HC C ? ?(D) is a map associating each c in C
with a set of data access units - EC ??C ? C is a set of comm. links each
attributed by - BI(e) bandwidth index
- RI(e) reliability index
- System performance can be measured in a variety
of ways. Considering a task assignment TA T ? C
- Resource utilization for processing it is
defined as the average of all task processing
utilization, given as - Memory utilization MU(TA) and link utilization
LU(TA) can defined similarly - System utilization factor The weighted sum
percentage of utilizing the system resources
- Reliability is more tricky to measure. In
general, the reliability of a given computation
string is the multiplication of the reliability
indices of the underlying nodes and communication
edges. - The reliability factor RF(TA) for a given task
assignment, TA, depends on - The reliability of all its computation strings.
- The group reliability the underlying nodes
(taking into account their relative distances). - The resource utilization of the systems. The more
the system hardware are utilized the less
reliable it is.
27Specializations via Generative Programming
- GME-based POSAML language for POSA2 pattern
language - Generative programming to synthesize FOCUS and
AspectC rules - Synthesize specialized middleware stacks for
distributed deployment of operational strings.
28Run-time QoS-aware Mechanisms
- Focus on Autonomic Mechanisms
- Survivability Fault tolerance
- Students involved
- Jaiganesh Balasubramanian, Sumant Tambe, Jules
White, Nishanth Shankaran
Work supported by DARPA ARMS Program, PI on
subcontracts from Lockheed Martin ATL, BBN
Technologies, Telcordia
29Distributed Virtual Container Approach
- Virtual Container Concept for Component M/W
- Based on a virtualization idea
- Spans boundaries across all the replicas, which
could be placed on different physical nodes - Provides a single point for resource provisioning
component programming - Seamless environment for configuring FT, LB,
online swapping - Handles fine-grained checkpointing across all the
replicas in virtual container - Reliable multicast state synchronization
confined to a virtual container - Maintains information about how the replicas are
connected to the external component assemblies - Salient features
- Provides an operating context for the
components/assemblies requiring QoS - Relieves programmer from having to configure the
middleware for QoS support - Clients are oblivious to replication
- Normal container programming model
- Middleware hides the virtualization details
Virtual Container
primary
secondary
30Open RD Issues in FT for Comp MW
Virtual Container
Primary assembly
Secondary assembly
- What types of faults to consider? Single versus
multiple software versus hardware or both? - Can we detect port, component, container, node
application and assembly failures? - Do we allow recovery at port level or component
level or container level or node application
level - Does assembly recovery imply wholesale recovery
of entire assembly? - What does semi-active mean for assemblies?
- Does Distributed FT container span entire
assemblies?
31Run-time QoS Survivability Mechanisms
- A configurable approach to survivability
including micro- (infrastructure) macro-
(assembly operational string) level strategies
- Micro-level strategies monitor infrastructure
state to make proactive decisions at - Component level (swapping migration)
- Middleware level (configurations)
- Component Server Level (process resource
allocations) - Node level (multiple components)
- Macro-level strategies monitor assembly health to
make failover decisions
- Failover based on type of failover unit
- Affects service placement decisions
- May involve load balancing
- State synchronization issues
- Replication styles (hidden by FT strategies)
- Initial prototype developed using
Component-Integrated ACE ORB (CIAO) Deployment
Configuration Engine (DAnCE) (www.dre.vanderbilt
.edu)
32RD Mentoring and Advising
- Primary Advisees
- Amogh Kavimandan (PhD) hybrid simulation
technologies, DC for heterogeneous systems,
pervasive systems, BEEP - James Hill (PhD) emulation technologies
- Sumant Tambe (PhD) Variability management at
modeling level, FT - Dimple Kaul (MS) MDD/AOSD techniques for
middleware specialization, POSAML - Arundhati Kogekar (MS) MDD techniques for
analysis simulations, POSAML - Co-Advisees
- Arvind Krishna (PhD Graduated) Middleware
specializations (FOCUS) - Jaiganesh Balasubramanian (PhD) Survivable
systems, Deployment Planning, SwapCIAO, Load
Balancing - Krishnakumar Balasubramanian (PhD) MDD/CoSMIC,
assembly optimizations - Gan Deng (PhD) DC framework, ReDAC
- Jules White (PhD) Autonomic Computing, BEEP
- Nishanth Shankaran (PhD) Control Theory in
Middleware - Undergraduates
- Matthew Heineke Network data collection from
routers for traffic analysis (for IAB)
33Research Summary
RD in new, holistic approaches to end-to-end QoS
management in services-enabled distributed
real-time embedded systems
Research Challenge Research Approach Benefits
Managing problem space variability Model-driven generative approach to separation of concerns Enhance the state-of-art in MDD and AOSD technologies
Design-time What-if analysis using generative prog Variety of analysis techniques including non traditional mechanisms Generative technologies for automated analysis Application of Engineering Mechanics
Deployment-time intelligent decisions New applications of constraints optimization theory Middleware specializations Near optimal deployment Specialized middleware stacks
Run-time Mechanisms Multilevel, proactive QoS mgmt schemes Virtualization ideas Largely autonomic Survivable systems