Property Assurance in Middleware for Distributed Real-Time Systems*

About This Presentation

Title:

Property Assurance in Middleware for Distributed Real-Time Systems*

Description:

Department of Computer Science and Engineering. Washington University, St. Louis, MO ... E.g. Using consumer specific information to customize data stream ... – PowerPoint PPT presentation

Number of Views:99

Avg rating:3.0/5.0

Slides: 37

Provided by: ChrisG160

Category:

more less

Transcript and Presenter's Notes

Title: Property Assurance in Middleware for Distributed Real-Time Systems*

1
Property Assurance in Middlewarefor Distributed
Real-Time Systems

Christopher Gill
cdgill_at_cse.wustl.edu
Department of Computer Science and Engineering
Washington University, St. Louis, MO

Seminar at the Coordinated Sciences Laboratory
University of Illinois Urbana-Champaign Thursday,
May 24, 2007
Research supported in part by NSF grants
CCF-0615341 (EHS) and CCF-0448562 (CAREER)
2
A Motivating Example Real-Time Image Transmission

Chains of end-to-end tasks
E.g., compress, transmit,
decompress, analyze, and then
display images
Property assurance is crucial
Soft real-time constraints
Deadlock freedom
Many applications have similar needs
Is correct reuse of middleware possible?

Console
Gill et al., Integrated Adaptive QoS Management
in Middleware An Empirical Case Study (RTAS
04) Wang et al., CAMRIT Control-based
Adaptive Middleware for Real-time Image
Transmission (RTAS 04)
Camera
3
Middleware for Distributed Real-Time Systems

Layered stacks of mechanisms
thread, port, socket, timer
reactor, monitor
client, server, gateway, ORB
Task chains span multiple hosts
may be initiated asynchronously
Limited host resources
used by multiple task chains

A Distributed System Software Stack
4
One Widely Used Mechanism Reactor
Application
Reactor
Socket
read()
handle_input()
data arrival
select()
Event Handlers
handle_input()
read()
Read Handle Set
Reactor abstraction has many variations select
() vs. WaitForMultipleObjects() single thread
vs. thread pool unsynchronized vs. mutex vs.
readers-writer
5
An Illustration of Inherent Complexity

Wait-on-Reactor
Handler waits in reactor for reply
E.g., set read_mask, call select() again
Other requests can be processed while replies are
still pending
For efficiency, call stack remembers handler
continuation
Intervening requests may delay reply processing
(LIFO semantics)

Wait-on-Connection
Handler waits on socket connection for the reply
Blocking call to recv()
One less thread listening on the Reactor for new
requests
Exclusive handling of the reply
However, may cause deadlocks if reactor
upcalls are nested

Two essential research questions How can we
represent and analyze such diverse behavior? How
can we enforce properties that span hosts,
efficiently?
6
Essential Technical Objectives

A principled basis for middleware verification
Model each mechanisms inherent complexity
accurately
Remove unnecessary complexity through abstraction
Compose models tractably and with high fidelity
to system itself
New protocols and mechanisms for property
enforcement
Exploit call graph structure and other
domain-specific information
Develop efficient local mechanisms for end-to-end
enforcement
Design frameworks to support entire families of
related protocols
Practical extensions to preemption and control
semantics
Leverage existing theory to address part of the
problem space
Identify and exploit domain-specific problem
structure
Develop decidable and tractable representations
of other behavior

7
Model Architecture in IF for ACE

Network/OS layer inter-process communication
abstractions
Middleware layer ACE pattern-oriented
abstractions
Application layer application-specific semantics
within ACE event handlers

8
Modeling Threads

Challenge
No native constructs for threads in model
checkers that currently support timed automata
Option 1 model all thread actions as a single
automaton
Suitable for high level modeling of application
semantics
Option 2 model a thread as multiple interacting
automata
Interactions model the flow of control
This option better abstracts the nuances of
ACE-level mechanisms

Foo
Bar
input method_request
output method_request
input method_result
output method_result
9
Modeling Thread Scheduling Semantics (1/4)

Easy to achieve with one automaton per thread
Specify to model checker directly
E.g., using IF priority rules
More difficult with more than one automaton per
thread
Thread of control spans interactions among
automata
Need to express thread scheduling in terms of
execution control primitives provided by the
model checker

Activity1
Activity2
Update Display
Control Flow Rate
1 automaton per thread
prio_rule pid1 lt pid2 if pid1 instanceof
Activity1 and pid2 instanceof Activity2
Foo
Bar
input m_req
output m_req
input m_result
output m_result
10
Modeling Thread Scheduling Semantics (2/4)

Solution
Introduce a thread id that is propagated along
automata interactions
Thread id acts as an index to a storage area
which holds each threads scheduling parameters

Resulting Behavior
Bar1
Foo1
Foo1
Bar1
1
2
Bar2
Foo2
Bar1
Foo1
Prio5
Prio8
Foo2
Bar2
2
1
thread_schedule pid1 lt pid2 if pid1 instanceof
Foo1 and pid2 instanceof Bar1 and (Foo1pid1).th
readid ltgt (Bar1pid2).threadid
and (Thread((Foo1pid1).threadid)).prio
lt (Thread((Bar1pid2).threadid)).prio )
Hint to the model checker
Give higher preference to the automaton whose
thread (pointed to by thread id) has higher
priority
11
Modeling Thread Scheduling Semantics (3/4)

What if two threads have the same priority?
In an actual implementation, run-to-completion
(SCHED_FIFO) may control the possible
interleavings
How can we model run-to-completion?

Bar1
Foo1
Foo2
Foo3
Foo1
Bar2
Foo2
Bar1
Foo1
Bar3
Foo3
Bar1
Bar2
Bar3
Bar1
Foo1
Bar1
How do we prune out this space?
Bar2
Foo2
Foo3
Bar3
12
Modeling Thread Scheduling Semantics (4/4)

Solution
Record id of currently executing thread
Update when executing actions in each automaton

Currentnil
Bar1
Foo1
Current1
Current2
Foo2
Bar1
Current1
Bar1
Foo3
Current1
Bar1
Hint to the model checker
Give higher preference to the automaton whose
thread is the currently running thread.
Non-deterministic choice if Current is nil
Current2
Bar2
Current2
Bar3
Current2
13
Problem Over-constraining Concurrency
Hint to the model checker
Currentnil
Give higher preference to the automaton whose
thread is the currently running thread.
Non-deterministic choice if Current is nil
Bar1
Foo1
Current1
Current2
Foo2
Bar1
Current1
Bar1
Foo3
Current1
Bar1
Current2
Bar2
Current2
Bar3 always chosen to run
Bar3
Bar3
Current2
Current2
Time progresses
Foo3
14
Solution Idle Catcher Automaton
Foo3, Bar3 blocked at this point
Current2
Idle catcher runs

Key idea lowest priority catcher runs when all
others are blocked
E.g., catcher thread in middleware group
scheduling (RTAS 05)
Here, idle catcher automaton
runs when all other automata are idle (not
enabled), but before time progresses
Resets value of current id to nil

Currentnil
Time progress
Currentnil
Foo3
Bar3
Foo3 or Bar3 could be chosen to run.
Over-constraining eliminated
15
Problem Tractability
right away
in a minute
get coffee
go for an espresso
maybe tomorrow?

Model checking can suffer from state space
explosion
State space reduction, live variable analysis can
help
But even good model checkers dont fully solve
this
Need to think of modeling as a design issue, too
Does the model represent what it needs to
represent?
Can the model be re-factored to help the checker?
Can domain specific information help avoid
unnecessary checking?

16
Optimization 1 Leader Election

Leader/Followers concurrency
Threads in a reactor thread pool take turns
waiting on the reactor
One thread gets the token to access the reactor
- leader
All other threads wait for the token followers
It does not matter which thread gets selected as
leader in a threadpool
Model checker not aware of this domain specific
semantics
For BASIC-P protocol example, saved factor of 50
in state space, and factor of 20 in time

Token to access the reactor is available
T1
T3
T2
T2
T3
T3
Prune this out
17
Optimization 2 System Initialization

Similar idea, but different technique
Iff ok to establish initial object relations in
any order, can optimize away
E.g., 2 server automata, each of which creates a
reactor automaton
Useful when modeling object systems in model
checkers with dynamic automaton creation
capability (e.g., IF)
State space reduction depends on application
Factor of 250 for a deadlock scenario with 2
reactors and 3 threads in each reactor

S1 creates R
S2 creates R
1
1
S1
R
S2
R
S2 creates R
S1 creates R
1
1
S1
R
S2
R
2
2
S2
R
S1
R
Prune this out
18
Verification of a Real-Time Gateway
Consumer1
Supplier1
Consumer2
Gateway
Consumer3
Supplier2
Consumer4

An exemplar of many realistic ACE-based
applications
We modified the Gateway example to add new
capabilities
E.g., Real time, Reliability, Control-Push-Data-Pu
ll
Value added service in Gateway before forwarding
a to consumer
E.g. Using consumer specific information to
customize data stream
Different design, configuration choices become
important
E.g., number of threads, dispatch lanes, reply
wait strategies

19
Model Checking/Experiment Configuration
C1
100ms
Gateway
20
C2
100ms
Period
20
100ms
S1
C3
10
50ms
10
S2
50ms
C4
50ms
Relative Deadline
Value-added execution (and its cost)

Gateway is theoretically schedulable under RMA
Utilization 80
Schedulable utilization 100 for harmonic
periods
Assumption Messages from 50ms supplier is given
higher preference than 100ms supplier
ACE models let us verify scheduling enforcement
IN THE ACTUAL SYSTEM IMPLEMENTATION

Deadline Exec time
C1 100ms 20ms
C2 100ms 20ms
C3 50ms 10ms
C4 50ms 10ms
20
Real-time Gateway Single Thread
Gateway
ConsumerHandler
SupplierHandler
ConsumerHandler
SupplierHandler
Consumer
ConsumerHandler
Supplier
ConsumerHandler
Reactor

Single reactor thread dispatches incoming events
I/O (reactor) thread same as dispatch thread
I/O thread responsible for value added service

21
Real-time Gateway Dispatch Lanes
Gateway
ConsumerHandler
SupplierHandler
ConsumerHandler
SupplierHandler
Consumer
ConsumerHandler
Supplier
ConsumerHandler
Reactor

Single reactor thread again dispatches events to
gateway handlers
I/O (reactor) thread puts message into dispatch
lanes
Lane threads perform value added service,
dispatch to consumers
DO QUEUES HELP OR HURT TIMING PREDICTABILITY?

22
Model/Actual Traces for Real-time Gateway
Execution in the context of lane threads
Execution in the context of reactor thread
Single threaded Gateway
Gateway with dispatch lanes
S1,S2
S2
S1,S2
S2
Deadline miss for Consumer4 because of blocking
delay at reactor
C1
C2
C3
C4
C3
C4
C1
C2
C3
C4
C2
Model
C1
C2
C3
C4
Actual
C3
C4
C1
C2
C3
C4
C2
Time
Time
20
40
60
10
30
50
20
40
60
10
30
50
70
80
90
100
Expected execution timeline with RMS
Period Exec time Deadline
C1 100ms 20ms 100ms
C2 100ms 20ms 100ms
C3 50ms 10ms 50ms
C4 50ms 10ms 50ms
C1, C2, C3, C4
C3, C4
C3
C4
C1
C2
C3
C4
C2
20
40
60
10
30
50
70
80
90
100
23
Essential Technical Objectives

A principled basis for middleware verification
Model each mechanisms inherent complexity
accurately
Remove unnecessary complexity through abstraction
Compose models tractably and with high fidelity
to system itself
New protocols and mechanisms for property
enforcement
Exploit call graph structure and other
domain-specific information
Develop efficient local mechanisms for end-to-end
enforcement
Design frameworks to support entire families of
related protocols
Practical extensions to preemption and control
semantics
Leverage existing theory to address part of the
problem space
Identify and exploit domain-specific problem
structure
Develop decidable and tractable representations
of other behavior

24
Properties, Protocols, and Call Graphs

Many real-time systems have static call
graphs
even distributed ones
helps feasibility analysis
intuitive to program
Exploit this to design efficient protocols
pre-parse graph and assign static
attributes to its nodes
Resource dependence, prioritization
maintain local state about use
enforce properties according to (static)
attributes and local state
Guard a(fi) lt tRj
Decrement, increment tRj

Reactor 1
tR1 2
a(f4) 0
f4
f1
a(f1) 1
a(f3) 0
f3
f2
a(f2) 0
tR2 1
Subramonian et al., HICSS04 Sanchez et al.,
FORTE05, IPDPS06, EMSOFT06, OPODIS06
Reactor 2
25
Property Enforcement Mechanisms

Protocol enforcement has a common structure
pre-invocation method
invocation up-call
post-invocation method
Specialized strategies implement each protocol
BASIC-P
annotation variable
k-EFFCIENT-P
annotation array
LIVE-P
annotation
balanced binary tree
All of these protocols work by delaying upcalls
Constitutes a side effect that model checker
should evaluate

26
Timing Traces for BASIC-P Protocol
EH21
EH11
R1
R2
EH31
Flow1
EH22
EH12
R1
R2
EH32
Flow2
Model checking actual timing traces show
BASIC-P protocols regulation of threads use of
resources (no deadlock)
EH23
EH13
R1
R2
EH33
Flow3
27
BASIC-P Blocking Delay Comparison
Actual Execution
Model Execution
Blocking delay for Client2
Blocking delay for Client3
28
Overhead of ACE TP/DA reactor with BASIC-P
Negligible overhead with no DA protocol
Overhead increases linearly with of event
handlers due suspend/resume actions on handlers
at BASIC-P entry/exit
29
Essential Technical Objectives

A principled basis for middleware verification
Model each mechanisms inherent complexity
accurately
Remove unnecessary complexity through abstraction
Compose models tractably and with high fidelity
to system itself
New protocols and mechanisms for property
enforcement
Exploit call graph structure and other
domain-specific information
Develop efficient local mechanisms for end-to-end
enforcement
Design frameworks to support entire families of
related protocols
Practical extensions to preemption and control
semantics
Leverage existing theory to address part of the
problem space
Identify and exploit domain-specific problem
structure
Develop decidable and tractable representations
of other behavior

30
Concurrency, Resources, and the System State Space
Aswathanarayana et al., RTAS05

Example concurrency architecture processing
pipelines
Thread per pipeline stage (image codec,
filtering, analysis, etc.)
Resource itself constrains timed state space
Pipelines progress can only diverge within total
resource bound
Even off-the-shelf scheduling further limits
state space
Interleaving of resource allocation further
bounds divergence

31
Scheduling and Preemption
Joint work with Douglas Niehaus and Noah Watkins
(University of Kansas) and Venkita Subramonian
(ATT Labs)

Compare to classic static scheduling policies
like RMS
In those approaches preemption occurs at well
defined times
Even if release times are out of phase, can tag
and bound early
Yet, more nuanced policies are often needed in
practice
E.g., fair-progress based, with variable
execution time per job
More difficult to bound, though similarly
quasi-cyclic

32
Model Composition and State Space Exploration

Problems
No common framework for checking models of timed
component interactions with preemption and
alternative concurrency semantics
Decidability/tractability of models with
preemption
Solution approaches
Develop new component-based modeling semantics
Huang-Ming Huang component automata model,
algorithm, checker
Reduce state space using domain-specific
information
Terry Tidwell exploit scheduling-induced
quasi-cyclic structure

33
Solution Approach Timed ? Time Domain Automata

Exploit schedulers enforcement of fairness
Can parameterize time and state
Likely to result in a quasi-cyclic structure

Tidwell et al., WUSTL CSE Technical Report
2007-34
34
Solution Approach Time/Progress Bounds

Bounded fairness gives a particularly nice case
Captures behavior of fair-progress scheduled
systems
Leads to a quasi-cyclic state space, allows
analysis
Notice convergence in the limit to a common state

Tidwell et al., WUSTL CSE Technical Report
2007-34
35
A Brief Survey of Closely Related Work

Vanderbilt University and UC Irvine
GME, CoSMIC, PICML, Semantic Mapping
UC Irvine
DREAM
UC Santa Cruz
Code Aware Resource Management
UC Berkeley
Ptolemy, E-machine, Giotto
Kansas State University and University of
Nebraska
Bogor, Cadena, Kiasan

36
Concluding Remarks

Timed automata models of middleware building
blocks
Are useful to verify middleware concurrency and
timing semantics
Domain specific model checking refinements
Help improve fidelity of the models (run to
completion, priorities)
Can achieve significant reductions in state space
Can make otherwise intractable problems checkable
Property protocols
Reduce what must be checked by provable
enforcement
Also benefit from model checking (due to side
effects)
Future work extend modeling approach beyond
real-time concerns to cyber-physical system
concerns
Model dependence, interference, faults and
failure modes
Potential solution approach linear hybrid
automata augmented with new domain-aware
techniques for constraining complexity