Title: Feedback Control of QoS
1Feedback Control of QoS
- Tarek Abdelzaher
- Department of Computer Science
- University of Virginia
2The Web QoS Group
- Group is formed in 1999
- Projects
- Web performance
- Deeply embedded sensor networks
- Real-time systems
- Students
- Ying Lu, Chenyang Lu, Sagnik Bhattacharya, Seejo
Sebastine
3Performance Control in Server End systems
- How to design adaptive services which meet
pre-specified performance requirements? - How to model the effects of feedback (adaptation)
in software architectures for QoS guarantees? - How to use software feedback to achieve
performance requirements?
4Observation
- Physical and engineering sciences have a well
developed analytic foundation for performance
control in physical systems - No such unified foundation exists for performance
control of software services - The objective of this research effort is to
establish such a foundation based on control
theory and scheduling theory
5Why Control Theory?
- Successful track record in physical process
control - Performance guarantees in the face of
uncertainty, non-linearities, time-variations,
etc. - Does not require accurate system models
- Utilizes feedback to improve performance
- Performance of software services is governed by
queuing dynamics which may be expressed by
differential equations akin to those of physical
systems
6Feedback Control Versus Queuing Theory
- Queuing theory
- Off-line predictive analysis
- Assumptions about the arrival process
- Difficult to analyze some distributions
- Control-theory
- On-line input/output difference equations
- No assumptions about the arrival process
- Utilize run-time feedback for error correction
7Feedback Control Versus Optimization
- Optimization
- Works better if the performance problem is
formulated as one of maximizing or minimizing
some metric - Control-theory
- Works well if the performance problem is one of
maintaining an invariant, or is a tradeoff
between two conflicting metrics
8Software Performance Control
- Control theory
- Robust guarantees on aggregate state and global
performance metrics (e.g., average delay, total
utilization, etc) - Scheduling theory
- Guarantees on microscopic performance metrics
(e.g., individual response times) - Conditions on aggregate state
9Theoretical Elements of a QoS Control Methodology
Computing Tasks
Difference Equation Models
Modeling
Resource Queues
Desired Performance
Resource Scheduling
Feedback Control
Control Theory
Scheduling Theory
Fine-grained Performance Guarantees
10Potential Applications
- Performance-assured services (e-commerce, online
trading) - Service differentiation
- Contractual satisfaction guarantees
- Overload control
11ExampleIllustrating the Methodology
- Consider the problem of delay differentiation
between two classes of traffic in a multi-class
web server - It is desired to control server resource
allocation to the two traffic classes such that a
desired average delay ratio is observed
12Run-time Server Modeling
- A server can be modeled as a dynamic system
- Queues give rise to difference equations
- Current performance (output) depends on a finite
history of resource allocation decisions (inputs) - Server model
- V(m) measured relative delay in mth sampling
window - U(m) resource allocation in mth sampling window
13A Model of Delay Differentiation
Model parameters aj, bj 1 ? j ? n
Delay differentiation - Input assigned
process ratio - Output delay ratio
C0, C1
Least squares estimator
white-noise generator
monitor
B0, B1
TCP connection requests
Connection Scheduler
TCP listen queue
HTTP requests
HTTP response
14Model Estimation Results
A second order difference equation fits well
with the Apache server
15Controller design
PI Control Root-Locus Method Relative delay
controller Settling time TS 4.5 min Steady
state error ES 0
Root Locus
Closed Loop Poles
16Server Feedback Control
Wk 0 ? k lt N
Ck 0 ? k lt N
monitor
Controllers
Bk 0 ? k lt N
TCP connection requests
Connection Scheduler
TCP listen queue
HTTP requests
HTTP response
17Experimental Data (relative delay)
Delay Ratio Reference Process Ratio
premium-users 100?200
Designed settling time
Ratio
Time (sec)
(a) Adaptive Server
premium-users 100?200
Basic users get shorter delays than premium users!
Ratio
Time (sec)
(b) Non-adaptive Server
18Middleware for QoS Control
- API for plug-in performance sensors and actuators
- Common sensor/actuator library
- Engine for mapping QoS specifications to control
loops - Run-time enforcement of QoS guarantees
- Controlled
- System
- Server
- Proxy
Plug-in Actuators
Plug-in Sensors
Sensor API
Control API
QoS API
Performance Control Middleware
Loop Configuration
Common Platforms
19The Middleware Suite
- Run-time modeling tools
- Automated profiling (RTAS 00)
- Capacity planning and resource assignment
- Overload/throughput control (CDC 00, IWQoS 99)
- Performance isolation (IEEE TPDS 01)
- Service differentiation tools
- Server delay differentiation (RTAS 01),
- Cache hit ratio differentiation (ICDCS 01)
- Router delay differentiation (sub. to Infocom
02) - Prioritization (IWQoS 99)
- Absolute delay guarantee tools (RTAS 01)
20Middleware ExampleService Differentiation Tools
- Proportional Differentiated Web Services
Architecture
21Differentiated Services
- Problem statement
- N classes of users/traffic
- Average delay of class j is Dj
- It is required that
- D1D2 DN K1K2 KN
- K1, K2, , KN are specified weights
- Control-theoretical formulation?
22Control-Theoretical Formulation
- The differentiation objective
- D1D2 DN K1K2 KN
- One feedback loop per class
- The feedback control objective
- Error ei
23Control Loop Output
- Adjust resource allocation of each class j by DRj
- DRj f (ej), where
- f is linear
- f (0)0
- The resource conservation property
- Sj (DRj) 0
- Proof
- Sj (DRj) Sj f (ej) f (Sj ej) f (0) 0
24ApplicationDifferentiated Web Caching
Goal Different content classes receive
different hit ratio
25Experimental Setup1
- Web clients
- Surge a tool that generates references matching
empirical measurement - Servers
- Apache
- Cache
- Squid cache size to file population is roughly 1
to 30
26Performance
27Experimental Setup2
- Clients
- replay NLANR sanitized access logs
- class0 html files
- class1 non-html files
- Servers
- real servers on the internet
28Latency Reduction
- Backbone latency reduction
- ? includes all the pages that hit in the cache
- ? includes all the requested pages
29Software Performance Control
- Control theory
- Middleware solution for robust guarantees on
aggregate performance metrics (e.g., average
delay, total utilization, etc) - Scheduling theory
- Guarantees on microscopic performance metrics
(e.g., individual response times) - Conditions on aggregate state
30Role of Scheduling TheoryAbsolute Delay
Guarantees
- A constant-time admission test based on current
server utilization - All admitted tasks are guaranteed to meet their
deadlines - Arbitrary number of traffic classes
- No assumptions about task arrival process
31Main Results
- All arrivals will meet their deadlines under an
optimal fixed-priority scheduling policy if - Deadline monotonic scheduling is the optimal
fixed-priority scheduling policy
32Main Idea of Derivation
- Minimize, over all arrival patterns z , the
maximum Uz(t) that precedes a missed deadline
Uz(t)
Maximum Uz(t)
t
Missed deadline
33Evaluation
- Deadline miss ratio depends on CPU utilization
- Aperiodic (non-stationary) service requests meet
their deadlines when utilization is below the
bound - The utilization bound can serve as a control set
point
34The Future Vision
- An analytic foundation for performance control
- Putting it all together
35Putting it all Together Step 1 - Feasibility
Bounds
- Efficient QoS feasibility tests based on
aggregate measurements
1973
2001
2003
Utilization
Utilization
Utilization
100
100
Schedulable bound
generalized schedulable bound
Generalized schedulable region
Relaxed Periodicity
Schedulable region
Connectivity
0
0
Periodic Load
Random Load
Random Load
Distributed System
36Putting it all TogetherStep 2 - Aggregate Models
- System models without load knowledge
QoS Guarantees on Aggregate Behavior
Assumptions about Load Arrival Process
Closed Loop Feedback Control Dynamics
Aggregate Queuing Models
Server Difference Equation Models
Individual Requests (Microscopic Models)
Aggregate Service Profiles
37Putting it all TogetherStep 3 Feedback
Control
- Distributed control to maintain global sufficient
conditions for desired behavior
Var1
State Control Loops
Desired Aggregate Behavior
Feasible Region
Var3
Aggregate Performance Guarantees
Aggregate State Variables
Var2
38Conclusions
- A first step towards an underlying analytic
foundation and design methodology for performance
control in software systems - A middleware library that embodies the control
loop prototypes - Theory to relate aggregate state to fine-grained
performance guarantees
39Future Work
- Study the characteristic features of software
feedback control systems - Establish a better understanding of the
limitations of control theory - Integrate control theory with real-time
scheduling theory for robust fine-grained
guarantees on temporal behavior and QoS - Implement successful performance control
mechanisms in the QoS control middleware
40Acknowledgements
- I would like to acknowledge
- Chenyang Lu, for his work on delay
differentiation in web servers and for
contributing slides to this talk - Ying Lu and Avneesh Saxena for their work on
differentiated caching services - Jack Stankovic, Sang Son, Gang Tao, Nina Bhatti,
Kang Shin, Kevin Skadron, and Jorg Liebeherr for
their collaboration and help