Title: Autonomic System Design
1Autonomic System Design
- Visa Holopainen, visa_at_netlab.hut.fi
2Enabling autonomic behavior in systems software
with hot swapping, J. Appavoo et al. 2003
- Focus on object-oriented systems software
- By hot swapping, new algorithms and monitoring
code can be added to a running system without
disruption - Hot swapping is accomplished either by
interpositioning of code, or by replacement of
code - Interpositioning involves inserting a new
component between two existing ones. This enables
more detailed monitoring when problems occur,
while minimizing run-time costs when the system
is performing acceptably - Replacement allows an active component to be
switched with a different implementation of that
component while the system is running - Triggering hot swapping
- In many cases an object is expected to trigger a
replacement itself (autonomously). - For example, if an object is designed to support
small files and it registers an increase in file
size, then the object can trigger a hot swap with
an object that supports large files - In other cases, the system infrastructure is
expected to determine the need for an object
replacement through a hot swap. Monitoring is
required for this purpose.
3Adaptive code vs. hot swapping
- Among other features, hot swapping allows systems
software to react to changes in environment - More traditional approach towards handling
varying environments is to use adaptive code - In a system using adaptive code, all possible
configurations must be built to the system
beforehand - Adaptive code has many problematic features
(presented below)
4Illustration of adaptive code vs. hot swapping
- An adaptive code implementation (A) vs a
hot-swapping implementation (B) of the same
function - The adaptive code approach is monolithic and
includes monitoring code that collects the data
needed by the adaptive algorithm to choose a
particular code path - With hot swapping, each algorithm is implemented
independently (resulting in reduced complexity
per component), and is hot swapped in when needed
5Benefits of hot swapping
- Hot swapping can be beneficial at least in the
following respects - Optimizing for the (non) common case
- Dynamic replacement allows efficient
implementations of common paths to be used when
suitable, and less-efficient, less-common
implementations to be switched in when necessary - Optimizing for a wide range of file attribute
values - For example, although the vast majority of files
accessed are small (lt 4 KB), OSs must also
support large files - Access patterns
- Researchers have shown up to 30 percent fewer
cache misses by using the appropriate cache
management policy - Multiprocessor optimizations
- Some applications perform better when distributed
to many processors while others perform better
when run on a single processor - Enabling client-specific customization
- Exporting system structure information
- Always gathering the necessary profiling
information increases overhead
6Testing system
- A research operating system (K42) has been
developed to test the hot swapping approach - Runs on PowerPC and MIPS architectures (soon
available for x86 also) - K42 scales well to multiprocessor systems
- Performance advantages of hot swapping have been
demonstrated in K42 - K42 is available at http//www.research.ibm.com/K4
2
7Adding Autonomic Functionality to object-oriented
applications, M. Schanne, W. Tichy, T.
Gelhausen, 2003
- The goal is to separate autonomic functionality
from applications (similar to hot swapping) - This is accomplished by creating a system based
on class renaming and proxy/wrapper generation - A list of the proxy objects is kept in registry
- Proxy objects has always a pointer to the latest
version of the actual object and access to its
member functions - This is accomplished by ByteCode Engineering
Library (BCEL) - Wrapper functions ensure synchronization of
variables - The design ensures that there is no need for the
user to adapt his source code in any way or even
to restart the program - The supported environment the likes of Java 2
platform
8Usable Autonomic Computing Systems the
Administrators Pers- pective, R. Barrett, P.
Maglio, E. Kandogan, J. Bailey, 2004
- Autonomic computing seeks to solve the problem of
increasingly complex configurations through
increased automation - However, the AC strategy of managing complexity
through automation runs the risk of making
management harder (more powerful commands) - This is why autonomic systems should
- Provide facilities that make rehearsing and
planning easy - Be designed to allow administrators to quickly
undo changes, making operations (whether on
production systems or test systems) less risky
and therefore easier - Inform the administrator if undoing a command
will not be possible (easily) - Have enhanced capabilities for testing complex
end-to-end systems so that administrators will be
confident that their changes are not having
unintended consequences - Provide access to arbitrary levels of
configuration detail if need be - Autonomic system should also
- Contain a command line interface (in addition to
GUI)
9An Architectural Approach to Autonomic Computing,
S. White, J. Hanson, I. Whalley, D. Chess, J.
Kephart, 2004
- An autonomic system can be decomposed to 1)
interfaces, 2) interactions and 3) design
patterns - A bit RFC-style paper with MUST and SHOULD
statements about Autonomic Elements (AE) - MUST Examples
- An AE MUST be self-managing
- An AE MUST handle problems locally whenever
possible - An AE MUST be capable of establishing and
maintaining relationships with other autonomic
elements - SHOULD Examples
- An AE SHOULD ask for a realistic set of
requirements when requesting a service from
another element - An AE SHOULD offer a range of performace,
reliability, availability and security associated
with its service - An AE SHOULD protect itself against inappropriate
service requests and responses
10Use of policies
- The use of policies is essential for autonomic
systems - Three (3) policy levels presented
- Action policies (IF condition THEN action)
- An AE employing action policies MUST measure
and/or synthesize the quantities stated in the
condition - Goal policies (Response time must not exceed 2
sec.) - AEs employing goal policies MUST possess
sufficient modeling or planning capabilities to
translate goals into actions - Utility function policies (automatically
determine the most valuable goal in any
situation) - AEs employing utility funtion policies MUST have
sophisticated modeling and optimization
capabilities to translate utility functions into
actions
11Interfaces
- Making a system autonomic requires additional
interfaces to be added to the system - Monitoring and test interfaces
- Enable an element to be monitored by any other
element that has established the appropriate
administrative relationships with it - Lifecycle interfaces
- Enable administrative elements to determine the
lifecycle state of an element (e.g. starting,
paused), to cause a state change, and to
determine the lifecycle model that applies to the
element, and to determine the lifecycle model
that applies to the element - Policy interfaces
- Enable administrative elements to send new
policies to an element, and to determine the
policies currently in use by the element - Negotiation and binding interfaces
- Permit an element to request a service from other
elements, or to request to provide a service
12Relationships
- When an AE has agreed to provide service to
another AE, then those two elements have a
relationship - Relationships are typically formed at run-time
- Autonomic systems are built by relationships
- Request-response paradigm used to form
relationships
13From autonomic elements to autonomic systems
- Assembling an autonomic system requires
- A collection of AEs that implement the desired
function - Additional autonomic elements to implement system
functions that enable the needed system-level
behaviors (infrastructure elements) - Design patterns for system self-management
- Infrastructure element can be
- Registry (provides mechanisms for elements to
find one another) - Sentinel (provides monitoring services to other
elements) - Aggregator (combines two or more existing
elements and uses them to provide improved
service) - Broker (facilitates interaction)
- Negotiator (assists elements with complex
negotiations)
14Towards Requirements-Driven Autonomic
Systems Design, A. Lapouchnian, S. Liaskos, J.
Mylopoulos, Y. Yu, 2005
- There are three basic ways to make a system
autonomic - Design the system to support a space of possible
behaviors - Equip system with planning and social
capabilities so that it can delegate tasks to
external software components (agents) - Build the system so that it has evolutionary
capabilities (like biological systems) - The first approach was studied in the paper
- Requirements engineering
- Development of a framework for capturing and
analyzing stakeholder intentions to generate
functional and non-functional requirements
15Illustration of requirements engineering goal
model
- Top-level hard goal
- Schedule meeting
- AND-composed of lower level hard goals
- 4 top-level softgoals
- Good quality schedule, Minimal effort, Minimal
disturbances, Accurate constraints - Lower level softgoals can be related to higher
levels by help (), hurt (-), make () or break
(--) relationships - 6 alternative ways to fulfill the goal Schedule
Meeting - An autonomic system should address all different
ways of fulfilling the top-level goals
16Goal model -gt Feature model -gtComponent Connector
model
17Goal model is integrated into the knowledge of an
autonomic element
18Architectural Design of a Distributed Application
with Autonomic Quality Requirements, D. Weyns, K.
Schelfthout and T. Holvoet, 2005
- A reference architecture for situated multi-agent
systems (situated MAS) was developed - This reference architecture was applied to a
real-world software system - The architecture
- A situated MAS consists of an environment
populated with agents (autonomous entities) - Intelligence in a situated MAS originates from
the interaction between agents, rather than from
their individual capabilities - The architecture holds three abstractions
agents, ongoing activities and the environment
19High-level model view of the architecture
- The Perception module maps the local state of the
environment onto a percept for the agent - The Consuption module handles the effects of
encironment changes that affect the agent - The Decision module is responsible for action
selection
20The application
- A system in which robots transport loads from one
place to another within a warehouse and recharge
themselves whenever needed - Old system centralized server controlled robots
- Main problem inflexibility robots cant adapt
to changing situations - Improvement Robots are agents acting in a MAS
- Drawback more complicated system
21Module view of the application
- Two kinds of agents trasport agents and AGV
agents - Transport agents are managers they determine
the priority of the transport, assign transports
to AGVs and ensure that the transport succeeds - AGV agents are responsible for executing the
assigned transport
22Architecture of the environment
- To cope with the complexity of the environment,
it is presented through a layered architecture - Virtual environment uses a middleware layer that
enbles agents to communicate with each other - Virtual environment enbles agent routing and
prevents collisions - The agent observer a 3-5 meter circle from the
virtual environment at a time - In this circle the agent marks the path it is
going to use and removes this path when leaving
the circle - This way collisions can be avoided
- Transport agents use the virtual environment to
locate AGV agents
23A Control Theory Foundation for Self-Managing
Computing Systems, Y. Diao, J. Hellerstein, S.
Parekh, R. Griffith, G. Kaiser, D. Phung, 2005
- Control theory used as a way to identify a number
of requirements for and challenges in building
self-managing systems - What does control theory bring to table in terms
of self-management? - Autonomic computing and control theory have
slightly different points of focus autonomic
computing focuses on the specification and
construction of management components that
interoperate well, while the focus of control
theory is on analyzing and/or developing
components and algorithms so that the resulting
system achieves the control objectives - For example, control theory provides design
techniques for determining the values of
parameters in commonly used control algorithms so
that the resulting control system is stable and
settles quickly in response to disturbances
24Feedback Control Theory
- Reference Input (I/P) Desired Output (O/P) (as
specified by the human) - Control Error (Reference I/P Measured O/P)
- Control Input Parameters which affect behavior
of the system - Disturbance I/P affects Control I/P
- Controller Change Control I/P to achieve
Reference I/P - Measured O/P Measurable feature of the system
- Noise I/P affects Measured O/P
- Transducer Transforms measured O/P to compare
with Reference I/P
25Properties of Control Systems
- SASO
- Stable
- Bounded Input produces bounded output
- Unstable systems not usable in mission critical
work - Accurate
- Measure Output converges to Reference (Desired)
Input - Short Settling Times
- Converges to the Stable Value quickly
- No Overshoot
- Achieves objectives in a steady manner
26Control Analysis and Design
- Transfer function and Z-transformation used to
control and model response times and settling
times
27Example control theory approach to web server
management
- Objective CPU Utilization lt 50
- Measured Output CPU utilization
- Control Input MaxClients
- During the first 300 s, the system operates
without feedback control. When the controller is
turned on, a reference input of 0.5 is used. At
this point, the system begins to oscillate and
the amplitude of the oscillations increases. This
is a result of a controller design that
overreacts to the stochastics in the CPU
utilization measurement.
28ltusernamegt, I Need You!Initiative and
Interaction in Autonomic Systems, P. Kaminski, P.
Agrawal, H. Kienle, H. Müller, 2005
- Autonomic job requirements
- If I hired a person instead, what qualities would
I look for? - attention to detail, strong communication skills,
initiative, tempered by job boundaries,
self-knowledge and willingness to seek help - Treat users as partners, not masters
- Basic idea
- The system has an optimization engine that
decides if the preferred mode of action in some
situation is to 1) connect a human or 2) try to
repair the system - Decision based on 1) explicit instructions and 2)
learning - Balance match, bother, rush, risk
- The system learns from human actions and becomes
more competent in solving problems on its own - Balance initiative and interaction
- Send messages via e-mail, instant messenger, etc.
29Human (operator) is added to the traditional
autonomic computing cycle