Title: Algon: from interchangeable distributed algorithms to interchangeable middleware
1Algon from interchangeable distributed
algorithms to interchangeable middleware
2Introduction
- Challenges of distributed system development
- Resolving issues present in distributed
execution, e.g. non-determinism, contention and
synchronisation - Finding algorithms to solve these issues
- Availability of exact implementations is limited
- Existing implementations may not be suited to
present system requirements - Judging which algorithm to apply requires domain
knowledge - Developing own implementation can be fraught with
difficulty, requiring advanced programming skills
3Introduction
- Separation of concerns technique
- Decompose software development projects into
manageable parts - Programmers work on individual parts separately
- Dependencies between parts resolved with
abstractions (interfaces) - Thus business logic is not muddied with
complexities of orthogonal functional and
non-functional aspects
4Introduction
- Algon applies separation of concerns in
distributed systems development - Algorithmic complexity is hidden from programmers
in separate component levels - Algon includes
- Library of DA implementations
- Framework for integration between applications
and DAs - Performance metric evaluation tool
5Architectural Evolution I
Application
Thread1
Thread2
Synchronisation
Thread3
Thread4
6Architectural Evolution II
7Architectural Evolution III
Application
Application
Middleware
Middleware
Central Controller
Deals with synchronisation, non-determinism and
contention concerns
Middleware
Application
8Architectural Evolution IV
Application
Application
Application
Middleware
Middleware
Middleware
Centralised controller logic is duplicated in
each application. This logic may, however, be
spread throughout application code, making it
difficult to modify and maintain.
9Algon
- Distributed algorithms and scheduling code that
was injected in diverse locations in applications
following Architectural Evolution IV are now
centralised in one component layer - Algon component layer contains
- At least one Scheduler
- At least one distributed algorithm Interface
- At least one coded algorithm, implementing
Interface
10Algon Architecture
Application
Application
Application
Scheduler
Scheduler
Scheduler
Interface
Interface
Interface
Algorithm
Algorithm
Algorithm
Middleware
Middleware
Middleware
11Example Mutual Exclusion
- System has n (n gt 1) nodes requiring shared
access to some resource
Application Reading/Writing code
Mutual Exclusion Scheduler
MEScheduler
Interface for Non-Token-based Mutual Exclusion
Algorithms
MENT
Ricart-Agrawala
Maekawa
Application or administrator may switch between
algorithms to adapt to different load scenarios
Java/RMI
12Terminology
- Scheduler Class that
- Provides transparent access to lower-level Algon
features - Performs configuration
- Is specific to algorithm family used
- Interface
- Generalises distributed algorithm families
- Each DA implements one
13Separation of Concerns Schedulers and Interfaces
- Schedulers allow algorithm-dependent state
information to be maintained outside of
application logic - Interfaces abstract behaviour common to
functionally equivalent algorithms - Benefits
- Minimal alteration is required to add one or more
algorithms to an application - Painless interchange of algorithms facilitates
eliciting more desirable application performance
14Architectural Issues
- Two shortcomings identified in original Algon
architecture - Automatic site discovery applications on
distributed hosts identified by static config
files - Support for monitoring Initial design did not
cater for performance measurement, so
first-attempt implementation was ad-hoc and
inefficent
15Solutions
- Dynamic unique name assignment
- Name server tool (AlgonNameServer) added to
framework - Scheduler registers with AlgonNameServer
- Algorithm information, host IP address and unique
identifier recorded - Algorithm request sets constructed by
Scheduler-initiated name queries
16Solutions
- Integrated status reporting mechanism
- OutputDisplay class instantiated and registered
by AlgonNameServer - Scheduler sends status information to Reporter
class paired with it - Reporter maintains all status info, and sends
reports to UpdateQueue running on separate thread - Data on UpdateQueue forwarded to OutputDisplay by
Dispatch thread
17Current Architecture
Application
Dispatch
Scheduler
Reporter
Interface
UpdateQueue
Algorithm
Algon NameServer
Performance Display
Output Display
18OutputDisplay
19Performance Measurement
- OutputDisplay provides course-grained status
information - Data is not useful for realistic comparison of
algorithms - PerformanceDisplay was created to solve this
problem - Also receives status data via detached queue and
Dispatch mechanism, to minimise impact
20PerformanceDisplay
21Performance Metrics Mutual Exclusion
- Response Time Time interval between CS request
message transmission and end of CS execution - Synchronisation Delay (sd) Time interval between
end of one sites CS and start of another - Number of messages Count of messages required
for entering CS - System Throughput Rate at which CS requests are
processed
22Dealing with Failure
- Distributed failure is dealt with by specifically
developed algorithms - Algon layers (e.g. PerformanceDisplay,
OutputDisplay and AlgonNameServer) must not
introduce new (unhandled) failure points
23Dealing with Failure
- PerformanceDisplay and OutputDisplay
- If Dispatch fails, reports will not be forwarded
- Scheduler is unaffected because Reporter and
UpdateQueue are not affected - Dispatch will continue removing reports from
queue, but discards them immediately - Thus system integrity is not compromised
24Dealing with Failure
- AlgonNameServer
- Failure during setup results in overall system
failure - Single point of failure is system weakness, but
benefits of dynamic design outweigh it - Failure after setup does not compromise system
integrity - Should it be required, it is straightforward to
bootstrap name server from any site - Redundant name server replication is also feasible
25Dealing with Failure
- Algon philosophy
- Keep core framework functionality alive by
containing failures sacrifice non-functional
components if need be - Controlling application must be kept running if
at all possible
26Configuring Algon
- Configuration file defines
- IP address of master site
- Number of participating nodes
- Class name of specific algorithm(s) to be used
- Environment variables define
- Whether or not to dump debug information to
console - Whether or not application should hook up to
performance measuring components - The middleware to use
- The destination for normal Algon output
27Application of Algon Deadlock Detection
- Wait-for graph Edge T-gtU is inserted into graph
when T is blocked on request for resource that U
holds T waits for U - Deadlock detection If wait-for graph contains
cycle (T-gtU-gt-gtT), system is deadlocked - Detection can be performed at each graph
insertion, or at timeouts, or at user request - Classic example Dining Philosophers
28Application of Algon Deadlock Detection
- Many algorithms exist to detect deadlock
- We focus on diffusion computation category
(Chandy-Misra-Haas OR request model Chandy et al
83) - Other categories are path-pushing and
edge-chasing
29OR Request Model Diffusion Computation Algorithm
- Initiate diffusion computation for blocked
process Pi - Send query(i, i, j) to all processes Pj in DSi of
Pi - numi(i) DSi waiti(i) true
- When blocked process Pk receives query(i, j, k)
- If this is engaging query for process Pk
- Send query(i, k, m) to all Pm in DSk
- numk(i) DSk waitk(i) true
- Else if waitk(i) then
- Send reply(i, k, j) to Pj
30OR Request Model Diffusion Computation Algorithm
- When process Pk receives reply(i, j, k)
- If waitk(i)
- numk(i) numk(i) 1
- If numk(i) 0
- If i k declare deadlock
- Else
- Send reply(i, k, m) to process Pm which sent
engaging query
31Incorporating DD
Resource acquisition performed here
Locally-held resources
Application
RS1
RS2
RS3
Ph
DDScheduler
Interface for deadlock detecting algorithms
DD
CMO
Chandy-Misra-Haas OR request model algorithm
implementation
Middleware
32Concerning separation
- Framework maintains references from DDScheduler
to each Philosopher - Breaks separation of concerns adherence
- For DD to work, application-specific information
must be accessible to determine current state
33Resource acquisition
- Dining philosopher model is guaranteed to
deadlock because of resource acquisition
approach - public void run()
- think()
- right.get(identity)
- left.get(identity)
- eat()
- right.put(identity)
- left.put(identity)
34Setting Up DD
35Operation
- Normal operation involves Philosophers acquiring
and releasing resources - Deadlock detection algorithm does not participate
- When situation demands investigation into
possible deadlock - detectDeadlock method called on Scheduler
- Scheduler on current node builds dependency set
using other Schedulers - Situation is analysed and diagnosis made by
algorithm
36Testing for Deadlock
37Operation
- Deadlock detecting algorithms simply report on
deadlocks they do not resolve them - Deadlock resolution strategies may be employed
should system deadlock - Detection may proceed in parallel with normal
application processing
38Interchangeable Middleware
- Making Algon middleware independent
- Eliminates reliance of core classes on specific
middleware features/behaviour - Improves its extensibility
- Provides insight into impact of middleware on
algorithm implementations - Enables reimplementation in different languages
for different environments
39Challenges
- Implications of using Java RMI as middleware
layer - Syntactic rules
- Stub classes must extend UnicastRemoteObject
- Remote objects must nominally implement Remote
interface - All methods intended for remote invocation must
throw RemoteException - Semantic rules
- All parameters are passed via deep copy
(serialisation) - Failure must be handled in application logic
40Example Refactoring Mutual Exclusion
- MENT interface
- public interface Ment extends Serializable
- void sendRequests(SchedulerInterface si)
- void reply()
- void request(long time, SchedulerInterface si,
Ment m) - void getRequestSet()
- void sendRelease()
- void release()
41Example Refactoring Mutual Exclusion
42Example Refactoring Mutual Exclusion
- Middleware dependence has shifted to lowest
possible level - Flaws in this solution
- RicartAgrawalaRmi cannot extend both
RicartAgrawala and UnicastRemoteObject - RicartAgrawala methods must still throw
RemoteException
43Example Refactoring Mutual Exclusion
44Example Refactoring Mutual Exclusion
- Solution to inheritance problem
- Use delegation, whereby place-holder
RmiMentAlgorithmImpl is middleware stub, - Place-holder forwards algorithm calls to
RicartAgrawala - Solution to exception problem
- Impossible to remove all exceptions, so Ment
interface methods throw Exception the root of
all exception classes - Thus any exceptions thrown by Java-based
middleware methods are catered for
45Tying Up Loose Ends
- Middleware interface added to framework to
abstractly cater for algorithm discovery, access
and manipulation - MiddlewareException class added to uniformly
signal middleware-related failures - Toolset added to generate middleware-dependent
place-holders and stubs from algorithm interfaces
46Related Work
- Classifying Algon
- Similar to reflective systems Maes 87, but not
equivalent - Applies separation of concerns techniques
- Specialised programmer tool
- Algon classified tool that applies separation
of concerns technique to algorithmic concerns
47Related Work
- Technique research Guerraoui et al 97, Kiczales
et al 2001 - Technique application
- Real-time constraints Aksit et al 94
- Distribution and replication Guerraoui et al 97
- Exception handling Dellarocas 97
- Location control Okamura Ishikawa 94
- Synchronisation Lu et al 2001
48Related Work
- Approaches to separation of concerns
- Identify concerns and specify in separate objects
- Compile-time proxies Renaud 2001
- Runtime reflection Welch Stroud 2001
- Treat concern as orthogonal thus someone
elses problem Prentzeis 2000 - Use (3rd party) library that encapsulates
complexity Guerraoui et al 97
49Related Work
- Algon matches third approach best
- Adds tailoring options and additional level of
choice - Garf Guerraoui et al 97 also uses this approach
to address distribution issues - Implemented in Smalltalk
- Does not provide alternative algorithms for same
behaviour
50Future Work
- Further exploration and incorporation of
representative implementations of agreement
protocols, resource management techniques and
failure recovery techniques - Incorporating other middleware products
- Extend performance evaluation
- New algorithm classes
- Measuring middleware impact
- Translating Algon to C
51References
- Aksit, M., Bosch, J., Van der Sterren, W., and
Bergmans, L., Real-time specification inheritance
anomalies and real-time filters, in Tokoro, M.
and Pareschi, R. (eds), Object Oriented
Programming, Proceedings of the 8th European
Conference, ECOOP 94, Bologna, Italy, Lecture
Notes in Computer Science 821, Springer, 1994,
pp. 386-407. - Bishop, J.M., Renaud, K.V. and Worrall, B.,
Composition of distributed software with Algon
concepts and possibilities, in Elsevier ENCS 65,
ETAPS SC 2002, Grenoble, France, 2002. - Dellarocas, C., Toward exception handling
infrastructures for component-based systems, in
CBSE at SIGMETRICS, ACM, Kyoto, Japan, 1998, pp.
141-150.
52References
- Guerraoui, R., Garbinato, B. and Mazouni, K.R.,
Garf a tool for programming reliable distributed
applications, IEEE Concurrency 5 (1997), pp.
32-39. - Kiczales, G., Hilsdale, E., Hugunin, J., Kersten,
M., Palm, J., and Griswold, W., An overview of
AspectJ, in ECOOP, Budapest, Hungary, 2001, pp.
327-353. - Lu, J., Zhang, M., Xu, M. and Yang, D., A
two-layered class approach for the reuse of
synchronization code, Information and Software
Technology 43 (2001), pp. 287-294. - Maes, P., Concepts and experiments in
computational reflection, in OOPSLA, ACM,
Orlando, Florida, 1987, pp. 147-155.
53References
- Okamura, H. and Ishikawa, Y., Object location
control using meta-level programming, in Tokoro,
M. and Pareschi, R. (eds), Object Oriented
Programming, Proceedings of the 8th European
Conference, ECOOP 94, Bologna, Italy, Lecture
Notes in Computer Science 821, Springer, 1994,
pp. 299-319. - Prentzeis, T., Management of long-running
high-performance persistent object stores, Ph.D.
thesis, Department of Computing Science,
Department of Computing Science, University of
Glasgow (2000). - Renaud, K., Experience with statically generated
proxies for facilitating Java runtime
specialisation, IEE Proceedings Software 149
(2001), pp. 169-178.
54References
- Renaud, K., Lo, J., Bishop, J., Van Zyl, P. and
Worrall, B., Algon a framework for supporting
comparison of distributed algorithm performance,
in 11th Euromicro PNDP03, Genoa, Italy, 2003,
pp. 425-432. - Welch, I., and Stroud, R.J., Kava using byte
code rewriting to add behavioural reflection to
Java, in COOTS 01, San Antonio, Texas, USA,
2001, pp. 119-130.