Department of Computer Science - PowerPoint PPT Presentation

1 / 20

About This Presentation

Title:

Department of Computer Science

Description:

'Reverse lookup' for masking (state-less) servers failures. Towards highly available servers ... A 'reverse' lookup returns the name of a given wire connection ... – PowerPoint PPT presentation

Number of Views:115

Avg rating:3.0/5.0

Slides: 21

Provided by: zmke7

Category:

more less

Transcript and Presenter's Notes

Title: Department of Computer Science

1
Filterfresh Hot Replication of Java RMI Server
Objects Arash Baratloo, P. Emerald Chung, Yennun
Huang, Sampath Rangarajan, and Shalini Yajnik

Department of Computer Science
Courant Institute of Mathematical Sciences
New York University

Bell Laboratories Lucent Technologies
2
Filterfresh Goals

Support highly-available RMI services in presence
of failures
Handle crash failures
Transparent failure masking
Easily integrate into Java RMI

3
Roadmap

Goals
RMI Registry architecture crash failures
RMI architecture crash failures
Process group approach to fault tolerance
Highly available registry service
Reverse lookup for masking (state-less) servers
failures
Towards highly available servers
Conclusions

4
RMI in a nutshell

Step 1 a server object registers with the RMI
registry running on the local host
Steps 2-3 Clients get servers remote reference
by performing a lookup operation at a known
registry
Step 4 Given a remote reference, clients invoke
servers methods through RMI

5
Limitations of RMI Registry

Single point of failure
Clients need to know a priori which registry to
contact
Does not allow multiple RMI servers to register
under the same service name
Not suited for replicated highly-available RMI
server objects

6
Desirable properties of RMI Registry

Distributed to remove the single point of failure
Ability to dynamically add registries, and to
detect and remove failed processes
Highly available
Replicated to remove the a priori requirement
Replication strategy to maintain a consistent
global state
Support for multiple RMI servers to register
under the same service name
Thus, to provide high-availability to RMI server
objects we need a highly-available registry
service!

7
RMI Architecture

The programmer writes the client and server
application codes
The RMI compiler (rmic) generates the client stub
and server skeleton
The RMI package implements the RRL and transport
layers
Transparent masking of failures must occur below
the stub/skeleton levels

8
A unified solution

Fault-tolerance based on process group approach
Non-faulty processes form a logical group
Members interact using a set of group primitives
Group primitives are guaranteed to be reliable --
all or nothing
Group primitives are guaranteed to be ordered
Group members have a consistent view of other
group members
Applications built on process groups view events
in a synchronous fashion
The group view changes for all members as though
it is instantaneous -- synchronous
Events (e.g, send receive of multicasts) occur
in a logical order, within the same view
Members have the same view of the group

9
Strong Virtual Synchrony

Progress a joining process will eventually
become part of the group view (or be suspected of
failures)
Failure detection a crashed process will
eventually be detected and removed form the group
view
Reliability messages sent by a member that
remains in the group view will be delivered by
others
Order messages will be delivered by others in
the view it was sent
Consistency all surviving members of a view
agree on the set of messages delivered within
that view
Synchrony between two consecutive views, no
message is delivered

10
Fortunately

Process group approach is
Well studied
Well defined protocols
Process group approach has been used in building
general purpose fault-tolerant
Middle-ware systems, such as Horus/Ensemble,
Transis, etc.
Services, such as FT directory and file servers
OO systems, such as ISISORBIX, Electra, Orca
Java middle-ware systems such as iBus
Seems a good candidate for FT RMI services

11
Unfortunately

Process Group Membership is
As hard as distributed consensus
Impossible in purely asynchronous systems with
crash failures
Our implementation
Based on the timeout assumption
Correctness is guaranteed once terminates
Ack-based protocol for simplicity

12
Basis for process groups

A GroupManager Class
100 Pure Java
built on top of UDP/IP
Implements
Group creation
Join operation (with atomic state transfer)
Leave operation
Group multicast operation
Failure detection and recovery
All events are reliable and totally ordered

13
Performance of group multicast

PentiumPro 200, Linux 2.030, Fast Ethernet
connected by a hub
JDK1.1.1
Thread and object serialization influenced the
performance?

14
Roadmap

Goals
RMI Registry architecture crash failures
RMI architecture crash failures
Process group approach to fault tolerance
Highly available registry service
Reverse lookup for masking (state-less) servers
failures
Towards highly available servers
Conclusions

15
FT Registry architecture

Embedded a GroupManager class to ensure reliable
ordered events
Reliable and ordered group operations ensure
consistent state
Replicated registry service for high availability
Supports dynamic joins w/state transfer
Detects and removes failed registry servers

16
Bind operation

Bind operations are sent to every replica
Reliable multicast ensures every replica receives
the event
Ordered group operation ensures consistency even
if a new replica joins

17
Lookup operation

Lookup operations are handled locally
Provides location transparency to clients
able to locate servers registered at unknown
hosts
no need to have a priori knowledge of servers
host

18
Performance of FT Registry

PentiumPro 200, Linux 2.030, Fast Ethernet
connected by a hub
JDK1.1.1

19
Roadmap

Goals
RMI Registry architecture crash failures
RMI architecture crash failures
Process group approach to fault tolerance
Highly available registry service
Reverse lookup for masking (state-less) servers
failures
Towards highly available servers
Conclusions

20
RMI FT Registry

Supports multiple replicated servers to register
under the same service name
Object references remain valid after the
associated object has failed

21
In the event of server failure

The failure is detected below the stub level, and
...

22
Failure recovery forstate-less servers

A reverse lookup returns the name of a given
wire connection
The old connection is patched with a connection
to a non-faulty server
The operation is re-attempted
Transparent to the client illusion of a valid
object reference

23
FT server Architecture

Client has the illusion of a single server
In reality, a group of servers process clients
requests
Operations are performed at each server, in the
same order for consistency
Replicated servers for high availability

24
Highly available server objects

GroupManager ensures reliable ordering of events
across all servers
Guarantee consistent server state
Automatic detection and removal of failed server
objects
State transfer provide the ability to dynamically
add new server objects
In combination with FT Registry and reverse
lookup, clients have the illusion of a single
reliable server object

25
Conclusions and future work