Herald: Achieving a Global Event Notification Service

About This Presentation

Title:

Herald: Achieving a Global Event Notification Service

Description:

Herald: Achieving a Global Event Notification Service ... Herald is exploring scalability of the basic message and distributed state ... – PowerPoint PPT presentation

Number of Views:17

Avg rating:3.0/5.0

Slides: 29

Provided by: marvint

Category:

more less

Transcript and Presenter's Notes

Title: Herald: Achieving a Global Event Notification Service

1
Herald Achieving a Global Event Notification
Service

Luis Felipe Cabrera, Michael B. Jones, Marvin
Theimer
Microsoft Research

2
Global Event Notification Services

Communication via event notification (also called
publish/subscribe) is well-suited for
loosely-coupled eCommerce applications, as well
as Internet-scale distributed applications (e.g.
instant messaging and multi-player games).
General event notification systems currently
scale to tens of thousands of clients,
do not have global reach.

3
Internet-scale Issues

Scaling requirements are millions and billions,
perhaps more.
There will (probably) not be a single
organization that owns the entire event
notification infrastructure. Hence a federated
design is required.
Global reach implies that failures and network
partitions will be common-place.

4
Focus on the Basic Distributed Systems Primitives

Focus on the scalability of basic message
delivery and distributed state management
capabilities.
Employ a very simple message-oriented design and
assume until proven otherwise that richer
event notification semantics can be layered on
top.

5
Herald Event Notification Model
1 Create Rendezvous Point
4 Notify
3 Publish
2 Subscribe
6
Design Criteria

The usual criteria
Scalability
Resilience
Self-administration
Timeliness
Additional criteria
Heterogeneous federation
Security
Support for disconnection
Partitioned operation

7
Scalability

1011 Rendezvous Points (RPs)
1011 publishers subscribers in aggregate
1010 publishers subscribers per RP
1010 federation members
102 events/sec/RP

8
Resilience

Fail last, fail least semantics.
Correct operation in the presence of
malicious/corrupt participants.

9
Self-administration

System decides where to place state and how to
propagate information about state changes.
System dynamically adapts to changing loads and
the presence of faults and network partitions.
No manual tuning.

10
Timeliness

Event notification should normally take seconds
not hours.

11
Heterogeneous Federation

Federation of machines within cooperating but
mutually suspicious domains of trust.
Federated parties may include both small and
large domains.

12
Security

Support restricted access to Herald facilities.
Support concepts such as groups and roles.

13
Support for Disconnection

Eventual delivery to disconnected subscribers.
Event histories to allow a posteriori examination
of the past.

14
Partitioned Operation

Continued operation on both sides of a network
partition.
Eventual (out-of-order) delivery after partition
healing.

15
Non-Goals

Whats the best way to do
Naming
Filtering
Complex subscription queries
In-order delivery (except as layered on top)

16
Applying Lessons of the Internet and Web

Assume things are broken
Mutual suspicion and no dependence on correct
behavior by others.
Dont try to fix everything
All distributed state is maintained in a
weakly-consistent soft-state manner and is aged.
All distributed state is incomplete and may be
inaccurate.

17
Design Overview

We think we only need these mechanisms
Replication.
Overlay distribution networks.
Time contracts.
Event histories.
Administrative rendezvous points.

18
Replication
19
Overlay Distribution Networks
Herald_at_L1

20
Time Contracts
RP1
Creator
60
Pub1
10
Sub1
30
21
Event Histories
RP1
Creator
60
Pub1
10
Sub1
30
History
50
22
Administrative Rendezvous Points
1. Subscribe RP1_at_
2. Notify(change)
23
Engineering Research Issues

Baseline scalability numbers
Dynamic system reconfiguration
Federation and security

24
Baseline Scalability Numbers

How scalable are single-node servers and server
clusters?
What are multicast-style delivery systems
actually capable of, especially in aggregate?

25
Dynamic System Reconfiguration

Reconfiguring distributed RP state in response to
aggregate workloads and global state changes.
Dealing with flash crowd loads.
Placement of RP state to minimize the effects of
network partitions and disconnection.
Placement of RP state to enable efficient
implementations of higher-level pub/sub semantics.

26
Federation and Security

Can we define simple, open protocols?
Will we need heavy-weight mechanisms to deal with
malicious/corrupt servers?
How should anonymity and privacy be dealt
with/supported?

27
Related Work

Non-global event notification systems (Gryphon,
Ready, Siena, )
Netnews
P2P systems such as Gnutella and Farsite
Overlay multicast networks
CDNs
OceanStore

28
Conclusion

Global event notification is emerging as a key
Internet technology.
Herald is exploring scalability of the basic
message and distributed state management aspects
of an event notification system
Gain engineering experience with scalable pub/sub
systems.
Explore dynamic system reconfiguration.
Understand the implications of federation and
security.