Herald: Achieving a Global Event Notification Service - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Herald: Achieving a Global Event Notification Service

Description:

Herald: Achieving a Global Event Notification Service ... Herald is exploring scalability of the basic message and distributed state ... – PowerPoint PPT presentation

Number of Views:17
Avg rating:3.0/5.0
Slides: 29
Provided by: marvint
Category:

less

Transcript and Presenter's Notes

Title: Herald: Achieving a Global Event Notification Service


1
Herald Achieving a Global Event Notification
Service
  • Luis Felipe Cabrera, Michael B. Jones, Marvin
    Theimer
  • Microsoft Research

2
Global Event Notification Services
  • Communication via event notification (also called
    publish/subscribe) is well-suited for
    loosely-coupled eCommerce applications, as well
    as Internet-scale distributed applications (e.g.
    instant messaging and multi-player games).
  • General event notification systems currently
  • scale to tens of thousands of clients,
  • do not have global reach.

3
Internet-scale Issues
  • Scaling requirements are millions and billions,
    perhaps more.
  • There will (probably) not be a single
    organization that owns the entire event
    notification infrastructure. Hence a federated
    design is required.
  • Global reach implies that failures and network
    partitions will be common-place.

4
Focus on the Basic Distributed Systems Primitives
  • Focus on the scalability of basic message
    delivery and distributed state management
    capabilities.
  • Employ a very simple message-oriented design and
    assume until proven otherwise that richer
    event notification semantics can be layered on
    top.

5
Herald Event Notification Model
1 Create Rendezvous Point
4 Notify
3 Publish
2 Subscribe
6
Design Criteria
  • The usual criteria
  • Scalability
  • Resilience
  • Self-administration
  • Timeliness
  • Additional criteria
  • Heterogeneous federation
  • Security
  • Support for disconnection
  • Partitioned operation

7
Scalability
  • 1011 Rendezvous Points (RPs)
  • 1011 publishers subscribers in aggregate
  • 1010 publishers subscribers per RP
  • 1010 federation members
  • 102 events/sec/RP

8
Resilience
  • Fail last, fail least semantics.
  • Correct operation in the presence of
    malicious/corrupt participants.

9
Self-administration
  • System decides where to place state and how to
    propagate information about state changes.
  • System dynamically adapts to changing loads and
    the presence of faults and network partitions.
  • No manual tuning.

10
Timeliness
  • Event notification should normally take seconds
    not hours.

11
Heterogeneous Federation
  • Federation of machines within cooperating but
    mutually suspicious domains of trust.
  • Federated parties may include both small and
    large domains.

12
Security
  • Support restricted access to Herald facilities.
  • Support concepts such as groups and roles.

13
Support for Disconnection
  • Eventual delivery to disconnected subscribers.
  • Event histories to allow a posteriori examination
    of the past.

14
Partitioned Operation
  • Continued operation on both sides of a network
    partition.
  • Eventual (out-of-order) delivery after partition
    healing.

15
Non-Goals
  • Whats the best way to do
  • Naming
  • Filtering
  • Complex subscription queries
  • In-order delivery (except as layered on top)

16
Applying Lessons of the Internet and Web
  • Assume things are broken
  • Mutual suspicion and no dependence on correct
    behavior by others.
  • Dont try to fix everything
  • All distributed state is maintained in a
    weakly-consistent soft-state manner and is aged.
  • All distributed state is incomplete and may be
    inaccurate.

17
Design Overview
  • We think we only need these mechanisms
  • Replication.
  • Overlay distribution networks.
  • Time contracts.
  • Event histories.
  • Administrative rendezvous points.

18
Replication
19
Overlay Distribution Networks
Herald_at_L1

20
Time Contracts
RP1
Creator
60
Pub1
10
Sub1
30
21
Event Histories
RP1
Creator
60
Pub1
10
Sub1
30
History
50
22
Administrative Rendezvous Points
1. Subscribe RP1_at_
2. Notify(change)
23
Engineering Research Issues
  • Baseline scalability numbers
  • Dynamic system reconfiguration
  • Federation and security

24
Baseline Scalability Numbers
  • How scalable are single-node servers and server
    clusters?
  • What are multicast-style delivery systems
    actually capable of, especially in aggregate?

25
Dynamic System Reconfiguration
  • Reconfiguring distributed RP state in response to
    aggregate workloads and global state changes.
  • Dealing with flash crowd loads.
  • Placement of RP state to minimize the effects of
    network partitions and disconnection.
  • Placement of RP state to enable efficient
    implementations of higher-level pub/sub semantics.

26
Federation and Security
  • Can we define simple, open protocols?
  • Will we need heavy-weight mechanisms to deal with
    malicious/corrupt servers?
  • How should anonymity and privacy be dealt
    with/supported?

27
Related Work
  • Non-global event notification systems (Gryphon,
    Ready, Siena, )
  • Netnews
  • P2P systems such as Gnutella and Farsite
  • Overlay multicast networks
  • CDNs
  • OceanStore

28
Conclusion
  • Global event notification is emerging as a key
    Internet technology.
  • Herald is exploring scalability of the basic
    message and distributed state management aspects
    of an event notification system
  • Gain engineering experience with scalable pub/sub
    systems.
  • Explore dynamic system reconfiguration.
  • Understand the implications of federation and
    security.
Write a Comment
User Comments (0)
About PowerShow.com