Mercury: Building Distributed Applications with PublishSubscribe - PowerPoint PPT Presentation

About This Presentation
Title:

Mercury: Building Distributed Applications with PublishSubscribe

Description:

2. Hub Selectivity. Recall: subscription is sent to one 'randomly' chosen hub! Ideally, it should be sent to the 'highest selective' hub ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 22
Provided by: Ash8
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Mercury: Building Distributed Applications with PublishSubscribe


1
Mercury Building Distributed Applications with
Publish-Subscribe
  • Ashwin Bharambe
  • Carnegie Mellon University
  • Monday Seminar Talk

2
Quick Terminology Recap
  • Basics
  • Publishers inject data/events/publications
  • Subscribers register interests/subscriptions
  • Brokers match subscriptions with publications
    and deliver to subscribers
  • Mercury distributed publish-subscribe system
  • Performs matching and content routing in
    distributed fashion
  • Data model

Name ashwin Age 23 X 192.3 Y 223.4
Name Age gt 35 X gt 100 X lt 180
Publication
Subscription
3
Virtual reality example
Events
(50,250)
(100,200)
User
Arena
(150,150)
Interests
Virtual World
4
Mercury goals
  • Implement distributed publish-subscribe
  • Support range queries
  • Avoid hot-spots in the system
  • Flooding anything is bad
  • Avoid publication flooding completely
  • Avoid subscription flooding as much as is
    possible
  • Consider queries like SELECT from RECORDS
  • Peer-to-peer scenario
  • No dedicated brokers
  • Highly dynamic network

5
Talk Contents
  • Mercury Architecture
  • Overlay construction
  • Routing guarantees
  • Overlay properties
  • How randomness is useful
  • Load balancing histogram maintenance
  • Application Design

6
Attribute Hubs
  • Each attribute range is divided into bins
  • A node responsible for range of attribute values
  • Assigned when the node joins can change
    dynamically

7
Routing
y
Generating point
S
age
S
name
Name X gt 100 X lt 180
Subscription ?
x
  • Send a subscription to one hub
  • Which one? Interesting question in itself!
  • Determine query selectivity send to highest
    selective hub

8
Routing (contd.)
age
Generating point
x
P
P
P
name
y
Name ashwin Age 23
Publication ?
  • We must send publications to all hubs
  • Ensures matching

9
Routing illustrated
10
Hub structure and routing (Symphony)
  • Naïve routing along the circle scales linearly
  • Utilize the small-world phenomenon Kleinberg
    2000
  • Know thy neighbors and one random person and you
    can contact anybody quickly
  • Routing policy choose the link which gets you
    closest to destn
  • Performance
  • Average hop length O(log2 (n)/k) with k
    random links

Need to be careful when node ranges are not
uniform
11
Caching
  • O(log2 (n)) is good, but each hop is still an
    application level hop
  • Latency can be quite large if overlay
    non-optimized
  • For distributed applications like games, this is
    way off from optimal
  • Exploit locality in the access patterns of an
    application
  • In addition to k random links, have cached
    links
  • Store nodes which were the rendezvous points for
    recent publications

12
Performance (Uniform workload)
long links 6 cache links log(n)
Publications were generated from a uniform
distribution
13
Performance (Skewed workload)
long links 6 cache links log(n)
Publications were generated from a high skew Zipf
distribution
14
Performance (Memory reference trace)
long links 6 cache links log(n)
Publications were generated from memory
references of SPEC2000 benchmark
15
Two Problems
  • 1. Load Balancing
  • Concern because publication values need not
    follow a uniform, or a priori known, distribution
  • Node ranges are assigned when the nodes join

16
Problems (contd.)
  • 2. Hub Selectivity
  • Recall subscription is sent to one randomly
    chosen hub!
  • Ideally, it should be sent to the highest
    selective hub
  • Need to estimate selectivity of a subscription

17
Hail randomness
  • Randomized construction of the network gives
    additional benefits!
  • Turns out, this network is an Expander with high
    probability
  • Random walks mix rapidly i.e., they approach
    the stationary distribution rapidly
  • Uniform sampling non-trivial
  • Node ranges are not uniform across nodes
  • Random walks efficient way of sampling
  • No explicit hierarchy required (as in RANSUB
    USITS 03)
  • In general, several statistics about a very
    dynamic network can be efficiently maintained

18
Hub Selectivity (ideas)
  • Use sampling to build approximate histograms
  • Approach 1 (Push)
  • Each Rendezvous point selects publications with
    a certain probability and sends them off with
    specific TTL
  • log2(n) length random walk ensures good mixing
  • Traffic overhead / publications
  • Approach 2 (Pull)
  • Perform uniform random sampling periodically
  • Each sample histogram of sampled node
  • Question how to combine histograms?

19
Load balancing (ideas)
  • Sample average load in the system
  • Utilize the histograms to quickly know high/low
    load areas
  • Strategy 1
  • A light load gracefully leaves the overlay
  • Re-inserts itself into a high load area
  • Strategy 2
  • Use load diffusion heavy nodes shed load to
    neighbors
  • Only if the neighbor is light

20
Distributed Game Design
  • Current implementation Distributed version of
    the Asteroids game!
  • Questions
  • How is state distributed across the system?
  • How is consistency handled in the system?
  • Cheating???

21
Conclusion
  • Distributed publish-subscribe system supporting
  • Range queries
  • Scalable routing and matching
  • Randomized network construction
  • Provides routing guarantees
  • Also yields an elegant way of sampling in a
    distributed system
  • Exports an API for applications
  • Implemented deployed on Emulab
  • Distributed game using Mercury
  • Almost done
  • To be deployed on Planetlab soon
Write a Comment
User Comments (0)
About PowerShow.com