Sven Bittner, 28 November 2006 - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Sven Bittner, 28 November 2006

Description:

Analysis of distributions on eBay. Identification of typical subscription classes ... Computer Science Conference (ACSC 2006), Hobart, Australia, 16-19 January, 2006. ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 35
Provided by: csWaik
Category:

less

Transcript and Presenter's Notes

Title: Sven Bittner, 28 November 2006


1
Talk at the 3rd International Middleware Doctoral
Symposium (MDS 2006) Supporting Arbitrary
Boolean Subscriptions in Distributed
Publish/Subscribe Systems
  • Sven Bittner, 28 November 2006
  • Department of Computer Science
  • The University of Waikato, New Zealand

This research is partially funded by the NZ
Government under the New Zealand International
Doctoral Research Scholarships (NZIDRS) programme.
2
Structure of Talk
  • Motivation Publish/Subscribe
  • Problem Description
  • Filtering in Central Components
  • Routing in the Distributed System
  • Summary and Outlook

Sven Bittner Supporting Arbitrary Boolean
Subscriptions in Distributed Pub/Sub Systems
3
Publish/Subscribe Systems
Pub/sub system
Publishers
Subscribers
B4
B5
B3
B6
B2
B1
B7
B9
B8
Motivation Problem Definition Central
Filtering Routing Optimizations Summary
4
Messages Subscriptions
  • Event messages
  • Describe a state change/real-world event
  • Attribute-value pairs
  • Subscriptions
  • Describe interests
  • Arbitrary Boolean combination of predicates

title Harry Potter endingWithin 6 hours
condition new price 15.00
Motivation Problem Definition Central
Filtering Routing Optimizations Summary
5
Context Filtering
  • Filtering algorithm
  • Determination of all subscriptions matching an
    incoming event message (messages not stored)
  • Indexation of subscriptions and predicates
  • Support of required subscription language
    (Boolean)

Motivation Problem Definition Central
Filtering Routing Optimizations Summary
6
Context Routing
  • Routing algorithm
  • Determination of all brokers with matching
    subscriptions
  • Distribution of subscriptions to build event
    routing tables
  • Subscriptions as routing entries (where to route
    messages)

Motivation Problem Definition Central
Filtering Routing Optimizations Summary
7
Context Routing Optimization
  • Optimization goal
  • Improvement of routing process, e.g.,
  • Higher throughput
  • Less memory for routing tables
  • Manipulation of
  • routing entries

Routing table
B2
S1
B3
S9


Motivation Problem Definition Central
Filtering Routing Optimizations Summary
8
Structure of Talk
  • Motivation Publish/Subscribe
  • Problem Description
  • Filtering in Central Components
  • Routing in the Distributed System
  • Summary and Outlook

Sven Bittner Supporting Arbitrary Boolean
Subscriptions in Distributed Pub/Sub Systems
9
Current Approaches (1)
  • Observations
  • Current systems only support conjunctive
    subscriptions
  • Restrictions exploited in
  • Filtering algorithms
  • Routing optimizations
  • ? No consideration of other operators in
    subscriptions

Motivation Problem Definition Central
Filtering Routing Optimizations Summary
10
Current Approaches (2)
  • Motivation
  • Arbitrary Boolean subscriptions can be converted
    to DNF (exponential in size)
  • Every conjunction is handled as separate
    subscription
  • ? Approach as in database management systems
    (conversion of query restrictions)

Motivation Problem Definition Central
Filtering Routing Optimizations Summary
11
Publish/Subscribe vs. DBMS (1)
  • Important differences
  • Number of simultaneous data requests
  • DBMSs relatively small number of queries
  • Pub/sub systems large number of subscriptions
  • ? Even higher load after conversion
  • Query processing
  • DBMSs query optimization on canonical form based
    on known data (access plans, cost estimation,
    etc.)
  • Pub/sub systems events are unknown, no
    optimization applied

Motivation Problem Definition Central
Filtering Routing Optimizations Summary
12
Publish/Subscribe vs. DBMS (2)
  • Considering data storage
  • Subscriptions ? queries
  • Data (base) ? subscription (base)
  • Queries ? event messages
  • Messages are in canonical form (attribute-value
    pairs)
  • So, why converting subscriptions as well?
  • ? Questionable whether to take conversion
    approach in pub/sub (problem size explosion)

Motivation Problem Definition Central
Filtering Routing Optimizations Summary
13
Hypothesis
  • The internal support of arbitrary Boolean
    subscriptions
  • reduces the memory requirements
  • compared to current conjunctive solutions
  • without degrading the system efficiency.

Motivation Problem Definition Central
Filtering Routing Optimizations Summary
14
Steps to Take
  • Development and analysis of
  • Filtering algorithm (central broker components)
  • Routing optimization (distributed system)

Motivation Problem Definition Central
Filtering Routing Optimizations Summary
15
Structure of Talk
  • Motivation Publish/Subscribe
  • Problem Description
  • Filtering in Central Components
  • Routing in the Distributed System
  • Summary and Outlook

Sven Bittner Supporting Arbitrary Boolean
Subscriptions in Distributed Pub/Sub Systems
16
Steps Undertaken (1)
  • 1. Application scenario analysis BH06b online
    auctions
  • Analysis of distributions on eBay
  • Identification of typical subscription classes
  • Semi-realistic data set
  • Used in later analysis

Motivation Problem Definition Central
Filtering Routing Optimizations Summary
17
Steps Undertaken (2)
  • 2. Filtering algorithm for arbitrary Boolean
    subscriptions BH05a
  • Generic solution
  • Extends general-purpose conjunctive counting
    algorithm YGM94, AJL02
  • Filters on conjunctions the same way as counting
    approach

Motivation Problem Definition Central
Filtering Routing Optimizations Summary
18
Steps Undertaken (3)
  • 3. Characterization scheme and memory analysis
    BH05b
  • Description of subscription patterns
  • Analysis of counting, cluster HCKW90, FJL01
    and Boolean approach
  • Determination of point where Boolean approach
    requires less memory
  • Already one disjunction might favor Boolean
    approach

Motivation Problem Definition Central
Filtering Routing Optimizations Summary
19
Steps Undertaken (4)
  • 4. Practical Verification/Efficiency Analysis
  • Confirmation of theoretical results
  • Efficiency is similar to counting approach
  • Summary Boolean solution
  • More space efficient filtering
  • Similar time efficiency properties
  • Scheme helps with decision Boolean/conjunctive
    algorithm

Motivation Problem Definition Central
Filtering Routing Optimizations Summary
20
Structure of Talk
  • Motivation Publish/Subscribe
  • Problem Description
  • Filtering in Central Components
  • Routing in the Distributed System
  • Summary and Outlook

Sven Bittner Supporting Arbitrary Boolean
Subscriptions in Distributed Pub/Sub Systems
21
Subscription Pruning (1)
  • Idea of pruning BH06a
  • Remove parts ofsubscription trees
  • ? Creates more general subscription

Motivation Problem Definition Central
Filtering Routing Optimizations Summary
22
Subscription Pruning (2)
  • Less complex (time and space) subscriptions ()
  • More events forwarded (-)
  • Consequences of pruning

Motivation Problem Definition Central
Filtering Routing Optimizations Summary
23
Application of Pruning (1)
  • Pruning of routing entries

Motivation Problem Definition Central
Filtering Routing Optimizations Summary
24
Application of Pruning (2)
  • Pruning of routing entries

No pruning in local broker ? Ensure correct
filtering
Motivation Problem Definition Central
Filtering Routing Optimizations Summary
25
Application of Pruning (3)
  • Pruning of routing entries

Subscriber
  • Less complex subscriptions
  • More time and space efficient routing

Motivation Problem Definition Central
Filtering Routing Optimizations Summary
26
Application of Pruning (4)
  • Pruning of routing entries

Subscriber
  • But more general subscriptions
  • More forwarded event messages (false positives)
  • More event messages to route/process

Motivation Problem Definition Central
Filtering Routing Optimizations Summary
27
Practical Pruning
  • Question
  • What subscription and what part of its
    subscription tree should be pruned first?
  • Answer
  • Four heuristics BH06c based on influence on
  • Memory usage
  • Filter efficiency
  • Network load (selectivity)
  • Network load (selectivity popularity)

Motivation Problem Definition Central
Filtering Routing Optimizations Summary
28
Experiments (In Progress)
  • Evaluation in online auction setting
  • Different distributions in subscriptions
  • Applicability to conjunctive subscriptions
  • Comparison to conjunctive routing optimization

Motivation Problem Definition Central
Filtering Routing Optimizations Summary
29
Structure of Talk
  • Motivation Publish/Subscribe
  • Problem Description
  • Filtering in Central Components
  • Routing in the Distributed System
  • Summary and Outlook

Sven Bittner Supporting Arbitrary Boolean
Subscriptions in Distributed Pub/Sub Systems
30
Summary
  • Motivation
  • Publish/subscribe systems
  • Online auction scenario
  • Need for arbitrary Boolean subscriptions
  • Problem definition
  • Only conjunctions supported
  • Conversion adopted from DBMSs
  • Does conversion make sense?
  • Hypothesis No, if disjunctions occur!

Motivation Problem Definition Central
Filtering Routing Optimizations Summary
31
Summary
  • Central broker components
  • Boolean Filtering algorithm
  • Characterization scheme
  • Analysis, comparison, and verification
  • ? Boolean approach is favorable
  • Distributed system
  • Novel optimization subscription pruning
  • First Experiments valuable optimization
  • Memory requirements ?, throughput ?

Motivation Problem Definition Central
Filtering Routing Optimizations Summary
32
Future Work
  • Future work
  • Writing up
  • Finish experiments
  • Heuristic based on multicriteria optimization
  • Future work (not within PhD)
  • Analyze other applications
  • Optimize solutions
  • Open source prototype

Motivation Problem Definition Central
Filtering Routing Optimizations Summary
33
Thank you for your attention!
Selected further reading BH05a S. Bittner and
A. Hinze. On the Benefits of Non-Canonical
Filtering in Publish/Subscribe Systems. In
Proceedings of the 25th IEEE International
Conference on Distributed Computing Systems
Workshops (ICDCSW '05), Columbus, USA, June 2005.
BH05b S. Bittner and A. Hinze. A Detailed
Investigation of Memory Requirements for Pub/Sub
Filtering Algorithms. In Proceedings of the 13th
International Conference on Cooperative
Information Systems (CoopIS 2005), Agia Napa,
Cyprus, 31 October-4 November, 2005. BH06a S.
Bittner and A. Hinze. Pruning Subscriptions in
Distributed Pub/Sub Systems. In Proc. of the 29th
Austral. Computer Science Conference (ACSC 2006),
Hobart, Australia, 16-19 January, 2006. BH06b
S. Bittner and A. Hinze. Event Distributions in
Online Book Auctions. Technical Report 03/2006.
Computer Science Department, Waikato University,
New Zealand, February 2006. BH06c S. Bittner
and A. Hinze. Dimension-Based Subscription
Pruning for Publish/Subscribe Systems. In
Proceedings of the 26th IEEE International
Conference on Distributed Computing Systems
Workshops (ICDCSW '06), Lisbon, Portugal, July
2006. BH06d S. Bittner and A. Hinze.
Optimizing Pub/Sub Systems by Advertisement
Pruning. In Proceedings of the 8th International
Symposium on Distributed Objects and Applications
(DOA 2006), Montpellier, France, 30 October-1
November 2006.
Sven Bittner, s.bittner_at_cs.waikato.ac.nz Talk
Supporting Arbitrary Boolean Subscriptions
in Distributed Publish/Subscribe Systems
34
Selected Other References
  • AJL02 G. Ashayer, H.-A. Jacobsen, and H. Leung.
    Predicate Matching and Subscription Matching in
    Publish/Subscribe Systems. In Proceedings of the
    22nd IEEE International Conference on Distributed
    Computing Systems Workshops (ICDCSW '02), Vienna,
    Austria, July 2-5 2002.
  • CRW01 A. Carzaniga, D. S. Rosenblum, and A. L.
    Wolf. Design and Evaluation of a Wide-Area Event
    Notification Service. ACM Transactions on
    Computer Systems (TOCS), 19(3)332-383, 2001.
  • FJL01 F. Fabret, A. Jacobsen, F. Llirbat, J.
    Pereira, K. Ross, and D. Shasha. Filtering
    Algorithms and Implementation for Very Fast
    Publish/Subscribe Systems. In Proc. of the 2001
    ACM SIGMOD Intern. Conference on Management of
    Data (SIGMOD 2001), USA, May 2001.
  • HCKW90 E. N. Hanson, M. Chaabouni, C.-H. Kim,
    and Y.-W. Wang. A Predicate Matching Algorithm
    for Database Rule Systems. In Proceedings of the
    1990 ACM SIGMOD International Conference on
    Management of Data (SIGMOD 1990), Atlantic City,
    USA, May 23-25 1990.
  • LHJ05 G. Li, S. Hou, and H.-A. Jacobsen. A
    Unified Approach to Routing, Covering and Merging
    in Publish/Subscribe Systems based on Modified
    Binary Decision Diagrams. In Proc. of the 25th
    IEEE Intern. Conference on Distributed Computing
    Systems (ICDCS '05), USA, June 2005.
  • MF01 G. Muehl and L. Fiege. Supporting Covering
    and Merging in Content-Based Publish/Subscribe
    Systems Beyond Name/Value Pairs. IEEE
    Distributed Systems Online (DSOnline), 2(7),
    2001.
  • TE04 P. Triantafillou and A. Economides.
    Subscription Summarization A New Paradigm for
    Efficient Publish/Subscribe Systems. In
    Proceedings of the 24th IEEE International
    Conference on Distributed Computing Systems
    (ICDCS '04), Tokyo, Japan, March, 2004.
  • WQV04 Y.-M. Wang, L. Qiu, C. Verbowski, D.
    Achlioptas, G. Das, and P. Larson. Summary-based
    Routing for Content-based Event Distribution
    Networks. ACM SIGCOMM Computer Communication
    Review, 34(5)59-74, 2004.
  • YGM94 T. W. Yan and H. Garcia-Molina. Index
    Structures for Selective Dissemination of
    Information Under the Boolean Model. ACM
    Transactions on Database Systems (TODS),
    19(2)332-364, 1994.
Write a Comment
User Comments (0)
About PowerShow.com