Load Shedding in XML Streams - PowerPoint PPT Presentation

1 / 1
About This Presentation
Title:

Load Shedding in XML Streams

Description:

Push both the join and predicate down to the data source. ... Propose a divided join algorithm for sharing the load in a distributive way ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 2
Provided by: mingz5
Category:
Tags: xml | join | load | shedding | streams

less

Transcript and Presenter's Notes

Title: Load Shedding in XML Streams


1
Adding Intelligence to the
Optimization in Data Integration System Di
Wang Advisor Prof. Murali Mani Database Systems
Research Group (DSRG), Department of Computer
Science
Our Approach Divide the join operation
Data Integration System
Intuitionally, many join algorithms have
naturally two phases partition probe
Typical Data Integration System Architecture
  • Application areas of data integration
  • Enterprise information integration ()
  • Data sources on the web
  • Scientific data sharing

Application
Query/ Browser
Metadata/ Catalog
Mediator
Rough Idea The wrappers do the partition
phase, The mediator do the probe phase.
  • Many heterogeneous database systems
  • Information Manifold
  • TSIMMIS
  • Garlic
  • Tukwila

Wrapper
Wrapper
Wrapper
Wrapper
Relation Database
Complex Object Repository
Document System
Bio-Info Database
  • Target Scenarios
  • Some non-database sources can not do
    probing but only basic operations
  • The mediator/ wrapper is already
    heavy-loaded, while it is required perform join

As part of data integration, our focus is on
system Performance and Optimization
Motivation Two classes of optimization
Based on the assumptions of integration system
Tight-federated data sources Smart
mediator -- Assume that optimizer can
have accurate information of
sources with wrappers support --
Optimizer select the best query plan based on
series of cost formulas and computation
Network-bound data sources Thin mediator --
Assume that statistics of data sources are
unavailable, and data arrival is
unpredictable -- Bunches of adaptive
technologies are used to optimize the query
plan during execution
2
1
Could produce a blind or inefficient initial plan
?
Highly relies on wrappers Cost of costing
Question How to tradeoff the cost of cost
model computation and the inefficiency of blind
initial plan?
Write a Comment
User Comments (0)
About PowerShow.com