CONTROL: - PowerPoint PPT Presentation

About This Presentation
Title:

CONTROL:

Description:

Heap Striding. More generally, on-line permutation. Non ... Heap Stride (On-Line Permutation) Reorder tuples on the fly to get a fair sample. AAABABACDCDAAA... – PowerPoint PPT presentation

Number of Views:8
Avg rating:3.0/5.0
Slides: 10
Provided by: ChrisO1
Category:
Tags: control | heap

less

Transcript and Presenter's Notes

Title: CONTROL:


1
CONTROL
  • Continuous Output and Navigation Technology with
    Refinement On-Line

CONTROL group Joe Hellerstein, Ron Avnur,
Christian Hidber, Bruce Lo, Chris Olston,
Vijayshankar Raman, Tali Roth, Kirk Wylie, UC
Berkeley
2
Batch vs. On-Line Processing
  • Batch Processing
  • Gives 100 accurate answers, but users must wait
    for entire query to finish . . .
  • On-Line Processing
  • Gives progressively refining answers as the query
    runs!
  • Allow users to control processing.
  • Applications of On-Line Processing
  • Large, ad-hoc queries in domains where
    approximate answers are acceptable (big
    picture)

3
Demo Outline
  • On-Line Aggregation
  • Refining estimates
  • Statistics give confidence
  • User Control
  • The user can speed up the processing of certain
    groups
  • The user can stop the processing at any time
  • On-Line Visualization
  • Displays an approximation of an image based on
    data while the data is being fetched
  • Shows the estimated density and distribution of
    data

4
On-Line Agg. Query Processing
  • New Access Methods
  • Randomly delivered data.
  • Index Striding
  • We can take advantage of B-Trees to access the
    groups
  • Heap Striding
  • More generally, on-line permutation
  • Non-blocking Join Algorithms
  • Ripple Join Family
  • RIPL Rectangles of Increasing Perimeter Length
  • Join progressively larger samples of two tables

5
Access Methods for On-Line Agg.
  • Index Stride
  • Round-robin through the groups to get a fair
    sample
  • Works with an index on the grouping column
  • Heap Stride (On-Line Permutation)
  • Reorder tuples on the fly to get a fair sample

6
Multi-Table On-Line Aggregation
  • Progressively refining join Ripple Join
  • Ever-larger rectangles in R ? S
  • Comes in naive, block, and hash flavors
  • Benefits
  • sample from both relations simultaneously
  • gives better statistical confidences much faster
  • intimate relationship between delivery and
    estimation

7
On-Line Aggregation User Interface
Estimates for Each Group
User Controls
Graph of Estimates w/Confidence Intervals
8
On-Line Visualization CLOUDS
  • CLOUDS displays an approximation of an image
    based on data while the data is being fetched

Conventional Algorithm
CLOUDS Algorithm
CLOUDS (with Index)
Note that CLOUDS predicts the high density of
cities in the Midwest
9
Quantifying the benefit of CLOUDS
CLOUDS gives a better approximate image faster
than the conventional algorithm
Conventional
Error
CLOUDS
Time (seconds)
Write a Comment
User Comments (0)
About PowerShow.com