Speculative Query Processing - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Speculative Query Processing

Description:

Train on recent actions of the user. Large learning rate. Probability of joins ... Web queries. Multi-user database. Client collaboration. Use speculation in ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 36
Provided by: alk3
Category:

less

Transcript and Presenter's Notes

Title: Speculative Query Processing


1
Speculative Query Processing
  • Alkis Polyzotis
  • Yannis Ioannidis

2
Speculation
  • Definition (Webster dict.)
  • to assume a business risk in hope of gain
  • Success Stories
  • Computer Architecture
  • Operating Systems
  • Database Systems

3
Motivation
Timeline
Select
name
From employee
From
employee
Where agelt25
Where
agelt25
Select name
GO!
GO!
4
Applying Speculation
Select From employee Where agelt25 Into T
Select name From T
5
Roadmap
  • Introduction
  • Speculative Processing Framework
  • Speculative Manipulations
  • Cost Optimization
  • Learning the User
  • Experimental Results
  • Conclusions

6
Setting
  • Interactive exploration
  • Ad-hoc queries
  • Long sessions
  • Read-only database

Final Query
DB
Query Interface
Partial Query
7
Framework
Query Interface
DB
Speculative Actions
Speculative Query
Partial Query
Possible Manipulations
Cost Optimizer
Actions
User Model
Cost Model
8
Example Application
Query Interface
DB
Select From employee Where agelt25 Into T1
Speculative Actions
Select From employee Where agelt25
Materialize agelt25

Cost Optimizer
Speculator

Cost Model
Learner
9
Roadmap
  • Introduction
  • Speculative Processing Framework
  • Speculative Manipulations
  • Speculative Actions
  • Composition Rules
  • Cost Optimization
  • Learning the User
  • Experimental Results
  • Conclusions

10
Speculative Actions
  • Prepare DB for speculated query
  • Types
  • Data Staging
  • Histogram Creation
  • Index Creation
  • Materialization
  • Simple
  • Complex

11
Data Staging
  • Pre-fetch pages of relations
  • Objective warm up the buffer pool
  • Low risk
  • Difficult to implement!

12
Histogram Creation
  • Create histograms on predicate fields
  • Objective Help DB Optimizer
  • Low Risk

13
Index Creation
  • Create indices on predicate fields
  • Objectives
  • Provide additional access methods
  • Provide better statistics
  • Low Risk

14
Simple Materializations
  • Materialize predicates in new tables
  • High gain
  • Increased risk

S
R
15
Complex Materializations
  • Materialize joins in new tables
  • Very high gain
  • Very high risk!

Y
W
R
S
16
Roadmap
  • Introduction
  • Speculative Processing Framework
  • Speculative Manipulations
  • Speculative Actions
  • Composition Rules
  • Cost Optimization
  • Learning the User
  • Experimental Results
  • Conclusions

17
Composition Rules
S
R
18
Composition Rules
  • May cancel previous manipulations

19
Roadmap
  • Introduction
  • Speculative Processing Framework
  • Speculative Manipulations
  • Cost Optimization
  • Learning the User
  • Experimental Results
  • Conclusions

20
Cost Optimization
  • Select most promising manipulation
  • Evaluate cost in terms of
  • Gain vs. risk
  • Possible final query
  • Users behavior

21
Roadmap
  • Introduction
  • Speculative Processing Framework
  • Speculative Manipulations
  • Cost Optimization
  • Learning the User
  • Experimental Results
  • Conclusions

22
Survival of predicates
  • Will s appear in the final query?
  • Perceptron learner
  • Temporal Differences learning
  • Train on recent actions of the user
  • Large learning rate

23
Probability of joins
  • Will R,S be joined in the final query?
  • Bayesian learner
  • Re-train on previous n queries

24
Roadmap
  • Introduction
  • Speculative Processing Framework
  • Speculative Manipulations
  • Cost Optimization
  • Learning the User
  • Experimental Results
  • Conclusions

25
Methodology
  • Collect actual traces
  • Users explore a TPC-H dataset
  • Query interface records actions timestamps
  • Replay traces
  • Normal Processing
  • Speculative Processing

26
Performance Evaluation
  • Metric Absolute Improvement
  • Metric calculated as a histogram
  • Source parameter execution time

27
Configuration
  • Dual Pentium-III 500 MHz
  • 1GB physical memory
  • Linux RH 6.2
  • Oracle Server 8.1.6
  • DB uses files, not raw partition

Thank you Remzi!
28
Experiment I Comparing Gain
Buffer 32MB
Data Set 100MB
No Indices No Histograms
29
Experiment IComparing Gain
Buffer 32MB
Data Set 100MB
No Indices No Histograms
30
Experiment II Using the Cost Model
Buffer 32MB
Data Set 100MB
Indices Histograms
31
Experiment III Performance
Buffer 32MB
Data Set 100MB
Indices Histograms
32
Experiment III Performance
Buffer 320MB
Data Set 100MB
Indices Histograms
33
Experiment III Performance
Buffer 32MB
Data Set 300MB
Indices Histograms
34
Conclusions
  • Speculative Processing Framework
  • Speculative Manipulations
  • Cost Optimization
  • Experimental Results
  • Materializations offer most gain
  • Cost model avoids penalties
  • Performance improvement up to 40

35
Future Work
  • Evaluate speculation on
  • larger dataset
  • bigger schema
  • Apply speculation in
  • Search queries
  • Web queries
  • Multi-user database
  • Client collaboration
  • Use speculation in the server
Write a Comment
User Comments (0)
About PowerShow.com