ICS 214A: Database Management Systems Winter 2004 - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

ICS 214A: Database Management Systems Winter 2004

Description:

WHERE birthdate LIKE 60' ICS214A. Notes 10. 6. Example: Parse Tree ... name MovieStar birthDate 60' ICS214A. Notes 10. 7. Example: Logical Query Plan ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 36
Provided by: che7
Category:

less

Transcript and Presenter's Notes

Title: ICS 214A: Database Management Systems Winter 2004


1
ICS 214A Database Management Systems Winter 2004
  • Lecture 10
  • Part III Query Processing and Optimization
  • Professor Chen Li

2
Big picture
We are here!
SQL statements
Access plan
Query Processor
optimizer
Read write records, scan relations
Get page containing tuples
Indexes
Record-oriented file system
Buffer manager
Basic file system
Read/write file pages
Hardware
3
Query Processor Optimizer
  • Overview
  • Query processor
  • Query optimizer

4
Overview query processing steps
  • Parsing
  • ? Validation
  • ? Optimization
  • ? Compilation
  • ? Execution

5
Example SQL query
  • Find the movies with stars born in 1960
  • SELECT title
  • FROM StarsIn
  • WHERE starName IN (
  • SELECT name
  • FROM MovieStar
  • WHERE birthdate LIKE 1960
  • )

6
Example Parse Tree
ltQuerygt
ltSFWgt
SELECT ltSelListgt FROM ltFromListgt
WHERE ltConditiongt
ltAttributegt ltRelNamegt
ltTuplegt IN ltQuerygt
title StarsIn
ltAttributegt ( ltQuerygt )
starName ltSFWgt
SELECT ltSelListgt FROM ltFromListgt
WHERE ltConditiongt
ltAttributegt ltRelNamegt
ltAttributegt LIKE ltPatterngt
name MovieStar
birthDate 1960
7
Example Logical Query Plan
?title
?starNamename
?
StarsIn ?name
?birthdate LIKE 1960
MovieStar
8
Example Improved Logical Query Plan
?title
starNamename
StarsIn ?name
?birthdate LIKE 1960
MovieStar
9
Example One Physical Plan

Hash join
SEQ scan
index scan
StarsIn MovieStar
10
Logical vs Physical Plan
  • Logical Plan
  • Physical Plan

Nested Loop
Merge Join
File scan R3
Sort
Sort
File scan R1
File scan R2
11
Role of Optimizer
  • Convert a logical plan to an optimal physical
    plan w.r.t. one or more performance metrics.
  • Performance Metrics
  • I/O,
  • CPU,
  • Communication,
  • Power Consumption,
  • Cache misses,
  • of internet queries,
  • .

12
Two topics
  • Query processor
  • Relational algebra level
  • Physical operator level
  • estimate costs
  • Query optimizer

13
Operators in Relational Model
  • Tuple-level unary operators ?,?
  • can be evaluated immediately
  • tuple at a time without looking at entire
    relation
  • Relation-level unary operators
  • S - duplicate removal
  • VL- aggregations, grouping
  • E - sorting
  • Binary operators

14
Example
R(A,B,C), S(C,D,E)
  • Select B,D
  • From R,S
  • Where R.A c ? S.E 2 ? R.CS.C

15
R A B C S C D E a 1 10 10 x 2 b 1 20 2
0 y 2 c 2 10 30 z 2 d 2 35 40 x 1 e 3 45 50
y 3
16
Plan I
  • ?B,D
  • sR.Ac? S.E2 ? R.CS.C
  • x
  • R S

OR ?B,D sR.Ac? S.E2 ? R.C S.C (R x S)
17
R x S R.A R.B R.C S.C S.D S.E a 1 10 10
x 2 a 1 10 20 y 2 . .
C 2 10 10 x 2 . .
18
Plan II
  • ?B,D
  • sR.A c sS.E 2
  • R S

natural join
19
R S A B C s (R) s(S) C D
E a 1 10 A B C C D E 10
x 2 b 1 20 c 2 10 10 x 2 20 y
2 c 2 10 20 y 2 30 z 2 d 2
35 30 z 2 40 x 1 e 3 45
50 y 3
20
Plan III
  • Use R.A and S.C Indexes
  • (1) Use R.A index to select R tuples with R.A
    c
  • (2) For each R.C value found, use S.C index to
    find matching tuples
  • (3) Eliminate S tuples S.E ? 2
  • (4) Join matching R,S tuples, project B,D
    attributes and place in result

21
R S A B C C D E a 1
10 10 x 2 b 1 20 20
y 2 c 2 10 30 z 2 d 2 35
40 x 1 e 3 45
50 y 3
A
C
I1
I2
22
Query Processor physical level
  • Implements a bunch of algorithms that form the
    building blocks of the physical plan.
  • Unary operators
  • sort, hash, select, project, aggregation, group
    by.
  • binary operators
  • join, union, intersection, difference.
  • Important Issue
  • Query processor architecture
  • how is the execution of different physical
    operators synchronized and how is data passed
    form one operator to another?

23
Query Processor Architecture
  • Consider a simple join operator and query
    consisting of 2 joins. How should data be passed
    form one join to another?
  • Create temporary files.
  • Create a process for each operator and use IPC
    (e.g, shared memory, pipes).
  • OS does scheduling of these processes.

24
Query Processor Architecture (cont)
  • Take a query plan and convert as much of it as
    possible into a block that can be executed and
    one single iterative program with loops and other
    control structures
  • Implement operators as iterators with open,
    next, close as a single process easy to
    parallelize

25
Iterators
  • Demand-driven evaluation of query tree.
  • Operators exchange data in units such as records
  • Each operator supports the following interfaces
  • open
  • next
  • close
  • open() at top of tree results in cascade of opens
    down the tree.
  • An operator getting a next() call may recursively
    make next() calls from within to produce its next
    answer.
  • close() at top of tree results in cascade of
    close down the tree

26
Benefits of Iterators
  • Query executed as one process
  • Items produced one at a time
  • No temporary files
  • amenable to parallelization

27
Sample Iterators
  • Print
  • Open()
  • open input
  • Next()
  • call next on input format the item to screen
  • Close()
  • close input

28
Sample Iterators (cont)
  • Scan
  • open
  • open file
  • next
  • read next item
  • close
  • close file

29
Sample Iterators (cont)
  • Sort (merge sort)
  • open
  • open input
  • build all initial run files calling next on
    input
  • close input
  • merge run files
  • next
  • determine next output item read new item from
    correct run file
  • close
  • destroy remaining run files

30
Tuple-level unary operator
  • selection and projection
  • simple to implement using a scan
  • relation scan
  • index scan
  • Memory requirement is small
  • Disk IOs depends on
  • Records clustered?
  • Indexes available?

31
Relation-level unary operators
  • Duplicate elimination
  • Keep seen records in a buffer
  • Check if each record is in the buffer
  • To speedup the search, use a hash table
  • Grouping (MIN/MAX, COUNT, SUM, AVG)
  • Scan the records
  • Accumulate values

32
Binary operators Union
  • Could use set semantics or bag semantics
  • Assuming set semantics R ? S
  • Assume S is smaller than R
  • Read S into buffers
  • Build a search structure using the entire tuple
    as the key
  • Output S
  • For each tuple in R, output it only if it was not
    seen in the structure
  • How about Bag?

33
Binary operators Intersection
  • Set or bag
  • Assuming set semantics R ? S
  • Read S into buffers
  • Build a search structure using the entire tuple
    as the key
  • For each tuple in R, output it only if it is in
    the structure

34
Binary operators Difference
  • Set or bag
  • Assuming set semantics R - S
  • Read S into buffers
  • Build a search structure using the entire tuple
    as the key
  • For each tuple in R, output it only if it is NOT
    in the structure

35
Next
  • Binary operators Join
  • sort based
  • hash based
  • nested loop based
  • indexed based
Write a Comment
User Comments (0)
About PowerShow.com