The Query Compiler - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

The Query Compiler

Description:

THE QUERY COMPILER Prepared by : Ankit Patel (226) REFERENCES H. Garcia-Molina, J. Ullman, and J. Widom, Database System: The Complete Book, second edition: p ... – PowerPoint PPT presentation

Number of Views:176
Avg rating:3.0/5.0
Slides: 15
Provided by: sjs87
Category:
Tags: compiler | query

less

Transcript and Presenter's Notes

Title: The Query Compiler


1
The Query Compiler
  • Prepared by
  • Ankit Patel (226)

2
References
  • H. Garcia-Molina, J. Ullman, and J. Widom,
    Database System The Complete Book, second
    edition p.897-913, Prentice Hall, New Jersey,
    2008

3
Compilation of Queries
  • Compilation means turning a query into a physical
    query plan, which can be implemented by query
    engine.
  • Steps of query compilation
  • Parsing
  • Semantic checking
  • Selection of the preferred logical query plan
  • Generating the best physical plan

4
The Parser
  • The first step of SQL query processing.
  • Generates a parse tree
  • Nodes in the parse tree corresponds to the SQL
    constructs
  • Similar to the compiler of a programming language

5
View Expansion
  • A very critical part of query compilation.
  • Expands the view references in the query tree to
    the actual view.
  • Provides opportunities for the query optimization.

6
Semantic Checking
  • Checks the semantics of a SQL query.
  • Examines a parse tree.
  • Checks
  • Attributes
  • Relation names
  • Types
  • Resolves attribute references.

7
Conversion to a logical query plan
  • Converts a semantically parsed tree to a
    algebraic expression.
  • Conversion is straightforward but subqueries need
    to be optimized.
  • Two argument selection approach can be used.

8
Algebraic transformation
  • Many different ways to transform a logical query
    plan to an actual plan using algebraic
    transformations.
  • The laws used for this transformation
  • Commutative and associative laws
  • Laws involving selection
  • Pushing selection
  • Laws involving projection
  • Laws about joins and products
  • Laws involving duplicate eliminations
  • Laws involving grouping and aggregation

9
Estimating sizes of relations
  • True running time is taken into consideration
    when selecting the best logical plan.
  • Two factors the affects the most in estimating
    the sizes of relation
  • Size of relations ( No. of tuples )
  • No. of distinct values for each attribute of each
    relation
  • Histograms are used by some systems.

10
Cost based optimizing
  • Best physical query plan represents the least
    costly plan.
  • Factors that decide the cost of a query plan
  • Order and grouping operations like joins,unions
    and intersections.
  • Nested loop and the hash loop joins used.
  • Scanning and sorting operations.
  • Storing intermediate results.

11
Plan enumeration strategies
  • Common approaches for searching the space for
    best physical plan .
  • Dynamic programming Tabularizing the best plan
    for each sub expression
  • Selinger style programming sort-order the
    results as a part of table
  • Greedy approaches Making a series of locally
    optimal decisions
  • Branch-and-bound Starts with enumerating the
    worst plans and reach the best plan

12
Left-Deep join trees
  • Left Deep Join Trees are the binary trees with
    a single spine down the left edge and with leaves
    as right children.
  • This strategy reduces the number of plans to be
    considered for the best physical plan.
  • Restrict the search to Left Deep Join Trees
    when picking a grouping and order for the join of
    several relations.

13
Physical Plans for Selection
  • Breaking a selection into an index-scan of
    relation, followed by a filter operation.
  • The filter then examines the tuples retrieved by
    the index-scan.
  • Allows only those to pass which meet the portions
    of selection condition.

14
Pipelining versus Materializing
  • An operator always consumes the result of other
    operator and is passed through the main memory.
  • This flow of data between the operators can be
    controlled to implement Pipelining .
  • The intermediate results should be removed from
    main memory to save space for other operators.
  • This techniques can implemented using
    materialization .
  • Both the pipelining and the materialization
    should be considered by the physical query plan
    generator.
Write a Comment
User Comments (0)
About PowerShow.com