QUERY PROCESSING - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

QUERY PROCESSING

Description:

5 London branches; no indexes or sort keys; ... (1) (position='Manager') (city='London') (Staff.branchNo=Branch.branchNo) (Staff X Branch) ... – PowerPoint PPT presentation

Number of Views:88
Avg rating:3.0/5.0
Slides: 28
Provided by: jaydp
Category:

less

Transcript and Presenter's Notes

Title: QUERY PROCESSING


1
QUERY PROCESSING
  • Topic Number 16
  • AANAL PATEL

2
Objectives
  • Needs of Query Processing.
  • Overview of Query Processing and Optimization.
  • Dynamic versus Static Optimization.
  • Phases of Query Decomposition.
  • Methods of Query Optimization.
  • How to create a relational algebra tree to
    represent a query.
  • Questions.
  • Quiz.

3
Introduction
  • Database grows larger and larger as time passes,
    one of the most important characteristics of a
    database is its ability to maintain a consistent
    and acceptable level of performance.
  • The principal mechanism to obtain optimal level
    of performance of database is database query
    processing.
  • In network and hierarchical DBMSs, low level
    procedural language is generally embedded in
    high-level programming language.
  • In languages like SQL, user only specifies the
    data to be retrieved but not how to retrieve
    that.
  • Gives DBMS more control over system performance.

4
Overview of Query Processing
  • Definition
  • The activities involved in parsing,
    validating, optimizing, and executing a query.





  • Aim
  • Transform query written in high-level
    language (e.g. SQL), into correct and efficient
    execution strategy expressed in low-level
    language (implementing Relational Algebra).

High level user query
Query Processor
Low level data manipulation commands
5
Query Optimization
  • Defination
  • The activity of choosing an efficient execution
    strategy for processing a query.
  • Aim
  • To choose a transformation that minimizes
    resource usage.
  • Reduce total execution time of query.
  • May also reduce response time of query.
  • Problem computationally intractable with large
    number of relations, so strategy adopted is
    reduced to finding near optimum solution.

6
Example Comparison of different processing
strategies
  • Find all Managers who work at a London branch.
  • Query in SQL
  • SELECT FROM Staff s, Branch b
  • WHERE s.branchNo b.branchNo AND
  • (s.position Manager AND b.city London)
  • Assumptions
  • 1000 tuples in Staff
  • 50 tuples in Branch
  • 50 Managers
  • 5 London branches
  • no indexes or sort keys
  • results of any intermediate operations stored on
    disk
  • cost of the final write is ignored
  • tuples are accessed one at a time.

7
  • Three equivalent relational algebra queries are
  • (1) ?(position'Manager') ? (city'London') ?
    (Staff.branchNoBranch.branchNo) (Staff
    X Branch)
  • (2) ?(position'Manager') ? (city'London')(Staff
    X Staff.branchNoBranch.branchNo Branch)
  • (3) (?position'Manager'(Staff)) X
    Staff.branchNoBranch.branchNo (?city'London'(Bra
    nch))
  • Cost (in disk accesses) are
  • C1((1000 50) 2(1000 50)) 101 050
  • C2(21000 (1000 50)) 3 050
  • C3 (1000 250 5 (50 5)) 1 160
  • If we compare all three above costs C1, C2, C3,
    than best case is C3 with ration of 871

8
Dynamic versus Static Optimization
  • Dynamic Optimization Static
    Optimization
  • Information required to select an optimum
    strategy is up to date.
  • Performance of query is affected.
  • Time may limit finding optimum strategy.
  • The runtime overhead is removed, so gets more
    time to find optimum strategy.
  • Chosen optimum strategy may no longer be optimal
    when query is run.
  • A hybrid approach could be used to overcome this.

9
Phases of Query Processing
  • Query Processing can be divided into four main
    phases
  • Query Decomposition
  • Query Optimization
  • Code Generation
  • Runtime Query Execution

10
Query Decomposition
  • Aims are to transform high-level query into
    Relational Algebra query and check that query is
    syntactically and semantically correct.
  • Typical stages are
  • analysis
  • normalization
  • semantic analysis
  • simplification
  • query restructuring

11
Analysis
  • Analyze query lexically and syntactically using
    compiler techniques.
  • Verify relations and attributes exist.
  • Verify operations are appropriate for object
    type.
  • Example
  • SELECT studentNo
  • FROM Student
  • WHERE grade gt 80
  • In analysis stage query would be rejected on two
    points
  • 1. In the select list, the attribute studentNo is
    not defined for the Student relation. It may be
    defined as studentId instead of studentNo.
  • 2. In the WHERE clause, the comparison is
    incompatible with the data type grade, which is a
    variable character string like A,B,C,D,Fa
    il.

12
Query Tree
  • A Query tree is a tree structure that corresponds
    to a relational algebra expression such that
  • - A Leaf node created for each base relation.
  • - A Non-leaf node created for each
    intermediate relation produced ? by RA
    operation.
  • - The Root of tree represents query
    result.
  • - The Sequence is directed from leaves
    to root.

13
Normalization
  • Converts query into a normalized form for easier
    manipulation.
  • Predicate can be converted into one of two forms
  • Conjunctive normal form
  • (position 'Manager' ? salary gt 20000) ?
    (branchNo 'B003')
  • Disjunctive normal form
  • (position 'Manager' ? branchNo 'B003' ) ?
  • (salary gt 20000 ? branchNo 'B003')

14
Semantic Analysis
  • Objective of semantic analysis is to rejects
    normalized queries that are incorrectly
    formulated or contradictory.
  • Query is incorrectly formulated if components do
    not contribute to generation of result.
  • Query is contradictory if its predicated can not
    be satisfied by any tuple.
  • Algorithms to determine correctness exits only
    for queries that do not contain disjunction and
    negation. For these queries we could construct
  • 1. A relation connection graph.
  • 2. Normalized attribute connection graph.
  • A Relation Connection Graph
  • Create node for each relation and node for
    result.
  • Create edges between two nodes that represent a
    join, and edges between nodes that represent
    projection.
  • If not connected, query is incorrectly
    formulated.

15
  • Example Checking Semantic Correctness.
  • SELECT p.propertyNo, p.street
  • FROM Client c, Viewing v, PropertyForRent p
  • WHERE c.clientNo v.clientNo AND c.maxRent gt
    500
  • AND c.prefType Flat AND
    p.ownerNo CO93

16
  • Normalized Attribute Connection Graph
  • Create node for each reference to an attribute,
    or constant 0.
  • Create directed edges between nodes that
    represent a join, and directed edge between
    attribute node and 0 nodes that represents
    selection.
  • Weight edges a ? b with value c, if it represents
    inequality condition (a ? b c) weight edges 0
    ? a with c, if it represents inequality
    condition (a ? c).
  • If graph has cycle for which valuation sum is
    negative , query is contradictory.
  • Example Checking Semantic Correctness.
  • SELECT p.propertyNo, p.street
  • FROM Client c, Viewing v, PropertyForRent p
  • WHERE c.maxRent gt 500 AND c.clientNo
    v.clientNo AND v.propertyNo p.propertyNo
    AND
  • c.prefType Flat AND c.maxRent lt 200

17
(No Transcript)
18
Simplification
  • The objectives of the simplification stage are to
    deduct redundant qualifications, eliminate common
    subexperssions, and transform the query into more
    easily and efficiently computed form.
  • Typically, access restrictions, view definitions,
    and integrity constraints are considered.
  • Assuming user has appropriate access privileges,
    first apply well-known idempotency rules of
    Boolean algebra.
  • Query restructuring
  • The query is restructured to provide a more
    efficient implementation of query processing.

19
Methods of Query Optimization
  • 1. Heuristic based query optimization.
  • Perform Selection operations as early as
    possible.
  • Combine Cartesian product with subsequent
    selection whose predicate represents join
    condition into a Join operation.
  • Use associatively of binary operations to
    rearrange leaf nodes so leaf nodes with most
    restrictive Selection operations executed first.
  • Perform Projections operations as early as
    possible.
  • Eliminate duplicate computations.

20
  • 2. Cost based query optimization.
  • Objective of Cost-based query optimization is
    estimate the cost of different equivalent query
    expressions and chose the execution plan with the
    lowest cost.
  • Cost of query is concerned on disk access, which
    are slow compared to memory access.
  • Many of cost estimations are based on the
    cardinality of the relation.
  • DBMS holds the statistical information in its
    system catalog.
  • DBMS updates the statistical information daily
    or periodically as per organization requirements.

21
Transformation rules for the RA operations
  • Conjunctive Selection operations can cascade into
    individual Selection operations (and vice
    versa).
  • ?p?q?r(R) ?p(?q(?r(R)))
  • Sometimes referred to as cascade of Selection.
  • Example ?branchNo'B003' ? salarygt15000(Staff)
    ?branchNo'B003'(?salarygt150
    00(Staff))
  • Commutativity of Selection.
  • ?p(?q(R)) ?q(?p(R))
  • Example ?branchNo'B003'(?salarygt15000(Staff))
    ?salarygt15000(?branchNo'B003'(Staff))

22
  • In a sequence of Projection operations, only the
    last in the sequence is required.
  • ?L?M ?N(R) ?L (R)
  • Example ?lName?branchNo, lName(Staff) ?lName
    (Staff)
  • Commutativity of Selection and Projection.
  • If predicate p involves only attributes in
    projection list, Selection and Projection
    operations commute
  • ?Ai, , Am(?p(R)) ?p(?Ai, , Am(R)) ,where p?
    A1, A2, , Am
  • Example ?fName, lName(?lNameSmith'(Staff))

    ?lNameSmith'(?fName,lName(Staff))
  • Commutativity of Theta Join
  • R Xp S S Xp R
  • R X S S X R

23
  • Rule also applies to Equijoin and Natural join.
  • Example Staff X staff.branchNobranch.branchNo
    Branch Branch X staff.branchNobranch.branchN
    o Staff
  • Commutativity of Selection and Theta join (or
    Cartesian product).
  • If selection predicate involves only attributes
    of one of join
    relations, Selection
    and Join (or Cartesian product) operations
    commute as
  • ?p(R Xr S) (?p(R)) Xr S
  • ?p(R X S) (?p(R)) X S where p? A1,
    A2, , An

24
  • Commutativity of Projection and Theta join (or
    Cartesian product).
  • If projection list is of form L L1 ? L2, where
    L1 only has attributes of R, and L2 only has
    attributes of S, provided join condition only
    contains attributes of L, Projection and Theta
    join commute
  • ?L1?L2(R Xr S) (?L1(R)) Xr (?L2(S))
  • If join condition contains additional attributes
    not in L (M M1 ? M2 where M1 only has
    attributes of R, and M2 only has attributes of
    S), a final projection operation is require

?L1?L2(R Xr S) ?L1?L2( (?L1?M1(R)) Xr
(?L2?M2(S)))
Example?position,city,branchNo(StaffXStaff.bran
chNoBranch.branchNo Branch) (?position,
branchNo(Staff)) XStaff.branchNo
Branch.branchNo (?city, branchNo (Branch))
25
Questions
  • ???

26
  • 1. According to presentation how we can define
    Query Processing?
  • a) the Parsing, Planning, Optimizing, and
    Translating of a query
  • b) the Parsing, Analyzing, Customizing, and
    Execution of a query
  • c) the Parsing, Validating, Optimizing, and
    Execution of a query
  • d) the Formulating, Entering, Optimizing, and
    Translating of a query
  • e) none of above
  • 2. In which Phase of Query Processing are
    Database Statistics are used?
  • a) Runtime Query Execution
  • b) Query Optimization
  • c) Query Decomposition
  • d) Code Generation
  • e) both b) and c)
  • 3. Objective of Semantic Analysis is to reject
    normalized queries that are

27
  • 1. According to presentation how we can define
    Query Processing?
  • a) the Parsing, Planning, Optimizing, and
    Translating of a query
  • b) the Parsing, Analyzing, Customizing, and
    Execution of a query
  • c) the Parsing, Validating, Optimizing, and
    Execution of a query
  • d) the Formulating, Entering, Optimizing, and
    Translating of a query
  • e) none of above
  • 2. In which Phase of Query Processing are
    Database Statistics are used?
  • a) Runtime Query Execution
  • b) Query Optimization
  • c) Query Decomposition
  • d) Code Generation
  • e) both b) and c)
  • 3. Objective of Semantic Analysis is to reject
    normalized queries that are
Write a Comment
User Comments (0)
About PowerShow.com