Optimization of Distributed Queries - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

Optimization of Distributed Queries

Description:

Title: Data Mining Engineering Author: Peter Brezany Last modified by: brezany Created Date: 8/2/1995 10:08:02 AM Document presentation format: A4 Paper (210x297 mm) – PowerPoint PPT presentation

Number of Views:102
Avg rating:3.0/5.0
Slides: 42
Provided by: PeterB243
Category:

less

Transcript and Presenter's Notes

Title: Optimization of Distributed Queries


1
Optimization of Distributed Queries
  • Univ.-Prof. Dr. Peter Brezany
  • Institut für Scientific Computing
  • Universität Wien
  • Tel. 4277 39425
  • Sprechstunde Di, 13.00-14.00
  • LV-Portal www.par.univie.ac.at/brezany/teach/gck
    fk/300658.html

2
Layers of Query Processing
3
Query Optimization Process
4
Search Space
  • For a given query, the search space can be
    defined as the set of equivalent operator trees,
    that can be produced using transformation rules.
  • It is useful to concentrate on join trees,
    operator trees whose operators are join or
    Cartesian product.
  • This is because permutations of the join order
    have the most important effect on performance of
    relational queries.
  • Next example illustrates 3 equivalent join trees,
    which are obtained by exploiting the
    associativity of binary operators. Join tree (c)
    which starts with a Cartesian product may have a
    much higher cost than other join trees.

5
Search Space - Example
Example SELECT ENAME, RESP FROM EMP,
ASG, PROJ WHERE EMP.ENOASG.ENO AND ASG.PNOP
ROJ.PNO
6
Search Space Shape of the Join Tree
A linear tree at least one operand of each
operand node is a base relation. A bushy tree is
more general and may have operators whose both
operands are intermediate operators. In a
distributed environment, bushy trees are useful
in exhibiting parallelism.
7
Distributed Cost Model
  • An optimizers cost model includes
  • Cost functions to predict the cost of operators
  • Statistics and base data and formulas to evaluate
    the sizes of intermediate results.
  • Cost Functions can be expressed with respect to
    either the total time or the response time.The
    total time is the sum of all time (cost)
    components, the response time is the elapsed time
    from the initiation to the completion of the
    query.
  • Total_time TCPU insts TI/O I/Os TMSG
    msgs TTR bytes
  • TCPU the time of a CPU instruction
  • TI/O - the time of a disk I/O
  • TMSG - the fixed time of initiating and receiving
    a message
  • TTR - the time it takes to transmit a data unit
    from one site to another
  • Costs are generally expressed in terms of time
    units, which in turn, can be translated into
    other units (e.g., dollars).

8
Cost Function (cont.)
  • When the response time of the query is the
    objective function of the optimizer, parallel
    local processing and parallel communications must
    also be considered.
  • Response_time TCPU seq_insts TI/O
    seq_I/Os TMSG seq_msgs TTR seq_bytes
  • Example

Site 1
x units
Site 3
Site 2
y units
Most early distributed DBMSs designed for wide
area networks have ignored the local processing
cost and concentrate on minimizing the
communication cost. Total_time 2 TMSG TTR
(x y) Respone_time max TMSG TTR x,
TMSG TTR y since the transfers can be done
in parallel.
9
Cost Function (cont.)
  • Minimizing response time is achieved by
    increasing the degree of parallel execution.
  • This does not imply that the total time is also
    minimized.
  • On contrary, it can increase the total time, for
    example by having more parallel local processing
    (often includes synchronization overhead) and
    transmissions.
  • Minimizing the total time implies that the
    utilization of the resources improves, thus
    increasing the system throughput.
  • In practice, a compromise between the total and
    response times is desired.

10
Database Statistics
  • The main factor affecting the performance is the
    size of the intermediate relations that are
    produced during the execution.
  • When a subsequent operation is located at a
    different site, the intermediate relation must be
    transmitted over the network. ? It is of prime
    interest to estimate the size of the intermediate
    results in order to minimize the size of data
    transfers.
  • The estimation is based on statistical
    information about the base relations and formulas
    to predict the cardinalities of the results of
    the relational operations. ? the more precise
    statistics being the more costly.
  • For a relation R defined over the attributes A
    A1, A2, ..., An and fragmented as R1, R2, ...,
    Rr, the statistical data are the following

11
Database Statistics (cont.)
12
Database Statistics (cont.)
13
Database Statistics (cont.)
14
Database Statistics (cont.)
15
Centralized Query Optimization- INGRES Algorithm
  • Why reviewing centralized optimizations?
  • A distributed query is is translated into local
    queries, each of which is processed in a
    centralized way.
  • Distributed techniques are extensions of
    centralized ones
  • Centralized optimization is a simpler problem
    the minimization of communication costs makes
    distributed query optimization more complex.
  • INGRES is a popular relational DB system and it
    has a distributed version whose optimization
    algorithms are extensions of the centralized
    version.

16
INGRES Algorithm
  • It uses a dynamic query optimization algorithm
    that recursively breaks-up a calculus query into
    smaller pieces.
  • It combines calculus-algebra decomposition and
    optimization.
  • A query is first decomposed into a subsequence of
    queries having a unique relation in common. Then
    each monorelation query is processed by a
    one-variable query processor (OVQP).
  • The OVQP optimizes the access to a single
    relation by selecting the best access method to
    that relation (e.g., index, sequential scan).

17
INGRES Algorithm (cont.)
  • By q qi-1 ? qi we denote a query q decomposed
    into 2 subqueries qi-1 and qi, where qi-1 is
    executed first and its result is consumed by qi.
  • Given an n-relation query q, the INGRES query
    processor decomposes q into n subqueries q1 ? q2
    ? ... ? qi. This decomposition uses two basic
    techniques detachment and substitution.
  • Detachment is used first it breaks q into q
    ? q, based on a common relation that is the
    result of q.
  • A more detailed explanation and examples
    follow.

18
INGRES Algorithm - Detachment
If the query q is expressed in SQL is of the form
19
Database Example
20
INGRES Algorithm Detachment - Example
Running Example for INGRES To illustrate the
detachment technique, we apply it to the
following query Names of employees working
on the CAD/CAM project This query can be
expressed by the following query q1 on our
example engineering DB.
21
INGRES Algorithm - Substitution
22
INGRES Algorithm Substitution - Example
OVQP one-variable query processor
23
INGRES Algorithm (formalized)
24
JOIN ORDERING IN FRAGMENT QUERIES
  • Ordering joins is an important aspect of
    centralized query optimization.
  • Join ordering in a distributed context is even
    more important since joins between fragments may
    increase the communication time.
  • 2 basic approaches exist to order joins in
    fragment queries
  • direct optimization of the ordering of joins
    (e.g. in the Distributed INGRES algorithm).
  • replacement of joins by combination of semijoins
    in order to minimize communication costs.

25
Join Ordering
  • Note To simplify notation, we use the term
    relation to designate a fragment stored at a
    particular site.
  • Lets the query is R ? S, where R and S are
    relations stored at different sites and ? denotes
    the join operator. The obvious choice is to send
    the smaller relation to the site of the larger
    one.

if size(R) lt size(S)
R
S
if size(R) gt size(S)
More interesting is the case where there are more
than 2 relations to join. The objective of the
join ordering algorithm is to transmit smaller
operands. The difficulty the join operations may
reduce or increase the size of intermediate
results ? estimating the size of joint results
is mandatory, but difficult.
26
Join Ordering - Example
Example Consider the following query expressed
in relat. alg. PROJ ?PNO
EMP ?ENO ASG whose join graph is below
Site 2
ASG
ENO
PNO
EMP
PROJ
Site 3
Site 1
This query can be executed in at least 5
different ways. We describe them by the programs
introduced in the next slide.
27
Join Ordering Example (cont.)
28
Semijoin Based Algorithms
  • The join of 2 relations R and S over attribute A,
    stored at sites 1 and 2 respectively, can be
    computed by replacing one or both operand
    relations by a semijoin with the other relation,
    using the following rules

Strategy 1
Strategy 1
29
Horizontal fragmentation
30
Vertical fragmentation
31
Hybrid (Mixed) Fragmentation
In most cases a simple horizontal or vertical
fragmentation will not be sufficient to satisfy
the requirements of user applications. In this
case a vertical fragmentation may be followed by
a horizontal one, or vice versa, producing a
tree-structured partitioning.
Hybrid fragmentation
Reconstruction of hybrid fragmentation
32
Hybrid Fragmentation (cont.)
Relation A
Fragment H1
Fragment H2V2
Fragment H2V1H1
Fragment H2V1H2
Fragment H2V1H3
33
Distributed INGRES Algorithm
  • Only horizontal fragmentation is handled.
  • General and broadcast (the same data unit can be
    transmitted from one site to all the other sites
    in a simple transfer) network topology
    considered.
  • E.g., broadcasting is used to replicate fragments
    and then to maximize the degree of parallelism.
  • The input is a query expressed in tuple
    relational calculus and schema information the
    network type the location and size of each
    fragment.
  • The algorithm is executed by the site, called the
    master site, where the query is initiated.
  • The algorithm called D-INGRES-QOA is given in
    Algorithm 9.3. in the next slide.

34
Distributed INGRES Algorithm
MRQ_list
35
Distributed INGRES Algorithm (cont.)
  • All monorelation queries (e.g. selection and
    projection) that can be detached are first
    processed locally Step (1)
  • Reduction algorithm is applied to the original
    query Step (2). (Reduction is a technique that
    isolates all irreducible subqueries and
    monorelation subqueries by detachment.)
    Monorelation queries are ignored because they
    have already been processed in step (1). Thus the
    REDUCE procedure produces a sequence of
    irreducible subqueries q1 ? q2 ? ... ? qn, with
    at most one relation in common between two
    consecutive subqueries. Our Running Example in
    the previous slides, which illustrated the
    detachment technique, also illustrates what the
    REDUCE procedure would produce.
  • Based on the list of irreducible queries isolated
    in step (2) and the size of each fragment, the
    next subquery MRQ, which has at least 2
    variables, is chosen at Step (3.1) and Steps
    (3.2), (3.3), and 3.4 are applied to it.
  • Step 3.2 selects the best strategy to process the
    query MRQ. This strategy is described by a list
    of pairs (F, S), in which F is a fragment to
    transfer to the processing site S.
  • Step 3.3 transfers all the fragments to their
    processing sites.
  • Step 3.4 executes MRQ.
  • If there are remaining subqueries, the algorithm
    goes back to step (3) and performs the next
    iteration. Otherwise, the algorithm terminates.

36
Distributed INGRES Algorithm (cont.)
  • Optimization occurs in steps (3.1) and (3.2). The
    algorithm has produced subqueries with several
    components and their dependency order (similar to
    one given by a relational algebra tree).
  • At step (3.1) a simple choice for the next
    subquery is to take the next one having no
    predecessor and involving the smaller fragments.
    This minimize the size of the intermediate
    results.
  • E.g., if a query q has the subqueries q1, q2, and
    q3, with dependencies q1 ? q3, q2 ? q3, and if
    the fragments referred to by q1 are smaller than
    those referred to by q2, then q1 is selected.
  • The subquery selected must then be executed.
    Since the relation involved in a subquery may be
    stored at different sites and even fragmented,
    the subquery may nevertheless be further
    subdivided.

37
Distributed INGRES Algorithm (cont.) - Example
Assume that relations EMP, ASG, and PROJ of the
query of our Running Example are stored as
follows, where relation EMP is fragmented.
38
Distributed INGRES Algorithm (cont.)
  • At step (3.2), the next optimization problem is
    to determine how to execute the subquery by
    selecting the fragments that will be moved and
    the sites where the processing will take place.
  • For an n-relation subquery, fragments from n-1
    relations must be moved to the site(s) of
    fragments of the remaining relation, say Rp, and
    then replicated there. Also, the remaining
    relation may be further partitioned into k
    equalized fragments in order to increase
    parallelism. This method is called
    fragment-and-replicate and performs a
    substitution of fragments rather than of tuples
    as in centralized INGRES.
  • The selection of the remaining relation and of
    the number of processing sites k on which it
    should be partitioned is based on the objective
    function and the topology of the network.
    (Replication is cheaper in broadcast networks
    than in point-to-point networks).

39
Distributed INGRES Algorithm (cont.) - Example
40
Architecture of Distributed DBMS Revisited
Detailed Model of the Distributed
Execution Monitor
Components of a Distributed DBMS
41
Architecture Revisited (cont.)
  • The Transaction Manager (TM) is responsible for
    coordinating the execution of the DB operations
    on behalf of an application.
  • The Scheduler (SC) is responsible for the
    implementation of a specific concurrency control
    algorithm for synchronizing access to the
    database.
  • Each transaction originates at one site its
    originating site. The execution of the database
    operations of a transaction is coordinated by the
    TM at that transactions originating site.
  • The TMs implement an interface for the
    applications programs which consists of 5
    commands
  • Begin_transaction. This is an indicator to the TM
    that a new transaction is starting. The TM does
    some bookkeeping, such as recording the
    transactions name, the originating application,
    etc.
  • Read. If the data item x is stored locally, ist
    value is read and returned to the transaction.
    Otherwise, the TM selects one copy of x and
    requests ist copy to be returned.
  • Write. The TM coordinates the updating ofn xs
    value at each site where it resides.
  • Commit. The TM coordinates the physical updating
    of all databases that contain copies of each data
    item for which a previous write was issued.
  • Abort. The TM makes sure that no effects of the
    transaction are reflected in the DB.
  • In providing these services, a TM can
    communicate with SCs and data processors at the
    same or at different sites.
Write a Comment
User Comments (0)
About PowerShow.com