Communicating with the Outside

About This Presentation

Title:

Communicating with the Outside

Description:

Communicating with the Outside – PowerPoint PPT presentation

Number of Views:2

Avg rating:3.0/5.0

Slides: 17

Provided by: shaw1179

more less

Transcript and Presenter's Notes

Title: Communicating with the Outside

1
Communicating withthe Outside
2
Overview
3
Package several SQL statements within one call to
the database server

Embedded procedural language (Transact SQL) with
control flow facilities.
Use User Defined Functions (UDFs) when they
select out a high number of records.
Do not use positioned updates, which update rows
that are obtained as the result of a query,
forcing updates one row at a time.

4
Packing sqls into one call to db server
create table Temp1 (parent varchar(200)) create
table Temp2(parent varchar(200)) create table
Ancestor (person varchar(200)) / Temp2will hold
the latest generation discovered. / INSERT INTO
Temp1 SELECT parent FROM Parental WHERE child
Nicholas Bennet WHILE EXISTS(SELECT FROM
Temp1) BEGIN INSERT INTO Ancestor SELECT FROM
Temp1 INSERT INTO Temp2 SELECT FROM
Temp1 DELETE FROM Temp1 INSERT INTO
Temp1 SELECT Parental.parent FROM Parental,
Temp2 WHERE Parental.child Temp2.parent DELETE
FROM Temp2
IF EXISTS ( SELECT FROM Ancestor WHERE person
Carol Diane ) PRINT Carol Diane is an ancestor
of Nicholas Bennet. ELSE PRINT Carol Diane is
not an ancestor of Nicholas Bennet.
T-SQL to determines whether Carol Diane is an
ancestor ofNicholas Bennet. Assume a genealogical
database containing at least the relation
Parental(parent, child).
5
User Defined Functions

Function computes the number of working days
between two dates.
Function executed either on the database site
(UDF) or on the application site
Applying the UDF yields good performances when it
helps reduce significantly the amount of data
sent back to the application.

6
Retrieve Needed Columns Only

Avoid transferring unnecessary data
Might prevent the use of a covering index, i.e.
index-only scan
In the experiment the subset contains ¼ of the
attributes.
Reducing the amount of data that crosses the
application interface yields significant
performance improvement.

7
Retrieve Needed Rows Only

If the user is only viewing a small subset of a
very large result set, it is best to
Only transfer that subset
Only compute that subset
In cases when users only want a feel of the
data, use TOP or FETCH FIRST to fetch just a few
rows
Applications that allow the formulation of ad-hoc
queries(An Ad Hoc Query is a query that cannot be
determined prior to the moment the query is
issued I ) should permit users to cancel them.

8
Minimize the Number of Query Compilations

Prepared execution yields better performance when
the query is executed more than once
No compilation
No access to catalog.
Prepared execution plans become obsolete if
indexes are added or the size of the relation
changes.

Experiment performed on Oracle8iEE on Windows
2000.
Direct ODBC direct execution Prepared ODBC
prepared command, then execute it repeatedly .
9
Tuning the Application Interface

Avoid user interaction within a transaction
Minimize the number of roundtrips between the
application and the database
Retrieve needed columns only
Retrieve needed rows only
Minimize the number of query compilations

10
Bulk Loading Data

Purpose loading large volumes of data into a
database
Tools
SQL server bcp and a Transact-SQL command BULK
INSERT.
Oracle SQL Loader
DB2 Load utility
Tool parameters
Bypass query engine
Avoid logging
No index update
No constraint check
Frequency of commits

11
Use Direct Path for Bulk Loading

sqlldr directpathtrue controlload_lineitem.ctl
dataE\Data\lineitem.tbl
load data
infile "lineitem.tbl"
into table LINEITEM append
fields terminated by ''
(
L_ORDERKEY, L_PARTKEY, L_SUPPKEY, L_LINENUMBER,
L_QUANTITY, L_EXTENDEDPRICE, L_DISCOUNT, L_TAX,
L_RETURNFLAG, L_LINESTATUS, L_SHIPDATE DATE
"YYYY-MM-DD", L_COMMITDATE DATE "YYYY-MM-DD",
L_RECEIPTDATE DATE "YYYY-MM-DD", L_SHIPINSTRUCT,
L_SHIPMODE, L_COMMENT
)

12
Direct Path

Loading 600000 records into the lineitem relation
from TPCH
Direct path loading bypasses the query engine and
the storage manager. It is orders of magnitude
faster than conventional path (with a commit
every 100 records) and inserts (with a commit for
each record).

Experiment performed on Oracle8iEE on Windows
2000.
13
Batch Size

Bulk load of 600000 records.
Throughput increases steadily when the batch size
increases to 100000 records.Throughput remains
constant afterwards.
Trade-off between performance and amount of data
that has to be reloaded in case of problem.

Experiment performed on SQL Server 2000 on
Windows 2000.
14
Storage Engine Parameters

Bulk load of 600000 records.
As expected
Turning off logging helps.
Collecting statistics hurts
Maintaining indexes incrementally hurts a lot.

Experiment performed on IBM DB2 UDB V7.1 on
Windows 2000.
15
Connecting to Multiple Databases

Shared connection to reduce start-up cost
Connection pooling
Pass-through statements when performance is CPU
bound
E.g. dblink in Oracle
Eliminates query rewriting, translation and
postprocesing to adapt to specific SQL dialect
Transfer large block of data to minimize
roundtrips when performance is network bound,

16
summary

Write a Comment

User Comments (0)