Parallel%20Execution%20Plans - PowerPoint PPT Presentation

About This Presentation
Title:

Parallel%20Execution%20Plans

Description:

Allows single query to use multiple processors ... Inserts are grouped into intermediate size batches. Data Population Scripts 2 ... – PowerPoint PPT presentation

Number of Views:123
Avg rating:3.0/5.0
Slides: 48
Provided by: joec9
Category:

less

Transcript and Presenter's Notes

Title: Parallel%20Execution%20Plans


1
Parallel Execution Plans
  • Joe Chang
  • jchang6_at_yahoo.com
  • www.sql-server-performance.com/joe_chang.asp

2
Parallel Execution Plans
  • Allows single query to use multiple processors
  • Query should run faster but may consume more
    resources
  • Example
  • 1 thread 10 sec run time, 10 CPU-sec
  • 2 threads 6 sec run time, 12 CPU-sec

3
Parallel Execution Configuration
  • Cost Threshold For Parallelism
  • Minimum query plan threshold for considering
    queries for parallel execution
  • Default 5 Considering increasing to 20-50 for
    new systems
  • Max Degree of Parallelism
  • Default 0 Can use all available processors
  • SQL Server determines level based on available
    memory and recent CPU usage

4
Parallel Plan Operators
The Distribute Streams operator consumes a single
input stream of records and produces multiple
output streams. The record contents and format
are not changed. Each record from the input
stream appears in one of the output streams. This
operator automatically preserves the relative
order of the input records in the output streams.
Usually, hashing is used to decide to which
output stream a particular input record belongs.
The Repartition Streams operator consumes
multiple streams and produces multiple streams of
records. The record contents and format are not
changed. Each record from an input stream is
placed into one output stream. If this operator
is order-preserving, then all input streams must
be ordered and merged into several ordered output
streams.
The Gather Streams operator consumes several
input streams and produces a single output stream
of records by combining the input streams. The
record contents and format are not changed. If
this operator is order-preserving, then all input
streams must be ordered.
5
Execution Plan Cost Formulas
Index Seek Plan Formula I/O Cost 0.006328500
0.000740741 per additional page (1GB)
0.003203425 0.000740741 per additional page
(gt1GB) CPU Cost 0.000079600 0.000001100 per
additional row
Bookmark Lookup May have changed ? I/O Cost
multiple of 0.006250000 (1GB)
multiple of 0.003124925 (gt1GB) CPU Cost
0.0000011 per row
Table Scan or Index Scan I/O 0.0375785
0.0007407 per page CPU 0.0000785 0.0000011
per row
Table Scan or Index Scan IUD I/O Cost 0.01002
0.01010 (gt100 rows) IUD CPU Cost 0.000001 per
row
6
Cost Interpretation
  • Time in seconds? CPU time?

Too fast for 7200RPM disk random I/Os.
0.0062500sec -gt 160/sec 0.000740741 -gt1350/sec
(8KB) -gt169/sec(64K)-gt 10.8MB/sec
About right for 1997 sequential disk transfer
rate?
S2K BOL Administering SQL Server, Managing
Servers, Setting Configuration Options cost
threshold for parallelism Opt Query cost refers
to the estimated elapsed time, in seconds,
required to execute a query on a specific
hardware configuration.
7
Test Table
CREATE TABLE M3A_20 ( GroupID int NOT NULL, ID
int NOT NULL, ID2 int NOT NULL, ID3 int NOT
NULL, ID4 int NOT NULL, sID smallint NOT
NULL, bID1 bigint NOT NULL, bID2 bigint NOT
NULL, bID3 bigint NOT NULL, rMoney money NOT
NULL, rDate datetime NOT NULL, rReal real NOT
NULL, rDecimal decimal (9,4) NOT
NULL, CONSTRAINT PK_M3A_20 PRIMARY KEY
CLUSTERED ( GroupID, ID ) WITH FILLFACTOR
100 ) GO
8
Data Population Script 1
SET NOCOUNT ON DECLARE _at_BatchTotal int,
_at_BatchSize int, _at_TotalRows int, _at_BatchStart int,
_at_BatchEnd int, _at_BatchRow int, _at_I int,
_at_RowsPerPage bigint , _at_Card int , _at_DistinctValues
int SELECT _at_BatchStart1, _at_BatchEnd1000,
_at_BatchTotal1000, _at_BatchSize100000,
_at_RowsPerPage100, _at_Card100000 SELECT
_at_TotalRows_at_BatchTotal_at_BatchSize SELECT
_at_I(_at_BatchStart-1)_at_BatchSize1,
_at_DistinctValues_at_TotalRows/_at_Card WHILE
_at_BatchStart lt _at_BatchEnd BEGIN BEGIN
TRANSACTION SELECT _at_BatchRow
_at_BatchStart_at_BatchSize WHILE _at_I lt _at_BatchRow
BEGIN INSERT M3A_20 (GroupID, ID, ID2, ID3,
ID4, sID, bID1, bID2, bID3, rMoney, rDate, rReal,
rDecimal) VALUES ( 1, _at_I, _at_TotalRows-_at_I1,
(_at_I-1)/_at_Card1, (_at_TotalRows-_at_I)_at_Card1,
_at_I32768, _at_I, (_at_I-1)_at_Card1,
1(_at_I-1)_at_RowsPerPage/_at_TotalRows((_at_I-1)_at_RowsPerP
age)_at_TotalRows, 10000rand(),
DATEADD(hour,_at_I3000000,'1900-01-01'),
10000rand(), 10000rand() ) IF _at__at_ERROR gt 0
BEGIN GOTO B END SET _at_I _at_I1 END COMMIT
TRANSACTION CHECKPOINT PRINT CONVERT(varchar,GETDA
TE(),121) ', row ' CONVERT(varchar,_at_BatchRow)
SET _at_BatchStart _at_BatchStart1 END B IF
_at__at_TRANCOUNT gt 0 COMMIT TRANSACTION PRINT '01
Complete ' CONVERT(varchar,GETDATE(),121) ',
row ' CONVERT(varchar,_at_BatchRow) ',
Trancount ' CONVERT(varchar(10),_at__at_TRANCOUNT)
9
Data Population Script 1 Notes
  • Double While Loop
  • Each Insert/Update/Delete statement is an
    implicit transaction
  • Gets separate transaction log entry
  • Explicit transaction generates a single
    transaction log write (max 64KB per IO)
  • Single TRAN for entire loop requires excessively
    large log file
  • Inserts are grouped into intermediate size batches

10
Data Population Scripts 2
Primary table populated using single row inserts
in a WHILE loop, Additional tables populated with
INSERT / SELECT statement Single row inserts
20-30K rows/sec INSERT / SELECT statement 100K
rows/sec
DECLARE _at_L int SELECT _at_L 1 WHILE _at_L lt 3 BEGIN
INSERT M3A_11 (GroupID,ID,ID2,ID3,ID4,sID,bID1,bID
2,bID3,rMoney,rDate,rReal, rDecimal) SELECT TOP
500000 GroupID, ID, 1500001-ID, ID3, ID4, sID,
bID1, bID2, bID3, rMoney, rDate, rReal, rDecimal
FROM M3A_20 WHERE GroupID 1 AND ID BETWEEN
(_at_L-1)5000001 AND _at_L500000 SELECT _at_L _at_L 1
CHECKPOINT PRINT '11 Step '
CONVERT(varchar,_at_L) ', ' CONVERT(varchar,GETDA
TE(),121) END UPDATE STATISTICS M3A_01
(PK_M3A_01) WITH FULLSCAN CREATE STATISTICS ST_01
ON M3A_01 (ID) WITH FULLSCAN, NORECOMPUTE
11
Index Seek Plans
Many rows returned, Non-parallel plan
Cost 9.34
Parallel Execution disabled
Cost 9.82
Cost 4.94
Parallel Plan
12
Index Seek Details
Non-parallel plan
Parallel plan
13
Index Seek Non-parallel
Cost assigned to SELECT
Index Seek, 1M rows in 11,115 pages (81
bytes/row, 90 Fill) I/O cost is 8.2365 CPU Cost
is 1.1000785 Cost sub-tree Cost is correct, I/O
CPU is ½ of correct value
14
Index Seek Parallel Plan
No cost assigned to SELECT
Index Seek I/O and CPU cost ½ of non-parallel
plan
15
Index Seek with Aggregate
1
2
3
4
16
Index Seek Aggregate Parallel Plan Details
2
4
1
3
17
Table Scan
Cost 9.01
Cost 8.26
18
Table Scan Details
Non-parallel plan
Parallel plan
I/O cost same CPU cost ½ of non parallel plan
19
Table Scan Details
Non-parallel plan
Parallel plan
No cost
No cost on Select
I/O cost same CPU cost ½ of non parallel plan
20
Parallel Plan Cost Formulas Patterns
  • CPU costs are ½ of non-parallel plan
  • Index Seek I/O cost are also ½
  • Scan I/O cost is same as non-parallel plan
  • Parallel plan costs are based on 2 processors
  • Actual number of processors determined at runtime
  • Overhead operations
  • Distribute, Repartition Gather Streams

21
Hash Join
200,000 rows 15 byte OS row size
Cost 6.50
Cost 4.79
22
Hash Join Details
Non-parallel plan
Parallel plan
23
Hash Join Details
Non-parallel plan
Parallel plan
24
Hash Join Non-parallel plan
25
Hash Join Parallel Plan
2
3
4
1
2
4
1
3
26
Hash Join with I/O Cost
900,000 rows MAXDOP 1
Cost 85.1
Cost 74.1
27
Hash Join Join I/O Cost
730,000 rows
740,000 rows
28
Hash Join - Bitmap
29
Hash Join Cost Formula
Index Seek Plan Formula I/O Cost 0.006328500
0.000740741 per additional page (1GB)
0.003203425 0.000740741 per additional page
(gt1GB) CPU Cost 0.000079600 0.000001100 per
additional row
Hash Join CPU Cost 0.017750000 base
0.0000001749 (2-30 rows) 0.0000000720 (100
rows) 0.000015091 per row 0.000015857
(parallel) 0.000001880 per row per 4 bytes
in OS 0.000005320 per additional row in
IS I/O Cost 0.000042100 per row over 64MB (Row
Size8) 0.0000036609 per 4 byte over 15B
30
Parallel Cost Formula
Base Cost 0.028500 Repartition Stream Cost per
row 0.0000024705 Base (15 Bytes)
0.000000759 per additional 4 Bytes Gather
Stream Cost per row 0.0000018735 Base(15)
0.000000759 per additional 4 Bytes Dispatch
31
Loop Join
32
Loop Join Details
Non-parallel plan Outer Source
Parallel plan Outer Source
33
Loop Join Details
Inner Source cost identical for both non-parallel
and parallel plans
34
Loop Join Details
Non-parallel plan
Parallel plan
35
Merge Join
36
Merge Join Details
Non-parallel plan
Parallel plan
37
Merge Join Details
Non-parallel plan
Parallel plan
38
Merge Join Details
Non-parallel plan
Parallel plan
39
Index Seek Aggregate Test
Xeon 2.4GHz/512K
Opteron2.2GHz 1M
40
Index Seek Aggregate Test, Itanium 2
Itanium 2 1.5GHz/6M
41
Index Seek Aggregate Test, SUM(INT)
Itanium 2 1.5GHz/6M
42
Index Seek Aggregate Test, NULL
Itanium 2 1.5GHz/6M
43
Loop Join, COUNT()
Itanium 2 1.5GHz/6M
44
Hash Join, COUNT()
Itanium 2 1.5GHz/6M
45
Merge Join, COUNT()
Itanium 2 1.5GHz/6M
46
General Recommendations
  • Useful in DW, ETL, and maintenance activities
  • Use judgment on transactions processing
  • Is throughput more important
  • Or faster expensive queries
  • Increase Cost Threshold from 5 to 20-50
  • Limit MAXDOP to 4
  • Verify or limit parallelism on Xeon systems with
    Hyper-Threading enabled

47
Additional Information
www.sql-server-performance.com/joe_chang.asp SQL
Server Quantitative Performance Analysis Server
System Architecture Processor Performance Direct
Connect Gigabit Networking Parallel Execution
Plans Large Data Operations Transferring
Statistics SQL Server Backup Performance with
Imceda LiteSpeed jchang6_at_yahoo.com
Write a Comment
User Comments (0)
About PowerShow.com