www'SageLogix'Com - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

www'SageLogix'Com

Description:

Pseudo-randomizing such data with REVERSE key indexes relieves this performance problem ... All buckets assumed to have the same number of rows ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 43
Provided by: TimGo6
Category:
Tags: sagelogix | com | www

less

Transcript and Presenter's Notes

Title: www'SageLogix'Com


1
International Oracle Users Group Live 2004
Technical Session 549Understanding
IndexesTim GormanPrincipal - SageLogix,
Inc.Email tim_at_sagelogix.com
  • www.SageLogix.Com

2
Agenda
  • BTree index architecture
  • Issues with BTree indexes in Oracle
  • Real issues
  • Sparsely-populated indexes
  • Contention on INSERT
  • Uneven data distribution
  • Low data cardinality (a.k.a. selectivity)
  • Imagined or mythical issues
  • Indexes needing rebalancing
  • ???
  • Some interesting options for indexing
  • Function-based, descending, compression

3
BTree index architecture
  • Oracle implements balanced BTree indexes as its
    primary indexing method
  • Tree structure consisting of root, branch, and
    leaf nodes
  • each node is one database block in Oracle
  • root node is the top-level branch node
  • starting point for searches into the index
  • branch nodes are connectors
  • point to other branch nodes or to leaf nodes
  • leaf nodes
  • contain data values and pointers to rows

4
BTree index architecture
  • Branch entries contain
  • Max data value piece
  • DBA of next level branch or leaf
  • Leaf entries contain
  • Data value
  • ROWID of table row

Index PK_X
root

branch
branch
branch
branch

leaf
leaf
leaf
leaf
leaf
leaf
leaf


Table X
5
BTree index architecture
  • Index read I/O is sequential in nature
  • Oracle wait-event for single-block I/O requests
    used during indexed access
  • db file sequential read
  • Used with RANGE, UNIQUE, and FULL scans
  • FAST FULL index-scans are an exception
  • Uses sequential multiblock I/O similar to FULL
    table-scans
  • Conventional wisdom that tables and indexes must
    be separated to different tablespaces to enable
    some form of parallel I/O is a myth
  • Think of how a treasure hunt is performed.

6
BTree index architecture
  • When indexes are built on populated tables using
    CREATE INDEX or ALTER INDEX xxx REBUILD
  • Table is scanned and column values are sorted
  • Leaf blocks of index are populated first
  • Then supporting branch levels are built to
    support those leaves
  • All blocks are filled up to PCTFREE threshold

7
BTree index architecture
  • When indexes are built transactionally using
    conventional-path INSERT or UPDATE statements
  • How the indexes grow is dependent upon two
    possible design decisions
  • Optimize either for
  • random data values
  • Or
  • sequentially ascending data values

8
BTree index architecture
  • If optimizing for random data values
  • As blocks fill, split them in half, move upper
    half to new block, leave lower half behind
  • Leaves behind half-full blocks
  • Allows back-filling as lower data values occur
  • If optimizing for sequential data values
  • As blocks fill, simply overflow into a new block
  • Leaves behind full blocks
  • No need to worry about back-filling as lower data
    values could never occur

9
BTree index architecture
Grows like this
Overflow
  • Overflowing uses space more efficiently in a
    heap (non-sorted) data structure
  • Also uses space more efficiently when optimizing
    sequential data values

Split
Grows like this
  • Splitting anticipates back-filling in a sorted
    data structure
  • Therefore uses space more efficiently when
    optimizing random data values

10
BTree index architecture
  • If the last index entry inserted is the highest
    data value in the block
  • Then data will overflow into the next block on
    the same level
  • If the last index entry inserted is not the
    highest data value in the block
  • Then the current block will split, with the lower
    set of values remaining in place and the higher
    set of values moving to the next block on the
    same level

11
BTree index architecture
  • If the branch block above cannot accommodate
    another child
  • Then this algorithm is applied recursively first
  • If this algorithm iterates recursively all the
    way back to the root block, and the root becomes
    full
  • Then the root splits/overflows to generate a new
    branch level (BLEVEL)
  • Root block stays in place, generates two new
    blocks at a new 1st BLEVEL below

12
BTree index architecture
  • Balanced BTree indexes grow balanced by design,
    automatically
  • New levels are added to the tree structure above,
    dynamically, as needed
  • Indexes do not shrink automatically when
    deletions occur
  • Instead, empty index entries are left in place
  • For possible reuse by newly inserted data
  • ROWID (pointer to data) portion is set to NULL
  • Deletions may cause an index to become sparse and
    thus less efficient over time

13
Issues with BTree indexes
  • BTree indexes optimize
  • Both the random and sequential data values!
  • Even distribution of distinct data values
  • High data cardinality or selectivity of values
  • As a result, issues with BTree indexes are
  • Sparseness
  • Resulting from data deletions
  • Block contention when inserting sequential data
  • Uneven distribution of data values
  • Popular and unpopular data values
  • Low data cardinality

14
Sparsely-populated indexes
  • Indexes never become unbalanced over time
  • If so, then why does performance sometimes
    deteriorate over time?
  • Instead, indexes can become sparsely populated
    due to
  • Deletions of row data
  • Unfortunate patterns of inserted random data
  • Sparseness simply means that more I/O is required
    to perform the same work

15
Sparsely-populated indexes
  • How is this condition detected?
  • ANALYZE INDEX VALIDATE STRUCTURE command
  • Populate session-private view named INDEX_STATS
    with only one row
  • Column PCT_USED is the average percentage of
    space utilized in the blocks belonging to the
    index
  • Derived from the ratio of the value of the
    columns USED_SPACE and BTREE_SPACE
  • Expect PCT_USED to be 90 by default
  • Lesser values may indicate sparseness developing
  • Best to watch values over time

16
Sparsely-populated indexes
  • How is this condition detected?
  • If PCT_USED is much less than 90
  • Non-zero values in the column DEL_LF_ROWS
  • Cause of sparseness is probably row deletions, if
    the value in the column DEL_LF_ROWS is a large
    percentage of value in column LF_ROWS

17
Sparsely-populated indexes
  • Rebuild or coalesce are two possible solutions
    for index sparseness
  • Dont guess!!!
  • Make certain using ANALYZE INDEX VALIDATE
    STRUCTURE
  • Be aware that the ANALYZE command locks the
    index, however
  • To rebuild an index
  • Use ALTER INDEX REBUILD
  • To coalesce space within an index
  • Use ALTER INDEX COALESCE

18
Sparsely-populated indexes
  • ALTER INDEX REBUILD is the most commonly-used
    solution for sparsely-populated indexes
  • Unlike CREATE INDEX, uses the existing index as
    the source, which is faster because
  • Index is usually smaller than the table (less
    I/O)
  • Index is already sorted (some sorting still
    needed)
  • Other features of ALTER INDEX REBUILD include
  • Parallel execution (throw more resources at task)
  • Direct-path operations (no rollback/undo
    generated)
  • NOLOGGING (no redo generated)
  • COMPUTE STATISTICS (better CBO stats cheaply)
  • ONLINE (allows full usage of index during task)

19
Sparsely-populated indexes
  • ALTER INDEX COALESCE is another alternative to
    rebuilding
  • Merges unused space within indexes to free blocks
    for reuse
  • Slower than ALTER INDEX REBUILD because it is a
    transactional operation, not a bulk operation
  • No direct-path, parallel, nologging, etc
  • But COALESCE is implicitly an ONLINE operation

20
Contention on INSERT
  • Multiple concurrently-executing INSERTs into an
    index on column(s) with sequential data values
    can also bottle-neck on buffer busy waits
    (a.k.a. block-level contention)
  • Pseudo-randomizing such data with REVERSE key
    indexes relieves this performance problem
  • Instead of many processes attempting to insert
    index entries into the right-most leaf block
  • Insertions are evenly scattered over all of the
    available leaf blocks

21
Contention on INSERT
  • CREATE ALTER INDEX REVERSE
  • Pseudo-randomizes non-random data
  • By flipping or reversing the physical order
    of the data value during storage
  • Causes sequential data values to become more
    random by simply reversing data
  • 123456 becomes 654321
  • 123457 becomes 754321, etc...
  • Index data values are transparently converted
    (flipped) and unconverted (unflipped) upon insert
    and retrieval, respectively

22
Contention on INSERT
  • Adverse impacts of using REVERSE indexes
  • only equivalence operations will use the index
  • , !, ltgt, IN, and NOT IN
  • range-scans will not use the index
  • gt, gt, lt, lt, LIKE, BETWEEN
  • But, is this really a problem?
  • Depends on how the sequential data values are
    used
  • Sequence numbers or surrogate system-generated
    keys are usually sought using equivalence
    operations
  • No problem here
  • But timestamps are often range-scanned
  • There is likely a problem with these situations

23
Uneven distribution of data values
  • Basic rule of thumb for using indexes
  • Use indexes when selecting unpopular data
  • Otherwise, use FULL table-scan
  • Or a different index? Or partitioning? Or
  • By default, the Oracle cost-based optimizer (CBO)
    assumes even data distribution
  • Only LOVAL, HIVAL, and DISTINCT_KEYS gathered
    during ANALYZE or DBMS_STATS
  • Real-life often invalidates this assumption
  • Selectively gathering column-level statistics
    provides the CBO with information on data values
    which are popular and which are unpopular

24
Uneven distribution of data values
  • There are two commands for gathering CBO
    statistics
  • ANALYZE TABLE FOR ALL INDEXED COLUMNS
  • DBMS_STATS.GATHER_COLUMN_STATS
  • Populates data dictionary views
    DBA_TAB_HISTOGRAMS and DBA_PART_HISTOGRAMS with
    rows representing buckets of data values
  • All buckets assumed to have the same number of
    rows
  • Each bucket is defined by its highest value
  • By default, only one row populated

25
Uneven distribution of data values
  • Please dont gather column-level statistics
    unless the problem is proven
  • Gathering column-level statistics is not a good
    choice as a default operation
  • Perform SQL tuning with SQL Trace/TKPROF or
    STATSPACK to provide proof of a problem
  • If the CBO fails to choose an index
  • it could be because it has detected low
    cardinality on average
  • If the CBO chooses an index
  • it could perform poorly if the data value is
    overly popular

26
Illustrating the importance of data
  • SQLgt create table t1
  • 2 ( c1 varchar2(30),
  • 3 c2 number,
  • 4 c3 number
  • 5 ) tablespace tools
  • Table created.
  • SQLgt begin
  • 2 for i in 1..100000 loop
  • 3 insert into t1
  • 4 values(to_char(mod(i,187)), i,
    mod(i,187))
  • 5 end loop
  • 6 end
  • 7 /
  • PL/SQL procedure successfully completed.

27
Illustrating the importance of data
  • SQLgt create index i1 on t1(c1) tablespace tools
  • Index created.
  • SQLgt analyze table t1 compute statistics
  • Table analyzed.
  • SQLgt set autotrace on
  • SQLgt select c1 from t1 where c1 '10000'
  • no rows selected
  • Execution Plan
  • --------------------------------------------------
    ---
  • 0 SELECT STATEMENT OptimizerCHOOSE
    (Cost37 Card535 Bytes1605)
  • 1 0 TABLE ACCESS (FULL) OF 'T1' (Cost37
    Card535 Bytes1605)

28
Illustrating the importance of data
  • SQLgt select num_rows, blocks
  • 2 from user_tables
  • 3 where table_name 'T1'
  • NUM_ROWS BLOCKS
  • ------------- ---------
  • 100,000 232
  • SQLgt select num_rows, distinct_keys,
  • 2 avg_leaf_blocks_per_key, avg_data_blocks_per_
    key
  • 3 from user_indexes where index_name 'I1'
  • Avg Leaf Avg Data
  • Distinct Blocks Blocks
  • Nbr Rows Key Per Key Per Key
  • -------- -------- ---------- ---------
  • 100,000 187 1 231

29
Illustrating the importance of data
  • SQLgt truncate table t1
  • Table truncated.
  • SQLgt begin
  • 2 for i in 1..100000 loop
  • 3 insert into t1
  • 4 values(to_char(round(i/187,0)), i,
  • 5 round(i/187,0))
  • 6 end loop
  • 7 end
  • 8 /
  • PL/SQL procedure successfully completed.

30
Illustrating the importance of data
  • SQLgt create index i1 on t1(c1) tablespace tools
  • Index created.
  • SQLgt analyze table t1 compute statistics
  • Table analyzed.
  • SQLgt set autotrace on
  • SQLgt select c1 from t1 where c1 '10000'
  • no rows selected
  • Execution Plan
  • --------------------------------------------------
    ---
  • 0 SELECT STATEMENT OptimizerCHOOSE (Cost2
    Card187 Bytes561)
  • 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T1'
    (Cost2 Card187 Bytes561)
  • 2 1 INDEX (RANGE SCAN) OF 'I1' (NON-UNIQUE)
    (Cost1 Card187)

31
Illustrating the importance of data
  • SQLgt select num_rows, blocks
  • 2 from user_tables
  • 3 where table_name 'T1'
  • NUM_ROWS BLOCKS
  • ------------- ---------
  • 100000 242
  • SQLgt select num_rows, distinct_keys,
  • 2 avg_leaf_blocks_per_key, avg_data_blocks_per_
    key
  • 3 from user_indexes where index_name 'I1'
  • Avg Leaf Avg Data
  • Distinct Blocks Blocks
  • Nbr Rows Key Per Key Per Key
  • -------- -------- ---------- ---------
  • 100000 536 1 1

32
Low data cardinality
  • BTree indexes work best with high cardinality
  • Think of the example of a telephone book
  • Would you use the standard indexing method to
    find all occurances of a name comprising 25 of
    the book?
  • Or, would it be better to just scan through the
    book?

33
Low data cardinality
  • Bitmap indexes are extremely compact
  • That is their primary advantage
  • No real magic involved
  • Less I/O is faster, plain and simple
  • Still, no matter how compact they are
  • a FULL table scan is still likely faster than a
    single bitmap index on low cardinality data
  • However, merging two or more bitmap indexes
    brings out the real power of the mechanism
  • Scanning of merged bitmaps using fast low-level
    bitmasking operations
  • Sifts through huge numbers of rows quickly

34
Low data cardinality
  • Bitmap indexes are designed for quickly scanning
    low cardinality data
  • each database block is comprised of two bitmap
    segments
  • each bitmap segment is comprised of a bitmap, a
    ROWID list, and a list of distinct data values
  • ROWID list is a forward-compressed list of
    ROWIDs, positioned according to bits in bitmap

ROWID list
Distinct values list
Bitmap
  • Bitmap is organized into columns for distinct
    data values and rows of bits
  • representing rows

ROWID list
Distinct values list
Bitmap
  • More distinct data values means less room for
    rows within each data value

35
Low data cardinality
  • Bitmap indexes extremely slow during DML
  • INSERT, UPDATE, and DELETE
  • For each change, entries within the bitmap
    segments must be
  • Expanded/decoded
  • Manipulated
  • Re-encoded
  • For multiple concurrently-executing DML changes
  • Slower individual changes cause queueing
  • Only two bitmap segments per block increases
    contention
  • Because of this, bitmap indexes are most feasible
    on partitioned tables in non-transactional
    applications
  • Data changes should be performed in bulk using
    some variation on the technique of EXCHANGE
    PARTITION

36
Function-based indexes
  • Indexes can be based on functions and expressions
  • create index xxx_ix1 on xxx(upper(c1))
  • create index xxx_ix2 on xxx(upper(c1)yadda_yadda
    (c2))

37
Function-based indexes
  • User-defined functions usable along with standard
    built-in functions
  • Restrictions for using user-defined functions
  • But functions and expressions must be
    deterministic
  • repeatable given the same inputs, the function
    or expression must always provide the same
    results
  • user-defined functions cannot access tables or
    package variables
  • PRAGMA RESTRICT_REFERENCES(, RNDS, WNDS, RNPS,
    WNPS)
  • CREATE FUNCTION DETERMINISTIC

38
Function-based indexes
  • Function-based indexes must have statistics
    gathered before they can be used
  • System permissions
  • GLOBAL QUERY REWRITE needed to CREATE or
    ALTER REBUILD
  • Parameter QUERY_REWRITE_ENABLED TRUE to use

39
Descending indexes
  • Implemented as a form of function-based indexes
  • CREATE INDEX DESC
  • data values in index leaf blocks are sorted in
    descending order
  • must have statistics before it will be utilized
  • not usable with rule-based optimizer
  • system permission QUERY REWRITE or GLOBAL QUERY
    REWRITE needed to CREATE or ALTER REBUILD
  • parameter QUERY_REWRITE_ENABLED not necessary to
    use

40
Index compression
  • Since Oracle8i, compression has been available on
    BTree indexes
  • Reduces repeated storage of prefix column data
    values
  • Index compression
  • COMPRESS -prefix-cols NOCOMPRESS
  • UNIQUE default -prefix-cols is (-cols)-1
  • NONUNIQUE default -prefix-cols is -cols
  • Same syntax for IOT in USING clause as for index
    compression
  • DML is supported on compressed indexes, but it
    becomes much slower
  • Index compression is best used for read-mostly or
    read-only situations
  • Mixing of compressed and uncompressed index
    partitions is possible

41
Solutions for BTree indexes
  • Dont assume that there is anything wrong with a
    BTree index
  • Prove it first!!!
  • To address sparse index structures
  • ALTER INDEX REBUILD COALESCE
  • To address block contention on INSERTs
  • ALTER INDEX REVERSE
  • To address uneven data distribution
  • Gather column-level statistics or histograms
  • To address low cardinality
  • Use bitmap indexes or no indexes at all
  • Additional groovy stuff
  • Function-based, descending, and compressed indexes

42
Technical Session 549 Slides, paper, and
scripts downloadable from http//www.SageLogix.com
and http//www.EvDBT.com/papers.htm Email
tim_at_sagelogix.com
Write a Comment
User Comments (0)
About PowerShow.com