Title: www'SageLogix'Com
1International Oracle Users Group Live 2004
Technical Session 549Understanding
IndexesTim GormanPrincipal - SageLogix,
Inc.Email tim_at_sagelogix.com
2Agenda
- BTree index architecture
- Issues with BTree indexes in Oracle
- Real issues
- Sparsely-populated indexes
- Contention on INSERT
- Uneven data distribution
- Low data cardinality (a.k.a. selectivity)
- Imagined or mythical issues
- Indexes needing rebalancing
- ???
- Some interesting options for indexing
- Function-based, descending, compression
3BTree index architecture
- Oracle implements balanced BTree indexes as its
primary indexing method - Tree structure consisting of root, branch, and
leaf nodes - each node is one database block in Oracle
- root node is the top-level branch node
- starting point for searches into the index
- branch nodes are connectors
- point to other branch nodes or to leaf nodes
- leaf nodes
- contain data values and pointers to rows
4BTree index architecture
- Branch entries contain
- Max data value piece
- DBA of next level branch or leaf
- Leaf entries contain
- Data value
- ROWID of table row
Index PK_X
root
branch
branch
branch
branch
leaf
leaf
leaf
leaf
leaf
leaf
leaf
Table X
5BTree index architecture
- Index read I/O is sequential in nature
- Oracle wait-event for single-block I/O requests
used during indexed access - db file sequential read
- Used with RANGE, UNIQUE, and FULL scans
- FAST FULL index-scans are an exception
- Uses sequential multiblock I/O similar to FULL
table-scans - Conventional wisdom that tables and indexes must
be separated to different tablespaces to enable
some form of parallel I/O is a myth - Think of how a treasure hunt is performed.
6BTree index architecture
- When indexes are built on populated tables using
CREATE INDEX or ALTER INDEX xxx REBUILD - Table is scanned and column values are sorted
- Leaf blocks of index are populated first
- Then supporting branch levels are built to
support those leaves - All blocks are filled up to PCTFREE threshold
7BTree index architecture
- When indexes are built transactionally using
conventional-path INSERT or UPDATE statements - How the indexes grow is dependent upon two
possible design decisions - Optimize either for
- random data values
- Or
- sequentially ascending data values
8BTree index architecture
- If optimizing for random data values
- As blocks fill, split them in half, move upper
half to new block, leave lower half behind - Leaves behind half-full blocks
- Allows back-filling as lower data values occur
- If optimizing for sequential data values
- As blocks fill, simply overflow into a new block
- Leaves behind full blocks
- No need to worry about back-filling as lower data
values could never occur
9BTree index architecture
Grows like this
Overflow
- Overflowing uses space more efficiently in a
heap (non-sorted) data structure - Also uses space more efficiently when optimizing
sequential data values
Split
Grows like this
- Splitting anticipates back-filling in a sorted
data structure - Therefore uses space more efficiently when
optimizing random data values
10BTree index architecture
- If the last index entry inserted is the highest
data value in the block - Then data will overflow into the next block on
the same level - If the last index entry inserted is not the
highest data value in the block - Then the current block will split, with the lower
set of values remaining in place and the higher
set of values moving to the next block on the
same level
11BTree index architecture
- If the branch block above cannot accommodate
another child - Then this algorithm is applied recursively first
- If this algorithm iterates recursively all the
way back to the root block, and the root becomes
full - Then the root splits/overflows to generate a new
branch level (BLEVEL) - Root block stays in place, generates two new
blocks at a new 1st BLEVEL below
12BTree index architecture
- Balanced BTree indexes grow balanced by design,
automatically - New levels are added to the tree structure above,
dynamically, as needed - Indexes do not shrink automatically when
deletions occur - Instead, empty index entries are left in place
- For possible reuse by newly inserted data
- ROWID (pointer to data) portion is set to NULL
- Deletions may cause an index to become sparse and
thus less efficient over time
13Issues with BTree indexes
- BTree indexes optimize
- Both the random and sequential data values!
- Even distribution of distinct data values
- High data cardinality or selectivity of values
- As a result, issues with BTree indexes are
- Sparseness
- Resulting from data deletions
- Block contention when inserting sequential data
- Uneven distribution of data values
- Popular and unpopular data values
- Low data cardinality
14Sparsely-populated indexes
- Indexes never become unbalanced over time
- If so, then why does performance sometimes
deteriorate over time? - Instead, indexes can become sparsely populated
due to - Deletions of row data
- Unfortunate patterns of inserted random data
- Sparseness simply means that more I/O is required
to perform the same work
15Sparsely-populated indexes
- How is this condition detected?
- ANALYZE INDEX VALIDATE STRUCTURE command
- Populate session-private view named INDEX_STATS
with only one row - Column PCT_USED is the average percentage of
space utilized in the blocks belonging to the
index - Derived from the ratio of the value of the
columns USED_SPACE and BTREE_SPACE - Expect PCT_USED to be 90 by default
- Lesser values may indicate sparseness developing
- Best to watch values over time
16Sparsely-populated indexes
- How is this condition detected?
- If PCT_USED is much less than 90
- Non-zero values in the column DEL_LF_ROWS
- Cause of sparseness is probably row deletions, if
the value in the column DEL_LF_ROWS is a large
percentage of value in column LF_ROWS
17Sparsely-populated indexes
- Rebuild or coalesce are two possible solutions
for index sparseness - Dont guess!!!
- Make certain using ANALYZE INDEX VALIDATE
STRUCTURE - Be aware that the ANALYZE command locks the
index, however - To rebuild an index
- Use ALTER INDEX REBUILD
- To coalesce space within an index
- Use ALTER INDEX COALESCE
18Sparsely-populated indexes
- ALTER INDEX REBUILD is the most commonly-used
solution for sparsely-populated indexes - Unlike CREATE INDEX, uses the existing index as
the source, which is faster because - Index is usually smaller than the table (less
I/O) - Index is already sorted (some sorting still
needed) - Other features of ALTER INDEX REBUILD include
- Parallel execution (throw more resources at task)
- Direct-path operations (no rollback/undo
generated) - NOLOGGING (no redo generated)
- COMPUTE STATISTICS (better CBO stats cheaply)
- ONLINE (allows full usage of index during task)
19Sparsely-populated indexes
- ALTER INDEX COALESCE is another alternative to
rebuilding - Merges unused space within indexes to free blocks
for reuse - Slower than ALTER INDEX REBUILD because it is a
transactional operation, not a bulk operation - No direct-path, parallel, nologging, etc
- But COALESCE is implicitly an ONLINE operation
20Contention on INSERT
- Multiple concurrently-executing INSERTs into an
index on column(s) with sequential data values
can also bottle-neck on buffer busy waits
(a.k.a. block-level contention) - Pseudo-randomizing such data with REVERSE key
indexes relieves this performance problem - Instead of many processes attempting to insert
index entries into the right-most leaf block - Insertions are evenly scattered over all of the
available leaf blocks
21Contention on INSERT
- CREATE ALTER INDEX REVERSE
- Pseudo-randomizes non-random data
- By flipping or reversing the physical order
of the data value during storage - Causes sequential data values to become more
random by simply reversing data - 123456 becomes 654321
- 123457 becomes 754321, etc...
- Index data values are transparently converted
(flipped) and unconverted (unflipped) upon insert
and retrieval, respectively
22Contention on INSERT
- Adverse impacts of using REVERSE indexes
- only equivalence operations will use the index
- , !, ltgt, IN, and NOT IN
- range-scans will not use the index
- gt, gt, lt, lt, LIKE, BETWEEN
- But, is this really a problem?
- Depends on how the sequential data values are
used - Sequence numbers or surrogate system-generated
keys are usually sought using equivalence
operations - No problem here
- But timestamps are often range-scanned
- There is likely a problem with these situations
23Uneven distribution of data values
- Basic rule of thumb for using indexes
- Use indexes when selecting unpopular data
- Otherwise, use FULL table-scan
- Or a different index? Or partitioning? Or
- By default, the Oracle cost-based optimizer (CBO)
assumes even data distribution - Only LOVAL, HIVAL, and DISTINCT_KEYS gathered
during ANALYZE or DBMS_STATS - Real-life often invalidates this assumption
- Selectively gathering column-level statistics
provides the CBO with information on data values
which are popular and which are unpopular
24Uneven distribution of data values
- There are two commands for gathering CBO
statistics - ANALYZE TABLE FOR ALL INDEXED COLUMNS
- DBMS_STATS.GATHER_COLUMN_STATS
- Populates data dictionary views
DBA_TAB_HISTOGRAMS and DBA_PART_HISTOGRAMS with
rows representing buckets of data values - All buckets assumed to have the same number of
rows - Each bucket is defined by its highest value
- By default, only one row populated
25Uneven distribution of data values
- Please dont gather column-level statistics
unless the problem is proven - Gathering column-level statistics is not a good
choice as a default operation - Perform SQL tuning with SQL Trace/TKPROF or
STATSPACK to provide proof of a problem - If the CBO fails to choose an index
- it could be because it has detected low
cardinality on average - If the CBO chooses an index
- it could perform poorly if the data value is
overly popular
26Illustrating the importance of data
- SQLgt create table t1
- 2 ( c1 varchar2(30),
- 3 c2 number,
- 4 c3 number
- 5 ) tablespace tools
- Table created.
- SQLgt begin
- 2 for i in 1..100000 loop
- 3 insert into t1
- 4 values(to_char(mod(i,187)), i,
mod(i,187)) - 5 end loop
- 6 end
- 7 /
- PL/SQL procedure successfully completed.
27Illustrating the importance of data
- SQLgt create index i1 on t1(c1) tablespace tools
- Index created.
- SQLgt analyze table t1 compute statistics
- Table analyzed.
- SQLgt set autotrace on
- SQLgt select c1 from t1 where c1 '10000'
- no rows selected
- Execution Plan
- --------------------------------------------------
--- - 0 SELECT STATEMENT OptimizerCHOOSE
(Cost37 Card535 Bytes1605) - 1 0 TABLE ACCESS (FULL) OF 'T1' (Cost37
Card535 Bytes1605)
28Illustrating the importance of data
- SQLgt select num_rows, blocks
- 2 from user_tables
- 3 where table_name 'T1'
- NUM_ROWS BLOCKS
- ------------- ---------
- 100,000 232
- SQLgt select num_rows, distinct_keys,
- 2 avg_leaf_blocks_per_key, avg_data_blocks_per_
key - 3 from user_indexes where index_name 'I1'
- Avg Leaf Avg Data
- Distinct Blocks Blocks
- Nbr Rows Key Per Key Per Key
- -------- -------- ---------- ---------
- 100,000 187 1 231
29Illustrating the importance of data
- SQLgt truncate table t1
- Table truncated.
- SQLgt begin
- 2 for i in 1..100000 loop
- 3 insert into t1
- 4 values(to_char(round(i/187,0)), i,
- 5 round(i/187,0))
- 6 end loop
- 7 end
- 8 /
- PL/SQL procedure successfully completed.
30Illustrating the importance of data
- SQLgt create index i1 on t1(c1) tablespace tools
- Index created.
- SQLgt analyze table t1 compute statistics
- Table analyzed.
- SQLgt set autotrace on
- SQLgt select c1 from t1 where c1 '10000'
- no rows selected
- Execution Plan
- --------------------------------------------------
--- - 0 SELECT STATEMENT OptimizerCHOOSE (Cost2
Card187 Bytes561) - 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T1'
(Cost2 Card187 Bytes561) - 2 1 INDEX (RANGE SCAN) OF 'I1' (NON-UNIQUE)
(Cost1 Card187)
31Illustrating the importance of data
- SQLgt select num_rows, blocks
- 2 from user_tables
- 3 where table_name 'T1'
- NUM_ROWS BLOCKS
- ------------- ---------
- 100000 242
- SQLgt select num_rows, distinct_keys,
- 2 avg_leaf_blocks_per_key, avg_data_blocks_per_
key - 3 from user_indexes where index_name 'I1'
- Avg Leaf Avg Data
- Distinct Blocks Blocks
- Nbr Rows Key Per Key Per Key
- -------- -------- ---------- ---------
- 100000 536 1 1
32Low data cardinality
- BTree indexes work best with high cardinality
- Think of the example of a telephone book
- Would you use the standard indexing method to
find all occurances of a name comprising 25 of
the book? - Or, would it be better to just scan through the
book?
33Low data cardinality
- Bitmap indexes are extremely compact
- That is their primary advantage
- No real magic involved
- Less I/O is faster, plain and simple
- Still, no matter how compact they are
- a FULL table scan is still likely faster than a
single bitmap index on low cardinality data - However, merging two or more bitmap indexes
brings out the real power of the mechanism - Scanning of merged bitmaps using fast low-level
bitmasking operations - Sifts through huge numbers of rows quickly
34Low data cardinality
- Bitmap indexes are designed for quickly scanning
low cardinality data - each database block is comprised of two bitmap
segments - each bitmap segment is comprised of a bitmap, a
ROWID list, and a list of distinct data values
- ROWID list is a forward-compressed list of
ROWIDs, positioned according to bits in bitmap
ROWID list
Distinct values list
Bitmap
- Bitmap is organized into columns for distinct
data values and rows of bits - representing rows
ROWID list
Distinct values list
Bitmap
- More distinct data values means less room for
rows within each data value
35Low data cardinality
- Bitmap indexes extremely slow during DML
- INSERT, UPDATE, and DELETE
- For each change, entries within the bitmap
segments must be - Expanded/decoded
- Manipulated
- Re-encoded
- For multiple concurrently-executing DML changes
- Slower individual changes cause queueing
- Only two bitmap segments per block increases
contention - Because of this, bitmap indexes are most feasible
on partitioned tables in non-transactional
applications - Data changes should be performed in bulk using
some variation on the technique of EXCHANGE
PARTITION
36Function-based indexes
- Indexes can be based on functions and expressions
- create index xxx_ix1 on xxx(upper(c1))
- create index xxx_ix2 on xxx(upper(c1)yadda_yadda
(c2))
37Function-based indexes
- User-defined functions usable along with standard
built-in functions - Restrictions for using user-defined functions
- But functions and expressions must be
deterministic - repeatable given the same inputs, the function
or expression must always provide the same
results - user-defined functions cannot access tables or
package variables - PRAGMA RESTRICT_REFERENCES(, RNDS, WNDS, RNPS,
WNPS) - CREATE FUNCTION DETERMINISTIC
38Function-based indexes
- Function-based indexes must have statistics
gathered before they can be used - System permissions
- GLOBAL QUERY REWRITE needed to CREATE or
ALTER REBUILD - Parameter QUERY_REWRITE_ENABLED TRUE to use
39Descending indexes
- Implemented as a form of function-based indexes
- CREATE INDEX DESC
- data values in index leaf blocks are sorted in
descending order - must have statistics before it will be utilized
- not usable with rule-based optimizer
- system permission QUERY REWRITE or GLOBAL QUERY
REWRITE needed to CREATE or ALTER REBUILD - parameter QUERY_REWRITE_ENABLED not necessary to
use
40Index compression
- Since Oracle8i, compression has been available on
BTree indexes - Reduces repeated storage of prefix column data
values - Index compression
- COMPRESS -prefix-cols NOCOMPRESS
- UNIQUE default -prefix-cols is (-cols)-1
- NONUNIQUE default -prefix-cols is -cols
- Same syntax for IOT in USING clause as for index
compression - DML is supported on compressed indexes, but it
becomes much slower - Index compression is best used for read-mostly or
read-only situations - Mixing of compressed and uncompressed index
partitions is possible
41Solutions for BTree indexes
- Dont assume that there is anything wrong with a
BTree index - Prove it first!!!
- To address sparse index structures
- ALTER INDEX REBUILD COALESCE
- To address block contention on INSERTs
- ALTER INDEX REVERSE
- To address uneven data distribution
- Gather column-level statistics or histograms
- To address low cardinality
- Use bitmap indexes or no indexes at all
- Additional groovy stuff
- Function-based, descending, and compressed indexes
42Technical Session 549 Slides, paper, and
scripts downloadable from http//www.SageLogix.com
and http//www.EvDBT.com/papers.htm Email
tim_at_sagelogix.com