Indexing OLAP Data Sunita Sarawagi - PowerPoint PPT Presentation

About This Presentation
Title:

Indexing OLAP Data Sunita Sarawagi

Description:

Title: Indexing OLAP Data Sunita Sarawagi Author: Monowar Hossain Last modified by: Monowar Hossain Created Date: 11/14/2002 8:41:51 AM Document presentation format – PowerPoint PPT presentation

Number of Views:148
Avg rating:3.0/5.0
Slides: 12
Provided by: Mono80
Category:

less

Transcript and Presenter's Notes

Title: Indexing OLAP Data Sunita Sarawagi


1
Indexing OLAP DataSunita Sarawagi
  • Monowar Hossain
  • York University

2
Agenda
  • Requirements on Indexing methods
  • Existing indexing methods
  • Optimization of R-Tree for OLAP data
  • R-Tree VS Bit-mapped Indices
  • Conclusion

3
Requirements on Indexing methods
  • Symmetric partial match queries
  • Continuous e.g. time between Jan to July 94
  • Discontinuous e.g. first month of each year
  • Indexing at multiple levels of aggregation
  • Pre-computation group-bys
  • Indexing summary data
  • Handing multiple traversal orders
  • Efficient batch update
  • Handling sparse data efficiently

4
Existing methods
  • Multidimensional array-based methods
  • Works efficiently when data is dense
  • Essbases schema
  • E.G. four dimensional cube product and store
    (sparse), time and scenarios ( dense)
  • B-tree on Product and Store
  • Two-dimensional array on time and scenarios
  • Evaluation of Essbases schema
  • May cause multiple searches.
  • E.g. searching store something on
    product-store index
  • Performance depends on ability to find enough
    dense dimensions.
  • Efficient batch update

5
Existing methods Cont...
  • Bit mapped indices
  • Pros
  • Low cardinality data, bit maps are both spaced
    and retrieval efficient.
  • Supports bitwise operations
  • Access data is clustered
  • All dimensions handles symmetrically
  • Cons
  • Range queries
  • Increased space overhead of storing the bit-maps
    specially for high cardinality data
  • Expensive batch update as all bit mapped indices
    have to be modified even for a single row
    insertion

6
Existing methods... Cont
  • Bit-mapped indices variants
  • Compression
  • Hybrid
  • Dynamic Bit-maps

7
Existing methods... Cont
  • Hierarchical Indices
  • Example Product - Store
  • Index product first also store summaries on
    product level.
  • For each product value, create index for Store
    and store summaries for product-store level
  • Pros
  • Allows faster access to higher levels data
  • Dimensions are symmetrically handled
  • Cons
  • Widely used index storage overhead
  • The average retrieval efficiency can suffer
    because of large indexing structure

8
Existing methods Cont
  • Multidimensional indices
  • Use of of the indexed methods designed for
    spatial data
  • E.g RTree, GridFiles etc.

9
Optimized R-Tree of OLAP data
  • Rectangular dense region (only the boundaries
    that contain more than threshold number of points
  • Contains a pointer to variable length array of
    (TIDs or the tuples itself)
  • Points in sparse regions
  • Finding dense regions
  • Ask Expert?
  • Use of clustering algorithm (similar algorithm
    image analysis)
  • Need evaluation!!

10
R-Tree VS Bit-mapped indices
  • R-Tree Pros
  • Allows range queries
  • Smaller space overhead
  • Update is more efficient
  • Bit-mapped Pros
  • Faster Bit-wise operation
  • Efficient for low cardinality, few restricted
    dimensions, and sparse data.

11
Conclusion
  • High level overview
  • Recommended readings
  • MOLAP VS OLAP
  • R-Tree and variants
  • R-Tree alternatives
  • Computational of multidimensional aggregates
  • And More..
Write a Comment
User Comments (0)
About PowerShow.com