Compressing Relations And Indexes - PowerPoint PPT Presentation

About This Presentation
Title:

Compressing Relations And Indexes

Description:

Compressing Relations And Indexes Jonathan Goldstein Raghu Ramakrishnan Uri Shaft Department of Compter Sciences, University of Wisconsin-Madison – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 24
Provided by: Eya57
Category:

less

Transcript and Presenter's Notes

Title: Compressing Relations And Indexes


1
Compressing Relations And Indexes
  • Jonathan Goldstein Raghu Ramakrishnan
  • Uri Shaft
  • Department of Compter Sciences, University of
    Wisconsin-Madison
  • June 18, 1997

2
Agenda
  • Introduction
  • Compressing A Relation
  • Compression Applied to Rectangle Base Indexes
  • Performance Evaluation
  • Questions and Remarks

3
Introduction
  • Page level Compression
  • Performance Study
  • Application to B-trees and R-trees
  • Multidimensional bulk loading algorithm

4
Introduction
5
Introduction
6
Compressing A relation
  • Frames Of Reference
  • Non numeric attributes
  • File level compression

7
Frames of Reference
8
Lossy Compression
  • Point approximation in lossy compression

9
Compressing an indexing structure
  • Compressing a B-tree
  • Compressing a rectangle based indexing structure
  • Compression oriented Bulk Loading

10
Rectangle Based indexing qualities
11
Changing the frame of reference
12
Bulk-Loading Algorithm
  • Input. A set of points in some
    n-dimentional space.
  • Output. A partition of the inut into subsets.
  • Requirements. The partition shuold group points
    that are close to each other in the same group as
    much as possiblg

13
GB-Pack compression oriented bulk loading
14
GB-Pack compression oriented bulk loading
  • Qualities
  • trading off some tree quality for increased
    compression.
  • number of entries per page is data-dependent.
  • cutting a dimension in a value boundary in the
    data.

15
GB-Pack compression oriented bulk loading
16
GB-Pack compression oriented bulk loading
17
GB-Pack compression oriented bulk loading
18
Performance Evaluation
  • Relational Compression Experiments.
  • CPU vs. I/O Costs.
  • Comparison With Techniques in commercial systems.
  • Importance of Tuple-Level Decompression.
  • R-tree Compression Experiments.

19
Synthetic Data Sets
  • Size The number of tuples in the relation.
  • Dimensionality The number of attributes of the
    relations.
  • Range The range of values for the attributes.
  • Distribution uniform(worst case) / exponential.
  • Partition Strategy.
  • Page size.

20
Sales Data Set
  • Sales data set. Compression Achieved versus
    dimensionality

21
CPU vs. I/O Costs
22
R-tree Compression Experiments
  • Testing the quality of R-trees on Sales Data Set.

23
Questions And Remarks
Write a Comment
User Comments (0)
About PowerShow.com