Indexing Data Relationships

About This Presentation

Title:

Indexing Data Relationships

Description:

Indexing Data Relationships Michael J. Franklin University of California, Berkeley & RightOrder Inc. Overview Data relationships can be complex. Hierarchical views ... – PowerPoint PPT presentation

Number of Views:112

Avg rating:3.0/5.0

Slides: 22

Provided by: ValuedGa240

Category:

more less

Transcript and Presenter's Notes

Title: Indexing Data Relationships

1
Indexing Data Relationships

Michael J. Franklin
University of California, Berkeley
RightOrder Inc.

2
Overview

Data relationships can be complex.
Hierarchical views XML, LDAP,
Semistructure dynamic schema
ApproachEncode paths as tagged strings
raw paths encode structure
refined paths accelerate lookups
Index strings in a highly-compact structure.
Live on top of, next to or inside DBMS.
Benefits
Performance, Scalability Adaptivity
Leverages mature DBMS technology

3
Raw paths w/Designators
4
Refined paths

Optimize specific access paths

Find invoices where X sold to Y
Find invoices where X bought Y and Z
Find invoices where a buyer bought X, Y and Z
5
Index Fabric

An index structure for long strings.
Provides fast lookups
Handles long strings
Ideal substrate for designated keys
Based on Patricia tries
Highly compressed string representation
Cost in index independent of string length
But, need to balance.

6
Patricia tries
Indexes first point of difference between keys
greenbeans
greentea
D. R. Morrison. PATRICIA Practical algorithm
to retrieve information coded in alphanumeric.
J. ACM, 15 (1968) pp. 514-534
7
Multiple Hierarchical Views

Can store multiple permulations of relationships
Find animals and the plants they eat
Find plants and the animals that eat them
Represent as a new set of keys
Store data once using permutation records

8
Example
a
b
a
w
o
c
b
a
c
c
9
Example
a
b
a
w
o
c
b
a
c
c
a
b
10
Balancing Patricia tries
11
Balancing Patricia tries
Step 1 divide trie into blocks
12
Balancing Patricia tries
Step 2 build another layer
g
e
Layer 1 Layer 0
13
Balancing Patricia tries
Search for cash
greenbeans
g
e
Layer 1 Layer 0
14
Balancing Patricia tries
Search for cash
0
g
c
g
2
2
e
a
w
r
e
2
t
grass
corn
cow
b
greenbeans
greenbeans
greentea
Layer 1 Layer 0
15
Balancing Patricia tries
Search for cash
0
g
c
g
2
2
e
a
w
r
greenbeans
e
2
t
grass
corn
cow
b
greenbeans
greentea
Layer 1 Layer 0
16
Balancing Patricia tries
17
Performance