Title: Indexing Semistructured Data
1Indexing Semistructured Data
- J. McHugh, J. Widom, S. Abiteboul,
- Q. Luo, and A. Rajaraman
- Stanford University
- January 1998
- http//www-db.stanford.edu/lore/
EECS 684 02/21/2000 Presented by Weiming
Zhou
2Outline
- Introduction
- - Data Model
- - Query Language
- Indexes in Lore
- Query plans using indexes
- Conclusions
3Data Model - Object Exchange Model (OEM)
4The Lorel Query Language (Lorel)
Example 1 select DB.Movie.Title where
DB.Movie.Actor.Name Harrison Ford Example
2 select T from DB.Movie M, M.Title T where
exists A in M.Actor exists N in A.Name N
Harrison Ford
5Indexes In Lore
- Value index
- Text index
- Link index
- Path index
- Edge index
6Value index
Similar to attribute indexes in Relational
DBMS Example Suppose we create a Value index
for DB.Movie.Year If we perform a lookup for
DB.Movie.Year 1956, Result 12.
7Text Index
- An information-retrieval style keyword search.
- Restricted by incoming labels.
- Locates string values containing specific words.
- Useful for strings containing a significant
amount of text. - Implementation
- Inverted lists - map a given word w and label l
to a list of atomic values with incoming edge l
that contain word w. - Example
- Lookup for all objects with an atomic string
value containing the - word Ford" and an incoming edge Name.
- Results lt17, 2gt, lt21, 2gt.
8Link Index
- Locates parents of a given object.
- Serves as back-pointers
- Implementation
- Extendible hashing
- One Link Index for the entire database graph
- Example
- The Link Index lookup for object 17 returns
parent object 6, and the lookup for object 21
returns object 13.
9Path Index
Locate all objects reachable by a given labeled
path. Provided by DataGuide. Example select
DB.Movie.Title Using the Path Index to directly
locate all objects reachable via DB.Movie.Title.
Results 5 9 14.
10Edge Index
All parent-child pairs connected via a specified
label. Example Look up label Year in Edge
Index Results 2-7, 3-12
11Query Plans Using Indexes
- Top-Down
- Bottom-Up
- Hybrid
- Example
- select T
- from DB.Movie M, M.Title T
- where exists A in M.Actor exists N in A.Name
- N Harrison Ford
12Top-Down Query Plan
Exhaustive Top-down traversals DB.Movie.Actor.Name
Harrison Ford 17, 21 Link Index
17 ? 2, 21 ? 4 DB.Movie.Title 5,
14
13Bottom-Up Query Plan
Look up Value Index DB.Movie.Actor.Name
Harrison Ford 17, 21 Link Index 17 ?
2, 21 ? 4 DB.Movie.Title 5, 14
14Hybrid Query Plan
select X from A.B X where exists Y in X.C Y 5
Bottom-up Value Index A.B.C 5 Top-down
A.B Intersect
15Conclusions
- Presents Lores indexing structures Value
Index, Text Index, Link Index, Path Index and
Edge Index. - Query plans using indexes
- Preliminary performance results
- at least an order of magnitude improvement when
indexes are used for query processing.