Title: METAXPath
1METAXPath
- Curtis Dyreson
- E.E. and Computer Science
- Washington State University
- USA
Michael Böhlen and Christian S. Jensen Computer
Science Aalborg University Denmark
Nykredit Center for Database Research Aalborg
University, Denmark
2Outline
- Data
- Data model
- XML
- Query language
- XPath
- Metadata
- METAXPath
- Future work
3An XML Database Architecture
Database
Client (HTTP browser)
HTTP server
XML data and metadata
4Database Data Model Evolution
- 60s - Hierarchical data model
- 70s - Network data model
- 80s - Relational data model
- 90s - Object-oriented data model
- 00s - Unstructured/semistructured/XML
- Innovators
- Unstructured data models (UPenn)
- UnQL/Strudel (ATT)
- OEM and Lore (Stanford)
- XML (W3C)
5Object Exchange Model (OEM)
- Heterogeneous OODBs
- Exchange objects
- Text description
object 1
object 1
object 2
text (XML)
your database
my database
6Object Representation in XML
- Use names and values
- Ignore types
- X denotes object X
lt!ATTLIST person id ID REQUIREDgt lt!ELEMENT
person (name age)gt
// A person class class Person String
name int age // A person
object Person joe new Person(Joe Doe, 25)
ltperson id1gt ltnamegtJoe Doelt/namegt
ltagegt25lt/agegt lt/persongt
7XML (XPath) Data Model
- Each element or attribute is a node
- Edges indicate nesting
- Nodes contain information
- Tree is ordered
root
nameJoe attribute
person element
id1 attribute
ltperson id1 nameJoegt ltagegt25lt/agegt
lt/persongt
age element
/n text
/n text
XML
25 text
XPath
8Semistructured Data Model
- Each element or attribute is a node
- Edges indicate nesting
- Edges are labeled
person
ltperson id1 nameJoegt ltagegt25lt/agegt
lt/persongt
1
name
age
25
Joe
XML
Semistructured
9Data Models Compared
- Insensitive to
- text order,
- whitespace
- attributes vs. elements
- Directed graph (many roots, can contain cycles)
- Captures
- text order,
- whitespace,
- attributes and elements
- A tree (single root, no cycles)
root
nameJoe attribute
person
person element
id1 attribute
1
name
age element
/n text
/n text
age
25
Joe
25 text
Semistructured
XPath
10Outline
- Data
- Data model
- XML
- Query language
- XPath
- Metadata
- XML - METAXPath
- Future work
11XPath
- W3C Recommendation 1999
- Used in XQuery, XSLT, and XPointer
- Language for selecting locations in an XML
document - Query
- Sequence of location steps separated by /
- Location step
- axisnode_test predicate1predicateN
- Evaluated with respect to a context node
- Results in a node-set (actually a list of nodes!)
- Step continues from nodes reached in previous step
12Descendent Axis Example
root
person element
SSN99 attribute
dateOfBirth element
This comment
name element
last element
first element
month element
year element
initialS attribute
1981 text
January text
Douglas text
Susan text
13Axes that Partition a Tree
- Ancestor, descendent, following, preceding, and
self partition a tree.
ancestor
self
preceding
following
descendent
14XPath Node Test and Predicates
- Each node in result-set must pass node test
- Is this an element node named person?
- person
- Is this an element node?
-
- Predicates are further tests (about other nodes)
- Does node have a ssn attribute?
- attributessn
15Example /childperson/child/childlast
root
root
person element
person element
SSN99 attribute
dateOfBirth element
This comment
name element
last element
last element
first element
last element
month element
year element
initialS attribute
1981 text
January text
Douglas text
Susan text
16XPath Examples
- The dateOfBirth children of person nodes
- /descendentperson/childdateOfBirth
- The last text node
- /descendenttext()position()last()
17Abbreviated Syntax
- Think of file path specifications in Unix
- Year child of dateOfBirth
- childdateOfBirth/childyear
- dateOfBirth/year
- name siblings
- parent/childname
- ../name
- All year nodes
- /descendent-or-self/childyear
- //year
18Outline
- Data
- Data model
- XML
- Query language
- XPath
- Metadata
- XML - METAXPath
- Future work
19Metadata
- Database metadata
- Schema, security, transaction time (versions)
- Web metadata
- Author, language, subject, privacy
- Web metadata recommendations
- RDF, RDD, P3P
- Features
- Descriptive, but also exclusionary
- Irregular
- Multiple
- Ad-hoc
20A Movie Database
- Movie data
- Bruce Willis stars in Colour of Night.
- Colour of Night premiered 1/Jul/1995.
- Publication meta-data
- language English
- URL http//www.auc.dk
- publication date 2/Apr/1997
- privacy/security over 18
- publication history v1.2, modified
31/Jul/1998 - subject Film, Suspense,
Thriller - namespace http//www.auc.dk/movieD
ataDTD.xml
21Movie Database Queries
- Metadata only
- Retrieve information published at Danish web
sites. - Metadata compared to data
- Find reviews published in the first week of the
movies release. - Metadata and data, but independent
- Get suspense films starring Bruce Willis.
22Properties of a Metadata Data Model
- Goal Same query language for data and metadata
- User learns one language
- Compiler/optimization reuse
- Challenges Data and metadata in different
dataspaces - Query on data should not accidently query
metadata - Meta-metadata
- Metadata for metadata
- Metadata has semantics
- Data with/without metadata
23METAXPath Data Model
- Data model
- Reuse XPath data model
- Meta attribute points to metadata tree
- Right angle data model
- Features
- Minimal extension of XPath
- Backwards-compatible
24Example
- Data
- lt?xml version"1.0"gt
- ltperson ssn"234"gt
- ltnamegtIchirolt/namegt
- lt/persongt
- URL metadata
- ltsource URLwww.wsu.edu/p.htmgt
- Language metadata of person element
- ltlanguagegtEnglishlt/languagegt
- Author meta-metadata - language metadata author
- ltauthor name"Suzuki"/gt
25Type root
Type element Value person Attributes (ssn, 223)
lt?xml version"1.0"gt ltperson ssn"234"gt
ltnamegtIchirolt/namegt lt/persongt
Type element Value name Attributes
Type text Value \n\t
Type text Value \n
Type text Value Ichiro
26Type root Meta
Type root
Type element Value source Attributes (URL,
www.wsu.edu/p.htm)
Type element Value person Attributes (ssn,
223)
Type element Value name Attributes
Type text Value \n\t
Type text Value \n
Type text Value Ichiro
ltsource URLwww.wsu.edu/p.htmgt
27Type root Meta
Type root
Type element Value source Attributes (URL,
www.wsu.edu/p.htm)
Type element Value person Attributes (ssn,
223) Meta
Type root
Type element Value language Attributes
Type element Value name Attributes
Type text Value \n\t
Type text Value \n
Type text Value English
Type text Value Ichiro
ltlanguagegtEnglishlt/languagegt
28Type root Meta
Type root
Type element Value source Attributes (URL,
www.wsu.edu/p.htm)
Type element Value person Attributes (ssn,
223) Meta
Type root Meta
Type root
Type element Value author Attributes (name,
Suzuki)
Type element Value language Attributes
Type element Value name Attributes
Type text Value \n\t
Type text Value \n
Type text Value English
Type text Value Ichiro
ltauthor name"Suzuki"/gt
29Sharing and Excluding Metadata
- Meta property points to metadata for a node
- Shared pointers gt shared metadata
- To share with child
- Copy pointer
- To exclude from child
- Duplicate excluded portion
- Copy remaining shared pointers
30Type root Meta
Type root
Type element Value source Attributes (URL,
www.wsu.edu/p.htm)
Type element Value person Attributes (ssn,
223) Meta
Type root Meta
Type root
Type element Value author Attributes (name,
Suzuki)
Type element Value language Attributes Meta
Type element Value name Attributes Meta
Type text Value \n\t Meta
Type text Value \n Meta
Type text Value English Meta
Type text Value Ichiro Meta
Share metadata with descendents
31Type root Meta
Type root
Type element Value source Attributes (URL,
www.wsu.edu/p.htm)
Type element Value person Attributes (ssn,
223) Meta
Type root Meta
Type root
Type element Value author Attributes (name,
Suzuki)
Type element Value language Attributes Meta
Type element Value name Attributes Meta
Type text Value \n\t Meta
Type text Value \n Meta
Type text Value English Meta
Ichiro text not authored by Suzuki
Type text Value Ichiro Meta
Type root Meta
32METAXPath Queries
- XPath plus level shift operation
- meta axis
- in abbreviated syntax
- Example - Locate data nodes with URL metadata of
p.htm - /descendent-or-self
- meta/childsourceattributeURL"p.htm"
- In abbreviated syntax
- //source_at_URL"p.htm"
- Example - Locate the URL metadata
- //source/_at_URL
- Example - Locate data that has metadata authored
by Suzuki (meta-metadata) - ////author_at_name"Suzuki"
33Outline
- Data
- Data model
- XML
- Query language
- XPath
- Metadata
- XML - METAXPath
- Future work
34Metadata Semantics
2
name movie
3
name title trans. time 2/Apr/1997
- 31/Jul/1998
Color of Night
35AUCQL Collapse Example
- PropertyCollapse for name is concatenation, for
trans. time it is temporal intersection.
1
name reviewed trans. time
1/Sep/1999 - uc
2
name movie
3
name title trans. time 1/Aug/1998
- uc
name title trans. time 2/Apr/1997
- 31/Jul/1998
Colour of Night
Color of Night
36AUCQL Additional Operations
- Coalesce - compute a distributed property value
37Thin Layer Impementation
result
METAXPath query
Metadata encoding
METAXPath Compiler
XPath query
XPath Compiler
DB
38Prototype Implementation
result
METAXPath query
METAXPath Compiler
Perl
Evaluation Tree
Query Evaluation Engine
Perl
DBM
Database API
39Summary
- METAXPath website
- http//www.eecs.wsu.edu/cdyreson/pub/MetaXPath
- AUCQL website
- VLDB 99
- Implemented research prototype
- Free, downloadable, Unix environment
- http//www.eecs.wsu.edu/cdyreson/pub/AUCQL
- Interactive query engine
- Tutorials