Query Languages - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Query Languages

Description:

Query Languages. Why a query language? Extracting, Restructuring, ... movie in 'http://www.imdb.com' construct cine-contact who $g /who movie $t /movie ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 40
Provided by: lkh8
Category:
Tags: imdb | languages | query

less

Transcript and Presenter's Notes

Title: Query Languages


1
Query Languages
  • Why a query language? Extracting, Restructuring,
    Integration, Browsing
  • XML-QL
  • http//www.w3.org/TR/NOTE-xml-ql
  • http//db.cis.upenn.edu/XML-QL/
  • XPATH (part of a query language)
  • httpwww.w3.org/TR/xpath
  • XSLT
  • http//www.w3.org/TR/xslt
  • http//www.mulberrytech.com/quickref/XSLTquickref
    .pdf

2
XQuery -- likely to gain acceptance...
  • and, like other things in W3C, not necessarily
    the best. XQuery is a group of projects.
    General blather and informal specifications
  • http//www.w3.org/XML/Query
  • Ingredients
  • A concrete syntax http//www.w3.org/TR/xquery
  • Based on XPath http//www.w3.org/TR/xpath.html
  • A formal semantics and algebra
    http//www.w3.org/TR/query-semantics/
  • Some test cases http//www.w3.org/TR/xmlquery-use
    -cases

3
Query Languages and DTDs
  • The DOM does not interact with DTDs (later
    levels may do this)
  • For query languages there is almost no
    interaction (only to find out the names of IDs
    and IDREFs)
  • XDUCE (developed at Penn) is the only
    well-developed language that uses DTDs as a type
    system. www.cis.upenn.edu/hahosoya/xduce/

4
XML-QL (XML Query Language)
  • W3C proposal, August 1998
  • authors
  • Mary Fernandez ATT
  • Dana Florescu INRIA
  • Alon Levy Univ. of Washington
  • Dan Suciu ATT
  • Alin Deutsch Univ. of Pennsylvania

5
Address Book Revisited
  • Caesar
  • Caesar Imperator
  • The Capitol
  • Rome, OH 98765
  • (321) 786 2543
  • (321) 786 2543
  • (321) 786 2543
  • jc_at_forum.rome.org

6
XML-QL Pattern Matching
Find Caesars e-mail address where

Caesar
e
in http//db.cis.upenn.edu
/peter/address.xml construct e
jc_at_forum.rome.org
Data Extraction
7
XML-QL Constructing New XML Data
Whom can we contact electronically? where

g
e
in http//... construct
g
e

Caesar
Imperator jc_at_forum.rome.org

Brutus mb_at_philippi.com
...
Data Restructuring
8
XML-QL Joins
Who of our contacts was involved in a
movie? where
g
e
in
http//address.xml
t
g in
http//www.imdb.com construct
g
t
e
9
XML-QL Joins (contd)
Caesar
Imperator jc_at_forum.rome.orgere Asterix and Cleopatra

Dr. Strangelove
strangelov_at_love.the.bomb
Dr. Strangelove or How I Stopped
... ...
Data Integration
10
XML-QL Data Model
  • Directed, labeled graph
  • Tags represented as edge labels
  • Sets of attribute name-value pairs as node labels
  • Two models ordered and unordered

11
XML-QL Data Model (contd)
  • Caesar
  • Caesar Imperator
  • The Capitol
  • Rome, OH 98765
  • (321) 786 2543
  • (321) 786 2543
  • (321) 786 2543
  • jc_at_forum.rome.org

12
XSL (Extensible Stylesheet Language)
  • W3C working draft by Adobe Systems
  • Original purpose was to specify rendering of XML
    documents (mainly by Web browsers HTML)
  • Consists of two parts
  • an XML transformation language
  • a formatting vocabulary denoting typographic
    abstractions (paragraph, page, rule, footer,
    etc.)

13
The XSL Processor
original document
stylesheet
transformer
restructured document, elements are formatting
objects
plug in your favorite
formatter
presentation (other formats possible)
14
Formatting Example
original document
An example This is a
test. This is anothertest.p
transformed document
An
example This is a
test. This is
another

test.
15
Template Rules
  • ly-templates/

16
Template Rules (contd)

where n
e con
struct n
e
17
XSL vs. XML-QL
XML-QL XSL
XML output no schema required data
extraction data restructuring data
integration schema browsing relational complete
18
  • XPATH and XQuery

19
URLs -- XPath
  • http//www.w3.org/TR/xpath
  • This is the recommendation. Dense. Few
    examples. Difficult to extract the big picture
    from the morass of detail
  • http//www.zvon.org/xxl/XPathTutorial/
  • General/examples.html
  • A tutorial with some simple examples. Maybe too
    simple. There are lots of tutorials on the web.

20
URLs -- XQuery
  • http//www.w3.org/TR/xquery/
  • The basic recommendation. Plenty of examples,
    so work through these first.
  • http//www.w3.org/TR/query-semantics/
  • A formal semantics for XQuery. Despite its
    forbidding title, it is remarkably readable. It
    also discusses a type system for XQuery.
  • http//www.w3.org/TR/xmlquery-use-cases
  • A bunch of example queries and their solution in
    XQuery (not surprising, since XQuery is
    Turing-complete!)

21
How to Identify nodes in a Tree -- Regular Path
Expressions
In the normal syntax of regular
expressions db.emps.emp db.(depts.dept.mgr
emps.emp) db._.name
Mary
Bill
John
N.B. Regular path expressions have nothing to do
with regular expresions in DTDs
22
More examples
With the DTD MOTHER)
the regular path expression
(PERSON.MOTHER) identifies matrilineal
ancestry XPATH is a superset of a subset of
regular path expressions. (It cannot express
this set of nodes.) However, it is not limited
to moving down the tree.
23
XPath
  • Primary goal to permit to access some nodes
    from a given document
  • XPath main construct axis navigation
  • An XPath path consists of one or more navigation
    steps, separated by /
  • A navigation step is a triplet axis node-test
    list of predicates
  • Examples
  • /descendantnode()/childauthor
  • /descendantnode()/childauthorparent/attribute
    booktitle XML2
  • XPath also offers some shortcuts
  • no axis means child
  • // º /descendant-or-selfnode()/

24
XPath- child axis navigation
  • author is shorthand for childauthor. Examples
  • aaa -- all the child nodes labeled aaa (1,3)
  • aaa/bbb -- all the bbb grandchildren of aaa
    children (4)
  • /bbb all the bbb grandchildren of any child
    (4,6)
  • . -- the context node / -- the root node

25
XPath- child axis navigation (cont)
  • /doc -- all the doc children of the root
  • ./aaa -- all the aaa children of the context node
    (equivalent to aaa)
  • text() -- all the text children of the context
    node
  • node() -- all the children of the context node
    (includes text and attribute nodes)
  • .. -- parent of the context node
  • .// -- the context node and all its descendants
  • // -- the root node and all its descendants
  • //para -- all the para nodes in the document
  • //text() -- all the text nodes in the document
  • _at_font the font attribute node of the context node

26
Predicates
  • 2 -- the second child node of the context node
  • chapter5 -- the fifth chapter child of the
    context node
  • last() -- the last child node of the context
    node
  • chaptertitleintroduction -- the chapter
    children of the context node that have one or
    more title children whose string-value is
    introduction (the string-value is the
    concatenation of all the text on descendant text
    nodes)
  • person.//firstname joe -- the person
    children of the context node that have in their
    descendants a firstname element with string-value
    Joe
  • From the XPath specification
  • NOTE If x is bound to a node set then x
    foo does not mean the same as not (x !
    foo) .

27
Unions of Path Expressions
  • employee consultant -- the union of the
    employee and consultant nodes that are children
    of the context node
  • For some reason person/(employeeconsultant) --as
    in regular path expressions -- is not allowed
  • However person/node()boolean(employeeconsultant)
    is allowed!!
  • From the XPATH specification
  • The boolean function converts its argument to a
    boolean as follows
  • a number is true if and only if it is neither
    positive or negative zero nor NaN
  • a node-set is true if and only if it is non-empty
  • a string is true if and only if its length is
    non-zero
  • an object of a type other than the four basic
    types is converted to a boolean in a way that is
    dependent on that type

28
Axis navigation
  • So far, nearly all our expressions have moved us
    down the by moving to child nodes. Exceptions
    were
  • . -- stay where you are
  • / go to the root
  • // all descendants of the root
  • .// all descendants of the context node
  • All other expressions have been abbreviations for
    child e.g. childpara. childis an example of
    an axis
  • XPath has several axes ancestor,
    ancestor-or-self, attribute, child, descendant,
    descendant-or-self, following, following-sibling,
    namespace, parent, preceding, preceding-sibling,
    self
  • Some of these (self, parent) describe single
    nodes, others describe sequences of nodes.

29
XPath Navigation Axes(merci, Arnaud Sahuguet)
ancestor
following-sibling
preceding-sibling
self
child
attribute
following
preceding
namespace
descendant
30
XPath abbreviated syntax
(nothing) child _at_ attribute // /descendan
t-or-selfnode() . selfnode() .// descendan
t-or-selfnode .. parentnode() / (document
root)
31
XPath
  • Reasonably widely adopted -- in XML-Schema and
    query languages.
  • Neither more expressive nor less expressive than
    regular path expressions (cant do (ab) )
  • Particularly messy in some areas
  • defining order of results
  • overloading of operations,
  • e.g. chapter/title Introduction
  • why not Introduction IN chapter/title ?

32
XQuery
  • proposed by Chamberlin, Robbie and Florescu
  • (from the authors slides)
  • Leverage the most effective features of several
    existing and proposed query languages
  • Design a small, clean, implementable language
  • Cover the functionality required by all the XML
    Query use cases in a single language
  • Write queries that fit on a slide

33
XQuery XPath comprehension syntax
  • XML -QL
  • Quilt

34
XQuery

35
Examples from XQuery
List the titles of books published by Morgan
Kaufmann in 1998. FOR b IN document("bib.xml")/
/book WHERE b/publisher "Morgan Kaufmann"
AND b/_at_year "1998" RETURN b/title
XPath expressions in orange
36
DTD for Sample Document
  • publisher, price )

37
Examples from XQuery (cont)
List each publisher and the average price of its
books. FOR p IN distinct(document("bib.xml")//p
ublisher) LET a avg( document("bib.xml")/
/bookpublisher p/price) RETURN
p/text()
a
LET binds a variable to a value. It does not
cause an iteration. Does this create a
(well-formed) XML document?
38
Examples from XQuery (cont)
List the publishers who have published more than
100 books. FOR p IN
distinct(document("bib.xml")//publisher) LET b
document("bib.xml")//bookpublisher
p WHERE count(b) 100 RETURN p

What about efficiency?
39
Examples from XQuery (cont)
Invert the structure of the input document so
that each distinct author element contains a
sequence of book-titles. FOR
a IN distinct(document("bib.xml")//author)
RETURN a/text()
FOR b IN document("bib.xml")//bookauthor
a RETURN b/title _list
Write a Comment
User Comments (0)
About PowerShow.com