Bottom-up Evaluation of XPath Queries - PowerPoint PPT Presentation

About This Presentation
Title:

Bottom-up Evaluation of XPath Queries

Description:

Algorithms : bottom-up evaluation. Design and implementation. Introduction ... By a bottom-up algorithm we mean a method of processing XPath while traversing ... – PowerPoint PPT presentation

Number of Views:131
Avg rating:3.0/5.0
Slides: 30
Provided by: Ling5
Learn more at: https://cse.buffalo.edu
Category:

less

Transcript and Presenter's Notes

Title: Bottom-up Evaluation of XPath Queries


1
Bottom-up Evaluation of XPath Queries
  • Stephanie H. Li
  • Zhiping Zou

2
Outline
  • Overview of XPath
  • Motivation
  • Algorithms bottom-up evaluation
  • Design and implementation

3
Introduction- Overview
  • Overview of Xpath
  • XPath is a querying language and is designed for
    addressing nodes of XML documents.
  • Data model
  • Syntax
  • Expressions
  • Location paths
  • Operators
  • Functions
  • Evaluation(context)

4
Data Model
  • Data Model
  • XML document tree of nodes
  • 7 kinds of nodes
  • Element
  • Attribute
  • Text
  • Namespace
  • Processing-instruction
  • Comment
  • Document (root) nodes.

5
Data Model(Example)
  • ltagt
  • ltb/gt
  • ltb/gt
  • ltb/gt
  • ltb/gt
  • lt/agt

6
Expression
  • XPath uses expressions to select nodes from XML
    documents
  • The main types of expressions are
  • Location Paths, Functions and operators

7
Location Paths
  • Although there are many different kinds of XPath
    expressions, the one thats of primary use in
    Java programs is the location path.
  • Location Path
  • /childmovies/childmovieposition()5
  • step axis nodetest predicate
  • location path

8
Location Step
  • AxisNodetestpredicts
  • Axis chooses the direction to move from the
    context node
  • Node test determines what kinds of nodes will be
    selected along that axis
  • Predicts further filter the node-set.

9
XPath Axis
  • Axis---main navigator for a XML doc
  • ancestor nodes along the path
    to the root
  • ancestor-or-self same but including the
    context node
  • child children of the
    context node
  • descendant descendants of the
    context node
  • descendant-or-self same but including the
    context node
  • following nodes after the
    context node in document order,
    excluding descendants
  • following-sibling following sibling of the
    context node
  • parent the parent of the
    context node
  • preceding nodes before the
    context node in document

    order,excluding ancestors
  • preceding-sibling preceding sibling of the
    context node

10
Node Test
  • Node Type test
  • Example
  • T(root()) r,
  • T(element()) a b1 b4
  • T(element(a)) a
  • T(element(b)) b1 b4
  • Node Name test
  • Element node name

11
Operators and Functions
  • Arithmetic Ops
  • Ops for comparisons and boolean logic
  • lt,gt,lt,gt,,! or, and
  • Functions
  • Position()
  • Last()

12
Xpath Query Evalutation
  • Query evaluation is a major algorithmic problem
  • Main construct is the expression
  • Each expression is evaluated to yield an object
    one of these four types
  • Node-set (an unordered collection of nodes
    without duplicates )
  • Boolean(true or false)
  • Number(a floating-point number )
  • String

13
Context
  • All XPath expressions are evaluated w.r.t. a
    Context,which consists of
  • A context node
  • A context position(int)
  • A context size(int)
  • The input context for query evaluation is chosen
    by the user.

14
Motivation
  • Claim
  • The way XPath is defined in W3C XPath
    recommendation motivates an inefficient
    implementation (exponential-time).
  • This paper propose more efficient way
    (polynomial-time)

15
Basic query evaluation strategy
  • Procedure process-location-step(n0, Q)
  • / n0 is the context node
  • query Q is a list of location steps /
  • Begin
  • node set S apply Q.first to node n0
  • if (Q.tail is not empty) then
  • for each node n ? S do
  • process-location-step(n, Q.tail)
  • End
  • Time(Q) D Time(Q-1) or DQ when
    Q gt 0
  • 1 when Q
    0

The algorithm recursively evaluates each
remaining step for each matching node of the
current step
16
Xpath Evaluate in PTime
  • Theorem Let e be an arbitrary XPath expression.
    Then, for context node x, position k, and size n,
    the value of e is v, where v is the unique value
    such that ltx,k,n,vgt? E?e
  • The main principle that the paper propose to
    obtain an XPath evaluation algorithm with PTime
    complexity is the notion of a context-value
    table(CVT)

17
Context-value table Principle
  • Given an expression e, the CVT of e specifies all
    valid combinations of contexts cltx,k,ngt and
    values v, s.t. e evaluates to v in context
    cltx,k,ngt
  • Such a table for expression e is obtained by
    first computing the CVTs of the direct
    subexpressions of e and then combining them into
    the CVT for e.
  • The size of each of the CVTs has a polynomial
    bound
  • Each of the combination steps can be effected in
    PTime
  • Thus, query evaluation in total under our
    principle also has a PTime bound

18
Bottom-up evaluation of XPath
19
Bottom-up evaluation of XPath
  • Algorithm (Bottom-up algorithm for XPath)
  • Input An XPath query Q
  • Output E?Q
  • Method
  • Let Tree(Q) be the parse tree of query Q
  • RØ
  • For each atomic expression l ? leaves(Tree(Q)) do
  • compute table E?l and add it to R Note
    we use JDom to do this
  • While E?root(Tree(Q))! ? R do
  • Begin
  • take an Op(l1,ln) nodes(Tree(Q))
  • s.t. E?l1, E?ln ? R
  • compute E?Op(l1,ln) using E?l1,, E?ln
  • add E?Op(l1,ln) to R
  • End
  • Return E?root(Tree(Q))

By a bottom-up algorithm we mean a method of
processing XPath while traversing the parse tree
of the query from its leaves up to its root.
20
Bottom-up evaluation of XPath
  • Example
  • XML

lt?xml version"1.0"?gt ltpeoplegt ltperson
born"1912" died"1954" id"p342"gt ltnamegt
Alan Turing lt/namegt lt!-- Did the word
computer scientist exist in Turing's day? --gt
ltprofessiongtcomputer scientistlt/professiongt
ltprofessiongtmathematicianlt/professiongt
ltprofessiongtcryptographerlt/professiongt
lthomepagegthref"http//www.turing.org.uk/"lt/homepa
gegt lt/persongt ltperson born"1918"
died"1988" id"p4567"gt ltnamegtRichard M.
Feynmanlt/namegt ltprofessiongtphysicistlt/profes
siongt lthobbygtPlaying the bongoeslt/hobbygt
lt/persongt lt/peoplegt
21
Example XML Doc Tree
22
Example XPath Query tree
  • Parse tree XPath query
  • descendant profession/following-siblingposit
    ion()! last()

23
Example Evaluate subexpressions
24
Example Evaluate subexpressions
25
Example Evaluate subexpressions
26
Design and Implementaion
  • Environment
  • Java,JDK1.5.0
  • Jdom1.0
  • XPath1.0
  • Features
  • Only Element nodes are queried
  • Not support abbreviated xpath expressions
  • Not support format of location steps in predicts.

27
System Structure
User input (MyDriver.java)
XML file Query Context node
Query Parser (Parser.java BinaryTree.java,Node.jav
a)
JDom XML parser (org.jdom.input.SAXBuilder)
Query tree
XML document tree
Evaluator( QueryEval.java)
Context value tables (ContextValTable.java and
others)
Result for the full xpath query
28
Conclusion
  • XPath query evaluation algorithm that runs in
    polynomial time with respect to the size of both
    the data and the query (linear in the size of
    queries and quadratic in the size of data)
  • No optimization, strictly coheres to the
    specification given in the paper

29
References
  • G. Gottlob, C. Koch, and R. Pichler. "Xpath
    Processing in a Nutshell". In Proceedings of the
    19th IEEE International Conference on Data
    Engineering (ICDE'03), Bangalore, India, Mar.
    2003.
  • G. Gottlob, C. Koch, and R. Pichler. "Efficient
    Algorithms for Processing XPath Queries". In
    Proceedings of the 28th International Conference
    on Very Large Data Bases (VLDB'02), Hong Kong,
    China, Aug. 2002.
  • G. Gottlob, C. Koch, and R. Pichler. "XPath Query
    Evaluation Improving Time and Space Efficiency".
    In Proceedings of the 19th IEEE International
    Conference on Data Engineering (ICDE'03),
    Bangalore, India, Mar. 2003.
  • http//www.ibiblio.org/xml/books/xmljava/chapters/
    ch16.html
Write a Comment
User Comments (0)
About PowerShow.com