Title: An Introduction to XPath
1An Introduction to XPath
Nick Davies
Tuesday 20th January 2004
2XPath Introduction
- XPath is a W3C recommendation (1999).
- According to W3C, it is a language for
addressing parts of an XML document - It can be thought of as query language, like
SQL, however it operates on XML documents, not
databases, by specifying path expressions in
order to identify nodes in the document - XPath models XML documents as trees of nodes
using the Document Object Model (DOM), text order
is maintained in the tree - XPath's primary intention was that of a
component that can be used by other
specifications (such as XPointer and XSLT)
3XPath Syntax
XPath uses path expressions to locate nodes with
XML documents Here is a simple XML document,
representing a DVD collection.
lt?xml version1.0 encodingISO-8859-1?gt ltcolle
ctiongt ltdvd titleFawlty Towersgt ltgenregtcomed
ylt/genregt ltyeargt1975lt/yeargt ltcertgtPGlt/certgt lt
/dvdgt ltdvd titleQuadropheniagt ltgenregtcultlt/g
enregt ltyeargt1979lt/yeargt ltlengthgt114lt/lengthgt
lt/dvdgt ltdvd titleGoldfingergt ltyeargt1964lt/yea
rgt ltcertgtPGlt/certgt ltlengthgt105lt/lengthgt lt/dvd
gt ltcollectiongt
4XPath Syntax
We can now describe the XPath syntax with
examples from the previous XML document
lt?xml version1.0 encodingISO-8859-1?gt ltcolle
ctiongt ltdvd titleFawlty Towersgt ltgenregtcomed
ylt/genregt ltyeargt1975lt/yeargt ltcertgtPGlt/certgt lt
/dvdgt ltdvd titleQuadropheniagt ltgenregtcultlt/g
enregt ltyeargt1979lt/yeargt ltlengthgt114lt/lengthgt
lt/dvdgt ltdvd titleGoldfingergt ltyeargt1964lt/yea
rgt ltcertgtPGlt/certgt ltlengthgt105lt/lengthgt lt/dvd
gt ltcollectiongt
Locating nodes /collection/dvd/cert
5XPath Syntax
We can now describe the XPath syntax with
examples from the previous XML document
lt?xml version1.0 encodingISO-8859-1?gt ltcolle
ctiongt ltdvd titleFawlty Towersgt ltgenregtcomed
ylt/genregt ltyeargt1975lt/yeargt ltcertgtPGlt/certgt lt
/dvdgt ltdvd titleQuadropheniagt ltgenregtcultlt/g
enregt ltyeargt1979lt/yeargt ltlengthgt114lt/lengthgt
lt/dvdgt ltdvd titleGoldfingergt ltyeargt1964lt/yea
rgt ltcertgtPGlt/certgt ltlengthgt105lt/lengthgt lt/dvd
gt ltcollectiongt
Select Unknown Values using wildcards ()
/collection/dvd/
6XPath Syntax
We can now describe the XPath syntax with
examples from the previous XML document
lt?xml version1.0 encodingISO-8859-1?gt ltcolle
ctiongt ltdvd titleFawlty Towersgt ltgenregtcomed
ylt/genregt ltyeargt1975lt/yeargt ltcertgtPGlt/certgt lt
/dvdgt ltdvd titleQuadropheniagt ltgenregtcultlt/g
enregt ltyeargt1979lt/yeargt ltlengthgt114lt/lengthgt
lt/dvdgt ltdvd titleGoldfingergt ltyeargt1964lt/yea
rgt ltcertgtPGlt/certgt ltlengthgt105lt/lengthgt lt/dvd
gt ltcollectiongt
Selecting Branches using square
brackets /collection/dvd1 /collection/dvdlas
t()
7XPath Syntax
We can now describe the XPath syntax with
examples from the previous XML document
lt?xml version1.0 encodingISO-8859-1?gt ltcolle
ctiongt ltdvd titleFawlty Towersgt ltgenregtcomed
ylt/genregt ltyeargt1975lt/yeargt ltcertgtPGlt/certgt lt
/dvdgt ltdvd titleQuadropheniagt ltgenregtcultlt/g
enregt ltyeargt1979lt/yeargt ltlengthgt114lt/lengthgt
lt/dvdgt ltdvd titleGoldfingergt ltyeargt1964lt/yea
rgt ltcertgtPGlt/certgt ltlengthgt105lt/lengthgt lt/dvd
gt ltcollectiongt
Select Several Paths using the
operator /collection/dvd/genre
/collection/dvd/cert
8XPath Syntax
We can now describe the XPath syntax with
examples from the previous XML document
lt?xml version1.0 encodingISO-8859-1?gt ltcolle
ctiongt ltdvd titleFawlty Towersgt ltgenregtcomed
ylt/genregt ltyeargt1975lt/yeargt ltcertgtPGlt/certgt lt
/dvdgt ltdvd titleQuadropheniagt ltgenregtcultlt/g
enregt ltyeargt1979lt/yeargt ltlengthgt114lt/lengthgt
lt/dvdgt ltdvd titleGoldfingergt ltyeargt1964lt/yea
rgt ltcertgtPGlt/certgt ltlengthgt105lt/lengthgt lt/dvd
gt ltcollectiongt
Selecting Attributes using the _at_
prefix //dvd_at_titleFawlty Towers
9XPath Location Paths
- One of the most important XPath expressions is
the location path. - A location path expression results in a node-set.
- Each location path consists of one or more
location steps. Each location step has - an axis
- a node test
- one or more predicates (optional)
- Location Step Syntax and Example
- axisnamenodetestpredicate
childdvdgenrecomedy
10XPath Location Paths Axes
There are several axes in XPath each of which
selects a different subset of the nodes in the
document, depending on the context node. They are
Ancestor, ancestor-or-self, attribute, child,
descendant, descendant-or-self, following,
following-sibling, namespace, parent, preceding,
preceding-sibling, self.
collection
collection
dvd
dvd
dvd
dvd
dvd
dvd
year
cert
time
year
cert
time
The child axis with context node collection
The preceding-sibling axis with context node time
11XPath Location Paths Node Tests
- Node Test
- Determines what kinds of nodes be selected along
a given axis. The node test is applied to each
node in the axis - If the test succeeds the node is kept, if not the
node is disregarded - There are 7 types of node Example Location
Paths - root nodes childfoo - is there a foo child
element? - element nodes parenttext() - is the parent a
text node? - text nodes descendantcomment() - is there
a comment - attribute nodes node below?
- namespace nodes
- processing instruction nodes
- comment nodes
12XPath Location Paths Predicates
- Predicates
- The predicate is a further test to retain or
eliminate nodes. It filters the node set into a
new node-set. Each node is evaluated in turn and
if a predicate is true for a given node, it is
kept in the new node set, if false it is removed.
- Consists of a well-formed expressions consisting
of - boolean operators
- functions
- numbers, strings
- comparison operators
-
- Examples
- childdvdgenrecomedy
- descendantdvdposition()2
- descendantcollectionattributetypetitle
13XPath Expressions
- Expressions describes a set of nodes in a
documents. Weve already seen one a location
path. - Path expressions may be absolute or relative
- absolute begin from the root node /collection/dv
d/length - relative begin from the context node dvd/genre
- XPath supports four types of expressions
- Numerical - div mod
- Equality !
- Relational lt gt lt gt
- Boolean or and
14XPath Functions
XPath defines a number of useful functions for
converting and translating data. Some take
arguments, some dont. Those that dont, operate
on the context node. There are 25 functions.
Here are only a few examples Node Set
Functions count() returns the number of nodes
in a node set String Functions concat()
returns the concatenations of all its
arguments contains() returns true if the second
of 2 given strings is in the first Number
Functions sum() returns the total value of a
set of numeric values in a node-set Boolean
Functions not() returns true if the argument is
false, and false if the arguments is true
15XPath Summary
- To sum up
- XPath is a straightforward declarative language
for selecting particular subsets of nodes from an
XML document - XPath has a syntax for the selection of nodes,
unknown elements, branches, - multiple paths and attributes
- XPath queries are written in expressions which
can returns doubles or strings and allow for
arithmetic and relational operations on data
types - Location paths are an important subset of
expressions, they consist of one or more location
step which in turn consists of an axis, a node
test, and (optionally) one or more predicates - XPath also provides a core set of functions for
operating on the data types
16Thanks For Your Time Any Questions?