Title: XPath
1XPath
2XPath
- identifies parts of an XML document
- XPath is not written in XML
- resembles (Unix) directory and filename path
structures - also represents simple data structures
- Numbers
- Booleans
- Strings
3XPath node types
- root node
- denoted /
- absolute XPath expressions derive from /
- NOT the same as the document root element
- element nodes
- one for each type of XML element in a document
- nesting gives the usual parent-child tree
structure of elements
4XPath node types
- text nodes
- CDATA content of an element
- entity references resolved by the parser before
node is resolved - attribute nodes
- element nodes are parents of their attribute
nodes - attribute nodes are NOT children of their parents!
5XPath node types
- comment nodes
- one for each comment in the XML document
- processing instruction nodes
- one for each processing instruction
- namespace nodes
- attached to each element or attribute node for
which a namespace is specified
6XPath node types
- not included as node types
- CDATA sections
- simply merged into text sections
- entity references
- e.g. lt
- simply merged in as literal strings
- Document Type Declarations
- default attributes defined by DTDs merged into
attribute nodes by the parser
7ltxml version 1.0gt lt?xml-stylesheet type
application/xml href people.xslgt lt?DOCTYPE
people lt!ELEMENT person (name, profession,
homepage?) gt lt!ATTLIST person born CDATA
IMPLIEDgt lt!ATTLIST person died CDATA
IMPLIEDgt ... gt
8ltpeoplegt ltperson born1912 died1954gt ltnamegtltf
irst_namegtAlanlt/first-namegt ltlast_namegtTuringlt/la
st_namegt lt/namegt lt!-- Did the word computer
scientist exist in Turings day?
--gt ltprofessiongtcomputer scientistlt/professiongt ltp
rofessiongtmathematicianlt/professiongt ltprofessiongtc
ryptographerlt/professiongt lthomepage
xlinkhrefhttpwww.Turing.org.uk//gt lt/persongt
9ltperson born1918 died1988gt ltnamegtltfirst_name
gtRichardlt/first-namegt ltmiddle_initialgtx50lt/mid
dle_initialgt ltlast_namegtFeynmanlt/last_namegt lt/nam
egt ltprofessiongtphysicistlt/professiongt lthobbygtplayi
ng the bongoeslt/hobbygt lt/persongt lt/peoplegt
10XPath Tree Structure
11location paths
- identifies a set of nodes in the document
- may be empty
- may be of different types
- constructed from location steps
- each step is evaluated with respect to the
current context node - root location path
- /
12child element location steps
- single element name
- identifies all child elements of the current
context node with that name - example
- profession
- identifies 3 nodes if the context is the first
person node (Turing) - identifies 1 node if the context is the second
person node (Feynman)
13the context node
- the node that is currently matched when the XPath
expression is evaluated
ltxsltemplate match persongt ltpgt ltxslvalue-of
select name /gt lt/pgt lt/xsltemplategt
14attribute location steps
- syntax
- _at_attribute_name
ltxsltemplate match persongt lttrgt lttdgtltxslvalu
e-of select name /gtlt/tdgt lttdgtltxslvalue-of
select _at_born /gtlt/tdgt lttdgtltxslvalue-of select
_at_died /gtlt/tdgt lt/trgt lt/xsltemplategt
15other location steps
- comment()
- identifies all children that are comment nodes
- text()
- identifies all children that are text nodes
- processing-instruction()
- identifies all children that are processing
instruction nodes - parameters can be used to specify precisely which
nodes are identified - namespace nodes cannot be referenced by XPath
16examples
- ltxsltemplate match comment()gt
- ltigtComment deletedlt/igt
- lt/xsltemplategt
- ltxsltemplate match processing-instruction(xml
-stylesheet)gt - lth1gta stylesheet is attachedlt/h1gt
- lt/xsltemplategt
17wildcards
- match different element and node types at the
same time -
- identifies all elements in context
- node()
- identifies all nodes in context
- _at_
- identifies all attributes in context
18multiple matches
- use to separate multiple location steps
- identifies all nodes in context matching at least
one of the steps - examples
- professionhobby
- first_namemiddle_initiallast_name
- idxlinktype
- _at_
19compound location paths
- combine location steps with / to form compound
location paths - each step is relative to the last
- / identifies the root path
- absolute paths start with /
- relative paths depend from the context node
- /people/person/name/first_name
- person/_at_born
20selecting from descendants
- // selects from the context node and all its
descendants - examples
- //name
- /people/person//name
- //_at_id
- person//_at_id
21selecting the parent element
- .. identifies the parent of the current node
- examples
- //_at_id..
- identifies all elements in the document with an
attribute named id - //middle_initial/../first_name
- identifies all first_name elements that are
siblings of middle_initial elements - ltfirst_namegtRichardlt/first_namegt
22selecting the context node
- . indicates the context node
- example
- ltxsltemplate match comment()gt
- ltspan class commentgt
- ltxslvalue-of select . /gt
- lt/spangt
- lt/xsltemplategt
23predicates
- Boolean expressions used to provide more
discrimination in identifying nodes - may attach a predicate to each location step
- syntax
- locationSteppredicate
- relational operators
- lt, lt, , gt, gt, !
- Boolean operators
- and, or, not()
24predicate examples
//profession.physicist . here means the
string content of the profession
element //personprofessionphysicist single
quotes can be used to avoid parse
errors //person_at_idp4567 ltxslapply-templates
select //person_at_bornlt1976 /gt lt, gt etc
have to be escaped in well-formed XML documents
25general XPath expressions
- Numbers
- 3.141, 22, 3 - 4 17 div 6 mod 2
- Booleans
- true(),false(), 32.5 lt 76.2
- Strings
- this is a string value
- ltso is this /gt
26XPath functions
- return Booleans, Numbers, Strings or Node-sets
- no void functions
- example
- ltxsltemplate match persongt
- Personltxslvalue-of selectposition()/gt
- ltxslvalue-of select name/gt
- lt/xsltemplate
27function examples
ltxsltemplate match persongt Personltxslvalue-o
f selectposition()/gt ltxslvalue-of select
name/gt lt/xsltemplategt produces a list
Person1, name1, Person2, name2, from name
elements ltxslapply-templates selectnamestar
ts-with(last_name,T)/gt apply a template to
all name elements, the string content of whose
last-name element begins with the character T
28Schedule Change
- Week commencing 15th October
- Weds 17th October BC6 1100-1300
- Double Lecture, different venue
- no change to Monday lecture
- No Friday Lab
- Week commencing 22nd October
- No lectures
- Thursday Lab only available
- Week commencing 29th October
- Weds 31st October BC6 1100-1300
- Double Lecture, different venue
- no Monday lecture