Introduction to XSLT - PowerPoint PPT Presentation

About This Presentation
Title:

Introduction to XSLT

Description:

XSLT is: Powerful Small Beautiful In high demand Fun to learn Fun to teach What should I expect this afternoon? Fasten your seatbelts A variety of interactive ... – PowerPoint PPT presentation

Number of Views:382
Avg rating:3.0/5.0
Slides: 139
Provided by: xmlportfo
Category:

less

Transcript and Presenter's Notes

Title: Introduction to XSLT


1
Introduction to XSLT
Evan Lenz XML/XSLT Consultant http//xmlportfolio.
com evan_at_evanlenz.net
August 2, 2005 OReilly Open Source
Convention August 1 - 5, 2005
2
Who is this guy?
  • Evan Lenz
  • Majored in music
  • Over 5 years ago, read Michael Kay's XSLT
    Programmer's Reference cover-to-cover while
    sitting by his newborn son's hospital bed
  • Participated on the XSL Working Group for a
    couple years
  • Wrote XSLT 1.0 Pocket Reference (due out this
    month)
  • Preparing for entrance to a Ph.D. program in
    Digital Arts and Experimental Media

3
Why does he like XSLT?
  • XSLT is
  • Powerful
  • Small
  • Beautiful
  • In high demand
  • Fun to learn
  • Fun to teach

4
What should I expect this afternoon?
  • Fasten your seatbelts
  • A variety of interactive exercises and
    traditional presentation
  • Feel free to feel overwhelmed
  • You're learning more than you think!
  • Try your best while you're here and it will be
    time well spent
  • Have fun!

5
What's with the handouts?
  • The big handout is a late-stage draft of XSLT 1.0
    Pocket Reference, due out this month
  • If you would like a complimentary copy of the
    final book, put your name and mailing address on
    the sign-up sheet
  • The smaller handout contains exercises that we
    will be using today

6
XSLT from 30,000 feet
  • High-level overview

7
What is XSLT?
  • XSL Transformations
  • A language for transforming XML documents into
    other XML documents
  • W3C Recommendation
  • http//www.w3.org/TR/xslt
  • Version 1.0 1999-11-16

8
OK, then what is XSL?
  • Extensible Stylesheet Language
  • A language for expressing stylesheets
  • W3C Recommendation
  • http//www.w3.org/TR/xsl
  • Version 1.0 2001-10-15
  • Has 2 parts
  • XSLT
  • Refactored out of XSL so that it could proceed
    independently
  • XSL-FO
  • Formatting Objects

9
What is XPath?
  • XML Path Language
  • A language for addressing parts of an XML
    document
  • W3C Recommendation
  • http//www.w3.org/TR/xpath
  • Version 1.0 1999-11-16
  • Released on the same day as XSLT 1.0
  • The expression language used in XSLT

10
A relationship of subsets
  • XPath is part of XSLT
  • XSLT is part of XSL
  • Today we are concerned only with the inner two
    circles
  • XSLT and XPath
  • XSL, a.k.a. XSL-FO, is out of scope for today

11
What is XSLT used for?
  • Common applications
  • Stylesheets for converting XML to HTML
  • Generating Web pages or whole websites
  • Docbook -gt HTML
  • Transformations from one document type to another
  • ML to ML as many potential applications as
    there are XML document types
  • RSS, SVG, UBL, LegalXML, HrXML, XBRL
  • Office applications
  • SpreadsheetML, WordML, Keynote XML, OOo XML,
    PowerPoint (in next version), Access XML, etc.
  • Extracting data from documents
  • Modifying or fixing up documents

12
Where is XSLT used?
  • Every platform
  • Windows, Linux, Mac, UNIX, Java
  • Many browsers support XSLT natively
  • Firefox/Mozilla, Internet Explorer, Safari
  • Many frameworks use or support XSLT
  • .NET, Java, LAMP
  • PHP5 now uses libxslt
  • Cocoon, 4Suite, Amazon web services, Google
    appliance, Cisco routers, etc., etc.
  • XSLT IS EVERYWHERE!!

13
Interoperable implementations?
  • In terms of interoperability, XSLT is unmatched
    among languages having multiple implementations
  • Java
  • Saxon http//saxon.sf.net (open-source)
  • Xalan-J http//xml.apache.org/xalan-j/
    (open-source)
  • Windows
  • MSXML fast, fully conformant
  • Python
  • 4xslt http//www.4suite.org (open-source)
  • C
  • libxslt http//xmlsoft.org (open-source used
    in Firefox, Safari, PHP5, etc.)
  • Xalan-C http//xml.apache.org/xalan-c/
    (open-source)

14
Enough already, let's see some code!
15
Example XML file
  • INPUT names.xml
  • ltpeoplegt
  • ltpersongt
  • ltgivenNamegtJoelt/givenNamegt
  • ltfamilyNamegtJohnsonlt/familyNamegt
  • lt/persongt
  • ltpersongt
  • ltgivenNamegtJanelt/givenNamegt
  • ltfamilyNamegtJohnsonlt/familyNamegt
  • lt/persongt
  • ltpersongt
  • ltgivenNamegtJimlt/givenNamegt
  • ltfamilyNamegtJohannsonlt/familyNamegt
  • lt/persongt
  • ltpersongt
  • ltgivenNamegtJodylt/givenNamegt
  • ltfamilyNamegtJohannsonlt/familyNamegt
  • lt/persongt

16
A very simple stylesheet, names.xsl
17
OUTPUT the result of the transformation
saxon names.xml names.xsl gtnames.html
18
Or we could open the XML directly in the browser
  • Oops, we must first add a processing instruction
    (PI) to the top, like this

lt?xml-stylesheet type"text/xsl"
href"names.xsl"?gt ltpeoplegt lt!-- ...
--gt lt/peoplegt
19
That's better. Displays as HTML but viewing
source shows it's just XML.
20
One more example for now
  • INPUT article.xml

lt?xml-stylesheet type"text/xsl"
href"article.xsl"?gt ltarticlegt ltheadinggtThis is
a short articlelt/headinggt ltparagtThis is the
ltemphasisgtfirstlt/emphasisgt paragraph.lt/paragt
ltparagtThis is the ltstronggtsecondlt/stronggt
paragraph.lt/paragt lt/articlegt
21
A rule-oriented stylesheet
  • article.xsl

22
A rule-oriented stylesheet, cont.
  • article.xsl, cont.

23
OUTPUT article.xml transformed to HTML
24
See a pattern here?
25
XPath in a nutshell
26
How XPath fits in XSLT
  • XPath expressions appear in attribute values,
    e.g.
  • ltxslfor-each select"/people/person"/gt
  • ltxslvalue-of select"givenName"/gt
  • ltxslapply-templates select"/article/para"/gt
  • What these mean
  • /people/person
  • Select all person child elements of all people
    child elements of the root node
  • givenName
  • Select all givenName child elements of the
    context node
  • /article/para
  • Select all para child elements of all article
    child elements of the root node

27
The skinny on XPath
  • XPath is an expression language
  • The only thing you can do with XPath is write
    expressions
  • When we say expression, we mean XPath
    expression
  • Every expression returns a value
  • XPath 1.0 has just four data types
  • Node-set (the most important)
  • String
  • Number
  • Boolean
  • All expressions are evaluated in a context
  • Understanding context is crucial to understanding
    XPath

28
Path expressions
  • Expressions that return node-sets are sometimes
    called path expressions
  • A node-set is
  • An unordered collection of zero or more nodes
  • Every expression is evaluated relative to exactly
    one context node
  • The context node is analogous to the current
    directory in a filesystem
  • On a CLI, dir/ expands to all the files in the
    dir directory inside the current directory
  • As an XPath expression, dir/ would select all
    the element children of all the dir element
    children of the context node

29
A filesystem analogy
  • Addressing files
  • Relative
  • dir/
  • ../file
  • Absolute
  • /home/elenz/file.txt
  • Addressing XML nodes
  • Relative
  • body/p
  • ../table
  • Absolute
  • /html/body/p

30
QUIZ 1 You have 5 minutes
  • Ready?
  • Set...

31
Go! Use this cheat sheet
  1. para selects the para element children of the
    context node
  2. selects all element children of the context
    node
  3. node() selects all children of the context node
  4. _at_name selects the name attribute of the context
    node
  5. _at_ selects all the attributes of the context node
  6. para1 selects the first para child of the
    context node
  7. paralast() selects the last para child of the
    context node
  8. /para selects all para grandchildren of the
    context node
  9. /doc/chapter5/section2 selects the second
    section of the fifth chapter of the doc
  10. chapter//para selects the para element
    descendants of the chapter element children of
    the context node
  11. //para selects all the para descendants of the
    document root and thus selects all para elements
    in the same document as the context node
  12. . selects the context node
  13. .//para selects the para element descendants of
    the context node
  14. .. selects the parent of the context node
  15. title
  16. ../_at_lang selects the lang attribute of the parent
    of the context node

32
XPath is all about trees
  • A venture into the abstract world of the XPath
    data model
  • Start filling out the NOTES page

33
The XPath data model
  • An abstraction of an XML document, after parsing
  • In XSLT, models the source tree, stylesheet tree,
    result tree
  • An XML document is a tree of nodes
  • There are 7 kinds of nodes (memorize these!)
  • Root node
  • Element node
  • Attribute node
  • Text node
  • Comment node
  • Processing Instruction (PI) node
  • Namespace node

34
Root nodes
  • Every XML document has exactly one root node
  • An invisible container for the whole document
  • The XPath expression / selects the root node of
    the same document as the context node
  • The root node is not an element
  • Instead, the document element or root element
    is a child of the root node
  • It can also contain
  • Processing instruction (PI) nodes
  • Comment nodes
  • XSLT extension to XPath data model
  • Root node may contain text nodes
  • Root node may contain more than one element node

35
Element nodes
  • There is one element node for each element that
    appears in a document. (Duh.)
  • Example ltfoogtltbar/gtlt/foogt
  • There are two element nodes above foo and bar.
  • The foo element contains the bar element node.
  • Element nodes can contain
  • Text nodes
  • Other element nodes
  • Comment nodes
  • Processing instruction (PI) nodes

36
Node property children
  • Applies only to
  • Element nodes
  • Root nodes
  • Consists of
  • Ordered list of zero or more other nodes
  • 4 kinds of nodes can be children (memorize this
    subset!)
  • Element nodes
  • Text nodes
  • Comment nodes
  • Processing instruction (PI) nodes
  • Instead of Lions, Tigers, and Bears, Oh My,
    chant
  • Elements, comments, text, PIs! Elements,
    comments, text, PIs!
  • Example ltfoogtltbar/gt lt!-- hi --gt lt/foogt
  • The foo element's children consists of four nodes
    in order
  • 1) element, 2) text, 3) comment, 4) text

37
Why should I memorize that subset of four?
  • Knowing what types of nodes can be children is
    crucial to understanding what this little,
    unassuming instruction does (as we shall see)
  • ltxslapply-templates/gt
  • So remember
  • Elements, comments, text, PIs!
  • Elements, comments, text, PIs!

38
How to access the children
  • Use the child axis, e.g. (in non-abbreviated
    form)
  • childnode()
  • Selects all children of the context node
  • child
  • Selects all child elements of the context node
  • childparagraph
  • Selects all child elements named paragraph
  • childxyzfoo
  • Selects all child elements named foo in the
    namespace designated by the xyz prefix
  • childxyz
  • Selects all child elements that are in the
    namespace designated by the xyz prefix

39
Attribute nodes
  • There is one attribute node for each attribute
    that appears in a document. (Duh again.)
  • Example ltfoo bar"bat" bang"baz"/gt
  • There are two attribute nodes in the above
    example
  • bar and bang

40
Node property attributes
  • Applies only to
  • Element nodes
  • Consists of
  • Unordered list of zero or more attribute nodes
  • For example
  • ltdoc lang"en"/gt
  • The doc element's attributes property consists of
    one lang attribute

41
How to select attributes
  • Use the attribute axis, e.g. (in abbreviated
    form)
  • _at_lang
  • Selects the attribute named lang
  • _at_ or _at_node()
  • Selects all attributes of the context node
  • _at_abcfoo
  • Selects the attribute named foo in the namespace
    designated by the abc prefix
  • _at_abc
  • Selects all attributes that are in the namespace
    designated by the abc prefix

42
Text nodes
  • There is one text node for each contiguous
    sequence of character data in a document
  • Text nodes are never adjacent siblings to each
    other
  • Adjacent text nodes are always automatically
    merged into one text node (e.g., when creating
    the result tree in XSLT)
  • Lexical details are thrown away
  • The XPath data model knows nothing about
  • CDATA sections, entity references, or character
    references
  • Example ltfoogtltlt/foogt
  • There is one text node in the above document (a lt
    character)
  • Example ltfoogtlt!CDATAltgtlt/foogt
  • Identical to the first example, as far as XPath
    is concerned

43
Text node quiz
  • Example

ltfoogt ltbargtHello world.lt/bargt lt/foogt
  • How many text nodes are in the above document?

44
Text node quiz ANSWER
  • Example

ltfoogt ltbargtHello world.lt/bargt lt/foogt
  • How many text nodes in the above document?
  • ANSWER 3
  • 1 Linefeed, space, space
  • 2 Hello world.
  • 3 Linefeed

45
How to select text nodes
  • Use the text() node test
  • text()
  • Short for childtext()
  • descendanttext()
  • Selects all text nodes that are descendants of
    the context node

46
Comment nodes
  • There is one comment node for each comment
  • Example
  • lt!--This is a comment node--gt

47
How to select comments
  • Use the comment() node test on the child axis
  • comment()
  • Short for childcomment()

48
Processing instruction (PI) nodes
  • There is one PI node for each PI
  • The XML declaration is not a PI
  • lt?xml version"1.0"?gt is not a PI
  • (It's not a node at all but just a lexical detail
    that XPath knows nothing about.)
  • Example
  • (This is a PI.)
  • lt?xml-stylesheet type"text/xsl" href"a.xsl"?gt

49
How to select processing instructions
  • Use the processing-instruction() node test
  • Any PI
  • processing-instruction()
  • Selects all PI children of the context node
  • Short for childprocessing-instruction()
  • PI with a specific target
  • processing-instruction('xml-stylesheet')
  • Selects all xml-stylesheet processing instruction
    children of the context node

50
Namespace nodes
  • There is one namespace node for each in-scope
    namespace URI/prefix binding for each element in
    a document. (No duh... er... what?)
  • Always includes this (implicit) binding (used by
    reserved attributes xmllang and xmlspace,
    etc.)
  • Prefix xml
  • URI http//www.w3.org/XML/1998/namespace
  • Example ltfoo/gt
  • There is one namespace node in the above document
  • Example ltfoo xmlns"http//example.com"/gt
  • There are two namespace nodes in the above
    document
  • The implicit xml one (see above)
  • And this one
  • Prefix
  • URI http//example.com

51
Node property namespace nodes
  • As with the attributes property, applies only to
  • Element nodes
  • Consists of
  • Unordered list of zero or more namespace nodes
  • For example
  • ltfoo xmlnsxyz"http//example.com"/gt
  • The foo element's namespace nodes property
    consists of two namespace nodes (one for xyz and
    one for xml)

52
How to select namespace nodes
  • Use the namespace axis
  • namespace
  • Selects all of the context node's namespace nodes
  • namespacenode()
  • Same as above
  • namespacexyz
  • Select the context node's namespace node that
    declares the xyz prefix.

53
Node property parent
  • Applies to
  • All node types except root node
  • Element nodes
  • Text nodes
  • Comment nodes
  • Processing instruction (PI) nodes
  • Attribute nodes
  • Namespace nodes
  • Consists of
  • Exactly one other node
  • Root node, or
  • Element node
  • Example ltfoo bar"bat"/gt
  • The bar attribute's parent is the foo element

54
How to access the parent node
  • Use the parent axis
  • ..
  • Selects the parent node of the context node
  • Short for parentnode()
  • parentdoc
  • Select the parent node of the context node
    provided that it is an element named doc
  • (otherwise return an empty node-set)
  • parent
  • Select the parent node of the context node
    provided that it is an element

55
A riddle
  • You have a parent but you are not a child.
  • What are you?

56
A riddle, cont.
  • You have a parent but you are not a child.
  • What are you?
  • Hint
  • Only 4 node types are children, but 6 node types
    have parents
  • 6 - 4 2 ...

57
The answer
  • You have a parent but you are not a child.
  • What are you?
  • Hint
  • 4 node types are children, but 6 node types have
    parents
  • 6 - 4 2 ...
  • ANSWER
  • A namespace node or an attribute node of course!
  • Embracing the asymmetry and moving on...

58
Derived node relationships
  • A node's descendants consists of the transitive
    closure of the children property
  • A fancy way of saying
  • My children and my grandchildren and my great
    grandchildren and their kids and so on
  • A node's ancestors consists of the transitive
    closure of the parent property
  • A fancy way of saying
  • My parent and my grandparent and my great
    grandparent and its parent and so on

59
Shooting blanks is okay
  • QUIZ How many nodes will each of the following
    expressions return?
  • parentcomment()
  • attributetext()
  • ancestorprocessing-instruction()
  • namespacexyz

60
Shooting blanks is okay
  • QUIZ How many nodes will each of the following
    expressions return?
  • parentcomment()
  • attributetext()
  • ancestorprocessing-instruction()
  • namespacexyz
  • ANSWER 0, by definition
  • These expressions are perfectly legal they're
    just guaranteed to return empty

61
Node property string-value
  • Applicable to
  • All node types
  • Root concatenation of all descendant text node
    string-values
  • Element concatenation of all descendant text
    node string-values
  • Attribute normalized attribute value
  • Text character data (always at least one
    character)
  • Comment the content of the comment
  • PI text following the PI target and whitespace
  • e.g., type"text/xsl" href"style.xsl" is the
    string-value of an example stylesheet PI
  • Namespace node the namespace URI
  • Use ltxslvalue-of/gt to insert the string-value of
    a node into the result tree

62
Node property expanded-name
  • Applicable to
  • Elements and attributes
  • Local part local name of node, returned by
    local-name()
  • URI part namespace name (URI) of node,
    namespace-uri()
  • PIs
  • Local part the PI target, e.g., xml-stylesheet
  • URI part (always null)
  • Namespace nodes
  • Local part the namespace prefix, e.g., xml or
    xyz or empty string () in the case of a default
    namespace
  • URI part (always null)
  • Root, text, and comment nodes do not have names

63
Document order
  • There is an ordering for all nodes in a document
    called document order
  • The root node is always the first node in a
    document
  • The rest are ordered according to where their XML
    representation begins
  • Except that the relative order of attributes and
    namespace nodes on the same element is
    implementation-defined
  • Why should I care about document order?
  • Because it's the default order in which nodes are
    processed by both ltxslfor-eachgt and
    ltxslapply-templatesgt

64
Quiz counting nodes
  • How many nodes are in the following XML document?

65
Answer
  • How many nodes are in the following XML document?
  • 15!

66
The first 14 nodes in the QUIZ 1 example
67
Quiz review
  • para
  • Short for childpara
  • Short for child
  • node()
  • Short for childnode()
  • _at_name
  • Short for attributename
  • _at_
  • Short for attribute

68
Quiz review
  • para1
  • Short for childpara1
  • Equivalent to childparaposition() 1
  • paralast()
  • Short for childparalast()
  • Equivalent to childparaposition() last()
  • /para
  • Short for child/childpara
  • /doc/chapter5/section2
  • Short for /childdoc/childchapter5/childsec
    tion2

69
Quiz review
  • chapter//para
  • Short for
  • childchapter/descendant-or-selfnode()/childp
    ara
  • //para
  • Short for
  • /descendant-or-selfnode()/para
  • .
  • Short for
  • selfnode()
  • .//para
  • Short for
  • selfnode()/descendant-or-selfnode()/childpar
    a

70
Quiz review
  • ..
  • Short for parentnode()
  • title
  • Short for childtitle
  • ../_at_lang
  • Short for parentnode()/attributelang

71
Summary of abbreviations
  • XPath has five abbreviations. They are
  • . is short for selfnode()
  • .. is short for parentnode()
  • _at_ is short for attribute
  • // is short for /descendant-or-selfnode()/
  • foo is short for childfoo

72
XPath, the language
  • Descending from the clouds
  • Keep filling out that NOTES page

73
XPath basics review
  • XPath is an expression language
  • Every expression returns a value
  • XPath 1.0 has just four data types (write these
    down!)
  • Node-set (the most important)
  • String
  • Number
  • Boolean
  • All expressions are evaluated in a context
  • Understanding context is crucial to understanding
    XPath

74
XPath context
  • All XPath expressions (whether in XSLT or not)
    are evaluated in a context
  • The context consists of 6 parts
  • The context node
  • The context size (an integer 1 or higher)
  • Returned by the last() function
  • The context position (an integer 1 or higher)
  • Returned by the position() function
  • A set of namespace/prefix declarations in scope
    for the expression
  • Used to evaluate QNames in the expression, e.g.,
    xyzfoo/xyzbar
  • A set of variable bindings
  • A function library

75
XPath context, cont.
  • The context comprises the entire world for an
    XPath expression, so to speak.
  • Other than its context, there is no input to an
    XPath expression. It consists of everything
    outside the expression itself that may affect the
    resulting value of the expression.
  • The context indicates
  • Where you are
  • Where in the tree
  • Context node
  • Where in processing
  • Context size (the size of an arbitrary list of
    nodes being processed)
  • Context position (the position of the current
    node in that list)
  • What is available to you
  • What variables you can reference
  • What namespace prefixes you can use
  • What functions you can call

76
XPath syntax overview
  • XPath supports these kinds of expressions
  • Variable references
  • foo, bar, etc.
  • Function calls
  • starts-with(str,'a')
  • true()
  • round(num)
  • Parenthesized expressions
  • (//para)
  • (foo bar)
  • String literals
  • "foo", 'bar', etc.
  • Numbers
  • 13, 24.7, .007, etc.
  • cont...

77
XPath syntax overview, cont.
  • Node-set expressions
  • /html/body/p2/text()
  • //_at_person //person
  • (.//note para/fnote)1
  • ns_at_id'xyz'
  • Arithmetic expressions
  • ((x - 5) 2) div -3
  • pos mod 2
  • Boolean expressions
  • is-good and is-valid
  • x gt 4
  • position() ! last()

78
Node-set expressions
  • A node-set is
  • An unordered collection of zero or more nodes
  • Node-set expressions include
  • Location paths (the most important kind of
    expression!)
  • foo/bar3
  • Union expressions (union of two node-set
    expressions using the operator)
  • set foo/bar3
  • Filtered expressions (a predicate applied to any
    expression using the predicate operator)
  • set.'good'
  • Path expressions (any expression composed with a
    location path using the / or // operators)
  • set//bar

79
Location paths a formal definition
  • A location path
  • Is the most important kind of XPath expression
  • Returns a node-set
  • Can be absolute or relative
  • Relative
  • One or more steps separated by /
  • foo
  • foo/bar
  • Absolute
  • /
  • Selects the root node of the document that
    contains the context node
  • The only location path that doesn't have any
    steps in it
  • / followed by a relative location path
  • /foo
  • /foo/bar

80
Location path steps
  • A location path step has 3 parts
  • An axis specifier
  • A node test
  • Zero or more predicates
  • The above is equivalent to this abbreviated form
  • paragraphstring-length(.) gt 100
  • (because child is the default axis)
  • It selects each paragraph child whose
    string-value is greater than 100 characters in
    length

81
How a step is evaluated
  • Moving from left to right
  • The axis identifies a set of nodes relative to
    the context node.
  • The node test acts as a filter on that set.
  • Each of any number of optional predicates in turn
    acts as a filter on the set identified by the
    preceding predicates and node test to its left.
  • For example
  • childparagraphstring-length(.)gt100
  • The child axis identifies all the children of the
    context node.
  • Among those, the paragraph node test selects only
    the elements named paragraph.
  • Among those, the string-length(.)gt100 predicate
    filters out all but the nodes whose string-value
    is greater than 100 characters long.

82
The axis
  • That's this part
  • Can be any one of 13 axes
  • child
  • self
  • parent
  • descendant
  • descendant-or-self
  • ancestor
  • ancestor-or-self
  • following
  • following-sibling
  • preceding
  • peceding-sibling
  • attribute
  • namespace

83
The 13 XPath axes
  • What each axis contains
  • child
  • The children of the context node.
  • descendant
  • The descendants of the context node (children,
    children's children, etc.).
  • parent
  • The parent of the context node (empty if context
    node is root node).
  • ancestor
  • The ancestors of the context node (parent,
    parent's parent, etc.).

84
The 13 XPath axes, cont.
  • What each axis contains, cont.
  • attribute
  • The attributes of the context node (empty if
    context node is not an element).
  • namespace
  • The namespace nodes of the context node (empty if
    context node is not an element).
  • self
  • Just the context node itself.
  • descendant-or-self
  • The context node and descendants of the context
    node.
  • ancestor-or-self
  • The context node and ancestors of the context
    node.

85
The 13 XPath axes, cont.
  • What each axis contains, cont.
  • following-sibling
  • All nodes with the same parent as the context
    node that come after the context node in document
    order (empty if context node is an attribute or
    namespace node).
  • preceding-sibling
  • All nodes with the same parent as the context
    node that come before the context node in
    document order (excluding attributes and
    namespace nodes).
  • following
  • All nodes after the context node in document
    order, excluding descendants, attributes, and
    namespace nodes.
  • preceding
  • All nodes before the context node in document
    order, excluding ancestors, attributes, and
    namespace nodes.

86
A little observation about siblings
  • The types of nodes that can be siblings are the
    same as the types of nodes that can be children
  • Elements, comments, text, PIs!
  • That's because attributes and namespace nodes are
    by definition not siblings to anyone, not even to
    each other. They're just attached to their
    parent.

87
The node test
  • That's this part
  • A node test is a filter on an axis
  • There are two kinds of node test
  • Node type tests
  • Any node node()
  • Specific node type text(), comment(),
    processing-instruction()
  • Specific PI target processing-instruction('foo')
  • Name tests
  • Wildcard (any name)
  • Namespace-qualified wildcard (any local name
    within a particular namespace) xyz, abc,
    etc.
  • QName (a specific expanded-name) foo, xyzfoo,
    the highlighted example, etc.
  • Node type tests are not functions
  • They're special forms that only happen to look
    like functions

88
Node type tests
  • The most inclusive node test node()
  • It includes every node regardless of its name or
    node type
  • Thus, it effectively selects all nodes on the
    given axis
  • childnode() selects all children
  • ancestornode() selects all ancestors
  • etc.
  • Specific node type
  • text(), comment(), processing-instruction()
  • Selects only the nodes of the given type from the
    given axis
  • descendanttext() selects all descendant text
    nodes
  • followingcomment() selects all following
    comment nodes
  • precedingprocessing-instruction() selects all
    preceding PI nodes
  • Specific PI target
  • childprocessing-instruction('xml-stylesheet')

89
Name tests
  • Name tests only select nodes of one type at a
    time, depending on the axis
  • This is called the principal node type for an
    axis
  • Attributes while on the attribute axis
  • Namespace nodes while on the namespace axis
  • Element nodes while on every other axis
  • The wildcard
  • Selects all nodes of the principal node type
  • child selects all element nodes on the child
    axis
  • attribute selects all attribute nodes on the
    attribute axis
  • effectively no different than attributenode()
    because the attribute axis can only ever contain
    attribute nodes

90
Name tests, cont.
  • Namespace-qualified wildcards xyz
  • Selects all nodes of the principal node type
    whose expanded-name has a particular URI part
  • childxyz selects all element nodes on the
    child axis that are in the namespace designated
    by the xyz prefix
  • QNames foo, xyzfoo, etc.
  • Selects all nodes of the principal node type
    whose expanded-name has a particular local part
    and a particular URI part
  • childfoo selects all element nodes on the child
    axis that have local name foo and that are not in
    a namespace
  • ancestorxyzfoo selects all element nodes on
    the ancestor axis that have local name foo and
    that are in the namespace designated by the xyz
    prefix

91
How multiple steps are evaluated
  • The rightmost step indicates what nodes are
    returned
  • table/tr/td
  • The above location path returns a node-set of
    zero or more td elements
  • /doc/section5/paratext() and _at_
  • What does the above location path return?
  • Each step is evaluated once for each node
    returned by the step to its left, using that node
    as the context node for the evaluation
  • The result is the union of the node-sets returned
    by all the evaluations of the rightmost step

92
Predicates
  • A predicate filters a node-set to produce a new
    node-set
  • price. gt 5
  • Of all the price child elements, return only
    those whose string-value, when converted to a
    number, is greater than 5
  • The predicate expression is evaluated once for
    each node in the node-set to be filtered, using
    that node as the context node for the evaluation
  • The result is converted to a boolean (if
    necessary)
  • If true, the node is retained in the result
  • If false, the node is excluded from the result

93
Numeric predicates
  • When the predicate expression evaluates to a
    number
  • It is interpreted in a special way, such that
  • foo5 is short for fooposition()5
  • foolast() is short for fooposition()last()

94
Context size in predicates
  • The last() function returns the context size
  • The number of nodes returned by the step (or
    arbitrary node-set expression) to its left
  • foolast() evaluates to foo5 if there are a
    total of 5 foo elements

95
Context position in predicates
  • The position() function returns the context
    position
  • The proximity position of the context node for
    the current predicate evaluation
  • The relative position of the node among all the
    nodes being filtered in document order
  • foo5 returns the 5th foo element in document
    order
  • Unless the step uses one of the four reverse
    axes
  • preceding
  • preceding-sibling
  • ancestor
  • ancestor-or-self
  • ancestornode()1 is equivalent to ..
  • preceding-siblingfoo1 returns the first foo
    element in reverse document order

96
Step filters vs. Expression filters
  • A predicate can be used to filter two different
    things
  • A location path step
  • An arbitrary node-set expression
  • node-set_at_foo'bar'
  • Gotchas
  • //para1 vs. (//para)1
  • ancestor1 vs. (ancestor)1

97
Comparisons with node-sets
  • price gt 20
  • True if there are any price element children
    whose string-value when converted to a number is
    greater than 20
  • foobar bat
  • Select all foo elements that have any bar element
    child and any bat element child that have the
    same string-value
  • Comparisons with empty node-sets always return
    false
  • Gotcha
  • foo ! 2
  • foo ! bar
  • Use not() for the true complement
  • not(foo 2)
  • not(foo bar)

98
Functions overview
  • String functions
  • string(), concat(), starts-with(), contains(),
    substring-before(), substring-after(),
    substring(), string-length(), normalize-space(),
    translate()
  • Node-set functions
  • last(), position(), count(), id(), local-name(),
    namespace-uri(), name()
  • Boolean functions
  • boolean(), not(), true(), false(), lang()
  • Number functions
  • number(), sum(), floor(), ceiling(), round()
  • XSLT adds
  • document(), key(), generate-id(),
    system-property(), format-number(), current(),
    element-available(), function-available(),
    unparsed-entity-uri()

99
XSLT element overview
100
XSLT elements, by use case
Creating nodes xslelement, xslattribute,
xsltext, xslcomment, xslprocessing-instruction
Copying nodes xslcopy-of, xslcopy Repetition
(looping) xslfor-each Sorting xslsort Conditiona
l processing xslchoose, xslif Computing or
extracting a value xslvalue-of Defining
variables and parameters xslvariable,
xslparam Defining and calling subprocedures
(named templates) xsltemplate,
xslcall-template Defining and applying template
rules xsltemplate, xslapply-templates,
xslapply-imports Numbering and number formatting
xslnumber, xsldecimal-format Debugging
xslmessage
101
XSLT elements, cont.
Combining stylesheets (modularization) xslimport
, xslinclude Compatibility xslfallback Building
lookup indexes xslkey XSLT code
generation xslnamespace-alias Output
formatting xsloutput Whitespace stripping
xslstrip-space, xslpreserve-space
102
XSLT's processing model
103
The end construct a result tree
104
The means process lists
  • If XPath is about trees, then XSLT is about lists
  • Populate arbitrary nodes from the source tree
    into lists
  • Iterate over those lists
  • For each node in the list, create part of the
    result tree
  • Source tree -gt List processing -gt Result tree
  • Thus, there is always
  • a current node list, and
  • a current node

105
Two mechanisms for iterating over lists
  • xslapply-templates and xslfor-each
  • They both iterate over the nodes of a given
    node-set
  • Supplied by the XPath expression in the select
    attribute
  • For example
  • ltxslapply-templates select"para"/gt
  • Populate the current node list with para
    elements, sorted in document order. For each para
    element, invoke the best-matching template rule.

106
All XSLT processing begins with...
  • A virtual call to
  • ltxslapply-templates select"/"/gt
  • The current node list initially consists of just
    one node
  • The root node of the source tree
  • In other words, the XSLT processor invokes the
    template rule that matches the root node
  • This call constructs the entire result tree
  • Nothing happens before it
  • Nothing happens after it

107
Your job as an XSLT stylesheet author...
  • ...is to defineusing template ruleswhat happens
    when the XSLT processor executes this
    instruction
  • ltxslapply-templates select"/"/gt

108
Template rules
  • An XSLT stylesheet contains a set of template
    rules
  • Two kinds of template rule
  • Those you define
  • Those that XSLT defines for you
  • These are called the built-in template rules.
  • There is a built-in template rule for each of the
    7 types of node
  • Ensures that all calls to xslapply-templates
    will never fail to find a matching template rule
  • Even if your stylesheet contains no explicit
    template rules at all

109
The empty stylesheet
  • Consider this stylesheet

ltxslstylesheet version"1.0"
xmlnsxsl"http//www.w3.org/1999/XSL/Transform"gt
lt/xslstylesheetgt
  • If you apply the above stylesheet to the example
    XML from QUIZ 1...
  • What will the result be?

110
The result
xsltproc empty.xsl quiz1.xml lt?xml
version"1.0"?gt This is a simple XML
document You can do it! There's nothing
to it! Go fast! This will be
interesting Here we go... sub-chapter
Who ever heard of nested chapters?!
another sub-chapter End of sub-chapter
No more nested chapters for now...
111
Template rules that you define
  • When you define template rules, you override the
    default behavior
  • An explicit template rule is
  • An xsltemplate element that has a match
    attribute
  • For example

ltxsltemplate match"foo"gt lt!-- construct part
of the result tree --gt ltxslapply-templates/gt
lt!--...--gt lt/xsltemplategt
112
Applying template rules
  • ltxslapply-templates/gt
  • Short for
  • ltxslapply-templates select"node()"/gt
  • Process all child nodes of the context node

113
Applying template rules an OOP analogy
  • ltxslapply-templates/gt
  • For each item in the list
  • Invoke the same polymorphic function
  • Each template rule is an implementation of that
    polymorphic function

114
Patterns
  • The value of the match attribute is a pattern
  • Looks like an XPath expression
  • Uses a subset of XPath syntax
  • But has a more passive role
  • Does the current node match this pattern? Yes or
    no.
  • When xslapply-templates is invoked, for each
    node in the list, the XSLT processor searches all
    the patterns of the stylesheet for the
    best-matching one

115
Example patterns
  • Example patterns
  • /
  • /doc_at_format'simple'
  • bar
  • foo/bar
  • section//para
  • _at_foo
  • _at_
  • node()
  • text()
  • xyz

116
Does the pattern match?
  • Informal
  • If this pattern were an expression, would the
    node in question ever be selected by it?
  • Formal
  • A node matches a pattern if the node is a member
    of the result of evaluating the pattern as an
    expression with respect to some possible context
    node.

117
Template rules with multiple patterns
  • Separate the alternative patterns with
  • ltxsltemplate match"foo bar"gt...
  • Is short for

ltxsltemplate match"foo"gt lt!--...--gt lt/xsltem
plategt ltxsltemplate match"bar"gt
lt!--...--gt lt/xsltemplategt
118
What about conflicts?
  • A foo element would match both of these template
    rules

ltxsltemplate match"foo"gt lt!--...--gt lt/xsltem
plategt ltxsltemplate match""gt
lt!--...--gt lt/xsltemplategt
  • Which one gets invoked by ltxslapply-templates
    select"foo"/gt?

119
Two steps to resolving conflicts
  • When more than one template rule matches
  • Eliminate rules with lower import precedence.
  • Eliminate rules with lower priority.
  • Only one rule should be left, otherwise error
  • Import precedence depends on what file the rule
    occurs in
  • Where it occurs in the import tree (via
    xslimport)
  • Priority depends on
  • The priority attribute of the xsltemplate
    element, or
  • The default priority (when priority attribute is
    absent)

120
Default priority
  • Priority is a positive or negative decimal number
  • The higher the number, the higher the priority
  • There are four default priorities
  • -.5
  • -.25
  • 0
  • .5
  • -.5 -.25 0 .5
  • _________________________________

121
Default priority depends on...
  • ...the syntax of the match pattern
  • The most common pattern format has a priority of
    0
  • 0
  • Match a particular name
  • foo, xyzfoo, _at_foo, _at_xyzfoo,
    processing-instruction('foo')
  • .5
  • The highest default priority
  • Any pattern with a predicate or multiple steps
  • foo/bar, foo2, foo_at_good'yes'

122
The lower default priorities are...
  • -.25
  • One-step wildcards within a namespace
  • xyz, _at_xyz
  • -.5
  • The lowest default priority
  • One-step wildcards regardless of name
  • , _at_, text(), comment(), processing-instruction()
    , node()

123
Modes
  • Modes allow you to process the same node again
    but do something different this time
  • ltxslapply-templates select"heading"
    mode"toc"/gt
  • ltxsltemplate match"heading" mode"toc"gt...
  • When the mode attribute is absent, that means the
    default (unnamed) mode
  • You can segment your template rules into sets
    organized by concern
  • What they generate in the result tree

124
The built-in template rules
  • For elements and root nodes
  • Apply templates to children

ltxsltemplate match"/ "gt ltxslapply-template
s/gt lt/xsltemplategt
  • For text nodes and attribute nodes
  • Output the string-value of the node

ltxsltemplate match"text() _at_"gt
ltxslvalue-of select"."/gt lt/xsltemplategt
125
The built-in template rules
  • For processing instructions and comments
  • Do nothing

ltxsltemplate match"comment()
processing-instruction()"/gt
  • For namespace nodes
  • Do nothing

126
Template rule content
  • Three kinds of elements
  • XSLT instructions
  • Any element in the XSLT namespace, e.g.,
    ltxslvalue-of/gt
  • Literal result elements
  • Any element in any other namespace, or no
    namespace
  • Creates a shallow copy of itself to the result
    tree
  • Extension elements
  • Any element in a namespace that's declared as an
    extension namespace (using the extension-element-p
    refixes attribute on the xslstylesheet element)

127
Attribute value templates
  • Attributes on literal result elements can contain
    dynamic values, delimited by curly braces
  • ltpara class"_at_format"gt...
  • To include a literal curly brace, double it
  • ltfoo bar"not interpreted as XPath"/gt

128
Miscellaneous topics...
129
(No Transcript)
130
The template rule engine
  • A lot goes on behind-the-scenes
  • xslapply-templates is the most important
    instruction in XSLT
  • ltxslapply-templates select"para"/gt
  • This means Apply templates to the para element
    children of the context node.

131
But what does apply templates mean?
  • ltxslapply-templates select"para"/gt
  • Let the para node-set populate the current node
    list (in document order)
  • For each node in the list, invoke the
    best-matching template rule
  • A template rule

132
(No Transcript)
133
(No Transcript)
134
Namespace node quiz
  • Example
  • ltfoo xmlnse"http//example.com"gtltbar/gtlt/foogt
  • Quiz How many namespace nodes in the above
    document?

135
Answer
  • Example
  • ltfoo xmlns"http//example.com"gtltbar/gtlt/foogt
  • Quiz How many namespace nodes are in the above
    document?
  • ANSWER 4
  • Two namespace nodes for each element
  • As we'll see, namespace nodes are a property of
    the element for which they're in scope.
  • Doesn't that make for a huge proliferation of
    namespace nodes?
  • Yes.
  • Should I care?
  • Hardly ever.

136
QNames in XPath/XSLT
  • QNames are expanded
  • Into local and URI parts
  • Using a set of namespace/prefix declarations
  • Supplied in the XPath expression context
  • This does not include a default namespace
    declaration (declared by xmlns)
  • Thus, if you want to select nodes in a particular
    namespace, then you must use a prefix
  • In other words, a QName without a prefix always
    designates a node that is not in a namespace
Write a Comment
User Comments (0)
About PowerShow.com