Title: What's%20New%20in%20XSLT%202.0
1What's New in XSLT 2.0
- Jeni Tennison
- http//www.jenitennison.com
2Overview
- Grouping
- Function Definitions
- Result Documents
- Multiple Result Documents
- Output Serialisation
- Temporary Trees
- Sequences
- Text Parsing
- Typing
3Grouping
- Perennial requirement
- usually use Muenchian Method (keys) in XSLT 1.0
- XSLT 2.0 has ltxslfor-each-groupgt
- select attribute identifies items to group
- grouping by value calculates grouping key
- group-by groups all items
- group-adjacent groups adjacent items
- grouping in sequence identifies start/end of
group - group-starting-with identifies start of group
- group-ending-with identifies end of group
- Use current-group() to get members of current
group, current-grouping-key() to get value for
current group
4Grouping by Value
ltpapergt lttitlegtXML and PDF in Publishing
Workflowslt/titlegt ltauthorgtMyers,
Charleslt/authorgt lt/papergt ltpapergt lttitlegtOn the
Way to XMLlt/titlegt ltauthorgtParsons,
Jonathanlt/authorgt ltauthorgtCaisley,
Phillt/authorgt lt/papergt
ltauthorgt ltnamegtCaisley, Phillt/namegt ltpapergtOn
the Way to XMLlt/papergt lt/authorgt ltauthorgt
ltnamegtMyers, Charleslt/namegt ltpapergtXML and PDF
in Publishing Workflowslt/papergt lt/authorgt ltauthorgt
ltnamegtParsons, Jonathanlt/namegt ltpapergtOn the
Way to XMLlt/papergt lt/authorgt
5Grouping by Value
ltxslfor-each-group select"paper"
group-by"author"gt ltxslsort
select"current-grouping-key()" /gt ltauthorgt
ltnamegt ltxslvalue-of select"current-groupin
g-key()" /gt lt/namegt ltxslfor-each
select"current-group()"gt ltpapergt
ltxslvalue-of select"title" /gt lt/papergt
lt/xslfor-eachgt lt/authorgt lt/xslfor-each-groupgt
6Grouping in Sequence
- Can use to do grouping by position
- Or to "levitate" structure from flat documents
- group the content of paragraphs into text or
block-level elements
ltxslfor-each-group select"paper"
group-starting-with"paperposition() mod 10
1"gt ltxslresult-document href"papersposition(
).html"gt ltxslapply-templates
select"current-group()" /gt
lt/xslresult-documentgt lt/xslfor-each-groupgt
7Implications for XSLT 2.0 Use
- No more Muenchian Grouping!
- easier to create indexes
- easier to create summaries/roll-ups
- easier to create paginated documents
- Much easier to convert from flat to hierarchical
structures - processing XHTML to DocBook (or XHTML 2.0)
8Function Definitions
- Use XSLT code to create new functions
- no facility to use scripting languages such as
JavaScript - similar to ltfuncfunctiongt from EXSLT
- Function must be in a namespace
- All parameters are required
- but can have multiple definitions with different
numbers of arguments - supports optional arguments, not polymorphic
functions - Need parameters for context item, position
- can't default to using context node as argument
- Body of function is result of function
- similar to named templates
- use ltxslsequencegt instead of ltfuncresultgt
9Function Definition
ltxslfunction name"stralign"gt ltxslparam
name"string" /gt ltxslparam name"padding" /gt
ltxslparam name"alignment" /gt
lt/xslfunctiongt ltxslfunction
name"stralign"gt ltxslparam name"string" /gt
ltxslparam name"padding" /gt ltxslsequence
select"stralign(string, padding, 'left')"
/gt lt/xslfunctiongt
10Implications for XSLT 2.0 Use
- General replacement for named templates
- Particular use where XSLT code can't be used
- creating a value to sort by using ltxslsortgt
- creating a value to index by using ltxslkeygt
- creating a value to group by using
ltxslforeach-groupgt - carrying out complex tests on nodes, for use in
match patterns in templates or keys
ltxsltemplate match"htmlis-heading(.)"gt
... lt/xsltemplategt
11Multiple Result Documents
- Many XSLT 1.0 processors have extension elements
to create multiple output documents - XSLT 2.0 has ltxslresult-documentgt
ltxslfor-each select"section"gt
ltxslresult-document href"_at_id.html"gt
ltxslapply-templates select"."
mode"html" /gt lt/xslresult-documentgt lt/
xslfor-eachgt
12Multiple Result Documents
- Document is associated with href URI
- accessible via API
- should enable client-side support
- Make it easier to create
- paginated output
- page per chapter
- page per 20 records
- pages using HTML frames
- supplementary files
- CSS stylesheets
- SVG graphics
13Output Serialisation
- Several changes to ltxsloutputgt
- output definitions are named
- referenced from ltxslresult-documentgt elements
- additional xhtml output method
- extra attributes to control HTML/XHTML
serialisation - escape-uri-attributes controls URI-escaping of
attributes - include-content-type controls addition of ltmetagt
element - normalize-unicode attribute provides Unicode
normalisation - character substitution provides an alternative
for disable-output-escaping
14Character Substitution
- Map of characters-to-strings
- On serialisation, each character in a text node
or attribute is substituted for the relevant
string - Simple use is to force use of entities
ltxslcharacter-map name"html"gt
ltxsloutput-character character"160"
string"ampnbsp" /gt
lt/xslcharacter-mapgt ltxsloutput
use-character-maps"html" /gt
lteggtblah160blahlt/eggt
lteggtblahnbspblahlt/eggt
15Character Substitution
- Complex use is to create non-well-formed output
- use characters from private use areas to
represent illegal sequences of characters
ltxslcharacter-map name"jsp"gt lt!-- JSP start
--gt ltxsloutput-character character"xE001"
string"lt" /gt lt!--
JSP end --gt ltxsloutput-character
character"xE002"
string"gt" /gt lt/xslcharacter-mapgt
xE001_at_ page language"java" xE002
lt_at_ page language"java" gt
16Implications for XSLT 2.0 Use
- Much more control over output
- control over automatic serialisation
- addition of ltmetagt element
- Unicode normalisation
- control over character serialisation
- which entities get used
- what form of character references
- No more d-o-e?
- character substitution is better
- works in attribute values
- supported by all (serialising) processors
- persists through variables
- d-o-e is still easier to use
17Temporary Trees and RTFs
- XSLT 1.0 had Result Tree Fragments
- created when use content of ltxslvariablegt
- tree that couldn't be accessed with location path
- most processors have xxxnode-set() extension
function - convert result tree fragment to node tree
- In XSLT 2.0, have temporary trees
- can copy in same way as RTFs
- can access without using extension function
18Example Temporary Tree
ltxslvariable name"menus"gt ltmenu name"File"gt
ltmenuItem name"New..." shortcut"Ctrl-N" /gt
ltmenuItem name"Open..." shortcut"Ctrl-O" /gt
ltmenuItem name"Save..." shortcut"Ctrl-S" /gt
... lt/menugt ... lt/xslvariablegt
1.0
ltxslvalue-of select"exslnode-set(menus)/menu
/menuItem_at_shortcut shortcut/_at_name" /gt
2.0
ltxslvalue-of select"menus/menu
/menuItem_at_shortcut shortcut/_at_name" /gt
19Implications for XSLT 2.0 Use
- Break up complex transformations into several
steps - filter, sort, annotate nodes in early steps
- later steps are easier to write
- Create lookup tables
- translate from codes or numbers to labels
- similar to arrays or matrices
- Iteratively process a document until it fulfils
some test - add content until a document is valid
20Sequences in XPath 2.0
- New fundamental type in XPath 2.0
- everything is a sequence
- similar to node sets, but
- ordered
- allow duplicates
- can contain atomic values as well as nodes
- Sequences are flat
- for structured data, use XML
- Singleton sequences are the same as the single
value they contain
21Using Sequences in XSLT 2.0
- Iterate over a sequence
- Create a text node from a sequence
ltxslfor-each select"1 to 3"gt lttrgtlttd
colspan"4" /gtlt/trgt lt/xslfor-eachgt
lttrgtlttd colspan"4" /gtlt/trgt lttrgtlttd colspan"4"
/gtlt/trgt lttrgtlttd colspan"4" /gtlt/trgt
ltxslvalue-of select"author/surname"
separator", " /gt
Thompson, Tobin
22Creating Sequences in XSLT 2.0
- Every sequence of instructions creates a sequence
of items - When a sequence is added to a node
- atomic values are converted to text nodes
- spaces added between atomic values
- nodes are copied to create children
sequence of any items
sequence of new nodes
children of node
23Temporary Trees
- Variables can be set using select attribute or
using content - When setting value using content
- if as attribute not present, create temporary
tree - if as attribute present, create sequence
ltxslvariable name"tree"gt 42 lt/xslvariablegt
ltxslvariable name"sequence"
asxsinteger"gt 42 lt/xslvariablegt
24Creating Sequences in XSLT 2.0
- New instruction ltxslsequencegt
- adds existing nodes or new atomic values to a
sequence - Select the line item with highest subtotal
ltxslvariable name"max-expenditure"
as"element(lineitem)"gt ltxslfor-each
select"lineitem"gt ltxslsort select"_at_price
_at_quantity" order"descending" /gt
ltxslif test"position() 1"gt
ltxslsequence select"." /gt lt/xslifgt
lt/xslfor-eachgt lt/xslvariablegt
25Implications for XSLT 2.0 Use
- Less need for recursive templates
- use integer sequences to iterate a number of
times - use ltxslsequencegt to build node sequences by
iteration rather than recursion - Less need for temporary elements
- atomic values don't need to be wrapped in an
element in order to be passed around in a list
26Text Parsing
- XPath 2.0 has regular expression support
- match(string, regex, flags?) returns true if a
regular expression matches a substring - replace(string, regex, replacement, flags?)
returns the string with all occurrences of the
regular expression replaced using the replacement
string - tokenize(string, regex, flags?) returns a
sequence of strings created by splitting the
string on every occurrence of the regular
expression - Can do more complex regular expression processing
using XSLT 2.0
27Analysing Strings
- XSLT 2.0 has ltxslanalyze-stringgt instruction
- select attribute selects string
- regex attribute holds regular expression
- string split into a sequence of matching and
non-matching substrings - processed in turn by
- ltxslmatching-substringgt for matching
- ltxslnon-matching-substringgt for non-matching
28Example String Analysis
ltpoemgt Mary had a little lamb, Its fleece was
white as snow And everywhere that Mary went
The lamb was sure to go. lt/poemgt
ltxsltemplate match"poem"gt ltpoemgt
ltxslanalyze-string select"." regex"\S."
flags"m"gt ltxslmatching-substringgt
ltlinegtltxslvalue-of select"." /gtlt/linegt
lt/xslmatching-substringgt lt/xslanalyze-string
gt lt/poemgt lt/xsltemplategt
ltpoemgt ltlinegtMary had a little lamb,lt/linegt
ltlinegtIts fleece was white as snowlt/linegt
ltlinegtAnd everywhere that Mary wentlt/linegt
ltlinegtThe lamb was sure to go.lt/linegt lt/poemgt
29More Text Parsing
- Within ltxslanalyze-stringgt, use regex-group()
function to get value of matched subexpression
ltxsltemplate match"_at_date"gt ltxslattribute
name"date"gt ltxslvariable name"UK-date-regex
" select"(\d2)\\(\d2)\\(\d
2)" /gt ltxslanalyze-string select"."
regex"UK-date-regex"gt
ltxslmatching-substringgt ltxslsequence
select"concat('20', regex-group(3),
'-', regex-group(2),
'-',
regex-group(1))" /gt lt/xslmatching-substring
gt lt/xslanalyze-stringgt lt/xslattributegt lt/x
sltemplategt
30Implications for XSLT 2.0 Use
- XSLT 2.0 also allows access to text files with
unparsed-text() function - works in a similar way to document()
- Potential to process any text format
- comma-delimited and fixed-format files
- CSS files
- HTML? Java code?
- these are hard because of matching tags/braces
- XSLT could be used for up-conversion to XML
31Strong Typing in XSLT 2.0
- XPath 2.0 is strongly typed
- the type of a value determines how it is treated
by some operators (e.g. , ) - if the wrong type of value is passed to a
function, you will get an error - Similarly, in XSLT 2.0
- the type of a sort key determines how values are
sorted - if the wrong type of value is passed as a
parameter to a template, you will get an error
32Declaring Types of Variables
- Declare the type of variables and parameters with
as attribute - holds a SequenceType
- item test plus occurrence indicator
- xsinteger means "one or more integers"
- element()? means "an optional element"
- Error if value doesn't comply with type
ltxslfunction name"mathpower"gt ltxslparam
name"base" as"xsdecimal" /gt ltxslparam
name"power" as"xsinteger" /gt
lt/xslfunctiongt
33Declaring Type of Functions
- Declare the return type of functions and
templates with as attribute - holds a SequenceType
- Generated sequence will be converted to atomic
sequence type if possible
ltxslfunction name"mathpower" as"xsdecimal"gt
ltxslparam name"base" as"xsdecimal" /gt
ltxslparam name"power" as"xsinteger" /gt
lt/xslfunctiongt
ltxsltemplate match"_at_date" as"attribute(_at_date,
)"gt ltxslattribute name"date"gtlt/xslattribute
gt lt/xsltemplategt
34Node Typing in XSLT 2.0
- In XPath 2.0, every element and attribute has a
type - can select nodes based on their type
- //attribute(_at_, xsdate) selects all the
attributes in the document that hold dates - Similarly, in XSLT 2.0
- can match nodes based on their type
- attribute(_at_, xsdate) matches all attributes
that hold dates - can create elements and attributes of particular
types
35Creating Elements and Attributes
- Use xsltype attribute to indicate type of
element/attribute - can use this to type-annotate documents without
schema validation
ltxsltemplate match"_at_date" as"attribute(_at_date,
xsdate)"gt ltxslattribute name"date"
type"xsdate"gt lt/xslattributegt lt/xsltem
plategt
ltxsltemplate match"employee" as"element(person,
xstoken)"gt ltperson xsltype"xstoken"gt
ltxslvalue-of select"name" /gt
lt/persongt lt/xsltemplategt
36Importing Schemas
- Need to import schemas to use
- user-defined types
- substitution groups
- Import with ltxslimport-schemagt
- namespace identifies target namespace
- schema-location locates schema
- Enables validation of result tree
ltxslimport-schema namespace"http//www.w3.org/19
99/xhtml" schema-location"xhtm
l.xsd" /gt ltxsltemplate match"element(_inline)"gt
lt/xsltemplategt
37Implications for XSLT 2.0 Use
- Easy to get errors from a stylesheet unless
you're rigorous in keeping track of types - declare types of variables and parameters
- cast elements/attributes to particular types
- Well-designed schemas become a useful tool
- substitution groups and appropriate types help
reduce number of templates - check whether the result conforms to the schema
while generating it
38Conclusions
- XSLT 2.0 introduces a lot of new features
- Many stylesheets can be simpler
- multi-step processing with temporary trees
- grouping using ltxslfor-each-groupgt
- user-defined functions for repetitive code
- Stylesheet applications can be simpler
- multiple result documents should reduce need for
client-side scripting - XSLT 2.0 expands into text parsing
- Using schemas/types well will make things easier
not using them will make things harder
39For Details
- XPath 2.0 Working Draft
- http//www.w3.org/TR/xpath20
- XSLT 2.0 Working Draft
- http//www.w3.org/TR/xslt20
- Saxon implementation
- http//saxon.sourceforge.net
- Comments please!
- public-qt-comments_at_w3.org