Title: Introduction to XML Part 2
1Introduction to XMLPart 2
- Zdenek abokrtský
- (using examples from www.kosek.cz)?
2Jmenné prostory
- v jednom dokumentu se mohou mÃsit tagy popsané
napr. v ruzných DTD, - fragmenty textu formátovaného v HTML vloené do
jiného XML - tagy pro transformacnà instrukce smÃené s tagy
transformovaného dokumentu atd. - význam tagu nesmà být nejednoznacný, je treba
vyreit duplikaci jmen elementu a atribututu z
ruzných schémat - Namespaces in XML doporucenà W3C, 1999
- poskytuje mechanismus pro identifikaci jmenných
prostoru pouitých v dokumentu - umonà identifikovat, ke kterému jmennému
prostoru daný element nebo atribut patrÃ
3Jmenné prostrory, pokrac.
- jako identifikátor jmenného prostoru se pouÃvá
URL - jednoznacnost
- ocekávaná informativnà hodnota (na URL naleznu o
daném jmenném prostoru vÃce informace) - Pozor formálne jde stále pouze o identifikátor!
aplikace nebudou vyadovat prÃstup k internetu a
dané URL ve skutecnosti nemusà ani existovat! - plne kvalifikované jméno (atributu/elementu)?
- prefixjmeno
- zavedenà prefixu atribut xmlnsprefix
- pokud jeden namespace dominuje, lze pouÃt
prázdný prefix -
- ltkniha xmlnshttp//ufal.mff.cuni.cz/kniha
xmlnsbibhttp//www.book.org/bibliographygt ...
ltkapitolagt ... ltbibbookgtltbibauthorgt..
4Navigation in XML
- Dokument addressing
- URL (Universal Resource Locator)?
- i.e. usual protocol//compport/path/resource
- Navigation inside a document
- Using identifiers relation between ID and IDREF
- Xpath
- Xlink, XPointer
5XPath
- Result of a query in XPath applied on a given XML
document is a set of nodes of the document
matching the query - XML document is understood as a tree structure
containing nodes of three types elements,
attributes, text - XPath syntax very similar to addressing in file
systems - Examples
- /book root element named book
- /book/chapter all elements named chapter in
the root element book - /book/ - all elements in the book root element
6XPath cont.
- Examples
- ////para elements para in the 4th level
- //chapter elements chapter anywhere in the
document - //bold //italic all elements bold a
italic, anywehere in the document - /book/chapter2 second chapter
- /book/chapterlast
- //chapter_at_id'k2'
- ../_at_lang attribute lang of the parent of
current node - //count(para)3 all element that contain
exactly three elements para
7Transformations of XML docs.
- motivation
- e.g. presenting the content of XML documents
(users don't want to see the tags, they want the
content to be nicely formatted...)? - style (style sheet)?
- Relation between logical structure and
presentation form - XSL eXtensible Stylesheet Language
- Description of transformations of XML docs
- XSLT XSL Transformation
- Transforming process realized by an XSLT
processor - xsltproc test.xsl test.xml gt test.html
8XSL
- Stylesheet declaration XML document, using xsl
namespace - ltxslstylesheet version"1.0" xmlnsxsl"http//ww
w.w3.org/1999/XSL/Transform"gt - declaration of templates ablon
- lt/xslstylesheetgt
- Style body at least one transformation rule
(called template)? - matching expression for finding nodes on which
the template should be applied - output found nodes will be substited by the
specified output - ltxsltemplate matchmatching_expressiongt
- output
- lt/xsltemplategt
9XSL - Example
lt?xml version"1.0" encoding"utf-8"?gt ltxslstyles
heet version"1.0" xmlnsxsl"http//www.w3.org/19
99/XSL/Transform"gt ltxsltemplate match"/"gt
lthtmlgt ltheadgtlt/headgt ltbodygt
ltxslapply-templates/gt lt/bodygt
lt/htmlgt lt/xsltemplategt ltxsltemplate
match"title"gt lth1gt ltxslnumber/gt.
ltxslvalue-of select"."/gt (id
ltxslvalue-of select"../_at_id"/gt)? lt/h1gt
lt/xsltemplategt lt/xslstylesheetgt
ltbookgt ltchapter id"k1"gt lttitlegtIntro...lt/title
gt bla bla lt/chaptergt ltchapter id"k2"gt
lttitlegtConcl...lt/titlegt bla bla
lt/chaptergt lt/bookgt
10Aplication interface for XML
- (API)?
- SAX Simple API for XML
- a document is read serially and its contents are
reported as callbacks - fast and efficient to implement, but user
applications have to keep tract of what part of
the document is being processed - DOM Document Object Model
- allows for navigation of the entire document
represented as a tree of objects - language-neutral interface to an in-memory
representation of an XML document
11SAX Event Handlers in Perl
- Each type of event is processed by a procedure
with a dedicated name
12DOM
- The DOM presents an XML document as a
tree-structure (a node tree), with the elements,
attributes, and text defined as nodes. - The entire document is a document node
- Every XML tag is an element node
- The texts contained in the XML elements are text
nodes - Every XML attribute is an attribute node
- Comments are comment nodes
- Example
13DOM in Perl
ltknihagt ltkapitola id"k1"gt ltnazevgtUvodlt/nazevgt
bla bla lt/kapitolagt ltkapitola id"k2"gt
ltnazevgtZaverlt/nazevgt bla bla
lt/kapitolagt lt/knihagt
- XMLDOM nebo XMLLibXML
- Ukázka
- !/usr/bin/perl
- use XMLDOM
- my parser XMLDOMParser-gtnew()
- my doc parser-gtparsefile('test.xml')
- foreach my kap (doc-gtgetElementsByTagName('kapit
ola')) - print "Attribute id contains
".kap-gtgetAttribute("id")."\n" - foreach child (kap-gtgetChildNodes)
- my type child-gtgetNodeType()
- if (type ELEMENT_NODE)
- print "Element ".child-gtgetTagName."
contains ". - child-gtgetFirstChild-gtgetNodeValue."\n"
-