Introduction to XML Part 2 - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

Introduction to XML Part 2

Description:

fragmenty textu form tovan ho v HTML vlo en do jin ho XML ... umo n identifikovat, ke kter mu jmenn mu prostoru dan element nebo atribut patr ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 14
Provided by: ufalMf
Category:
Tags: xml | dany | introduction | part

less

Transcript and Presenter's Notes

Title: Introduction to XML Part 2


1
Introduction to XMLPart 2
  • Zdenek abokrtský
  • (using examples from www.kosek.cz)?

2
Jmenné prostory
  • v jednom dokumentu se mohou mísit tagy popsané
    napr. v ruzných DTD,
  • fragmenty textu formátovaného v HTML vloené do
    jiného XML
  • tagy pro transformacní instrukce smíené s tagy
    transformovaného dokumentu atd.
  • význam tagu nesmí být nejednoznacný, je treba
    vyreit duplikaci jmen elementu a atribututu z
    ruzných schémat
  • Namespaces in XML doporucení W3C, 1999
  • poskytuje mechanismus pro identifikaci jmenných
    prostoru pouitých v dokumentu
  • umoní identifikovat, ke kterému jmennému
    prostoru daný element nebo atribut patrí

3
Jmenné prostrory, pokrac.
  • jako identifikátor jmenného prostoru se pouívá
    URL
  • jednoznacnost
  • ocekávaná informativní hodnota (na URL naleznu o
    daném jmenném prostoru více informace)
  • Pozor formálne jde stále pouze o identifikátor!
    aplikace nebudou vyadovat prístup k internetu a
    dané URL ve skutecnosti nemusí ani existovat!
  • plne kvalifikované jméno (atributu/elementu)?
  • prefixjmeno
  • zavedení prefixu atribut xmlnsprefix
  • pokud jeden namespace dominuje, lze pouít
    prázdný prefix
  • ltkniha xmlnshttp//ufal.mff.cuni.cz/kniha
    xmlnsbibhttp//www.book.org/bibliographygt ...
    ltkapitolagt ... ltbibbookgtltbibauthorgt..

4
Navigation in XML
  • Dokument addressing
  • URL (Universal Resource Locator)?
  • i.e. usual protocol//compport/path/resource
  • Navigation inside a document
  • Using identifiers relation between ID and IDREF
  • Xpath
  • Xlink, XPointer

5
XPath
  • Result of a query in XPath applied on a given XML
    document is a set of nodes of the document
    matching the query
  • XML document is understood as a tree structure
    containing nodes of three types elements,
    attributes, text
  • XPath syntax very similar to addressing in file
    systems
  • Examples
  • /book root element named book
  • /book/chapter all elements named chapter in
    the root element book
  • /book/ - all elements in the book root element

6
XPath cont.
  • Examples
  • ////para elements para in the 4th level
  • //chapter elements chapter anywhere in the
    document
  • //bold //italic all elements bold a
    italic, anywehere in the document
  • /book/chapter2 second chapter
  • /book/chapterlast
  • //chapter_at_id'k2'
  • ../_at_lang attribute lang of the parent of
    current node
  • //count(para)3 all element that contain
    exactly three elements para

7
Transformations of XML docs.
  • motivation
  • e.g. presenting the content of XML documents
    (users don't want to see the tags, they want the
    content to be nicely formatted...)?
  • style (style sheet)?
  • Relation between logical structure and
    presentation form
  • XSL eXtensible Stylesheet Language
  • Description of transformations of XML docs
  • XSLT XSL Transformation
  • Transforming process realized by an XSLT
    processor
  • xsltproc test.xsl test.xml gt test.html

8
XSL
  • Stylesheet declaration XML document, using xsl
    namespace
  • ltxslstylesheet version"1.0" xmlnsxsl"http//ww
    w.w3.org/1999/XSL/Transform"gt
  • declaration of templates ablon
  • lt/xslstylesheetgt
  • Style body at least one transformation rule
    (called template)?
  • matching expression for finding nodes on which
    the template should be applied
  • output found nodes will be substited by the
    specified output
  • ltxsltemplate matchmatching_expressiongt
  • output
  • lt/xsltemplategt

9
XSL - Example
lt?xml version"1.0" encoding"utf-8"?gt ltxslstyles
heet version"1.0" xmlnsxsl"http//www.w3.org/19
99/XSL/Transform"gt ltxsltemplate match"/"gt
lthtmlgt ltheadgtlt/headgt ltbodygt
ltxslapply-templates/gt lt/bodygt
lt/htmlgt lt/xsltemplategt ltxsltemplate
match"title"gt lth1gt ltxslnumber/gt.
ltxslvalue-of select"."/gt (id
ltxslvalue-of select"../_at_id"/gt)? lt/h1gt
lt/xsltemplategt lt/xslstylesheetgt
ltbookgt ltchapter id"k1"gt lttitlegtIntro...lt/title
gt bla bla lt/chaptergt ltchapter id"k2"gt
lttitlegtConcl...lt/titlegt bla bla
lt/chaptergt lt/bookgt
10
Aplication interface for XML
  • (API)?
  • SAX Simple API for XML
  • a document is read serially and its contents are
    reported as callbacks
  • fast and efficient to implement, but user
    applications have to keep tract of what part of
    the document is being processed
  • DOM Document Object Model
  • allows for navigation of the entire document
    represented as a tree of objects
  • language-neutral interface to an in-memory
    representation of an XML document

11
SAX Event Handlers in Perl
  • Each type of event is processed by a procedure
    with a dedicated name

12
DOM
  • The DOM presents an XML document as a
    tree-structure (a node tree), with the elements,
    attributes, and text defined as nodes.
  • The entire document is a document node
  • Every XML tag is an element node
  • The texts contained in the XML elements are text
    nodes
  • Every XML attribute is an attribute node
  • Comments are comment nodes
  • Example

13
DOM in Perl
ltknihagt ltkapitola id"k1"gt ltnazevgtUvodlt/nazevgt
bla bla lt/kapitolagt ltkapitola id"k2"gt
ltnazevgtZaverlt/nazevgt bla bla
lt/kapitolagt lt/knihagt
  • XMLDOM nebo XMLLibXML
  • Ukázka
  • !/usr/bin/perl
  • use XMLDOM
  • my parser XMLDOMParser-gtnew()
  • my doc parser-gtparsefile('test.xml')
  • foreach my kap (doc-gtgetElementsByTagName('kapit
    ola'))
  • print "Attribute id contains
    ".kap-gtgetAttribute("id")."\n"
  • foreach child (kap-gtgetChildNodes)
  • my type child-gtgetNodeType()
  • if (type ELEMENT_NODE)
  • print "Element ".child-gtgetTagName."
    contains ".
  • child-gtgetFirstChild-gtgetNodeValue."\n"
Write a Comment
User Comments (0)
About PowerShow.com