XML Tools - PowerPoint PPT Presentation

About This Presentation
Title:

XML Tools

Description:

... access and update the content and structure of XML ... public void startElement ( String nm, String ln, String qn, Attributes a ) throws SAXException ... – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 19
Provided by: lambd
Learn more at: https://lambda.uta.edu
Category:
Tags: xml | content | ee | size | tools

less

Transcript and Presenter's Notes

Title: XML Tools


1
XML Tools
  • Leonidas Fegaras

2
XML Processing
Well-formedness checks Reference expansion
document parser
document validator
application
XML infoset
XML infoset (annotated)
XML document
DTD or XML schema
storage system
3
DOM
  • The Document Object Model (DOM) is a platform-
    and language-neutral interface that allows
    programs and scripts to dynamically access and
    update the content and structure of XML
    documents. The following is part of the DOM
    interface
  • public interface Node
  • public String getNodeName ()
  • public String getNodeValue ()
  • public NodeList getChildNodes ()
  • public NamedNodeMap getAttributes ()
  • public interface Element extends Node
  • public Node getElementsByTagName ( String name
    )
  • public interface Document extends Node
  • public Element getDocumentElement ()
  • public interface NodeList
  • public int getLength ()
  • public Node item ( int index )

4
DOM Example
  • import java.io.File
  • import javax.xml.parsers.
  • import org.w3c.dom.
  • class Test
  • public static void main ( String args ) throws
    Exception
  • DocumentBuilderFactory dbf DocumentBuilderFacto
    ry.newInstance()
  • DocumentBuilder db dbf.newDocumentBuilder()
  • Document doc db.parse(new File("depts.xml"))
  • NodeList nodes doc.getDocumentElement().getChil
    dNodes()
  • for (int i0 iltnodes.getLength() i)
  • Node n nodes.item(i)
  • NodeList ndl n.getChildNodes()
  • for (int k0 kltndl.getLength() k)
  • Node m ndl.item(k)
  • if ( (m.getNodeName() "dept")
  • (m.getFirstChild().getNodeValue() "cse")
    )
  • NodeList ncl ((Element)
    m).getElementsByTagName("tel")
  • for (int j0 jltncl.getLength() j)

5
Better Programming
  • import java.io.File
  • import javax.xml.parsers.
  • import org.w3c.dom.
  • import java.util.Vector
  • class Sequence extends Vector
  • Sequence () super()
  • Sequence ( String filename ) throws Exception
  • super()
  • DocumentBuilderFactory dbf
  • DocumentBuilderFactory.newInstance()
  • DocumentBuilder db dbf.newDocumentBuilder()
  • Document doc db.parse(new File(filename))
  • add((Object) doc.getDocumentElement())

Sequence child ( String tagname )
Sequence result new Sequence() for
(int i 0 iltsize() i) Node n
(Node) elementAt(i) NodeList c
n.getChildNodes() for (int k 0
kltc.getLength() k) if (c.item(k).getNodeName(
).equals(tagname)) result.add((Object)
c.item(k)) return result
void print () for (int i 0
iltsize() i) System.out.println(e
lementAt(i).toString())
class DOM public static void main ( String
args ) throws Exception (new
Sequence("cs.xml")).child("gradstudent").child("na
me").print()
6
SAX
  • SAX is the Simple API for XML that allows you to
    process a document as it's being read
  • in contrast to DOM, which requires the entire
    document to be read before it takes any action)
  • The SAX API is event based
  • The XML parser sends events, such as the start or
    the end of an element, to an event handler, which
    processes the information

7
Parser Events
  • Receive notification of the beginning of a
    document
  • void startDocument ()
  • Receive notification of the end of a document
  • void endDocument ()
  • Receive notification of the beginning of an
    element
  • void startElement ( String namespace, String
    localName,
  • String qName, Attributes atts )
  • Receive notification of the end of an element
  • void endElement ( String namespace, String
    localName,
  • String qName )
  • Receive notification of character data
  • void characters ( char ch, int start, int
    length )

8
SAX Example a Printer
  • import java.io.FileReader
  • import javax.xml.parsers.
  • import org.xml.sax.
  • import org.xml.sax.helpers.
  • class Printer extends DefaultHandler
  • public Printer () super()
  • public void startDocument ()
  • public void endDocument () System.out.println(
    )
  • public void startElement ( String uri, String
    name,
  • String tag, Attributes atts )
  • System.out.print(lt tag gt)
  • public void endElement ( String uri, String
    name, String tag )
  • System.out.print(lt/ tag gt)
  • public void characters ( char text, int
    start, int length )
  • System.out.print(new String(text,start,lengt
    h))

9
The Child Handler
  • class Child extends DefaultHandler
  • DefaultHandler next // the next handler in
    the pipeline
  • String ptag // the tagname of the child
  • boolean keep // are we keeping or skipping
    events?
  • short level // the depth level of the
    current element
  • public Child ( String s, DefaultHandler n )
  • super()
  • next n ptag s
  • keep false level 0
  • public void startDocument () throws
    SAXException
  • next.startDocument()
  • public void endDocument () throws
    SAXException
  • next.endDocument()

10
The Child Handler (cont.)
  • public void startElement ( String nm, String
    ln, String qn, Attributes a ) throws SAXException
  • if (level 1)
  • keep ptag.equals(qn)
  • if (keep)
  • next.startElement(nm,ln,qn,a)
  • public void endElement ( String nm, String
    ln, String qn ) throws SAXException
  • if (keep)
  • next.endElement(nm,ln,qn)
  • if (--level 1)
  • keep false
  • public void characters ( char text, int
    start, int length ) throws SAXException
  • if (keep)
  • next.characters(text,start,length)

11
Forming the Pipeline
  • class SAX
  • public static void main ( String args )
    throws Exception
  • SAXParserFactory pf SAXParserFactory.new
    Instance()
  • SAXParser parser pf.newSAXParser()
  • DefaultHandler handler
  • new Child("gradstudent",
  • new Child("name",
  • new Printer()))
  • parser.parse(new InputSource(new
    FileReader("cs.xml")),
  • handler)

Childname
Printer
SAX parser
Childgradstudent
12
Example
  • Input Stream
  • ltdepartmentgt
  • ltdeptnamegt
  • Computer Science
  • lt/deptnamegt
  • ltgradstudentgt
  • ltnamegt
  • ltlastnamegt
  • Smith
  • lt/lastnamegt
  • ltfirstnamegt
  • John
  • lt/firstnamegt
  • lt/namegt
  • lt/gradstudentgt
  • ...
  • lt/departmentgt

SAX Events SD SE department SE deptname C
Computer Science EE deptname SE gradstudent SE
name SE lastname C Smith EE lastname SE
firstname C John EE firstname EE name EE
gradstudent ... EE department ED
Child gradstudent
Child name
Printer
13
XSL Transformation
  • A stylesheet specification language for
    converting XML documents into various forms (XML,
    HTML, plain text, etc).
  • Can transform each XML element into another
    element, add new elements into the output file,
    or remove elements.
  • Can rearrange and sort elements, test and make
    decisions about which elements to display, and
    much more.
  • Based on XPath
  • ltxslstylesheet version1.0
  • xmlnsxslhttp//www.w3.org/1999/XSL/Transform
    gt
  • ltstudentsgt
  • ltxslcopy-of select//student/name/gt
  • lt/studentsgt
  • lt/xslstylesheetgt

14
XSLT Templates
  • XSL uses XPath to define parts of the source
    document that match one or more predefined
    templates.
  • When a match is found, XSLT will transform the
    matching part of the source document into the
    result document.
  • The parts of the source document that do not
    match a template will end up unmodified in the
    result document (they will use the default
    templates).
  • Form
  • ltxsltemplate matchXPath expressiongt
  • lt/xsltemplategt
  • The default (implicit) templates visit all nodes
    and strip out all tags
  • ltxsltemplate match/gt
  • ltxslapply-templates/gt
  • lt/xsltemplategt
  • ltxsltemplate matchtext()_at_"gt
  • ltxslvalue-of select./gt
  • lt/xsltemplategt

15
Other XSLT Elements
  • ltxslvalue-of selectXPath expression/gt
  • select the value of an XML element and add it to
    the output stream of the transformation, e.g.
    ltxslvalue-of select"//books/book/author"/gt.
  • ltxslcopy-of selectXPath expression/gt
  • copy the entire XML element to the output stream
    of the transformation.
  • ltxslapply-templates matchXPath expression/gt
  • apply the template rules to the elements that
    match the XPath expression.
  • ltxslelement nameXPath expressiongt
    lt/xslelementgt
  • add an element to the output with a tag-name
    derived from the XPath.
  • Example
  • ltxslstylesheet version 1.0
  • xmlnsxslhttp//www.w3.org/1999/XSL/Tra
    nsformgt
  • ltxsltemplate match"employee"gt
  • ltbgt ltxslapply-templates select"node()"/gt
    lt/bgt
  • lt/xsltemplategt
  • ltxsltemplate match"surname"gt
  • ltigt ltxslvalue-of select"."/gt lt/igt
  • lt/xsltemplategt
  • lt/xslstylesheetgt

16
Copy the Entire Document
  • ltxslstylesheet version 1.0
  • xmlnsxslhttp//www.w3.org/1999/XSL/Transfo
    rmgt
  • ltxsltemplate match/"gt
  • ltxslapply-templates/gt
  • lt/xsltemplategt
  • ltxsltemplate matchtext()"gt
  • ltxslvalue-of select./gt
  • lt/xsltemplategt
  • ltxsltemplate match"gt
  • ltxslelement namename(.)gt
  • ltxslapply-templates/gt
  • lt/xslelementgt
  • lt/xsltemplategt
  • lt/xslstylesheetgt

17
More on XSLT
  • Conflict resolution more specific templates
    overwrite more general templates. Templates are
    assigned default priorities, but they can be
    overwritten using priorityn in a template.
  • Modes can be used to group together templates. No
    mode is an empty mode.
  • ltxsltemplate match modeAgt
  • ltxslapply-templates modeB/gt
  • lt/xsltemplategt
  • Conditional and loop statements
  • ltxslif testXPath predicategt body lt/xslifgt
  • ltxslfor-each selectXPathgt body
    lt/xslfor-eachgt
  • Variables can be used to name data
  • ltxslvariable namexgt value lt/xslvariablegt
  • Variables are used as x in XPaths.

18
Using XSLT
  • import javax.xml.parsers.
  • import org.xml.sax.
  • import org.w3c.dom.
  • import javax.xml.transform.
  • import javax.xml. . transform.dom.
  • import javax.xml.transformstream.
  • import java.io.
  • class XSLT
  • public static void main ( String argv )
    throws Exception
  • File stylesheet new File("x.xsl")
  • File xmlfile new File("a.xml")
  • DocumentBuilderFactory dbf DocumentBuilderFacto
    ry.newInstance()
  • DocumentBuilder db dbf.newDocumentBuilder()
  • Document document db.parse(xmlfile)
  • StreamSource stylesource new
    StreamSource(stylesheet)
  • TransformerFactory tf TransformerFactory.newIns
    tance()
  • Transformer transformer tf.newTransformer(style
    source)
  • DOMSource source new DOMSource(document)
Write a Comment
User Comments (0)
About PowerShow.com