XML Overview - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

XML Overview

Description:

Document 'language' that combines data with display formatting information. Used for document display/publishing. Combination of data and formatting info limits ... – PowerPoint PPT presentation

Number of Views:99
Avg rating:3.0/5.0
Slides: 32
Provided by: cmi77
Category:
Tags: xml | overview

less

Transcript and Presenter's Notes

Title: XML Overview


1
XML Overview
CSc 335 Object Oriented Programming Design
  • Jerod Wilkerson

2
Contents
  • Introduction
  • Purpose/Objectives
  • Background
  • Overview
  • XML Example
  • Uses of XML
  • XML Basics
  • Elements
  • Attributes
  • Comments
  • Well-formed XML
  • Validation
  • DTDs
  • Schemas
  • The DOCTYPE Element
  • Parsing
  • XML Parsers
  • Parsing APIs
  • DOM Parsing

3
Purpose/Objectives
  • Any Java developer will use XML
  • Becoming more important
  • Explain motivation for XML
  • Provide an overview and basic understanding
  • Provide a starting point for learning additional
    XML related topics
  • Teach what you need to know for programming
    assignment
  • Can only scratch the surface

4
Background
  • HTML
  • Application of Standard General Markup Language
    (SGML)
  • Document language that combines data with
    display formatting information
  • Used for document display/publishing
  • Combination of data and formatting info limits
    usefulness
  • Browser specific interpretations of formatting
    tags
  • Cascading Style Sheets
  • Allowed the separation of formatting information
    from HTML documents
  • Allowed web pages and web sites to be more
    flexible and easier to maintain

5
Background (cont.)
  • Creation of CSS allowed complete separation of
    data and formatting
  • XML is the result
  • Didnt happen by accident (two-part plan of W3C
    to fix HTMLand create a more useable SGML)
  • XML is a subset of SGML (not an application of
    it)
  • Much more flexible (not limited to web-display)
  • Allows anyone to easily define their own
    application of XML
  • HTML is like an application, XML is like a
    language

6
Overview What is XML?
  • A markup language for applying structure to data
  • A subset of SGML
  • Much more flexible than HTML
  • Describes data without specifying a particular
    meaning
  • Not limited to predefined tags
  • Human readable
  • Machine readable

7
Example Text
  • Taken (and modified) from
  • XML A Primer, 3rd Edition, by Simon St.Laurent

12729 Maple 1x1x2 4.25 12829 Oak
1x1x2 5.75 13029 Pine 1x1x2 2.00
8
Example HTML
lt!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01
Transitional//EN"gt lthtmlgt ltheadgt lttitlegtProduct
Listlt/titlegt lt/headgt ltbodygt The following
products are available ltulgt ltligt12729
Maple 1x1x2 ltbgt4.25lt/bgtlt/ligt ltligt12829
Oak 1x1x2 ltbgt5.75lt/bgtlt/ligt ltligt13029
Pine 1x1x2 ltbgt2.00lt/bgtlt/ligt
lt/ulgt lt/bodygt lt/htmlgt
9
Example HTML
10
Example XML
ltproduct-listgt ltproductgt ltidgt12729lt/idgt
ltdescriptiongtMaple 1x1x2lt/descriptiongt
ltpricegt4.25lt/pricegt lt/productgt ltproductgt
ltidgt12829lt/idgt ltdescriptiongtOak
1x1x2lt/descriptiongt ltpricegt5.75lt/pricegt
lt/productgt ltproductgt ltidgt13029lt/idgt
ltdescriptiongtPine 1x1x2lt/descriptiongt
ltpricegt2.00lt/pricegt lt/productgt lt/product-listgt
11
Example XML
ltproduct-listgt ltproductgt ltidgt12729lt/idgt
ltdescriptiongt ltitemgtCut Lumberlt/itemgt
ltspeciesgtMaplelt/speciesgt ltheightgtltinchesgt1lt/
inchesgtltheightgt ltwidthgtltinchesgt1lt/inchesgtlt/w
idthgt ltlengthgtltfeetgt2lt/feetgtlt/lengthgt
lt/descriptiongt ltpricegt4.25lt/pricegt
lt/productgt ltproductgt lt/productgt lt/produc
t-listgt
12
Example XML with Attributes
ltproduct-listgt ltproductgt ltidgt12729lt/idgt
ltdescriptiongt ltitemgtCut Lumberlt/itemgt
ltspeciesgtMaplelt/speciesgt ltheight
unitinchesgt1ltheightgt ltwidth
unitinchesgt1lt/widthgt ltlength
unitfeetgt2lt/lengthgt lt/descriptiongt
ltpricegt4.25lt/pricegt lt/productgt ltproductgt
lt/productgt lt/product-listgt
13
Uses of XML
  • Data Exchange
  • Machine to Machine
  • Human to Machine
  • Machine to Human
  • Human to Human
  • Data Storage
  • Multiple Use of Same Document/Data

14
Specific Uses of XML
  • EDI Replacement
  • Web-Services (i.e. RPC format over protocols like
    HTTP)
  • Data Display
  • Web (XHTML, XML XSLT)
  • Multiple Devices, same data (formatting not
    embedded)
  • Data Storage Format (i.e. XML databases)
  • Configuration Files
  • Many other uses

15
XML Details Elements
  • XML Tags
  • Main Building Block of XML Documents
  • Opening and Closing Tag Required
  • ltmy-taggtSome Datalt/my-taggt
  • Can Combine Opening and Closing Tags (called
    empty tags)
  • ltbr/gt

16
XML Details Attributes
  • Used to Specify Additional Details about Element
    Data
  • Can use in place of or in addition to element
    data
  • ltmy-tag attributevaluegtTextlt/my-taggt
  • ltmy-tag dataText/gt

17
XML Details - Comments
  • lt! This is a comment. Comments are ignored by
    XML parsers --gt
  • ltproduct-listgt
  • ltproductgt
  • ltidgt12729lt/idgt
  • ltdescriptiongt
  • ltitemgtCut Lumberlt/itemgt
  • ltspeciesgtMaplelt/speciesgt
  • lt! This is another comment --gt
  • ltheight unitinchesgt1ltheightgt
  • ltwidth unitinchesgt1lt/widthgt
  • ltlength unitfeetgt2lt/lengthgt
  • lt/descriptiongt
  • ltpricegt4.25lt/pricegt
  • lt/productgt
  • ltproductgt
  • lt/productgt
  • lt/product-listgt

18
Well-Formed XML Documents
  • Every start-tag must have a matching end-tag
  • Tags cannot overlap (strict hierarchical tree
    structure, one parent per tag)
  • Each document has exactly one root element
  • Other rules such as allowed characters in tag
    names

19
Valid XML Documents DTDs
  • Document Type Definition (DTD)
  • Specifies a datatype for the document
  • Inherited from SGML
  • Allows XML author to specify validation rules for
    a document
  • Allows tools or programs to verify conformance to
    expected structure, format, naming, etc.
  • Simple
  • Widespread Use
  • Declared in the document with a lt!DOCTYPE gt tag
  • Lack support for data typing of element or
    attribute content

20
Valid XML Documents XML Schemas
  • XML Schema
  • Similar to DTDs
  • More Complex
  • More Powerful
  • Newer than DTDs
  • Not as Widely Used
  • Heavily used within certain domains (such as
    web-services)
  • Quickly Increasing in Popularity

21
Example Document Prolog
lt?xml version"1.0" encoding"UTF-8"?gt lt!DOCTYPE
config SYSTEM "rules-config.dtd"gt ltconfiggt
lt/configgt
22
DOCTYPE Element System ID
  • lt!DOCTYPE config SYSTEM "rules-config.dtd"gt
  • DTDs root element is config
  • SYSTEM denotes a private DTD
  • Can be found locally using name
    rules-config.dtd
  • Location of DTD defaults to document relative
    file location
  • Can be an absolute address or URI
  • Can write code to change that (e.g. make it
    CLASSPATH relative)

23
DOCTYPE Element Public ID
  • lt!DOCTYPE struts-config PUBLIC
  • -//Apache Software Foundation//
  • DTD Struts Configuration 1.1//EN
  • http//jakarta.apache.org/struts/dtds/
  • struts-config_1_1.dtdgt
  • First String following PUBLIC is public
    identifier (human and machine understandable
    description of DTD)
  • Last String is URI specifying DTD location

24
XML Parsers
  • Prewritten code for parsing text out of XML
    documents (elements and attributes)
  • Handle validation against DTD or Schema
  • Drop into XML enabled applications as Jar files
  • Provide API for accessing document content
  • Eliminate the need to write low-level text
    parsing code
  • Apache Xerces Popular Open Source parser

25
XML Parsing
  • Simple API for XML (SAX)
  • Event-based API
  • Parser invokes callback (listener) methods for
    start-tags, end-tags, text data, etc.
  • Memory efficient (useful for large XML documents)
  • Document Object Model (DOM)
  • Tree-based API
  • Parser parses (and optionally validates) entire
    document
  • Method provided to get the root element
  • All nodes of tree accessible from root element
  • Easier to use than SAX

26
Java-Based DOM Parsing
  • java.xml and java.xml.parsers packages
  • Contain Java wrapper classes for commercial and
    open-source XML parsers
  • Main classes DocumentBuilderFactory and
    DocumentBuilder
  • org.w3c.dom package
  • Part of J2SE since 1.4
  • Contains classes and interfaces used to implement
    the DOM API
  • Main interfaces Document, Element
  • org.xml.sax package
  • Some interfaces used even when using DOM
  • Main interfaces InputSource, EntityResolver,
    ErrorHandler

27
DOM Example Parsing
protected final Element getRootElement() throws
ParserConfigurationException, SAXException,
IOException if (this.rootElement null)
// Create an InputSource for the configuration
file with the // SystemId set to match the
file name so the entity resolver // callback
method can find the file. InputSource
inputSource resolveEntity(null,
configFileName) // Get the document builder
factory DocumentBuilderFactory factory
DocumentBuilderFactory.newInstance()
factory.setValidating(true) // Get the
document builder and parse the document
DocumentBuilder builder factory.newDocumentBuild
er() builder.setEntityResolver(this)
builder.setErrorHandler(this) Document doc
builder.parse(inputSource) // Get the root
element from the DOM tree this.rootElement
doc.getDocumentElement() return
this.rootElement
28
DOM Example Parsing
  • From root element
  • Use methods of org.w3c.dom.Element (including
    methods inherited from Node) to drill-down into
    document.
  • Useful methods
  • NodeList getElementsByTagName(String)
  • NodeList getChildNodes()
  • String getAttribute(String)
  • Several others

29
DOM Example Resolving Entities
  • Allows programmer to specify how external
    entities (DTDs, Schemas, XML Documents) are
    accessed.
  • Can override default behavior of XML document
    location relative path.
  • Implement EntityResolver interface and create a
    resolveEntity(String, String) method.
  • Useful for reading documents and DTDs off the
    CLASSPATH.

30
DOM Example Handling Errors
// Implementation of the ErrorHandler
Interface public void warning(SAXParseException
ex) System.err.println(ex) public void
error(SAXParseException ex) throws SAXException
throw ex public void fatalError(SAXParseEx
ception ex) throws SAXException throw ex
31
Topics Not Covered
  • DTD Syntax
  • XML Schemas
  • Namespaces
  • Entities
  • XSL/XSLT
  • XPath/XQuery
  • XLink/XPointer
  • Web Services
  • Languages Support Covered Briefly
  • Other
Write a Comment
User Comments (0)
About PowerShow.com