XML, DTD, XML Schema - PowerPoint PPT Presentation

About This Presentation
Title:

XML, DTD, XML Schema

Description:

Why: Help XML generation and processing. ... name Bob Marley /name /person person id='bridget' mother='mary' name Bridget Jones /name ... – PowerPoint PPT presentation

Number of Views:419
Avg rating:3.0/5.0
Slides: 65
Provided by: jiang79
Category:
Tags: dtd | xml | and | marley | me | schema

less

Transcript and Presenter's Notes

Title: XML, DTD, XML Schema


1
XML, DTD, XML Schema
  • Jianguo Lu
  • University of Windsor

2
XML Basics
  • XML
  • DTD
  • XML Schema
  • XSLT

3
What is XML (eXtremely Marketed Language)
  • Markup

From XML Handbook
4
XML (eXtensible Markup Language)
  • Markup in XML
  • A sequence of characters inserted into a text
    file, to indicate how the file should be
    displayed, or to describe the logical structure.
  • Markup is everything in a document that is not
    content.
  • Initially used in typesetting a document
  • Markup indicators are called tags. e.g.
  • ltfont colorbluegt
  • A pair of tags and the things enclosed in tags is
    called element. e.g.
  • ltfont colorbluegt formatted as blue lt/fontgt

5
What is XML (eXtensible Markup Language) (cont.)
  • Extensible
  • In general Something that is designed that users
    or later designers can extend its capability.
  • In XML Allow you to define your own tags to
    describe data
  • You can represent any information (define new
    tags)
  • You can represent in the way you want (define new
    structure)
  • XML is a meta-language
  • A language to define other languages
  • Use DTD to define the syntax of a language

6
Markup (and extensible) languages are not new
  • SGML (Standard Generalized Markup Language)
  • Markup, extensible
  • 1980 first publication, 1986 ISO standard
  • HTML(HyperText Markup Language)
  • Markup, hypertext, Subset of SGML
  • Started 1990, CERN (Centre Européen de Recherche
    Nucléaire, or European High-Energy Particle
    Physics lab)
  • Invented by Tim Berners-Lee
  • XML (eXtensible Markup Language)
  • Subset of SGML
  • Started 1996, adopted by W3C 1998
  • Eliminate the complexity of SGML
  • Separate the data from the formatting information
    in HTML

SGML
HTML
7
HTML
  • table lttablegt
  • Table head ltTHgt
  • Table row ltTRgt
  • Table data ltTDgt

open tag, element name
Attribute value
attribute
  • lthtmlgtltbodygt  Stock table
  • ltTABLE border"1"gt
  • ltTRgtltTHgt Exchange lt/THgt ltTHgt Name lt/THgt
    ltTHgt Price lt/THgt lt/TRgt
  • ltTRgtltTDgt nasdaq lt/TDgt ltTDgt amazon corp lt/TDgt
    ltTDgt 16.875 lt/TDgt lt/TRgt 
  • ltTRgtltTDgt nyse lt/TDgt ltTDgt IBM inc lt/TDgt
    ltTDgt 102.250lt/TDgt lt/TRgt 
  • lt/TABLEgt lt/bodygtlt/htmlgt

stock.html
closing tag
data
Displayed in browser
8
XML and HTML
  • Similarities
  • They are both markup languages
  • They are both simple.
  • Differences

9
XML Example
attribute
  • lt?xml version"1.0" ?gt
  • ltstocksgt
  • ltstock exchange"nasdaq"gt 
  • ltnamegtamazon corplt/namegt  
  • ltsymbolgtamznlt/symbolgt  
  • ltpricegt16lt/pricegt  
  • lt/stockgt
  • ltstock exchange"nyse"gt 
  • ltnamegtIBM inclt/namegt  
  • ltpricegt102lt/pricegt  
  • lt/stockgt 
  • lt/stocksgt

element
stock.xml
  • An XML document has a group of elements
  • Each element has an opening tag and a closing
    tag
  • An element can have attributes.

10
Benefits of using XML
lthtmlgtltbodygt  Stock table ltTABLE
border"1"gt ltTRgtltTHgt Exchange lt/THgt ltTHgt Name
lt/THgt ltTHgt Price lt/THgt lt/TRgt ltTRgtltTDgt nasdaq
lt/TDgt ltTDgt amazon corp lt/TDgt ltTDgt 16.875 lt/TDgt
lt/TRgt  ltTRgtltTDgt nyse lt/TDgt ltTDgt IBM inc
lt/TDgt ltTDgt 102.250lt/TDgt lt/TRgt  lt/TABLEgt
lt/bodygtlt/htmlgt
lt?xml version"1.0" ?gt ltstocksgt ltstock
exchange"nasdaq"gt  ltnamegtamazon corplt/namegt
  ltsymbolgtamznlt/symbolgt  
ltpricegt16lt/pricegt   lt/stockgt ltstock
exchange"nyse"gt  ltnamegtIBM inclt/namegt  
ltpricegt102lt/pricegt   lt/stockgt  lt/stocksgt
11
Tree structure of XML
ltstocksgt
ltstock Exchangenyse gt
ltstock exchangenasdaqgt
ltnamegt
ltpricegt
ltnamegt
ltpricegt
ltsymbolgt
IBM
105
amzn
15.45
Amazon inc
12
XML Element
  • An element consists of
  • an opening tag
  • the content
  • a closing tag
  • Example
  • ltlecturergtDavid Billingtonlt/lecturergt
  • Tag names can be chosen almost freely.
  • The first character must be a letter, an
    underscore, or a colon
  • No name may begin with the string xml in any
    combination of cases
  • E.g. Xml, xML

13
Contents of XML Elements
  • Content may be text, or other elements, or
    nothing
  • ltlecturergt
  • ltnamegtDavid Billingtonlt/namegt
  • ltphonegt 61 - 7 - 3875 507 lt/phonegt
  • lt/lecturergt
  • ltlecturergtlt/lecturergt
  • If there is no content, then the element is
    called empty it is abbreviated as follows
  • ltlecturer/gt

14
Attributes
  • An attribute is a name-value pair inside the
    opening tag of an element
  • ltlecturer name"David Billington" phone"61 - 7
    - 3875 507"/gt
  • Example
  • ltorder orderNo"23456" customer"John Smith"
  • date"October 15, 2002"gt
  • ltitem itemNo"a528" quantity"1"/gt
  • ltitem itemNo"c817" quantity"3"/gt
  • lt/ordergt

15
Element and Attribute
ltorder orderNo"23456"
customer"John Smith" date"October
15, 2002"gt ltitem itemNo"a528
quantity"1"/gt ltitem itemNo"c817
quantity"3"/gt lt/ordergt
  • ltordergt
  • ltorderNogt23456lt/orderNogt
  • ltcustomergtJohn Smithlt/customergt
  • ltdategtOctober 15, 2002lt/dategt
  • ltitemgt
  • ltitemNogta528lt/itemNogt
  • ltquantitygt1lt/quantitygt
  • lt/itemgt
  • ltitemgt
  • ltitemNogtc817lt/itemNogt
  • ltquantitygt3lt/quantitygt
  • lt/itemgt
  • lt/ordergt
  • Attributes can be replaced by elements
  • When to use elements and when attributes is a
    matter of taste
  • But attributes cannot be nested.
  • Attributes can only have simple types.

16
Further Components of XML Docs
  • Comments
  • A piece of text that is to be ignored by parser
  • lt!-- This is a comment --gt
  • Processing Instructions (PIs)
  • Define procedural attachments
  • lt?xml-stylesheet type"text/xsl"
    href"stock.xsl"?gt
  • This instruction tells the program, say, the
    browser, to use stocl.xsl to process the xml
    document.
  • We will see this processing instruction later in
    XSLT.
  • lt?stylesheet type"text/css" href"mystyle.css"?gt

17
Well formed XML Document
  • An XML document is well formed if it conforms to
    XML syntax rules.
  • Additional rules
  • XML document must have a root element
  • Attribute values must be quoted
  • XML is case sensitive
  • Try to find bugs in the following XML document

lt?xml version"1.0" ?gt ltstock
exchange"nasdaq"gt  ltnamegtamazon corp
lt/namegt   ltsymbolgtamznlt/symbolgt  
ltpricegt16lt/pricegt   lt/stockgt ltstock
exchange nyse gt  ltnamegtIBM inclt/namegt  
ltpricegt 102 lt/PRICEgt   lt/stockgt 
ltstocksgt
lt/stocksgt
18
Valid XML document
lt?xml version"1.0" ?gt ltstocksgt ltnamegt
ltstockgt 102lt/stockgt   lt/namegt ltpricegtIBM
inclt/pricegt   ltsymbolgtamzn lt/symbolgt  
ltpricegt16lt/pricegt   ltstock exchange"nyse"gt 
ltpricegt amazon lt/pricegt   lt/stockgt 
lt/stocksgt
  • Problem
  • Not every well formed document makes sense
  • Solution
  • Associate XML with its type.
  • Valid XML document conforms to its XML schema.

19
XML DTD (Document Type Definition)
  • What DTD is a set of rules to define the syntax
    of a language. It is similar to context free
    grammar.
  • Why Help XML generation and processing.
  • How Write a sequence of element declarations and
    attribute declarations.
  • Element declaration
  • lt!ELEMENT tagName tagContentgt
  • Attribute declaration
  • lt!ATTLIST tagName attName attContentgt

Repeat 0 or more times
Occur 0 or once.
lt!ELEMENT stocks (stock)gt lt!ELEMENT stock (name,
symbol?, price)gt lt!ATTLIST stock exchange CDATA
gt lt!ELEMENT name (PCDATA)gt lt!ELEMENT symbol
(PCDATA)gt lt!ELEMENT price (PCDATA)gt
stock.dtd
20
Another DTD example
ltorder orderNo"23456 customer"John
Smith date"October 15, 2002"gt
ltitem itemNo"a528" quantity"1"/gt ltitem
itemNo"c817" quantity"3"/gt lt/ordergt
  • ID attribute values must be unique
  • IDREF attribute values must match some ID
  • lt!ELEMENT order (item)gt
  • lt!ATTLIST order
  • orderNo ID REQUIRED
  • customer CDATA REQUIRED
  • date CDATA REQUIREDgt
  • lt!ELEMENT item EMPTYgt
  • lt!ATTLIST item itemNo ID REQUIRED
  • quantity CDATA REQUIRED
  • comments CDATA IMPLIEDgt

21
Element Declaration
  • General form
  • lt!ELEMENT tagName tagContentgt
  • Example
  • lt!ELEMENT stock (name, symbol?, price)gt
  • Content Model
  • Sequence, Choice, Cardinality
  • We express that a lecturer element contains
    either a name element or a phone element as
    follows
  • lt!ELEMENT lecturer (name phone)gt
  • A lecturer element contains a name element and a
    phone element in any order.
  • lt!ELEMENT lecturer((name,phone)(phone,name))gt
  • Cardinality operators
  • ? appears zero times or once
  • appears zero or more times
  • appears one or more times
  • No cardinality operator means exactly once

22
Attribute declaration
  • General form
  • lt!ATTLIST tagName attName attContentgt
  • Example
  • lt!ATTLIST stock exchange CDATA gt
  • lt!ATTLIST item itemNo ID REQUIRED
  • quantity CDATA
    REQUIRED
  • comments CDATA
    IMPLIEDgt
  • AttContent contains Attribute types and default
    values.

23
Attribute types
  • Similar to predefined data types, but limited
    selection
  • The most important types are
  • CDATA, a string (sequence of characters)
  • Example lt!ATTLIST stock exchange CDATA gt
  • ID, a name that is unique across the entire XML
    document
  • IDREF, a reference to another element with an ID
    attribute carrying the same value as the IDREF
    attribute
  • IDREFS, a series of IDREFs
  • (v1 . . . vn), an enumeration of all possible
    values
  • Limitations no data types for dates, integer,
    number ranges etc.
  • XML Schema will solve this problem

24
Reference with IDREF and IDREFS
  • DTD
  • lt!ELEMENT family (person)gt
  • lt!ELEMENT person (name)gt
  • lt!ELEMENT name (PCDATA)gt
  • lt!ATTLIST person
  • id ID REQUIRED
  • mother IDREF IMPLIED
  • father IDREF IMPLIED
  • children IDREFS IMPLIEDgt
  • XML Example
  • ltfamilygt
  • ltperson id"bob"
  • mother"mary" father"peter"gt
  • ltnamegtBob Marleylt/namegt
  • lt/persongt
  • ltperson id"bridget" mother"mary"gt
  • ltnamegtBridget Joneslt/namegt
  • lt/persongt
  • ltperson id"mary" children"bob bridget"gt
  • ltnamegtMary Poppinslt/namegt
  • lt/persongt
  • ltperson id"peter" children"bob"gt
  • ltnamegtPeter Marleylt/namegt
  • lt/persongt

Whats the corresponding concepts in database ?
25
Enumerated attribute values
  • Syntax
  • lt!ATTLIST element-name
  • attribute-name (en1en2..)
    default-valuegt
  • DTD example
  • lt!ATTLIST payment method (checkcash)
    "cash"gt
  • Valid XML example
  • ltpayment method"check" /gt
  • or
  • ltpayment method"cash" /gt

26
Attribute defaults
  • REQUIRED
  • Attribute must appear in every occurrence of the
    element type in the XML document
  • IMPLIED
  • The appearance of the attribute is optional
  • FIXED "value"
  • Every element must have this attribute value
  • "value"
  • This specifies the default value for the attribute

27
FIXED value
  • Syntax
  • lt!ATTLIST element-name
  • attribute-name attribute-type FIXED
    "value"gt
  • DTD example
  • lt!ATTLIST sender company CDATA FIXED
    "Microsoft"gt
  • A valid XML
  • ltsender company"Microsoft" /gt
  • An invalid XML
  • ltsender companyIBM" /gt

28
Default value
  • Example DTD
  • lt!ELEMENT square EMPTYgt
  • lt!ATTLIST square width CDATA "0"gt
  • Valid XML
  • ltsquare width"100" /gt
  • ltsquare/gt

29
DTD example for email
  • A head element contains (in that order)
  • a from element
  • at least one to element
  • zero or more cc elements
  • a subject element
  • In from, to, and cc elements
  • the name attribute is not required
  • the address attribute is always required
  • A body element contains
  • a text element
  • possibly followed by a number of attachment
    elements
  • The encoding attribute of an attachment element
    must have either the value mime or binhex
  • mime is the default value

30
Email DTD
  • lt!ELEMENT email (head, body)gt
  • lt!ELEMENT head (from, to, cc, subject)gt
  • lt!ELEMENT from EMPTYgt
  • lt!ATTLIST from name CDATA IMPLIED
  • address CDATA REQUIREDgt
  • lt!ELEMENT to EMPTYgt
  • lt!ATTLIST to name CDATA IMPLIED
  • address CDATA REQUIREDgt
  • lt!ELEMENT cc EMPTYgt
  • lt!ATTLIST cc name CDATA IMPLIED
  • address CDATA REQUIREDgt
  • lt!ELEMENT subject (PCDATA)gt
  • lt!ELEMENT body (text, attachment)gt
  • lt!ELEMENT text (PCDATA)gt
  • lt!ELEMENT attachment EMPTYgt
  • lt!ATTLIST attachment encoding (mimebinhex)
    "mime"
  • file CDATA
    REQUIREDgt

31
Remarks on DTD
  • A DTD can be interpreted as an Extended
    Backus-Naur Form (EBNF)
  • lt!ELEMENT email (head, body)gt
  • is equivalent to email head body
  • Recursive definitions possible in DTDs
  • lt!ELEMENT bintree ((bintree root bintree)
    emptytree)gt

32
Where we are
  • XML
  • DTD
  • XML Schema

33
XML Schemacompared with DTD
  • Significantly richer language for defining the
    structure of XML documents
  • Its syntax is based on XML itself
  • not necessary to write separate tools
  • Reuse and refinement of schemas
  • Expand or delete already existent schemas
  • Sophisticated set of data types, compared to DTDs
  • Define new data types.
  • Built in data types DTD supports 10 XML Schemas
    supports 44 datatypes.

34
From DTD to XML Schema
  • lt!ELEMENT BookStore (Book)gt
  • lt!ELEMENT Book (Title, Author, Date, ISBN,
    Publisher)gt
  • lt!ELEMENT Title (PCDATA)gt
  • lt!ELEMENT Author (PCDATA)gt
  • lt!ELEMENT Date (PCDATA)gt
  • lt!ELEMENT ISBN (PCDATA)gt
  • lt!ELEMENT Publisher (PCDATA)gt

35
lt?xml version"1.0"?gt ltxsdschema
xmlnsxsd"http//www.w3.org/2001/XMLSchema"
targetNamespace"http//www.books
.org" xmlns"http//www.book
s.org" elementFormDefault"q
ualified"gt ltxsdelement name"BookStore"gt
ltxsdcomplexTypegt ltxsdsequencegt
ltxsdelement ref"Book"
minOccurs"1" maxOccurs"unbounded"/gt
lt/xsdsequencegt lt/xsdcomplexTypegt
lt/xsdelementgt ltxsdelement name"Book"gt
ltxsdcomplexTypegt ltxsdsequencegt
ltxsdelement ref"Title"
minOccurs"1" maxOccurs"1"/gt
ltxsdelement ref"Author" minOccurs"1"
maxOccurs"1"/gt ltxsdelement
ref"Date" minOccurs"1" maxOccurs"1"/gt
ltxsdelement ref"ISBN" minOccurs"1"
maxOccurs"1"/gt ltxsdelement
ref"Publisher" minOccurs"1" maxOccurs"1"/gt
lt/xsdsequencegt
lt/xsdcomplexTypegt lt/xsdelementgt
ltxsdelement name"Title" type"xsdstring"/gt
ltxsdelement name"Author" type"xsdstring"/gt
ltxsdelement name"Date" type"xsdstring"/gt
ltxsdelement name"ISBN" type"xsdstring"/gt
ltxsdelement name"Publisher" type"xsdstring"/gt
lt/xsdschemagt
lt!ELEMENT BookStore (Book)gt
lt!ELEMENT Book (Title, Author, Date,
ISBN, Publisher)gt
lt!ELEMENT Title (PCDATA)gt lt!ELEMENT Author
(PCDATA)gt lt!ELEMENT Date (PCDATA)gt lt!ELEMENT
ISBN (PCDATA)gt lt!ELEMENT Publisher (PCDATA)gt
36
XML Schema syntax
  • An XML schema is an element with an opening tag
    like
  • ltschema xmlnshttp//www.w3.org/2000/10/XMLSchema
    version"1.0"gt
  • Structure of schema elements
  • Element and attribute types using data types

37
Element Types
  • Example
  • ltelement name"email"/gt
  • ltelement name"head" minOccurs"1"
    maxOccurs"1"/gt
  • ltelement name"to" minOccurs"1"/gt
  • Cardinality constraints
  • minOccurs"x" (default value 1)
  • maxOccurs"x" (default value 1)
  • Generalizations of , ?, offered by DTDs

ltxsdelement nameemail" minOccurs"1"
maxOccurs"1"/gt
Equivalent!
ltxsdelement nameemail"/gt
38
Attribute declaration
  • A simple attribute declaration example
  • Attribute definition
  • ltxsattribute name"lang" type"xsstring"/gt
  • XML example with the attribute
  • ltlastname lang"EN"gtSmithlt/lastnamegt
  • Data types
  • Declare default/optional/required/fixed values
  • ltattribute name"lang" type"xsstring"
    use"optional"/gt
  • ltattribute name"lang" type"xsstring"
    use"required"/gt
  • Note that we dont need to specify the element
    name as in DTD.

39
Data types
  • Built-in data types
  • Numerical data types integer, Short etc.
  • String types string, ID, IDREF, CDATA etc.
  • Date and time data types time, Month etc.
  • User-defined data types
  • simple data types, which cannot use elements or
    attributes
  • complex data types, which can use these
  • Complex data types
  • sequence, a sequence of existing data type
    elements (order is important)
  • all, a collection of elements that must appear
    (order is not important)
  • choice, a collection of elements, of which one
    will be chosen

40
Complex data type
  • ltcomplexType name"lecturerType"gt
  • ltsequencegt
  • ltelement name"firstname" type"string"
  • minOccurs"0 maxOccurs"unbounded"/gt
  • ltelement name"lastname" type"string"/gt
  • lt/sequencegt
  • ltattribute name"title" type"string"
    use"optional"/gt
  • lt/complexTypegt

41
Data Type Extension
  • Data types can be extended by new elements or
    attributes. Example
  • ltcomplexType name"extendedLecturerType"gt
  • ltextension base"lecturerType"gt
  • ltsequencegt
  • ltelement name"email type"string
  • minOccurs"0"
    maxOccurs"1"/gt
  • lt/sequencegt
  • ltattribute name"rank" type"string"
    use"required"/gt
  • lt/extensiongt
  • lt/complexTypegt

42
Resulting XML Schema
  • ltcomplexType name"extendedLecturerType"gt
  • ltsequencegt
  • ltelement name"firstname" type"string"
  • minOccurs"0" maxOccurs"unbounded"/gt
  • ltelement name"lastname" type"string"/gt
  • ltelement name"email" type"string"
  • minOccurs"0" maxOccurs"1"/gt
  • lt/sequencegt
  • ltattribute name"title" type"string
    use"optional"/gt
  • ltattribute name"rank" type"string"
    use"required"/gt
  • lt/complexTypegt

43
Data type restriction
  • An existing data type may be restricted by adding
    constraints on certain values
  • Restriction is not the opposite from extension
  • Restriction is not achieved by deleting elements
    or attributes
  • The following hierarchical relationship holds
  • Instances of the restricted type are also
    instances of the original type
  • They satisfy at least the constraints of the
    original type.

44
Example of Data Type Restriction
  • ltcomplexType name"restrictedLecturerType"gt
  • ltrestriction base"lecturerType"gt
  • ltsequencegt
  • ltelement name"firstname" type"string"
  • minOccurs"1" maxOccurs"2"/gt
  • lt/sequencegt
  • ltattribute name"title" type"string"
    use"required"/gt
  • lt/restrictiongt
  • lt/complexTypegt

ltcomplexType name"lecturerType"gt ltsequencegt lte
lement name"firstname" type"string"
minOccurs"0 maxOccurs"unbounded"/gt ltelement
name"lastname" type"string"/gt lt/sequencegt ltatt
ribute name"title" type"string"
use"optional"/gt lt/complexTypegt
45
Restriction of simple data types
  • ltsimpleType name"dayOfMonth"gt
  • ltrestriction base"integer"gt
  • ltminInclusive value"1"/gt
  • ltmaxInclusive value"31"/gt
  • lt/restrictiongt
  • lt/simpleTypegt
  • ltsimpleType name"dayOfWeek"gt
  • ltrestriction base"string"gt
  • ltenumeration value"Mon"/gt
  • ltenumeration value"Tue"/gt
  • ltenumeration value"Wed"/gt
  • ltenumeration value"Thu"/gt
  • ltenumeration value"Fri"/gt
  • ltenumeration value"Sat"/gt
  • ltenumeration value"Sun"/gt
  • lt/restrictiongt
  • lt/simpleTypegt

46
Regular expression can be used
  • ltxsdsimpleType name"TelephoneNumber"gt
  • ltxsdrestriction base"xsdstring"gt
  • ltxsdlength value"8"/gt
  • ltxsdpattern value"\d3-\d4"/gt
  • lt/xsdrestrictiongt
  • lt/xsdsimpleTypegt

47
The email example revisited
  • ltelement name"email" type"emailType"/gt
  • ltcomplexType name"emailType"gt
  • ltsequencegt
  • ltelement name"head" type"headType"/gt
  • ltelement name"body" type"bodyType"/gt
  • lt/sequencegt
  • lt/complexTypegt
  • ltcomplexType name"headType"gt
  • ltsequencegt
  • ltelement name"from" type"nameAddress"/gt
  • ltelement name"to" type"nameAddress"
  • minOccurs"1" maxOccurs"unbounded"/gt
  • ltelement name"cc type"nameAddress"
  • minOccurs"0" maxOccurs"unbounded"/gt
  • ltelement name"subject type"string"/gt
  • lt/sequencegt
  • lt/complexTypegt

48
Different ways to declare elements
1
ltxsdelement name"name" type"type"
minOccurs"int" maxOccurs"int"/gt
ltxsdelement name"name" minOccurs"int"
maxOccurs"int"gt ltxsdcomplexTypegt
lt/xsdcomplexTypegt lt/xsdelementgt
2
ltxsdelement name"name" minOccurs"int"
maxOccurs"int"gt ltxsdsimpleTypegt
ltxsdrestriction base"type"gt
lt/xsdrestrictiongt
lt/xsdsimpleTypegt lt/xsdelementgt
3
49
Another way to define email schema
ltelement name"email"gt ltcomplexTypegt
ltsequencegt ltelement name"head"
type"headType"/gt ltelement name"body"
type"bodyType"/gt lt/sequencegt
lt/complexTypegt lt/elementgt ltcomplexType
name"headType"gt ltsequencegt ltelement
name"from" type"nameAddress"/gt ltelement
name"to" type"nameAddress" minOccurs"1"
maxOccurs"unbounded"/gt ltelement name"cc
type"nameAddress" minOccurs"0"
maxOccurs"unbounded"/gt ltelement
name"subject type"string"/gt lt/sequencegt lt/co
mplexTypegt
50
Using XML Schema/DTD
  • Data model
  • With XML Schemas you specify how your XML data
    will be organized, and the datatypes of your
    data. That is, with XML Schemas you model how
    your data is to be represented in an instance
    document.
  • A contract
  • Organizations agree to structure their XML
    documents in conformance with an XML Schema.
    Thus, the XML Schema acts as a contract between
    the organizations.
  • A rich source of metadata
  • An XML Schema document contains lots of data
    about the data in the XML instance documents,
    such as the datatype of the data, the data's
    range of values, how the data is related to
    another piece of data (parent/child, sibling
    relationship), i.e., XML Schemas contain metadata

51
Save coding
  • "In a typical program, up to 60 of the code is
    spent checking the data!- source unknown

Code to actually do the work
Code to check the structure and
content (datatype) of the data
If your data is structured as XML, and there is a
schema, then you can hand the data-checking task
off to a schema validator. Thus, your code is
reduced by up to 60!!! Big savings!
52
XML-Schema to GUI
Supplier Web Server
GUI Builder
P.O. Schema
P.O. HTML
From Costello
53
Schema to GUI
54
XML Schema to API
Person Schema
API Builder
Person API
55
XML Schema to Object
  • ltxsschema xmlnsxshttp//www.w3.org/2001/XMLSch
    emagt
  • ltxscomplexType nameordersgt
  • ltxssequencegt
  • ltxselement nameorder typeorder
    maxOccursunbounded/gt
  • lt/xssequencegt
  • lt/xscomplexTypegt
  • ltxscomplexType nameordergt
  • ltxssequencegt
  • ltxselement nameitem typeitem
    maxOccursunbounded/gt
  • lt/xssequencegt
  • ltxsattribute nameid typexsstring
    userequired/gt
  • ltxsattribute namezip typexsint
    userequired/gt
  • lt/xscomplexTypegt
  • ltxscomplexType nameitemgt
  • ltxssequencegt
  • ltxselement nameprice typexsdouble /gt
  • ltxselement namequantity typexsint /gt
  • lt/xssequencegt
  • ltxsattribute nameid typexsstring
    userequired/gt

56
The corresponding classes
  • public class orders
  • public order order
  • public class order
  • public item item
  • public string id
  • public int zip
  • public class item
  • public double price
  • public int quantity
  • public string id

57
JAXB architecture
  • Java Architecture for XML Binding (JAXB) provides
    a fast and convenient way to bind XML schemas to
    Java representations,
  • making it easy for Java developers to incorporate
    XML data and processing functions in Java
    applications.

58
JAXB Mapping of XML Schema Built-in Data Types
  • XML Schema Type Java Data Type
  • xsdstring java.lang.String
  • xsdinteger java.math.BigInteger
  • xsdint int
  • xsd.long long
  • xsdshort short
  • xsddecimal java.math.BigDecimal
  • xsdfloat float
  • xsddouble double
  • xsdboolean boolean
  • xsdbyte byte
  • xsdQName javax.xml.namespace.QName
  • xsdbase64Binary byte
  • xsdhexBinary byte
  • xsdunsignedInt long
  • xsdunsignedShort int
  • xsdunsignedByte short
  • xsdtime java.util.Calendar
  • xsddate java.util.Calendar

59
JAXB binding
  • Bind the following to Java package
  • XML Namespace URI
  • Bind the following XML Schema components to Java
    content interface
  • Named complex type
  • Anonymous inlined type definition of an element
    declaration
  • Bind to typesafe enum class
  • A named simple type definition with a basetype
    that derives from xsdNCName and has
    enumeration facets.
  • Bind the following XML Schema components to a
    Java Element interface
  • A global element declaration to a Element
    interface.
  • Local element declaration that can be inserted
    into a general content list.
  • Bind to Java property
  • Attribute use
  • Particle with a term that is an element reference
    or local element declaration.

60
XML Schema to smart editor
Helps you build your instance documents. For
example, it pops up a menu showing you what is
valid next. It knows this by looking at the XML
Schema!
Smart Editor (e.g., XML Spy)
P.O. Schema
61
XML Schema directed editor
62
Multiple levels of checking
BookStore.xml
BookStore.xsd
XMLSchema.xsd (schema-for-schemas)
Validate that the xml document conforms to the
rules described in BookStore.xsd
Validate that BookStore.xsd is a valid schema
document, i.e., it conforms to the rules
described in the schema-for-schemas
From Costello
63
XML Schema validators
  • Command Line Only
  • XSV by Henry Thompson
  • ftp//ftp.cogsci.ed.ac.uk/pub/XSV/XSV12.EXE
  • Has a Programmatic API
  • xerces by Apache
  • http//www.apache.org/xerces-j/index.html
  • IBM Schema Quality Checker (Note this tool is
    only used to check your schema. It cannot be
    used to validate an instance document against a
    schema.)
  • http//www.alphaworks.ibm.com/tech/xmlsqc
  • MSXML4.0
  • http//www.microsoft.com
  • GUI Oriented
  • XML Spy
  • www.altova.com/ (previously http//www.xmlspy.com
    )
  • Turbo XML
  • http//www.extensibility.com

64
XML Schema editor
Write a Comment
User Comments (0)
About PowerShow.com