Title: XML%20Validation
1XML Validation
In making up the slides for this lecture, I
borrowed material from several very nice
sources Data on the Web Abiteboul,
Buneman and Suciu XML in a Nutshell Harold
and Means The XML Companion Bradley The
validation examples were originally tested with
an older parser and so the specific outputs may
differ from those shown.
2XML Validation
A batch validating process involves comparing the
DTD against a complete document instance and
producing a report containing any errors or
warnings. Software developers should consider
batch validation to be analogous to program
compilation, with similar errors
detected. Interactive validation involves
constant comparison of the DTD against a document
as it is being created.
3XML Validation
- The benefits of validating documents against a
DTD include - Programmers can write extraction and
manipulation filters - without fear of their software ever processing
unexpected - input.
- Using an XML-aware word processor, authors and
editors can - be guided and constrained to produce conforming
documents.
4XML Validation Examples
XML elements may contain further, embedded
elements, and the entire document must be
enclosed by a single document element. The
degree to which an elements content is organized
into child elements is often termed its
granularity. Some hierarchical structures are
recursive. The Document Type Definition (DTD)
contains rules for each element allowed within a
specific class of documents.
5Things the DTD does not do
Tell us the document root. Specify the number of
instances of each kind of element. Describe the
character data inside an element (precise syntax
and semantics). The XML schema language is new
and may replace the use of DTDs
6Well run this program against several xml
files with DTDs. Well study the code soon.
// Validate.java using Xerces import
java.io. import org.xml.sax.ErrorHandler impor
t org.xml.sax.SAXException import
org.xml.sax.SAXParseException import
org.xml.sax.XMLReader import org.xml.sax.InputSou
rce import org.xml.sax.helpers.XMLReaderFactory
import org.xml.sax.helpers.DefaultHandler
This slide shows the imported classes.
7public class Validate public static
boolean valid true public static void
main (String argv ) if
(argv.length ! 1)
System.err.println ("Usage java Validate
filename.xml") System.exit (1)
Here we check if the command line is correct.
8try // get a parser
XMLReader reader
XMLReaderFactory.createXMLReader(
"org.apache.xerces.parsers.SAXParser")
// request validation
reader.setFeature("http//xml.org/sax/features/val
idation",
true) // associate an
InputSource object with the file name
InputSource inputSource new
InputSource(argv0) // go ahead
and parse reader.parse(inputSource)
9// Catch any errors or fatal errors here. // The
parser will handle simple warnings.
catch(org.xml.sax.SAXException e)
System.out.println("Error in parsing "
e) valid false
catch(java.io.IOException e)
System.out.println("Error in I/O "
e) System.exit(0)
System.out.println("Valid Document is "
valid)
10lt?xml version"1.0" encoding"UTF-8"?gt lt!DOCTYPE
FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"gt ltFixed
FloatSwapgt ltNotionalgt100lt/Notionalgt
ltFixed_Rategt5lt/Fixed_Rategt ltNumYearsgt3lt/NumYea
rsgt ltNumPaymentsgt6lt/NumPaymentsgt lt/FixedFloatS
wapgt
XML Document
DTD
lt?xml version"1.0" encoding"utf-8"?gt lt!ELEMENT
FixedFloatSwap (Notional, Fixed_Rate, NumYears,
NumPayments ) gt lt!ELEMENT Notional (PCDATA)
gt lt!ELEMENT Fixed_Rate (PCDATA) gt lt!ELEMENT
NumYears (PCDATA) gt lt!ELEMENT NumPayments
(PCDATA) gt
Valid document is true
11lt?xml version"1.0" encoding"UTF-8"?gt lt!DOCTYPE
FixedFloatSwap SYSTEM "http//localhost8001/dtd/F
ixedFloatSwap.dtd"gt ltFixedFloatSwapgt
ltNotionalgt100lt/Notionalgt ltFixed_Rategt5lt/Fixed_
Rategt ltNumYearsgt3lt/NumYearsgt
ltNumPaymentsgt6lt/NumPaymentsgt lt/FixedFloatSwapgt
XML Document
DTD on the Web? VERY NICE
lt?xml version"1.0" encoding"utf-8"?gt lt!ELEMENT
FixedFloatSwap (Notional, Fixed_Rate, NumYears,
NumPayments ) gt lt!ELEMENT Notional (PCDATA)
gt lt!ELEMENT Fixed_Rate (PCDATA) gt lt!ELEMENT
NumYears (PCDATA) gt lt!ELEMENT NumPayments
(PCDATA) gt
Valid document is true
12XML Document with an internal subset
lt?xml version"1.0" encoding"UTF-8"?gt lt!DOCTYPE
FixedFloatSwap lt!ELEMENT FixedFloatSwap
(Notional, Fixed_Rate, NumYears, NumPayments ) gt
lt!ELEMENT Notional (PCDATA) gt lt!ELEMENT
Fixed_Rate (PCDATA) gt lt!ELEMENT NumYears
(PCDATA) gt lt!ELEMENT NumPayments (PCDATA)
gt gt ltFixedFloatSwapgt ltNotionalgt100lt/Notion
algt ltFixed_Rategt5lt/Fixed_Rategt
ltNumYearsgt3lt/NumYearsgt ltNumPaymentsgt6lt/NumPaym
entsgt lt/FixedFloatSwapgt
Valid document is true
13lt?xml version"1.0" encoding"UTF-8"?gt lt!DOCTYPE
FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"gt ltFixed
FloatSwapgt ltNotionalgt100lt/Notionalgt
ltFixed_Rategt5lt/Fixed_Rategt ltNumYearsgt3lt/NumYea
rsgt ltNumPaymentsgt6lt/NumPaymentsgt lt/FixedFloatS
wapgt
XML Document
DTD
lt?xml version"1.0" encoding"utf-8"?gt lt!ELEMENT
FixedFloatSwap (Notional, Fixed_Rate,
NumPayments ) gt lt!ELEMENT Notional (PCDATA)
gt lt!ELEMENT Fixed_Rate (PCDATA) gt lt!ELEMENT
NumPayments (PCDATA) gt
Valid document is false
14lt?xml version"1.0" encoding"UTF-8"?gt lt!DOCTYPE
Swaps SYSTEM "FixedFloatSwap.dtd"gt ltSwapsgt
ltFixedFloatSwapgt ltNotionalgt100lt/Notio
nalgt ltFixed_Rategt5lt/Fixed_Rategt
ltNumYearsgt3lt/NumYearsgt ltNumPaymentsgt6lt/NumP
aymentsgt lt/FixedFloatSwapgt
ltFixedFloatSwapgt ltNotionalgt100lt/Notion
algt ltFixed_Rategt5lt/Fixed_Rategt
ltNumYearsgt3lt/NumYearsgt ltNumPaymentsgt6lt/NumP
aymentsgt lt/FixedFloatSwapgt lt/Swapsgt
XML Document
15lt?xml version"1.0" encoding"utf-8"?gt lt!ELEMENT
Swaps (FixedFloatSwap) gt lt!ELEMENT
FixedFloatSwap (Notional, Fixed_Rate, NumYears,
NumPayments ) gt lt!ELEMENT Notional (PCDATA)
gt lt!ELEMENT Fixed_Rate (PCDATA) gt lt!ELEMENT
NumYears (PCDATA) gt lt!ELEMENT NumPayments
(PCDATA) gt
DTD
C\McCarthy\www\examples\saxgtjava Validate
FixedFloatSwap.xml
Valid document is true
Quantity Indicators ? 0 or 1 time
1 or more times 0 or more
times
16Is this a valid document?
lt?xml version"1.0"?gt lt!DOCTYPE person
lt!ELEMENT person (name, profession)gt
lt!ELEMENT profession (PCDATA)gt lt!ELEMENT
name (PCDATA)gt gt ltpersongt ltnamegtAlan
Turinglt/namegt ltprofessiongtcomputer
scientistlt/professiongt ltprofessiongtcryptographe
rlt/professiongt lt/persongt
Sure!
17The locations where document text data is allowed
are indicated by the keyword PCDATA (Parsed
Character Data).
lt?xml version"1.0" encoding"UTF-8"?gt lt!DOCTYPE
FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"gt
ltFixedFloatSwapgt ltNotionalgt100lt/Notio
nalgt ltFixed_Rategt5lt/Fixed_Rategt
ltNumYearsgt ltStartYeargt2000lt/StartYeargt
ltEndYeargt2002lt/EndYeargt
lt/NumYearsgt ltNumPaymentsgt6lt/NumPaymentsgt
lt/FixedFloatSwapgt
XML Document
18DTD
lt?xml version"1.0" encoding"utf-8"?gt lt!ELEMENT
FixedFloatSwap (Notional, Fixed_Rate, NumYears,
NumPayments ) gt lt!ELEMENT Notional (PCDATA)
gt lt!ELEMENT Fixed_Rate (PCDATA) gt lt!ELEMENT
NumYears (PCDATA) gt lt!ELEMENT NumPayments
(PCDATA) gt
Output
C\McCarthy\www\46-928\examples\saxgtjava Validate
FixedFloatSwap.xml org.xml.sax.SAXParseException
Element "NumYears" does not allow "StartYear"
-- (PCDATA) org.xml.sax.SAXParseException
Element type "StartYear" is not
declared. org.xml.sax.SAXParseException Element
"NumYears" does not allow "EndYear" --
( PCDATA) org.xml.sax.SAXParseException Element
type "EndYear" is not declared. Valid document is
false
19Mixed Content
There are strict rules which must be applied when
an element is allowed to contain both text and
child elements. The PCDATA keyword must be the
first token in the group, and the group must be a
choice group (using not ,). The group must
be optional and repeatable. This is known as a
mixed content model.
20lt?xml version"1.0" encoding"utf-8"?gt lt!ELEMENT
Mixed (emph) gt lt!ELEMENT emph (PCDATA sub
super) gt lt!ELEMENT sub (PCDATA)gt lt!ELEMENT
super (PCDATA)gt
DTD
XML Document
lt?xml version"1.0" encoding"UTF-8"?gt lt!DOCTYPE
Mixed SYSTEM "Mixed.dtd"gt ltMixedgt
ltemphgtHltsubgt2lt/subgtO is water.lt/emphgt lt/Mixedgt
Valid document is true
21Is this a valid document?
lt?xml version"1.0"?gt lt!DOCTYPE page
lt!ELEMENT page (paragraph)gt lt!ELEMENT
paragraph ( PCDATA profession bold)gt
lt!ELEMENT profession (PCDATA)gt lt!ELEMENT
bold (PCDATA)gt gt ltpagegt ltparagraphgt Alan
Turing broke codes during ltboldgtWorld War
IIlt/boldgt. He very precisely defined the
notion of "algorithm". And so he had several
professions ltprofessiongtcomputer
scientistlt/professiongt ltprofessiongtcryptograph
erlt/professiongt And ltprofessiongtmathematic
ianlt/professiongt lt/paragraphgt lt/pagegt
Sure!
22How about this one?
lt?xml version"1.0"?gt lt!DOCTYPE page
lt!ELEMENT page (paragraph)gt lt!ELEMENT
paragraph ( PCDATA profession bold)gt
lt!ELEMENT profession (PCDATA)gt lt!ELEMENT
bold (PCDATA)gt gt ltpagegt The following is a
paragraph marked up in XML. ltparagraphgt
Alan Turing broke codes during ltboldgtWorld War
IIlt/boldgt. He very precisely defined the
notion of "algorithm". And so he had several
professions ltprofessiongtcomputer
scientistlt/professiongt ltprofessiongtcryptograph
erlt/professiongt And ltprofessiongtmathemetic
ian lt/professiongt lt/paragraphgt lt/pagegt
java Validate mixed.xml org.xml.sax.SAXParseExcept
ion The content of element type "page" must
match "(paragraph)". Valid document is false
23lt?xml version"1.0" encoding"UTF-8"?gt lt!DOCTYPE
FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"gt
ltFixedFloatSwapgt ltNotionalgt100lt/Notionalgt
ltFixed_Rategt5lt/Fixed_Rategt
ltNumYearsgt3lt/NumYearsgt ltNumPaymentsgt6lt/NumP
aymentsgt ltNotegt
lt!CDATAThis is text that ltbgtwill not be
parsed for markupgt
lt/Notegt lt/FixedFloatSwapgt
XML Document
CDATA Section
DTD
lt?xml version"1.0" encoding"utf-8"?gt lt!ELEMENT
FixedFloatSwap ( Notional, Fixed_Rate, NumYears,
NumPayments, Note ) gt lt!ELEMENT Notional
(PCDATA)gt lt!ELEMENT Fixed_Rate (PCDATA)
gt lt!ELEMENT NumYears (PCDATA) gt lt!ELEMENT
NumPayments (PCDATA) gt lt!ELEMENT Note (PCDATA) gt
24Recursion
lt?xml version"1.0"?gt lt!DOCTYPE tree
lt!ELEMENT tree (node)gt lt!ELEMENT node
(leaf (node,node))gt lt!ELEMENT leaf
(PCDATA)gt gt lttreegt ltnodegt ltleafgtA DTD is
a context-free grammarlt/leafgt lt/nodegt lt/treegt
java Validate recursive1.xml Valid document is
true
25How about this one?
lt?xml version"1.0"?gt lt!DOCTYPE tree
lt!ELEMENT tree (node)gt lt!ELEMENT node
(leaf (node,node))gt lt!ELEMENT leaf
(PCDATA)gt gt lttreegt ltnodegt ltleafgtAlan
Turing would like thislt/leafgt lt/nodegt ltnodegt
ltleafgtAlan Turing would like thislt/leafgt
lt/nodegt lt/treegt
java Validate recursive1.xml org.xml.sax.SAXParseE
xception The content of element type "tree" must
match "(node)". Valid document is false
26Relational Databases and XML
Consider the relational database r1(a,b,c),
r2(c,d) r1 a b c r2 c
d a1 b1 c1 c2
d2 a2 b2 c2 c3
d3 c4
d4 How can we represent this database with
an XML DTD?
27Relations
lt?xml version"1.0"?gt lt!DOCTYPE db
lt!ELEMENT db (r1, r2)gt lt!ELEMENT r1
(a,b,c)gt lt!ELEMENT r2 (c,d)gt
lt!ELEMENT a (PCDATA)gt lt!ELEMENT b
(PCDATA)gt lt!ELEMENT c (PCDATA)gt
lt!ELEMENT d (PCDATA)gt gt ltdbgt ltr1gtltagt a1
lt/agt ltbgt b1 lt/bgt ltcgt c1 lt/cgt lt/r1gt ltr1gtltagt a1
lt/agt ltbgt b1 lt/bgt ltcgt c1 lt/cgt lt/r1gt ltr2gtltcgt c2
lt/cgt ltdgt d2 lt/dgt lt/r2gt ltr2gtltcgt c3 lt/cgt ltdgt d3
lt/dgt lt/r2gt ltr2gtltcgt c4 lt/cgt ltdgt d4 lt/dgt
lt/r2gt lt/dbgt
java Validate Db.xml Valid document is true
There is a small problem.
28Relations
lt?xml version"1.0"?gt lt!DOCTYPE db
lt!ELEMENT db (r1r2) gt
lt!ELEMENT r1 ((a,b,c) (a,c,b) (b,a,c)
(b,c,a) (c,a,b) (c,b,a))gt lt!ELEMENT r2
((c,d) (d,c))gt lt!ELEMENT a
(PCDATA)gt lt!ELEMENT b (PCDATA)gt
lt!ELEMENT c (PCDATA)gt lt!ELEMENT d
(PCDATA)gt gt ltdbgt ltr1gtltagt a1 lt/agt ltbgt b1
lt/bgt ltcgt c1 lt/cgt lt/r1gt ltr1gtltagt a1 lt/agt ltbgt b1
lt/bgt ltcgt c1 lt/cgt lt/r1gt ltr2gtltcgt c2 lt/cgt ltdgt d2
lt/dgt lt/r2gt ltr2gtltcgt c3 lt/cgt ltdgt d3 lt/dgt lt/r2gt
ltr2gtltcgt c4 lt/cgt ltdgt d4 lt/dgt lt/r2gt lt/dbgt
The order of the relations should not count and
neither should the order of columns within rows.
29Attributes
An attribute is associated with a particular
element by the DTD and is assigned an attribute
type. The attribute type can restrict the range
of values it can hold. Example attribute types
include CDATA indicates a simple
string of characters NMTOKEN indicates a
word or token A named token group such as
(left center right) ID an element id
that holds a unique value (among other
element IDs in the document) IDREF
attributes refer to an ID
30lt?xml version"1.0" encoding"utf-8"?gt lt!ELEMENT
FixedFloatSwap (Notional, Fixed_Rate, NumYears,
NumPayments ) gt lt!ELEMENT Notional (PCDATA)
gt lt!ELEMENT Fixed_Rate (PCDATA) gt lt!ELEMENT
NumYears (PCDATA) gt lt!ELEMENT NumPayments
(PCDATA) gt lt!ATTLIST Notional currency (Dollars
Pounds) REQUIREDgt
DTD
lt?xml version"1.0" encoding"UTF-8"?gt lt!DOCTYPE
FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"gt
ltFixedFloatSwapgt ltNotionalgt100lt/Notionalgt
ltFixed_Rategt5lt/Fixed_Rategt
ltNumYearsgt3lt/NumYearsgt ltNumPaymentsgt6lt/NumP
aymentsgt lt/FixedFloatSwapgt
XML Document
C\McCarthy\www\46-928\examples\saxgtjava Validate
FixedFloatSwap.xml org.xml.sax.SAXParseException
Attribute value for "currency" is
REQUIRED. Valid document is false
31lt?xml version"1.0" encoding"utf-8"?gt lt!ELEMENT
FixedFloatSwap (Notional, Fixed_Rate, NumYears,
NumPayments ) gt lt!ELEMENT Notional (PCDATA)
gt lt!ELEMENT Fixed_Rate (PCDATA) gt lt!ELEMENT
NumYears (PCDATA) gt lt!ELEMENT NumPayments
(PCDATA) gt lt!ATTLIST Notional currency (Dollars
Pounds) REQUIREDgt
DTD
lt?xml version"1.0" encoding"UTF-8"?gt lt!DOCTYPE
FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"gt
ltFixedFloatSwapgt ltNotional currency
Poundsgt100lt/Notionalgt ltFixed_Rategt5lt/Fixe
d_Rategt ltNumYearsgt3lt/NumYearsgt
ltNumPaymentsgt6lt/NumPaymentsgt lt/FixedFloatSwapgt
XML Document
Valid document is true
32lt?xml version"1.0" encoding"utf-8"?gt lt!ELEMENT
FixedFloatSwap (Notional, Fixed_Rate, NumYears,
NumPayments ) gt lt!ELEMENT Notional (PCDATA)
gt lt!ELEMENT Fixed_Rate (PCDATA) gt lt!ELEMENT
NumYears (PCDATA) gt lt!ELEMENT NumPayments
(PCDATA) gt lt!ATTLIST Notional currency (Dollars
Pounds) REQUIREDgt lt!ATTLIST FixedFloatSwap
note CDATA IMPLIEDgt
DTD
lt?xml version"1.0" encoding"UTF-8"?gt lt!DOCTYPE
FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"gt
ltFixedFloatSwapgt ltNotional currency
Poundsgt100lt/Notionalgt ltFixed_Rategt5lt/Fixe
d_Rategt ltNumYearsgt3lt/NumYearsgt
ltNumPaymentsgt6lt/NumPaymentsgt lt/FixedFloatSwapgt
XML Document
Valid document is true IMPLIED means optional
33lt?xml version"1.0" encoding"utf-8"?gt lt!ELEMENT
FixedFloatSwap (Notional, Fixed_Rate, NumYears,
NumPayments ) gt lt!ELEMENT Notional (PCDATA)
gt lt!ELEMENT Fixed_Rate (PCDATA) gt lt!ELEMENT
NumYears (PCDATA) gt lt!ELEMENT NumPayments
(PCDATA) gt lt!ATTLIST Notional currency (Dollars
Pounds) REQUIREDgt lt!ATTLIST FixedFloatSwap
note CDATA IMPLIEDgt
DTD
lt?xml version"1.0" encoding"UTF-8"?gt lt!DOCTYPE
FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"gt
ltFixedFloatSwap note For your eyes onlygt
ltNotional currency Poundsgt100lt/Notionalgt
ltFixed_Rategt5lt/Fixed_Rategt
ltNumYearsgt3lt/NumYearsgt ltNumPaymentsgt6lt/NumP
aymentsgt lt/FixedFloatSwapgt
XML Document
Valid document is true
34lt?xml version"1.0" encoding"utf-8"?gt lt!ELEMENT
FixedFloatSwap (Notional, Fixed_Rate, NumYears,
NumPayments ) gt lt!ELEMENT Notional (PCDATA)
gt lt!ELEMENT Fixed_Rate (PCDATA) gt lt!ELEMENT
NumYears (PCDATA) gt lt!ELEMENT NumPayments
(PCDATA) gt lt!ATTLIST Notional currency (Dollars
Pounds) REQUIREDgt lt!ATTLIST FixedFloatSwap
note CDATA IMPLIEDgt
DTD
lt?xml version"1.0" encoding"UTF-8"?gt lt!DOCTYPE
FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"gt
ltFixedFloatSwapgt ltNotional currency
Poundsgt100lt/Notionalgt ltFixed_Rategt5lt/Fixe
d_Rategt ltNumYearsgt3lt/NumYearsgt
ltNumPaymentsgt6lt/NumPaymentsgt lt/FixedFloatSwapgt
XML Document
Valid document is true IMPLIED means optional
35lt?xml version"1.0" encoding"utf-8"?gt lt!ELEMENT
FixedFloatSwap (Notional, Fixed_Rate, NumYears,
NumPayments ) gt lt!ELEMENT Notional (PCDATA)
gt lt!ELEMENT Fixed_Rate (PCDATA) gt lt!ELEMENT
NumYears (PCDATA) gt lt!ELEMENT NumPayments
(PCDATA) gt lt!ATTLIST Notional currency (Dollars
Pounds) REQUIREDgt lt!ATTLIST FixedFloatSwap
note CDATA IMPLIEDgt
DTD
lt?xml version"1.0" encoding"UTF-8"?gt lt!DOCTYPE
FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"gt
ltFixedFloatSwap note For your eyes onlygt
ltNotional currency Poundsgt100lt/Notionalgt
ltFixed_Rategt5lt/Fixed_Rategt
ltNumYearsgt3lt/NumYearsgt ltNumPaymentsgt6lt/NumP
aymentsgt lt/FixedFloatSwapgt
XML Document
Valid document is true
36ID and IDREF Attributes
We can represent complex relationships within an
XML document using ID and IDREF attributes.
37An Undirected Graph
edge
vertex
v
w
u
x
z
y
38A Directed Graph
u
w
y
x
v
39Geom100
Math 100
Calc300
Calc100
Calc200
Philo45
CS1
CS2
This is called a DAG (Directed Acyclic Graph)
40lt?xml version"1.0"?gt lt!DOCTYPE
Course_Descriptions SYSTEM "course_descriptions.dt
d"gt ltCourse_Descriptionsgt ltCoursegt
ltCourse-ID id "Math100" /gt
ltTitlegtAlgebra Ilt/Titlegt ltDescriptiongt
Students in this course study
introductory algebra.
lt/Descriptiongt ltPrerequisites/gt
lt/Coursegt
This course has an ID
But no prerequisites
41ltCoursegt ltCourse-ID id "Geom100" /gt
ltTitlegtGeometry Ilt/Titlegt ltDescriptiongt
Students in this course study how to
prove several theorems in
geometry. lt/Descriptiongt
ltPrerequisites/gt lt/Coursegt
The DTD will force this to be unique.
42ltCoursegt ltCourse-ID id"Calc100" /gt
ltTitlegtCalculus Ilt/Titlegt ltDescriptiongt
Students in this course study the derivative.
lt/Descriptiongt ltPrerequisites
pre"Math100 Geom100" /gt lt/Coursegt
ltCoursegt
These are references to IDs. (IDREFS)
43ltCourse-ID id "Calc200" /gt
ltTitlegtCalculus IIlt/Titlegt ltDescriptiongt
Students in this course study the integral.
lt/Descriptiongt ltPrerequisites
pre"Calc100" /gt lt/Coursegt
The DTD requires that this name be a unique id
defined within this document. Otherwise, the
document is invalid.
44ltCoursegt ltCourse-ID id "Calc300" /gt
ltTitlegtCalculus IIlt/Titlegt ltDescriptiongt
Students in this course study the derivative
and the integral (in 3-space).
lt/Descriptiongt ltPrerequisites
pre"Calc200" /gt lt/Coursegt
Prerequisites is an EMPTY element. Its used only
for its attributes.
45ltCoursegt ltCourse-ID id "CS1" /gt
ltTitlegtIntroduction to Computer Science
Ilt/Titlegt ltDescriptiongt In this course we
study Turing machines. lt/Descriptiongt
ltPrerequisites pre"Calc100" /gt lt/Coursegt
ltCoursegt
IDREF
ID
A One-to-one link
46ltCourse-ID id "CS2" /gt
ltTitlegtIntroduction to Computer Science
IIlt/Titlegt ltDescriptiongt In this course we
study basic data structures.
lt/Descriptiongt ltPrerequisites pre"Calc200
CS1"/gt lt/Coursegt ltCoursegt
ID
IDREFS
ID
One-to-many links
47 ltCourse-ID id "Philo45" /gt
ltTitlegtEthical Implications of Information
Technologylt/Titlegt ltDescriptiongt TBA
lt/Descriptiongt ltPrerequisites/gt
lt/Coursegt lt/Course_Descriptionsgt
48The Course_Descriptions.dtd
lt?xml version"1.0"?gt lt!-- Course Description
DTD --gt lt!ELEMENT
Course_Descriptions (Course)gt lt!ELEMENT
Course (Course-ID,Title,Description,Prerequisites)
gt lt!ELEMENT Course-ID EMPTYgt
lt!ELEMENT Title (PCDATA)gt lt!ELEMENT
Description (PCDATA)gt lt!ELEMENT
Prerequisites EMPTYgt lt!ATTLIST Course-ID
id ID REQUIREDgt lt!ATTLIST Prerequisites
pre IDREFS IMPLIEDgt
49General Entities
General entities are used to place text into the
XML document. They may be declared in the DTD
and referenced in the document. They may also be
declared in the DTD as residing in a file.
They may then be referenced in the document.
50lt?xml version"1.0" encoding"UTF-8"?gt lt!DOCTYPE
FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"
lt!ENTITY bankname "Mellon National Bank and
Trust" gt gt ltFixedFloatSwapgt
ltBankgtbanknamelt/Bankgt ltNotionalgt100lt/Noti
onalgt ltFixed_Rategt5lt/Fixed_Rategt
ltNumYearsgt3lt/NumYearsgt ltNumPaymentsgt6lt/NumP
aymentsgt lt/FixedFloatSwapgt
Document using a General Entity
lt?xml version"1.0" encoding"utf-8"?gt lt!ELEMENT
FixedFloatSwap (Bank,Notional, Fixed_Rate,
NumYears, NumPayments )
gt lt!ELEMENT Bank (PCDATA) gt lt!ELEMENT Notional
(PCDATA) gt lt!ELEMENT Fixed_Rate (PCDATA)
gt lt!ELEMENT NumYears (PCDATA) gt lt!ELEMENT
NumPayments (PCDATA) gt
DTD
Validate is true
51The general entity is replaced before xslt sees
it.
ltxslstylesheet xmlnsxsl"http//www.w3.org/1999/
XSL/Transform" version"1.0"gt
ltxsltemplate match "Bank"gt ltWMLgt
ltCARDgt ltxslapply-templates/gt
lt/CARDgt lt/WMLgt lt/xsltemplategt
ltxsltemplate match "Notional Fixed_Rate
NumYears NumPayments"gt lt/xsltemplategt
lt/xslstylesheetgt
XSLT Program
52C\McCarthy\www\46-928\examples\saxgtjava
-Dcom.jclark.xsl.sax.parsercom.jclark. xml.sax.Co
mmentDriver com.jclark.xsl.sax.Driver
FixedFloatSwap.xml FixedFloatSwa p.xsl
FixedFloatSwap.wml C\McCarthy\www\46-928\example
s\saxgttype FixedFloatSwap.wml lt?xml
version"1.0" encoding"utf-8"?gt
ltWMLgtltCARDgtMellon National Bank and
Trustlt/CARDgtlt/WMLgt
XSLT OUTPUT
53lt?xml version"1.0" encoding"UTF-8"?gt lt!DOCTYPE
FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"
lt!ENTITY bankname SYSTEM "JustAFile.dat" gt
gt ltFixedFloatSwapgt ltBankgtbanknamelt/B
ankgt ltNotionalgt100lt/Notionalgt
ltFixed_Rategt5lt/Fixed_Rategt
ltNumYearsgt3lt/NumYearsgt ltNumPaymentsgt6lt/NumP
aymentsgt lt/FixedFloatSwapgt
An external text entity
54JustAFile.dat
Mellon Bank And Trust Corporation Pittsburgh PA
XSLT Output
lt?xml version"1.0" encoding"utf-8"?gt
ltWMLgtltCARDgtMellon Bank And Trust
Corporation Pittsburgh PAlt/CARDgtlt/WMLgt
55Parameter Entities
While general entities are used to place text
into the XML document parameter entities are used
to modify the DTD. We want to build modular
DTDs so that we can create new DTDs using
existing ones. Well look at slide from
www.fpml.org and the see some examples.
56FpML is a Complete Description of the Trade
Vanilla Swap Vanilla Fixed Float
Swap Cancellable Swaption FX Spot FX Outright FX
Swap Forward Rate Agreement...
Pool of modular components grouped into separate
namespaces
Rate
Party
Date
Money
Notional
Product
Adjustable Period
Date Schedule
57lt?xml version"1.0" encoding"utf-8"?gt lt!ELEMENT
FixedFloatSwap (Notional, Fixed_Rate, NumYears,
NumPayments ) gt lt!ENTITY parsedCharacterData
"(PCDATA)"gt lt!ELEMENT Notional
parsedCharacterData gt lt!ELEMENT Fixed_Rate
(PCDATA) gt lt!ELEMENT NumYears (PCDATA)
gt lt!ELEMENT NumPayments (PCDATA) gt
DTD
Internal Parameter Entities
lt?xml version"1.0" encoding"UTF-8"?gt lt!DOCTYPE
FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"gt
ltFixedFloatSwapgt ltNotionalgt100lt/Notionalgt
ltFixed_Rategt5lt/Fixed_Rategt
ltNumYearsgt3lt/NumYearsgt ltNumPaymentsgt6lt/NumP
aymentsgt lt/FixedFloatSwapgt
XML Document
58External Parameter Entities and DTD Components
lt?xml version"1.0" encoding "UTF-8"?gt lt!DOCTYPE
ORDER SYSTEM "order.dtd"gt lt!-- example order
form from XML A Managers Guide --gt ltORDER
SOURCE "web" CUSTOMERTYPE"consumer"
CURRENCY"USD"gt ltaddressesgt ltaddress
ADDTYPE"billship"gt ltfirstnamegtKevinlt/fir
stnamegt ltlastnamegtDicklt/lastnamegt
ltstreet ORDER"1"gt123 Anywhere Lanelt/streetgt
ltstreet ORDER"2"gtApt 1blt/streetgt
ltcitygtPalo Altolt/citygt
ltstategtCAlt/stategt ltpostalgt94303lt/postalgt
ltcountrygtUSAlt/countrygt lt/addressgt
Order.xml
59An order may have more than one address.
ltaddress ADDTYPE"bill"gt
ltfirstnamegtKevinlt/firstnamegt
ltlastnamegtDicklt/lastnamegt ltstreet
ORDER"1"gt123 Not The Same Lanelt/streetgt
ltstreet ORDER"2"gtWork Placelt/streetgt
ltcitygtPalo Altolt/citygt
ltstategtCAlt/stategt ltpostalgt94300lt/postalgt
ltcountrygtUSAlt/countrygt lt/addressgt
lt/addressesgt
60Several products may be purchased.
ltlineitemsgt ltlineitem ID"line1"gt
ltproduct CAT"MBoard"gt440BX Motherboardlt/productgt
ltquantitygt1lt/quantitygt
ltunitpricegt200lt/unitpricegt lt/lineitemgt
ltlineitem ID"line2"gt ltproduct CAT
"RAM"gt128 MB PC-100 DIMMlt/productgt
ltquantitygt2lt/quantitygt
ltunitpricegt175lt/unitpricegt lt/lineitemgt
ltlineitem ID"line3"gt ltproduct
CAT"CDROM"gt40x CD-ROMlt/productgt
ltquantitygt1lt/quantitygt
ltunitpricegt50lt/unitpricegt lt/lineitemgt
lt/lineitemsgt
61The payment is with a Visa card.
ltpaymentgt ltcard CARDTYPE"VISA"gt
ltcardholdergtKevin S. Dicklt/cardholdergt
ltcardnumbergt11111-22222-33333lt/cardnumbergt
ltexpirationgt01/01lt/expirationgt lt/cardgt
lt/paymentgt lt/ORDERgt
We want this document to be validated.
62order.dtd
lt?xml version"1.0" encoding"UTF-8"?gt lt!--
Example Order form DTD adapted from XML A
Manager's Guide --gt lt!-- Define an ORDER element
--gt lt!ELEMENT ORDER (addresses, lineitems,
payment)gt lt!ATTLIST ORDER SOURCE
(web phone retail)
REQUIRED CUSTOMERTYPE (consumer
business) "consumer" CURRENCY
CDATA "USD" gt
Define an order based on other elements.
63 lt!ENTITY anAddress SYSTEM "address.dtd"
gt anAddress lt!-- Collection of Addresses
--gt lt!ELEMENT addresses (address)gt lt!ENTITY
aLineItem SYSTEM "lineitem.dtd"
gt aLineItem lt!-- Collection of LineItems
--gt lt!ELEMENT lineitems (lineitem)gt lt!ENTITY
aPayment SYSTEM "payment.dtd" gt aPayment
External parameter entity declaration
External parameter entity reference
64address.dtd
lt!-- Address Structure --gt lt!ELEMENT address
(firstname, middlename?, lastname, street,
city, state,postal,country)gt lt!ELEME
NT firstname (PCDATA)gt lt!ELEMENT middlename
(PCDATA)gt lt!ELEMENT lastname
(PCDATA)gt lt!ELEMENT street
(PCDATA)gt lt!ELEMENT city
(PCDATA)gt lt!ELEMENT state
(PCDATA)gt lt!ELEMENT postal
(PCDATA)gt lt!ELEMENT country
(PCDATA)gt lt!ATTLIST address ADDTYPE
(bill ship billship)
"billship"gt lt!ATTLIST street ORDER
CDATA IMPLIEDgt
65lineitem.dtd
lt!ELEMENT lineitem (product,quantity,unitprice)gt lt
!ATTLIST lineitem ID ID
REQUIREDgt lt!ELEMENT
product (PCDATA)gt lt!ATTLIST product CAT
(CDROMMBoardRAM)
REQUIREDgt lt!ELEMENT quantity (PCDATA)gt lt!ELEMEN
T unitprice (PCDATA)gt
66payment.dtd
lt!ELEMENT payment (card PO)gt lt!ELEMENT card
(cardholder, cardnumber, expiration)gt lt!ELEMENT
cardholder (PCDATA)gt lt!ELEMENT cardnumber
(PCDATA)gt lt!ELEMENT expiration
(PCDATA)gt lt!ELEMENT PO (number,authorization)gt lt
!ELEMENT number (PCDATA)gt lt!ELEMENT
authorization (PCDATA)gt lt!ATTLIST card
CARDTYPE (VISAMasterCardAmex)
REQUIREDgt
67XML Schemas are Coming
- XML Schema is the official name
- XSDL (XML Schema Definition Language) is the
language - used to create schema definitions
- Can be used to more tightly constrain a document
instance - Supports namespaces
- Permits type derivation
68A Simple Purchase Order
lt?xml version"1.0" encoding"UTF-8"?gt lt!--
po.xml --gt ltpurchaseOrder orderDate"07.23.2001"
xmlns"http//www.cds-r-us.com"
xmlnsxsi"http//www.w3.org/2001/XMLSchema-instan
ce" xsischemaLocation"http//www.cds-r-us.
com po.xsd" gt
69ltrecipient country"USA"gt ltnamegtDennis
Scannellt/namegt ltstreetgt175 Perry Lea
Side Roadlt/streetgt ltcitygtWaterburylt/citygt
ltstategtVTlt/stategt
ltpostalCodegt15216lt/postalCodegt lt/recipientgt
ltordergt ltcd artist"Brooks Williams"
title"Little Lion" /gt ltcd artist"David
Wilcox" title"What you whispered" /gt
lt/ordergt lt/purchaseOrdergt
70Purchase Order XSDL
lt?xml version"1.0" encoding"utf-8"?gt lt!--
po.xsd --gt ltxsschema xmlnsxs"http//www.w3.
org/2001/XMLSchema" xmlns"http//www.cds-r-us.
com" targetNamespace"http//www.cds-r-us.com"
gt
71ltxselement name"purchaseOrder"gt
ltxscomplexTypegt ltxssequencegt
ltxselement ref"recipient" /gt
ltxselement ref"order" /gt
lt/xssequencegt ltxsattribute
name"orderDate" type"xsstring" /gt
lt/xscomplexTypegt lt/xselementgt
72ltxselement name "recipient"gt
ltxscomplexTypegt ltxssequencegt
ltxselement ref"name" /gt
ltxselement ref"street" /gt
ltxselement ref"city" /gt
ltxselement ref"state" /gt
ltxselement ref"postalCode" /gt
lt/xssequencegt ltxsattribute
name"country" type"xsstring" /gt
lt/xscomplexTypegt lt/xselementgt
73ltxselement name "name" type"xsstring" /gt
ltxselement name "street" type"xsstring" /gt
ltxselement name "city" type"xsstring"
/gt ltxselement name "state"
type"xsstring" /gt ltxselement name
"postalCode" type"xsshort" /gt ltxselement
name "order"gt ltxscomplexTypegt
ltxssequencegt
ltxselement ref"cd" maxOccurs"unbounded"/gt
lt/xssequencegt
lt/xscomplexTypegt lt/xselementgt
74ltxselement name"cd"gt
ltxscomplexTypegt ltxsattribute
name"artist" type"xsstring" /gt
ltxsattribute name"title" type"xsstring" /gt
lt/xscomplexTypegt lt/xselementgt lt/xssch
emagt