Title: An Introduction to XML
1An Introduction to XML
Paul Donohue
May 8th 2002 Hotel Senator Zürich
2Overview
Topics we will cover
- What Is XML?
- Why Use XML?
- Defining Rules with XML
- Related Technologies
- Demonstration
- Summary
- Questions
3What is XML?
XML In A Nutshell
- XML Extensible Markup Language
- Not a programming language
- An open standard for representing structured data
- Describes data structure and content
- Separates data from its presentation
4What is XML?
Markup Clarifies Data
- A woman without her man is nothing
- A woman without her man, is nothing
- A woman without her, man is nothing
5What is XML?
Some History
1969 Generalized Markup Language (GML)
1980 Standardized Generalized Markup Language (SGML)
1986 SGML becomes ISO standard
1991 Hypertext Markup Language (HTML)
1996 W3C begin work on a language to combine SGML HTML
1998 XML standard is published
6What is XML?
An XML Document
- lt?xml version"1.0"?gt
- lttalk code"ID352"gt
- lt!--Example XML file--gt
- lttitlegtA PB / XML Messaging Systemlt/titlegt
- ltpresentergtMr Paul Donohuelt/presentergt
- ltaudiencegtPowerBuilder Developerslt/audiencegt
- lttimegt1330lt/timegt
- ltdategt2001-08-13lt/dategt
- lt/talkgt
7What is XML?
Parts of an XML document
lt?xml version"1.0"?gt lt!DOCTYPE talk SYSTEM
DEMO.DTD"gt lttalk code"ID352" gt lt!--Example XML
file--gt lttitlegtA PB / XML Messaging
Systemlt/titlegt ltpresentergtMr Paul
Donohuelt/presentergt ltaudiencegtPowerBuilder
Developerslt/audiencegt lttimegt1330lt/timegt ltdategt200
1-08-13lt/dategt lt/talkgt
8What is XML?
Elements
- Elements are the basic building blocks of XML
- XMLs nouns
- Elements consist of start tag, contents and end
tag ltanimalgtcatlt/animalgt - Empty elements can be shown as ltanimal/gt
- Contents can be data or other elements
- Elements can own attributes
9What is XML?
Attributes
- Attributes give further information about an
element - XMLs adjectives
- Attributes consist of name, equals and value
cat_typepersian - Attribute values are quote delimited strings
- Attributes are placed inside an elements start
tag
10What is XML?
Characters
- XML messages are text files
- XML is case sensitive
- XML uses the Unicode 2.1 character set
- Special codes for markup characters such as lt and
11What is XML?
Names
- All elements and attributes are named
- Must begin with a letter, underscore or colon
- Must continue with valid name characters
- Letter / Underscore / Colon
- Digit
- Hyphen
- Full stop
- May not begin with XML
12What is XML?
XML vs HTML
- XML uses tags like in HTML
- XML complements HTML
- XML is for smart data
- XML is both data and document
13What is XML?
Data encoded as HTML
- lttable width"500" border"0" cellspacing"0"
cellpadding"0"gt - lttr bordercolor"FFFFFF" bgcolor"6666FF"gt
- lttd width"214"gtSybaselt/tdgt
- lttd width"150"gtltbgtPowerBuilder 7.1lt/bgtlt/tdgt
- lttd width"136"gtpound145.00lt/tdgt
- lt/trgt
- lt/tablegt
14What is XML?
Data encoded as XML
- ltSoftwaregt
- ltPublishergtSybaselt/Publishergt
- ltTitlegtPowerBuilder
- ltMajorVersiongt7lt/MajorVersiongt
- ltMinorVersiongt1lt/MinorVersiongt
- lt/Titlegt
- ltPrice currency"GBP"gt145.00lt/Pricegt
- lt/Softwaregt
15Why Use XML?
Six Good Reasons
- Royalty free
- Industry standard
- Platform vendor independent
- Self describing
- Flexible
- Caters for nested repeating data
16Why Use XML?
Royalty Free
- Nobody owns XML
- No software to purchase
- No licensing fees
17Why Use XML?
Industry Standard
- XML version 1.0 became a W3C standard in 1998
- Good support from vendors
- Low risk technology
- Large community of developers
18Why Use XML?
Platform Vendor Independent
- XML documents are text based
- Perfect for messaging
- There are no vendor-specific extensions
- PDA to Mainframe
19Why Use XML?
Self Describing
- Descriptive element attribute names
- The name / data combination is easy to understand
ltpricegt11.50lt/pricegt ltcurrencygtGBPlt/currencygt - XML can be viewed with a text editor
20Why Use XML?
Flexible
- XML can handle any structured data
- XML can be easily transformed
- Direct access to the required data
21Why Use XML?
Nested Repeating Data
- Nested data
An employee has an
address and that address has a street and a post
code - Repeating data
An invoice has one or
more items on it - Hard to do with traditional file formats
22Defining Rules
Well-Formed XML Documents
- Conform to the grammar of XML
- One root element
- Non-empty elements have start end tags
- Elements are nested correctly
- Attributes are not repeated within elements
- Attribute values are quoted
- Can be parsed by any parser
- The data may be nonsense
23Defining Rules
Valid XML Documents
- Must contain a valid document type declaration
- Must obey the constraints of that declaration
- Element sequence is valid
- Required attributes are provided
- Attribute values are a valid value
- Ensures data is valid for the application domain
- Rules are in a DTD or schema
24Defining Rules
Document Type Definitions (DTD)
- DTDs define validation rules for XML documents
- Elements contents, order occurrence
- Attributes valid default values
- DTDs are optional
- DTDs can be internal or external
- DTDs are written in XML Declaration Syntax
25Defining Rules
Schemas
- Schemas are more powerful that DTDs
- Data types
- Improved occurrence constraints
- Schemas are written in XML
- Schemas can refer to other schemas
- Get your DTD or schema correct before you code
26Defining Rules
Semantics
- Standard terms facilitate data exchange
- Industry-wide standards have emerged
- MathML Mathematical Markup Language
- CML Chemical Markup Language
- FPML Financial Products Markup Language
- CDF Channel Description Format
- Check if your industry or organisation has a
standard
27Related Technologies
Overview
- Parsers SAX DOM
- Searching XPath
- Formatting CSS, XSL XSLT
- Linking XLink XPointer
- Resource Description Framework (RDF)
28Related Technologies
Simple API for XML (SAX)
- Event driven
- Can handle large files
- No random access
- Read only
- Primarily for Java
29Related Technologies
Document Object Model (DOM)
- Standard set of function calls
- XML loaded into memory
- Best for smaller files
- Data is parsed into a tree of nodes
- Language and platform neutral
30Demonstration
31Summary
32Summary
What have we learnt?
- What XML is
- Why we should use XML
- How to define rules in XML
- XMLs related technologies
33Summary
Recommended reading
Title Professional XML Author Mark Birbeck et
al Publisher Wrox Press Inc ISBN 1861003110
34Summary
Recommended reading
Title Fast Track To XML Author Eric
Zenor Publisher Sybase Article 1003388
35Summary
Useful Web Sites
XML Org www.xml.org W3C www.w3.org/xml XM
L FAQ www.ucc.ie/xml XML Cover Pages
www.oasis-open.org/cover/sgml-xml XML Journal
www.sys-con.com/xml
36Summary
Useful XML Tools
- MS XML Notepad http//msdn.microso
ft.com - XML Spy
http//www.xmlspy.com
37Questions
- If you have any questions about this presentation
please email me or visit my web site. - Email info_at_pauldonohue.com
- Web www.pauldonohue.com