An XML Tutorial - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

An XML Tutorial

Description:

Extensible Markup Language (XML) is a. W3C proposed recommendation ... Musical scores. Library indices. Protein sequences. Bibliographies. Database schemas. TM ... – PowerPoint PPT presentation

Number of Views:92
Avg rating:3.0/5.0
Slides: 39
Provided by: hannes7
Category:
Tags: xml | tutorial

less

Transcript and Presenter's Notes

Title: An XML Tutorial


1
An XML Tutorial
Hannes Marais Systems Research Center, Palo
Alto, California marais_at_pa.dec.com
2
What is XML ?
Extensible Markup Language (XML) is a W3C
proposed recommendation for a file format
to easily and cheaply distribute electronic
documents on the World Wide Web
3
Example Documents
  • Books
  • User manuals
  • Product catalogs
  • Order forms
  • Medical documents
  • Tax forms
  • Mathematical formulas
  • Chemical formulas
  • Drug descriptions
  • Dictionaries
  • Newspapers
  • Style sheets
  • Musical scores
  • Library indices
  • Protein sequences
  • Bibliographies
  • Database schemas

4
Example E-Mail Document in XML
lt?XML VERSION"1.0"?gt lt!-- This is a sample email
data file --gt lt!DOCTYPE mail SYSTEM "email.dtd"
lt!ENTITY ingo "Ingo.Macherius_at_tu-clau
sthal.de" gt lt!ENTITY henning
"hb_at_ix.heise.de" gt gt ltmailgt
ltRecipientgthenninglt/Recipientgt
ltSendergtingolt/Sendergt ltDategtMon, 21 Apr
1997 092755 0200lt/Dategt ltSubjectgtXML
literaturelt/Subjectgt ltTextbodygt ltpgtHello
Mr ltNamegtBehmelt/Namegt,lt/pgt ltpgtPlease read
ltNamegtJon Bosaklt/Namegt's introductory textlt/pgt
ltpgt"SGML, Java and the Future of the
Web"lt/pgt ltpgtBest wishes,lt/pgt
ltpgtltNamegtIngo Macheriuslt/Namegtlt/pgt lt/Textbodygt lt/m
ailgt
5
XML Features in a Nutshell
  • Extensibility
  • HTML with user-defined tags
  • Can be used in any domain
  • Structure
  • Can represent trees and graph structures
    (database schemas, OO hierarchies, )
  • Validation
  • Consuming applications can check forstructural
    validity on importation

6
XML Development Timeline
  • 1986Standard Generalized Markup Language (SGML),
    ISO 8879-1986
  • Nov 1995HTML 2.0
  • Nov 1996Simplified / stripped-down SGML draft
    (dubbed XML)
  • Jan 1997HTML 3.2
  • Aug 1997XML Working Draft
  • Dec 1997XML1.0 Proposed RecommendationHTML 4.0
    Recommendation

7
Architectural Dependencies
Instances /Domains
RDF
CDF
CML
...
HTML
...
XML
SGML
8
Overview of the Tutorial
  • W3C Design Goals of XML
  • The XML Format
  • Example Applications
  • Conclusions

9
XML shall ...
  • Be usable over the Internet
  • Support a wide variety of applications
  • Be SGML compatible
  • Be easy to write
  • Be easy to process by program
  • Have no optional features
  • Be human-legible and clear
  • Be designed quickly
  • Have a formal and concise design

10
XML Markup Overview
  • XML documents contain
  • Character Data
  • Comments Escaped content
  • Processing instructions
  • Elements
  • Document Type Definition Markup

11
Character Data
  • Unicode (ISO 10646) characters without markup
  • Example

ltpgtHello Mr ltNamegtBehmelt/Namegt,lt/pgt ltpgtPlease
read ltNamegtJon Bosaklt/Namegt's introductory
textlt/pgt ltpgt"SGML, Java and the Future of the
Web"lt/pgt ltpgtBest wishes,lt/pgt ltpgtltNamegtIngo
Macheriuslt/Namegtlt/pgt
12
Comments Escaped Content
  • Ignored during processing
  • Example

lt!-- This is a sample email data file --gt
lt!CDATA any markup here gt
13
Processing Instructions
  • Special instructions to the XML consumer
    application
  • Example

lt?XML VERSION"1.0 ?gt
14
Elements
  • Consists of a start tag, body, and end tag
  • Body can be any other markup (even nested)
  • Examples
  • Empty element shorthand

ltpgtBest wishes,lt/pgt ltpgtltNamegtIngo
Macheriuslt/Namegtlt/pgtltpgtlt/pgt
ltp/gt
15
Attributes
  • Optional (Attribute, Value) pairs associated with
    elements
  • Example

ltperson firstnameJohn Bosak surnameJohngt
ltaddressgtSun MicroSystemslt/addressgt
lte-mailgtbosak_at_sun.comlt/e-mailgt lt/persongt
16
Doc Type Declarations
  • Identifies the XML instance being used with
    areference to the Document Type Definition (DTD)
  • Example

lt!DOCTYPE mail SYSTEM "email.dtd"
lt!ENTITY ingo "Ingo.Macherius_at_tu-clausthal.de"
gt lt!ENTITY henning "hb_at_ix.heise.de" gt gt
17
Document Type Definition (DTD)
  • Identifies the syntax of the XML flavor being
    used,i.e. CDF, RDF, CML, ...
  • Meta-information about the document contents
  • Valid element names
  • Valid attribute names and values
  • How elements can nest in each other
  • Typically the DTD is stored in a separate
    document
  • DTD does not say anything about document
    semantics

18
Well-formed vs. Valid Documents
  • Well-formed
  • Conforms to the basic XML syntax
  • Can be parsed without regard to the DTD
  • Valid
  • Well-formed
  • Conforms to its DTD

19
DTD Element Declarations
  • Specifies a valid element and its valid
    contents(What can be nested inside the element?)
  • Uses regular expressions to define valid contents
  • Examples

lt!ELEMENT br EMPTYgt // empty element lt!ELEMENT
p ANYgt // allows everything lt!ELEMENT mail
(subject from to textbody) )
20
DTD Attribute List Declarations
  • Defines allowed attribute names and values of an
    element
  • Example

lt!ATTLIST list listtype (bulletsorderedglossary
) glossary name CDATA
REQUIRED gt Name Type
Default value
21
DTD Some Attribute Types
  • CDATA - Any value
  • ID - Unique identifier for the XML Element
  • IDREF - Reference to an element with a specific
    ID
  • IDREFS - Sequence of IDREFs

22
Quick review
  • Comments Escaped content
  • lt!-- gt lt!CDATA gt
  • Processing instructions
  • lt?XML ?gt
  • Elements
  • ltA hrefhttp//www.w3.orggtW3Clt/Agt
  • Document Type Definition Markup
  • DOCTYPE, ELEMENT, ATTLIST

23
Convention Data Structures
  • Trees
  • Graphs

ltnode namexgt ltnode nameagt This is a.
lt/nodegt ltnode nameb/gtlt/nodegt
ltnode idnode101gt This is 101.lt/nodegtltstar
t refnode101gt
24
Convention Link identification
  • Convention for identifying and displaying links
    (URLs)
  • Tells how to display links
  • Embed, replace page, new window
  • Examples

ltA XML-LINKSIMPLE HREFhttp//www.w3.org
SHOWEMBEDgt W3Clt/Agt ltLINKSET
XML-LINKEXTENDEDgt ltLINK XML-LINKLOCATOR
HREFgt ltLINK XML-LINKLOCATOR HREFgt
lt/LINKSETgt
25
Convention Extended Pointers (XPointers)
  • String that identifies a specific element in a
    document
  • Can be used wherever URLs are
  • XPointer traces a relative path through the XML
    parse tree
  • Expressible relationships
  • Child, parent, descendant, ancestor, preceding,
    following, sibling, ...
  • Example

CHILD(3, DIV1)CHILD(4, DIV2)CHILD(29, P) means
the 29th paragraph(P) of the 4th
subdivision(DIV2) of the 3rd division (DIV1)
26
Applications of XML
  • Content Definition Format (CDF)
  • XML/EDI
  • Extensible Style Language (XSL)

27
Example Content Definition Format
  • Domain Internet Push Technology
  • Defined by Microsoft
  • PointCast is a major user
  • CDF file on a web site refers a set of newspaper
    articles (in HTML) called a channel
  • Client periodically fetches CDF file from server,
    then fetches newspaper articles described in CDF
    file
  • Note Open Software Description (OSD) from
    Microsoftdoes the same for application
    distribution

28
CDF Example Document
ltCHANNEL Title "AUFORA" LongName "AUFORA
News" Abstract "AUFORA offers the
latest UFO and astronomy news. We do
not subscribe to the sensationalism which
plagues UFOlogy today, rather we investigate in
an unbiased, objective, and scientific manner."
InfoURI "http//www.aufora.org/ SELF
"http//www.aufora.org/misc/pc.cdf" ContentID
"11037" Frequency "24" Authenticate
"No"gt ltITEM Title "Mysterious
happenings in Australian Outback" HREF
"http//www.aufora.org/news/16.html" Type
"HTML" Show "Channel"
Precache "Yes" Authenticate "No" gt
lt/ITEMgt ... lt/CHANNELgt
29
Example XML/EDI
  • Domain Inter-business Electronic Commerce,
    interoperation of XML and EDI applications
  • Maps XML ltgt EDIFACT messages
  • XML message size EDIFACT message 1.35
  • Mapping established purely with DTDs
  • An example EDItEUR Book Ordering
    Messages(Book trade distribution, book supply to
    libraries, new serial subscriptions, subscription
    renewals, despatch, claims)(by European Group
    for Electronic Commerce in theBooks and Serial
    Sectors)

30
EDItEUR Message in XML
lt!DOCTYPE Book-Order PUBLIC "-//EDItEUR//DTD
Book Order Message//EN"gt ltBook-Order
Supplier"4012345000094" Send-to"http//www.bic
.org/order.in"gt lttitlegtEDItEUR Lite-EDI Book
Orderinglt/titlegt ltOrder-Nogt967634lt/Order-Nogt
ltMessage-Dategt19961002lt/Message-Dategt
ltBuyer-EANgt5412345000176lt/Buyer-EANgt
ltOrder-Line Reference-No"0528837"gt
ltISBNgt0316907235lt/ISBNgt ltAuthor-TitlegtLabaln,
Brian/Chromelt/Author-Titlegtlt/Book-Ordergt
31
EDItEUR DTD
lt!ELEMENT Book-Order (title?, Order-No,
Message-Date, Buyer-EAN,
Order-Line) gt lt!ATTLIST Book-Order EDI-Prefix
CDATA FIXED "UNHME00579ORDERSD93AUNEAN007"
EDI-Suffix CDATA FIXED "UNSS'CNT22'UNT18M
E00579" Send-to CDATA REQUIRED
Supplier CDATA REQUIRED gt lt!ELEMENT Order-No
(PCDATA) gt lt!ATTLIST Order-No EDI-Prefix
CDATA FIXED "BGM220" Datatype NAME
FIXED "C8" Size NUMBER FIXED "8"
Title CDATA "Book Order No" gt
32
Extensible Style Language (XSL)
  • Domain Publishing
  • Proposed by Microsoft Co
  • Specifies how XML is to be presentedMaps XML
    files to HTML files
  • Mapping specification written in XML itself (!)
  • Rule-based mappingsExample Map lttitlegts
    occurring in ltdiv1gts to ltH2gt
  • Future browsers will have XSL support
    built-inXML XSL HTML4 Industrial strength
    publication
  • Plugin for XSL available since January 1998
    (MXSL)

33
The book order displayed using XSL
34
Example Web Bookstore
Bookstore Database
Web Server
Bookstore Client
Order this book
Browse
Let me indexall your books
Where can I findthe cheapest book?
Reorderbooks
I publish ...
Publisher
Publisher
AltaVista
XML
HTML or XMLXSL
35
And some more applications ...
  • Web Collections
  • Meta Content Framework
  • XML-Data
  • Name Spaces in XML
  • Chemical Markup Language
  • Bioinformatic Sequence Markup Language (BSML)
  • Open Financial Exchange
  • Open Trading Protocol (OTP)
  • Encoded Archival Description (EAD)
  • Translation Memory Exchange (TMX)
  • Scripting News in XML
  • Tutorial Markup Language (TML)
  • Mathematical Markup Language
  • OpenTag Markup
  • Metadata PICS
  • Synchronized Multimedia Integration Language
    (SMIL)
  • Web Interface Definition Language (WIDL)
  • Information and Content Exchange (ICE)
  • Ontology and Conceptual Knowledge Markup
    Languages
  • Cold Fusion Markup Language (CFML)
  • Java Speech Markup Language (JSML)
  • Resource Description Framework (RDF)

Source www.sil.org
36
Conclusions
  • With XML the Internet will move towards
    distributed document processing in a big way
  • Biggest opportunity for inter-business e-commerce
  • Newcomers to Internet-based e-commerce will
    probably skip EDI completely and go for the
    cheaper XML-based approach
  • Less that 80K of 6.2M U.S. businesses use an EDI
    system (2)
  • Only estimated 125K world-wide organizations use
    EDI
  • EDI cost and complexity makes it an
    insurmountable obstacle for small to medium
    sized businesses
  • Protocols for next generation internet
    businesses
  • HTTP
  • SSL / Authentication
  • XML
  • HTML

37
Software Opportunities
  • XML export import from/to databases
  • Mapping XML to legacy formats back
  • Mapping between different XML DTDs
  • DTDs and tools for specific application
    areas(for example, Healthcare, product catalogs,
    enterprise information repositories, supply chain
    integration)
  • Tools and languages for processing XML
  • WIDL, WebL, ...

38
Further reading
World Wide Web Consortium (W3C), http//www.w3.org
/XML
Write a Comment
User Comments (0)
About PowerShow.com