Title: XML,%20HTML%20and%20All%20That
1XML, HTML and All That
- What do they Mean, and Why Should you Care?
- Ian GRAHAM
- Centre for Academic Technology
- Tel 978-4548
- Email ltian.graham_at_utoronto.cagt
- Talk http//www.utoronto.ca/ian/talks/
2Overview
- A little Web history and the birth of HTML
- HTML is not enough -- why?
- XML for universal data
- For communicating information of all types
- Examples of XML in action
- Profound conclusions ...
3The Birth of the Web
- The HyperText Markup Language
- A simple language for distributing text
- All the other parts --
- URLs, HTTP, CGI ...
4Four Main Components
- URLs For addressing things
- HTTP For transporting data
- CGI For adding functionality
- HTML For encoding text information
5NNTP
Shoutcast
FTP
HTTP
Databases other software
CGI
URLs
6HTML
- A simple, general-purpose language
- Simple hypermedia
- Original idea --
- Collaborative authoring
- Merging of concept of authoring/viewing
7HTML Evolution
- Started with very few tags
- simple requirements (only need to no a little bit
about the tags, and then just muddle through) - Language evolved, as more tags were added
- forms, images, tables, frames, fonts, ...
8HTML Problems (1)
- Everyone wanted personalized tags
- Want to put other data into HTML
- mathematics, database entries, literary text,
poems, purchase orders . - HTML just isnt designed for that!
9HTML Problems (2)
Web server engine
- Software processing
- Server management of data
- But -- HTML is so ill-formed, this is hard!
HTML
HTML
HTML
HTML
HTML
10HTML Problems (3)
- Software processing
- Client data processing (machine--machine
communication) - But -- HTML is so ill-formed, this is hard!
Client software
HTML
(from somewhere on the Web ...)
Database, viewer, whatever....
11Idea Back to Basics
- HTML was defined using SGML
- Standard Generalized Markup Language
- A meta-language for defining languages
- Complex, sophisticated, powerful
- Idea Use SGML
12Languages based on SGML
HTML
TEI
DocBook
. . .
SGML
13Problems with SGML
- SGML Too complicated
- Rules too strict
- Cant distribute loosely formatted text (like
HTML) - Not good in a distributed environment
- Cant mix different data together
- Cant add arbitrary tags
14Idea (2) Webified SGML
- New eXtensible Markup Language XML
- Can use XML to define new languages
- Distributes easily on the Web
- Can mix different types of data together
- can easily add new tags, and tell a browser what
to do with them (more or less....)
15Basic XML Rules
- Tags written as with HTML, but ...
- Technical details
- Tag names are case-sensitive
- Always need end tags
- Special empty-element tags
- Always quote attribute values
16Like this example ..
lt?xml version"1.0" encoding"iso-8859-1"?gt lthtml
xmlns"http//www.w3.org/TR/xhtml1" gt ltheadgt
lttitlegt Title of text XHTML Document
lt/titlegt lt/headgtltbodygt ltdiv class"myDiv"gt
lth1gt Heading of Page lt/h1gt .. ltpgtAnd
here is another paragraph, this one containing
an ltimg src"image.gif" alt"waste
of time" /gt inline image, and a ltbr
/gt line break. lt/pgt lt/divgt lt/bodygtlt/htmlgt
XML stuff
17Special XML Things
- lt?xml version1.0 encodingiso-8859-1 ?gt
- Says that this is an XML document
- lthtml xmlnshttp//www.w3.org/TR/xhtml1gt
- Says that the meaning of the tags inside (and
including) the html element are defined as
belonging in the same space of names.xmlns ?
XML namespace
18Evolution of XML
- Many XML languages, optimised for different Web
roles - MathML -- for mathematics
- SMIL -- for synchronised multimedia
- RDF -- for describing things
- XUL -- for describing the Nav5 user interface
- SpeechML -- for synthesised voices
19MathML
- Designed to express layout of maths
- Also can express semantics
- Cut paste into Maple, Mathematica
- x2 4x 4 0
- ltmrowgt
- ltmrowgt
- ltmsupgt ltmigtxlt/migt ltmngt2lt/mngt lt/msupgt
ltmogtlt/mogt - ltmrowgt
- ltmngt4lt/mngt
- ltmogtinvisibletimeslt/mogt
- ltmigtxlt/migt
- lt/mrowgt
- ltmogtlt/mogt
- ltmngt4lt/mngt
- lt/mrowgt
- ltmogtlt/mogt
- ltmngt0lt/mngt
- lt/mrowgt
20SMIL
- Synchronised Multimedia Integration Language
- Integration of multimedia with text, audio, video
- Support in RealPlayer G2
21SMIL Example
ltsmilgt ltheadgt ltmeta name"title"
content"Online Teaching Services promo" /gt
ltmeta name"author" content"Jay Moonah, CAT" /gt
ltlayout type"text/smil-basic-layout"gt
ltroot-layout width"280" height"316"
background-color"white"/gt ltregion
id"AnimChannel1" title"AnimChannel1"
left"0" top"0" height"265" width"280"
fit"hidden"/gt lt/layoutgt lt/headgt ltbodygt ltpar
title"Online Teaching Services promo"
author"Jay Moonah, CAT" gt ltaudio
src"final.rm" id"Soundtrack"
title"Soundtrack"/gt ltanimation
src"otscompfin.swf" id"Animation"
region"AnimChannel1" title"Animation"
fill"freeze"/gt lttext src"cc.rt"
id"caption" region"cc" title"cc"
fill"freeze"/gt lt/pargt lt/bodygtlt/smilgt
22XHTML NextGen HTML
- lt?xml version"1.0" encoding"iso-8859-1"?gt
- lthtml xmlns"http//www.w3.org/TR/xhtml1" gt
- ltheadgt
- lttitlegt Title of text XHTML Document lt/titlegt
- lt/headgt
- ltbodygt
- ltdiv class"myDiv"gt
- lth1gt Heading of Page lt/h1gt
- ltpgt here is a paragraph of text. I will
include inside this paragraph - a bunch of wonky text so that it
looks fancy. lt/pgt - ltpgtHere is another paragraph with
ltemgtinline emphasizedlt/emgt - text, and ltbgt absolutely nolt/bgt sense
of humor. lt/pgt - ltpgtAnd another paragraph, this one with an
ltimg src"image.gif" - alt"waste of time" /gt image, and a
ltbr /gt line break. lt/pgt - lt/divgt
- lt/bodygtlt/htmlgt
23XHTML
- Just like HTML, but based on XML rules
- Will support integration of different data into a
single document - (Doesnt work that way now, unfortunately)
24XHTML and other Data
- lt?xml version"1.0" encoding"iso-8859-1"?gt
- lthtml xmlns"http//www.w3.org/TR/xhtml1" gt
- ltheadgt
- lttitlegt Title of XHTML Document lt/titlegt
- lt/headgtltbodygt
- ltdiv class"myDiv"gt
- lth1gt Heading of Page lt/h1gt
- ltmathml xmlnshttp//www.w3.org/TR/mathmlgt
- MathML markup
- lt/mathmlgt
- ltpgt more html stuff goes here lt/pgt
- ltsmil xmlnshttp//www.w3.org/TR/smil1gt
- SMIL markup
- lt/smilgt
- lt/divgt lt/bodygtlt/htmlgt
25Displaying XML
- More complicated than HTML
- XML represents data only, not how it looks
- Need extra instructions (a style sheet
document) to define how things should look
26What Browsers Do Now?
- Navigator 4, Internet Explorer 4
- Uggh (cant handle XML at all)
- Internet Explorer 5 -- shows a tree of elements
- Netscape 5 -- ignores the tags ... or so it seems
...
27Other Use Data Abstraction
- XML as a universal format for data interchange
- Machines exchange data as XML-format messages
- Eliminates proprietary data formats
- Lots of XML processing software available
28XML Messaging Business
29XML Messaging Database
Other DB
Request/send data
Database
Other DB
Other DB
Request/send data
30Example Message
- ltpartorders xmlnshttp//myco.org/Spec/partorders
.descgt - ltorder refx23-2112-2342 date25aug1999-1234
23hgt - ltdescgt Gold sprockel grommets, with
matching hamsterlt/descgt - ltpart number23-23221-a12 /gt
- ltquantity unitsgrossgt 12 lt/quantitygt
- ltdelivery-date date27aug1999-1200hgt
- lt/ordergt
- ltorder refx23-2112-2342 date25aug1999-123
423hgt - . Order something else ..
- lt/ordergt
- lt/partordersgt
-
31Other Examples
- XUL XML User Interface Language
- How Navigator 5 configures its interface
- Defines structure and software integration
(www.mozilla.org) - RDF Resource Description Framework
- For describing things
- Used by Netscape Open Catalog project to define
Web accessible resources (www.dmoz.org)
32The XML Family Tree
HTML
TEI
. . .
. . .
XML
SGML
33XML Summary
- an integration tool for mixing different types
of data - a universal format for exchanging data between
machines - a framework for distributing information on the
Web
34XML, HTML and All That
- Ian GRAHAM
- Centre for Academic Technology
- Tel 978-4548
- Email ltian.graham_at_utoronto.cagt
- Talk http//www.utoronto.ca/ian/talks/