XML - PowerPoint PPT Presentation

1 / 53
About This Presentation
Title:

XML

Description:

XML stands for EXtensible Markup Language. A meta-language for descriptive markup: you invent ... div style='background-color:teal;color:white;padding:4px' ... – PowerPoint PPT presentation

Number of Views:69
Avg rating:3.0/5.0
Slides: 54
Provided by: Suka8
Category:
Tags: xml | teal

less

Transcript and Presenter's Notes

Title: XML


1
XML
6
  • 6.1 XML

2
Outline
  • XML
  • XSL / XSLT
  • DTD
  • DOM
  • XSD
  • XPath
  • XForms

3
What is XML?
  • XML stands for EXtensible Markup Language
  • A meta-language for descriptive markup you
    invent your own tags
  • XML uses a Document Type Definition (DTD) or an
    XML Schema to describe the data
  • XML with a DTD or XML Schema is designed to be
    self-descriptive
  • Built-in internationalization via Unicode
  • Built-in error-handling
  • Optimized for network operations
  • Tons of support from the big IT companies

4
Some History
  • SGML (Standard Generalized Markup Language)
  • ISO Standard, 1986, for data storage exchange
  • Metalanguage for defining languages (through
    DTDs)
  • A famous SGML language HTML
  • Separation of content and display
  • Used in U.S. gvt. contractors, large
    manufacturing companies, technical info.
    Publishers,...
  • SGML reference is 600 pages long
  • XML
  • W3C recommendation in 1998
  • Simple subset (80/20 rule) of SGML ASCII of
    the Web, Semantic Web
  • XML specification is 26 pages long

5
Timeline
  • 1986
  • SGML becomes a standard
  • 1989
  • Tim Berners-Lee creates the WWW
  • 1994
  • W3C established
  • 1998
  • XML 1.0 W3C Recommendation
  • Jan 2000
  • XHTML becomes W3C Recommendation
  • A Reformulation of HTML 4 in XML 1.0
  • Feb 2004
  • W3c XML 1.0 (Third Edition) Recommendation
  • http//www.w3.org/TR/2004/REC-xml-20040204/
  • Feb 2004
  • XML 1.1 Recommendation
  • http//www.w3.org/TR/2004/REC-xml11-20040204/

6
XML and HTML
  • XML is not a replacement for HTML
  • XML was designed to carry data
  • XML and HTML were designed with different goals
  • XML was designed to describe data and to focus on
    what data is
  • HTML was designed to display data and to focus on
    how data looks.
  • HTML is about displaying information, while XML
    is about describing information

7
HTML and XML, I
HTML is used to mark up text so it can be
displayed to users
XML is used to mark up data so it can be
processed by computers
HTML describes both structure (e.g. ltpgt, lth2gt,
ltemgt) and appearance (e.g. ltbrgt, ltfontgt, ltigt)
XML describes only content, or meaning
In XML, you make up your own tags
HTML uses a fixed, unchangeable set of tags
8
HTML and XML, II
  • HTML and XML look similar, because they are both
    SGML languages
  • Both HTML and XML use elements enclosed in tags
  • Both use tag attributes
  • More precisely,
  • HTML is defined in SGML
  • XML is a (very small) subset of SGML

9
HTML and XML, III
  • HTML is for humans
  • HTML describes web pages
  • You dont want to see error messages about the
    web pages you visit
  • Browsers ignore and/or correct as many HTML
    errors as they can, so HTML is often sloppy
  • XML is for computers
  • XML describes data
  • The rules are strict and errors are not allowed
  • In this way, XML is like a programming language
  • Current versions of most browsers can display XML

10
XML does not DO anything
  • XML was not designed to DO anything
  • XML is created to structure, store and to send
    information
  • The following example is a book info, stored as
    XML

lt?xml version'1.0'?gt ltbookstoregt ltbook
genre'autobiography' publicationdate'1981'
ISBN'1-861003-11-0'gt lttitlegtThe
Autobiography of Benjamin Franklinlt/titlegt
ltauthorgt ltfirst-namegtBenjaminlt/first-namegt
ltlast-namegtFranklinlt/last-namegt
lt/authorgt ltpricegt8.99lt/pricegt lt/bookgt
lt/bookstoregt
11
XML is Free and Extensible
  • XML tags are not predefined
  • You must "invent" your own tags
  • The tags used to mark up HTML documents and the
    structure of HTML documents are predefined
  • The author of HTML documents can only use tags
    that are defined in the HTML standard
  • XML allows the author to define his own tags and
    his own document structure

12
XML Future
  • XML is going to be everywhere
  • XML is a cross-platform, software and hardware
    independent tool for transmitting information.

XML
XML
XML
XML
13
Benefits of XML
  • Open W3C standard
  • Representation of data across heterogeneous
    environments
  • Cross platform
  • Allows for high degree of interoperability
  • Strict rules
  • Syntax
  • Structure
  • Case sensitive

14
How can XML be Used?
  • XML can Separate Data from HTML
  • With XML, your data is stored outside your HTML
  • XML is used to Exchange Data
  • With XML, data can be exchanged between
    incompatible systems
  • With XML, financial information can be exchanged
    over the Internet
  • XML can be used to Share Data
  • XML can be used to Store Data
  • XML can make your Data more Useful
  • XML can be used to Create new Languages

15
Components of an XML Document
  • Elements
  • Each element has a beginning and ending tag
  • ltTAG_NAMEgt...lt/TAG_NAMEgt
  • Elements can be empty (ltTAG_NAME /gt)
  • Attributes
  • Describes an element e.g. data type, data range,
    etc.
  • Can only appear on beginning tag
  • Processing instructions
  • Encoding specification (Unicode by default)
  • Namespace declaration
  • Schema declaration

16
Components of an XML Document
  • lt?xml version"1.0" ?gt
  • lt?xml-stylesheet type"text/xsl"
    href"template.xsl"?gt
  • ltROOTgt
  • ltELEMENT1gtltSUBELEMENT1 /gtltSUBELEMENT2
    /gtlt/ELEMENT1gt
  • ltELEMENT2gt lt/ELEMENT2gt
  • ltELEMENT3 type'string'gt lt/ELEMENT3gt
  • ltELEMENT4 type'integer' value'9.3'gt
    lt/ELEMENT4gt
  • lt/ROOTgt

Elements with Attributes
Elements
Processing Instructions
17
XML declaration
  • The XML declaration looks like thislt?xml
    version"1.0" encoding"UTF-8" standalone"yes"?gt
  • The XML declaration is not required by browsers,
    but is required by most XML processors (so
    include it!)
  • If present, the XML declaration must be
    first--not even whitespace should precede it
  • Note that the brackets are lt? and ?gt
  • version"1.0" is required (this is the only
    version so far)
  • encoding can be "UTF-8" (ASCII) or "UTF-16"
    (Unicode), or something else, or it can be
    omitted
  • standalone tells whether there is a separate DTD

18
Processing Instructions
  • PIs (Processing Instructions) may occur anywhere
    in the XML document (but usually first)
  • A PI is a command to the program processing the
    XML document to handle it in a certain way
  • XML documents are typically processed by more
    than one program
  • Programs that do not recognize a given PI should
    just ignore it
  • General format of a PI lt?target instructions?gt
  • Example lt?xml-stylesheet type"text/css"
    href"mySheet.css"?gt

19
XML Elements
  • An XML element is everything from the element's
    start tag to the element's end tag
  • XML Elements are extensible and they have
    relationships
  • XML Elements have simple naming rules
  • Names can contain letters, numbers, and other
    characters
  • Names must not start with a number or punctuation
    character
  • Names must not start with the letters xml (or XML
    or Xml ..)
  • Names cannot contain spaces

20
XML Attributes
  • XML elements can have attributes
  • Data can be stored in child elements or in
    attributes
  • Should you avoid using attributes?
  • Here are some of the problems using attributes
  • attributes cannot contain multiple values (child
    elements can)
  • attributes are not easily expandable (for future
    changes)
  • attributes cannot describe structures (child
    elements can)
  • attributes are more difficult to manipulate by
    program code
  • attribute values are not easy to test against a
    Document Type Definition (DTD) - which is used to
    define the legal elements of an XML document

21
An XML Document
lt?xml version'1.0'?gt ltbookstoregt ltbook
genre'autobiography' publicationdate'1981'
ISBN'1-861003-11-0'gt lttitlegtThe
Autobiography of Benjamin Franklinlt/titlegt
ltauthorgt ltfirst-namegtBenjaminlt/first-namegt
ltlast-namegtFranklinlt/last-namegt
lt/authorgt ltpricegt8.99lt/pricegt lt/bookgt
ltbook genre'novel' publicationdate'1967'
ISBN'0-201-63361-2'gt lttitlegtThe Confidence
Manlt/titlegt ltauthorgt ltfirst-namegtHermanlt
/first-namegt ltlast-namegtMelvillelt/last-namegt
lt/authorgt ltpricegt11.99lt/pricegt
lt/bookgt lt/bookstoregt
22
Another XML Document
  • lt?xml version"1.0"?gt
  • ltweatherReportgt
  • ltdategt7/14/97lt/dategt
  • ltcitygtNorth Placelt/citygt, ltstategtNXlt/stategt
  • ltcountrygtUSAlt/countrygt
  • High Temp lthigh scale"F"gt103lt/highgt
  • Low Temp ltlow scale"F"gt70lt/lowgt
  • Morning ltmorninggtPartly cloudy,
    Hazylt/morninggt
  • Afternoon ltafternoongtSunny amp
    hotlt/afternoongt
  • Evening lteveninggtClear and Coolerlt/eveninggt
  • lt/weatherReportgt

23
XML Validation
  • XML with correct syntax is Well Formed XML
  • XML validated against a DTD is Valid XML

24
Rules For Well-Formed XML
  • There must be one, and only one, root element
  • All XML elements must have a closing tag
  • Sub-elements must be properly nested
  • A tag must end within the tag in which it was
    started
  • Attributes are optional
  • Defined by an optional schema
  • Attribute values must be enclosed in or
  • Processing instructions are optional
  • XML is case-sensitive
  • lttaggt and ltTAGgt are not the same type of element
  • White space is preserved
  • CR / LF is converted to LF
  • Comment in XML is similar to that of HTML

25
XML DTD
  • A DTD defines the legal elements of an XML
    document
  • defines the document structure with a list of
    legal elements
  • XML Schema 
  • XML Schema is an XML based alternative to DTD
  • Errors in XML documents will stop the XML program
  • XML Validators

26
Browsers Support for XML
  • Netscape 6 supports XML
  • Internet Explorer 5.0 supports the XML 1.0
    standard
  • Internet Explorer 5.0 has the following XML
    support
  • Viewing of XML documents
  • Full support for W3C DTD standards
  • XML embedded in HTML as Data Islands
  • Binding XML data to HTML elements
  • Transforming and displaying XML with XSL
  • Displaying XML with CSS
  • Access to the XML DOM

27
Viewing XML Files
  • Raw XML files can be viewed in IE 5.0 (and
    higher) and in Netscape 6
  • but to make it display like a web page, you have
    to add some display information
  • XML documents do not carry information about how
    to display the data
  • Different solutions to the display problem, using
    CSS, XSL, JavaScript, and XML Data Islands
  • Will you be writing your future Homepages in XML?

28
Displaying XML with CSS
  • With CSS (Cascading Style Sheets) you can add
    display information to an XML document
  • Formatting XML with CSS is NOT the future of the
    Web
  • Formatting with XSL will be the new standard

29
Example the xml file
lt?xml version"1.0" encoding"ISO-8859-1"?gt
lt?xml-stylesheet type"text/css"
href"cd_catalog.css"?gt ltCATALOGgt ltCDgt
ltTITLEgtEmpire Burlesquelt/TITLEgt ltARTISTgtBob
Dylanlt/ARTISTgt ltCOUNTRYgtUSAlt/COUNTRYgt
ltCOMPANYgtColumbialt/COMPANYgt
ltPRICEgt10.90lt/PRICEgt ltYEARgt1985lt/YEARgt
lt/CDgt ltCDgt ltTITLEgtHide your heartlt/TITLEgt
ltARTISTgtBonnie Tylerlt/ARTISTgt
ltCOUNTRYgtUKlt/COUNTRYgt ltCOMPANYgtCBS
Recordslt/COMPANYgt ltPRICEgt9.90lt/PRICEgt
ltYEARgt1988lt/YEARgt lt/CDgt . . . . lt/CATALOGgt
30
Example the css file
CATALOG background-color ffffff width
100 CD display block margin-bottom
30pt margin-left 0 TITLE color
FF0000 font-size 20pt ARTIST color
0000FF font-size 20pt COUNTRY,PRICE,YEAR,C
OMPANY Display block color 000000
margin-left 20pt
31
Displaying XML with XSL
  • With XSL you can add display information to your
    XML document
  • XSL is the preferred style sheet language of XML
  • XSL (the eXtensible Stylesheet Language) is far
    more sophisticated than CS
  • One way to use XSL is to transform XML into HTML
    before it is displayed by the browser

32
Example the xml file
lt?xml version"1.0" encoding"ISO-8859-1"?gt lt?xml-
stylesheet type"text/xsl" href"simple.xsl"
?gt ltbreakfast_menugt ltfoodgt ltnamegtBelgian
Waffleslt/namegt ltpricegt5.95lt/pricegt ltdescripti
ongttwo of our famous Belgian Waffles with plenty
of real maple syruplt/descriptiongt ltcaloriesgt650lt
/caloriesgt lt/foodgt ltfoodgt ltnamegtStrawberry
Belgian Waffleslt/namegt ltpricegt7.95lt/pricegt ltd
escriptiongtlight Belgian waffles covered with
strawberries and whipped creamlt/descriptiongt ltca
loriesgt900lt/caloriesgt lt/foodgt lt/breakfast_men
ugt lt/breakfast_menugt
33
Example the xsl file
lt?xml version"1.0" encoding"ISO-8859-1"?gt lthtml
xslversion"1.0" xmlnsxsl"http//www.w3.org/199
9/XSL/Transform" xmlns"http//www.w3.org/TR/xhtml
1/strict"gt ltbody style"font-familyArial,helvet
ica,sans-seriffont-size12pt background-colorE
EEEEE"gt ltxslfor-each select"breakfast_menu/f
ood"gt ltdiv style"background-colortealcolo
rwhitepadding4px"gt ltspan
style"font-weightboldcolorwhite"gt
ltxslvalue-of select"name"/gtlt/spangt -
ltxslvalue-of select"price"/gt lt/divgt
ltdiv style"margin-left20pxmargin-bottom1emfon
t-size10pt"gt ltxslvalue-of
select"description"/gt ltspan
style"font-styleitalic"gt
(ltxslvalue-of select"calories"/gt calories per
serving) lt/spangt lt/divgt
lt/xslfor-eachgt lt/bodygt lt/htmlgt
34
View the result in IE 6
35
XML Data Islands
  • XML can be embedded within HTML pages in Data
    Islands
  • Manipulated via client side script or data
    binding
  • The unofficial ltxmlgt tag is used to embed XML
    data within HTML
  • Data Islands can be bound to HTML elements (like
    HTML tables)

lthtmlgt ltbodygt ltxml id"cdcat"
src"cd_catalog.xml"gtlt/xmlgt lttable border"1"
datasrc"cdcat"gt lttrgt lttdgt ltspan
datafld"ARTIST"gt lt/spangt lt/tdgt lttdgt ltspan
datafld"TITLE"gt lt/spangt lt/tdgt lt/trgt lt/tablegt
lt/bodygt lt/htmlgt
36
The Microsoft XML Parser
  • To read and update an XML document, you need an
    XML parser
  • The Microsoft XML parser comes with Microsoft
    Internet Explorer 5.0
  • Once you have installed IE 5.0, the parser is
    available to scripts, both inside HTML documents.
  • The parser features a language-neutral
    programming model that supports
  • JavaScript, VBScript, Perl, VB, Java, C and
    more
  • W3C XML 1.0 and XML DOM
  • DTD and validation
  • You can create an XML document object with the
    following code
  • var xmlDocnew ActiveXObject("Microsoft.XMLDOM")

37
Loading an XML file into the parser
  • XML files can be loaded into the parser using
    script code.
  • The following code loads an XML document
    (note.xml) into the XML parser
  • ltscript type"text/javascript"gt var xmlDoc new
    ActiveXObject("Microsoft.XMLDOM")
    xmlDoc.async"false" xmlDoc.load("note.xml") //
    ....... processing the document goes here
    lt/scriptgt
  • The second line in the code above creates an
    instance of the Microsoft XML parser
  • The third line turns off asynchronized loading,
    to make sure that the parser will not continue
    execution before the document is fully loaded
  • The fourth line tells the parser to load the XML
    document called note.xml

38
Namespaces Overview
  • Part of XMLs extensibility
  • Allow authors to differentiate between tags of
    the same name (using a prefix)
  • Frees author to focus on the data and decide how
    to best describe it
  • Allows multiple XML documents from multiple
    authors to be merged
  • Identified by a URI (Uniform Resource Identifier)
  • When a URL is used, it does NOT have to represent
    a live server

39
Namespaces Declaration
Namespace declaration examples
xmlns bk "http//www.example.com/bookinfo/"
xmlns bk "urnmybookstuff.orgbookinfo"
xmlns bk "http//www.example.com/bookinfo/"
Namespace declaration
Prefix
URI (URL)
40
Namespaces Examples
ltBOOK xmlnsbk"http//www.bookstuff.org/bookinfo"
gt ltbkTITLEgtAll About XMLlt/bkTITLEgt
ltbkAUTHORgtJoe Developerlt/bkAUTHORgt ltbkPRICE
currency'US Dollar'gt19.99lt/bkPRICEgt
ltbkBOOK xmlnsbk"http//www.bookstuff.org/bookin
fo" xmlnsmoney"urnfinancemoney"gt
ltbkTITLEgtAll About XMLlt/bkTITLEgt
ltbkAUTHORgtJoe Developerlt/bkAUTHORgt ltbkPRICE
moneycurrency'US Dollar'gt
19.99lt/bkPRICEgt
41
Namespaces Default Namespace
  • An XML namespace declared without a prefix
    becomes the default namespace for all
    sub-elements
  • All elements without a prefix will belong to the
    default namespace

ltBOOK xmlns"http//www.bookstuff.org/bookinfo"gt
ltTITLEgtAll About XMLlt/TITLEgt ltAUTHORgtJoe
Developerlt/AUTHORgt
42
Namespaces Scope
  • Unqualified elements belong to the inner-most
    default namespace.
  • BOOK, TITLE, and AUTHOR belong to the default
    book namespace
  • PUBLISHER and NAME belong to the default
    publisher namespace

ltBOOK xmlns"www.bookstuff.org/bookinfo"gt
ltTITLEgtAll About XMLlt/TITLEgt ltAUTHORgtJoe
Developerlt/AUTHORgt ltPUBLISHER
xmlns"urnpublisherspublinfo"gt
ltNAMEgtMicrosoft Presslt/NAMEgt
lt/PUBLISHERgt lt/BOOKgt
43
Namespaces Attributes
  • Unqualified attributes do NOT belong to any
    namespace
  • Even if there is a default namespace
  • This differs from elements, which belong to the
    default namespace

44
Entities
  • Entities provide a mechanism for textual
    substitution, e.g.
  • You can define your own entities
  • Parsed entities can contain text and markup
  • Unparsed entities can contain any data
  • JPEG photos, GIF files, movies, etc.

45
CDATA
  • By default, all text inside an XML document is
    parsed
  • You can force text to be treated as unparsed
    character data by enclosing it in lt!CDATA ...
    gt
  • Any characters, even and lt, can occur inside a
    CDATA
  • Whitespace inside a CDATA is (usually) preserved
  • The only real restriction is that the character
    sequence gt cannot occur inside a CDATA
  • CDATA is useful when your text has a lot of
    illegal characters (for example, if your XML
    document contains some HTML text)

46
Pure XML -- Instance Model
  • XML 1.0 Standard
  • no explicit data model
  • only syntax of well-formed and valid (wrt. a DTD)
    documents
  • implicit data model
  • nested containers ("boxes within boxes")
  • labeled ordered trees (a semistructured data
    model)
  • relational, object-oriented, other data easy to
    encode

A
ltAgt ltBgtfoolt/Bgt ltCgtbarlt/Cgt ltCgtlablt/Cgt lt/Agt
B
C
C
"foo"
"bar"
"lab"
children are ordered
47
Example Relational Data to XML
R
?R? ?tuple? ?A? a1 ?/A? ?B? b1 ?/B? ?C? c1
?/C? ?/tuple? ?tuple? ?A? a2 ?/A? ?B? b2
?/B? ?C? c2 ?/C? ?/tuple? ?/R?
48
Adding Structure and Semantics
  • XML Document Type Definitions (DTDs)
  • define the structure of "allowed" documents
    (i.e., valid wrt. a DTD)
  • ? database schema
  • gt improve query formulation, execution, ...
  • XML Schema
  • defines structure and data types
  • allows developers to build their own libraries of
    interchanged data types
  • XML Namespaces
  • identify your vocabulary

49
XML Related Technologies I
  • XHTML - Extensible HTML
  • CSS - Cascading Style Sheets
  • XSL - Extensible Style Sheet Language
  • XSL consists of three parts XML Document
    Transformation (renamed XSLT, see below), a
    pattern matching syntax (renamed XPath, see
    below), and a formatting object interpretation. 
  • XSLT - XML Transformation
  • XSLT is far more powerful than CSS. It can be
    used to transform XML files into many different
    output formats.
  • XPath - XML Pattern Matching
  • XPath is a language for addressing parts of an
    XML document. XPath was designed to be used by
    both XSLT and XPointer.

50
XML Related Technologies II
  • XLink - XML Linking Language
  • The XML Linking Language (XLink), allows elements
    to be inserted into XML documents in order to
    create links between XML resources.
  • XPointer - XML Pointer Language
  • The XML Pointer Language (XPointer), supports
    addressing into the internal structures of XML
    documents, such as elements, attributes, and
    content.
  • DTD - Document Type Definition
  • A DTD can be used to define the legal building
    blocks of an XML document.
  • Namespaces
  • XML namespaces defines a method for defining
    element and attribute names used in XML by
    associating them with URI references.

51
XML Related Technologies III
  • DOM - Document Object Model
  • The DOM defines interfaces, properties and
    methods to manipulate XML documents.
  • XSD - XML Schema
  • Schemas are powerful alternatives to DTDs.
    Schemas are written in XML, and support
    namespaces and data types.
  • XQL - XML Query Language
  • The XML Query Language supports query facilities
    to extract data from XML documents.
  • SAX - Simple API for XML
  • SAX is another interface to read and manipulate
    XML documents

52
References
  • W3 Schools XML Tutorial
  • http//www.w3schools.com/xml/default.asp
  • W3C XML page
  • http//www.w3.org/XML/
  • XML Tutorials
  • http//www.programmingtutorials.com/xml.aspx
  • Online resource for markup language technologies
  • http//xml.coverpages.org/
  • Several Online Presentations

53
References
  • W3 Schools XML Tutorial
  • http//www.w3schools.com/xml/default.asp
Write a Comment
User Comments (0)
About PowerShow.com