XML Technology - PowerPoint PPT Presentation

1 / 103
About This Presentation
Title:

XML Technology

Description:

Nizar Mabroukeh nizar_at_ccse.kfupm.edu.sa COE 445 KFUPM April, 2001 We will cover What is XML? Components of an XML Document Document Type Definition (DTD) XML ... – PowerPoint PPT presentation

Number of Views:318
Avg rating:3.0/5.0
Slides: 104
Provided by: NizarMa3
Category:
Tags: xml | font | size | technology | unicode

less

Transcript and Presenter's Notes

Title: XML Technology


1
XML Technology
  • Nizar Mabroukeh
  • nizar_at_ccse.kfupm.edu.sa
  • COE 445
  • KFUPM April, 2001

2
We will cover
  • What is XML?
  • Components of an XML Document
  • Document Type Definition (DTD)
  • XML Data Islands
  • Parsing XML and DOM
  • XML presentation with CSS, XSL and XSLT
  • XPath, XML and Database Integration
  • Why use XML?
  • Creating your own XML vocabulary
  • Review of XML Applications and Tools
  • XML Resources

3
What is XML?
4
Markup Languages
  • SGML
  • Standard Generalized Markup Language
  • Mother of Markup Languages
  • HTML
  • Most popular presentation language for web
  • XML
  • Draws heavily on the merits shortcomings of
    HTML SGML

5
Issues with HTML
  • Merits
  • Very easy to use learn
  • Presentation technology
  • It is the most popular
  • Shortcomings
  • NOT a data technology
  • Poor Searching
  • There is no Intelligence of content/data
  • We loose meaning association with content
  • Data cannot be represented hierarchically
  • Limited set of tags

6
How does XML look?
  • Simple XML data would look like
  • ltbookgt
  • lttitlegt XML Tech lt/titlegt
  • ltauthorgt YAG lt/authorgt
  • ltlevelgt Freshman lt/levelgt
  • lt/bookgt
  • ltbookgt Called the root node

7
XML HTML
  • Similar in appearance
  • Both are based on SGML
  • BUT
  • XML describes data
  • HTML displays data

8
XML (eXtensible Markup Language)
9
What Is XML?
  • XML is a
  • platform-independent,
  • self-describing,
  • expandable,
  • standard data exchange format
  • that can be used either independently or
  • embedded and used within other solutions.

10
Platform Independent
  • Windows
  • Unix
  • Macintosh
  • Mainframe

11
Self-Describing
  • Example
  • ltDATEgtJuly 26, 1998lt/DATEgt
  • Describes the information, not the presentation.
    Format flexible.

12
Expandable Extensible
  • HTML has a fixed set of tags
  • ltH1gt, ltBgt, ltPREgt
  • XML lets you have your own tags
  • ltdangerous-substancegt, ltShakespearean-charactergt
    , ltcash-equivalentgt

13
Standard
  • W3C (World Wide Web Consortium) www.w3c.org
  • XML 1.0 specification was issued as
    standards-based text format for interchange of
    data as of February 1998.
  • W3C XML Working Group designed XML as a
    simplified subset of SGML

14
Standard
  • XML specification does not define any particular
    tag names (like HTML), instead it defines general
    syntactic rules enabling developers to create
    their own domain-specific vocabularies of tags.

15
Context
  • Greater context to the information
  • Tree structure is natural in XML

ltbalance-sheetgt lttotal-assetsgt ltasset-type"cur
rent"gt ltamount-period"1998"gt ltamountgt
41,000,000 lt/amountgt lt/amount-periodgt
lt/asset-typegt lt/total-assetsgt lt/balance-sheetgt
16
Freedom
  • Extensible markup language
  • Customized tags
  • Tags give meaning to the content
  • Separates data from style

17
Why XML?
  • Derived as a subset of SGML
  • Allows you to define your own tags
  • XML ltauthorgt YAG lt/authorgt
  • HTML ltBgt YAG lt/Bgt
  • Provides meaningful readable data
  • Meaning searches can be performed
  • Much simpler than SGML
  • SGML spec 300 pages, XML 33 pages
  • Purely a Data Technology
  • Supports compound documents.

18
XML Advantages
  • Web based
  • Extensible
  • License-free
  • Platform independent
  • Single end-to-end IT solution for electronic
    information exchanges

19
XML Documents
  • Authoring XML

20
XML Elements
  • An XML element is made up of a start tag, an end
    tag, and data in between. The start and end tags
    describe the data within the tags, which is
    considered the value of the element.
  • For example, the following XML element is a
    ltchairmangt element with the value Sadiq Sait"
  • ltchairmangtSadiq Saitlt/chairmangt
  • Elements can be empty, to represent an empty
    element ltchairman/gt

21
(No Transcript)
22
XML Attributes
  • An element can optionally contain one or more
    attributes. An attribute is a name-value pair
    separated by an equal sign ().
  • Example
  • ltCITY ZIP31261"gtDhahranlt/CITYgt
  • Here, ZIP31261" is an attribute of the ltCITYgt
    element. Attributes are used as meta information

23
XML is
  • Case-sensitive

24
Parts of an XML document

Version Declaration
Document Type Definition (DTD)
ltrootgt BODY lt/rootgt
25
Version Declaration
  • lt?xml version1.0 encodingUTF-8
    standaloneyes?gt
  • Encoding
  • Supports Unicode 8, Unicode 16 Others
  • In short Provides for multi-lingual data
  • Standalone
  • Indicates whether the document has any markup
    declarations that are external to the document

26
XML data
  • This is how XML data would look like (the body of
    the document)
  • ltbooksgt
  • ltbookgt
  • ltnamegt Codebook 6.0 lt/namegt
  • ltauthorgt YAG lt/authorgt
  • ltlevelgt Intermediate Advanced lt/levelgt
  • lt/bookgt
  • ltbookgt
  • ltnamegt Jave for Beginners lt/namegt
  • ltauthorgt Dale lt/authorgt
  • ltlevelgt Beginner lt/levelgt
  • lt/bookgt
  • lt/booksgt

27
XML Document Rules
  • ltbooksgt
  • ltbookgtlt/bookgt
  • ltbook ISBN21-458-65-0gt
  • ltpara id1 sid4gt
  • lt?xml version1.0?gt
  • ONE Root element
  • ALL tags start AND end
  • Tags cannot overlap and are case sensitive
  • Attribute values enclosed in quotes
  • Attributes not repeated in an element
  • FIRST item must be

28
Two types of XML documents
  • Well-formed XML documents
  • Valid XML documents

29
Well-formed document
  • Must contain one or more elements
  • Must contain a uniquely named root element
  • All other elements within the root element must
    be nested correctly
  • An XML parser will reject malformed
    documents(the method of rejection will vary by
    parser author)
  • Documents that contain XML and HTML tags are
    common
  • HTML within an XML document must be well-formed

30
Valid XML document
  • The XML document must be well formed
  • Should contain a Document Type Definition
  • DTD is a schema which contains the constraints
    for the XML document
  • It contains Element definitions and their
    Attributes
  • Attributes should comply with the following rules
  • Cannot contain lt, or a single or .
  • Elements must be nested correctly

31
Document Type Definition
  • DTD is a text document that defines the lexicon
    of legal names for tags in a particular XML
    vocabulary
  • It also defines how tags should be nested
  • It can be written as code inside the XML file or
    specified externally as a separate text file with
    extension .dtd

32
Sample DTD
  • lt!-- Uses EBNF (Extended Backus Naur Form) --gt
  • lt!DOCTYPE book
  • lt!ELEMENT book(name,author,level)gt
  • lt!ELEMENT name(PCDATA)gt
  • lt!ELEMENT author(PCDATA)gt
  • lt!ELEMTNT level(PCDATA)gt
  • lt!ATTLIST author email CDATA IMPLIEDgt
  • gt
  • DTD may be specified externally with .dtd
    extension lt!DOCTYPE book SYSTEM book.dtdgt

33
More on DTD
  • Special software can help you build your DTD
    document visually instead of having to write all
    this weird code, example software package is XML
    Authority from Extensibility
  • An XML document is associated with a
    corresponding DTD document for validation using
    the lt!DOCTYPEgt tag

34
Why use a DTD?
  • Application independent way of sharing data
  • Industries or trading parties can agree on a
    standard for interchanging data
  • Verification that data received from trading
    parties is valid.

35
Complete XML document
  • lt?xml version1.0 ?gt
  • lt!DOCTYPE book
  • lt!ELEMENT book(name,author,level?)gt
  • lt!ELEMENT name(PCDATA)gt
  • lt!ELEMENT author(PCDATA)gt
  • lt!ELEMTNT level(PCDATA)gt
  • gt
  • ltbookgt
  • ltnamegt Codebook 6.0 lt/namegt
  • ltauthorgt YAG lt/authorgt
  • ltlevelgt Intermediate Advanced lt/levelgt
  • lt/bookgt

36
OR
  • lt?xml version1.0 ?gt
  • lt!DOCTYPE book SYSTEM book.dtdgt
  • ltbookgt
  • ltnamegt Codebook 6.0 lt/namegt
  • ltauthorgt YAG lt/authorgt
  • ltlevelgt Intermediate Advanced lt/levelgt
  • lt/bookgt

37
XML Document Pluses
  • Tightly Structured
  • Extensible
  • Easily models data
  • Useful for applications and transferbetween
    applications
  • Interchangeable

38
XML Data Islands
  • XML inside HTML

39
XML Data Islands
  • A data island is an XML document that exists
    within an HTML page.
  • It allows you to script against the XML document
    without having to load it through script or
    through the ltOBJECTgt tag.
  • Almost anything that can be in a well-formed XML
    document can be inside a data island

40
How to create XML data island
  • The XML for a data island in HTML can be either
  • Inline using ID,
  • or called from an outside xml file using SRC,
  • or created using a ltscriptgt tag

41
Inline data island
  • The ltXMLgt element marks the beginning of the data
    island, and its ID attribute provides a name that
    you can use to reference the data island.
  • ltXML ID"XMLID"gt
  • ltcustomergt
  • ltnamegtMark Hansonlt/namegt ltcustIDgt81422lt/custIDgt
  • lt/customergt
  • lt/XMLgt

42
XML referenced from outside file
  • referenced through a SRC attribute on the ltXMLgt
    tag
  • ltXML ID"XMLID" SRC"customer.xml"gtlt/XMLgt

43
Created using ltscriptgt tag
  • ltSCRIPT LANGUAGE"xml" ID"XMLID"gt ltcustomergt
  • ltnamegtMark Hansonlt/namegt ltcustIDgt81422lt/cust
    IDgt
  • lt/customergt
  • lt/SCRIPT gt

44
Parsing XML
45
What is XML Parsing
  • For a computer program to access the structured
    information in the document in a meaningful way,
    parsing is required
  • The parser first reads the stream of characters
    and recognizes the syntactic details of elements,
    attributes and text in the document
  • Then, the parser exposes the hierarchical set of
    information in the document as a tree of related
    elements, attributes and text items

46
  • The logical tree of information items created
    after parsing the XML document, is called the
    Information Set or Infoset
  • This can then be manipulated in different ways
    and data extracted for usage in applications,
    databases,etc

47
XML Parsers
  • Always check for well-formedness
  • Can be validating or non-validating
  • Validation required association with DTD document
  • Included in Microsoft Internet Explorer 5.0
  • Language-neutral programming model
  • By using W3C XML 1.0 and XML DOM it supports
    JavaScript, VBScript, Java, C, Perl

48
Manipulating XML using the DOM
  • W3C provides a standard API called the Document
    Object Model (DOM) to access an XML documents
    infoset
  • The DOM API provides a complete set of operations
    to programmatically manipulate the node tree
    including navigating the nodes in the hierarchy,
    creating and appending new nodes, removing nodes,
    etc.

49
  • Once you are done with making changes to the node
    tree you can save it and serialize the infoset
    back into an XML document
  • xml infoset

parsing
serialization
50
DOM Properties Methods
  • An XML document object is created when an XML
    data island is loaded and parsed. and it has
    Properties Methods
  • XMLDocument Returns a reference to the
    XML DOM exposed by the object
  • documentElement Returns the root element
  • childNodes Returns a node list containing
    children (if any)
  • item(id) Access individual nodes through an
    index (zero based)
  • text Returns the text content of the node
  • Lets look at an example

51
DOM Example
  • ltXML ID"xmlDocument"gt
  • ltclassgt
  • ltstudent studentID"13429"gt
  • ltnamegtJane Smithlt/namegt
  • ltGPAgt3.8lt/GPAgt
  • lt/studentgt
  • lt/classgt
  • lt/XMLgt
  • All of the below begin with xmlDocument.documentEl
    ement.childNodes.item(0)
  • .childNodes.item(0).text Returns "James Smith"
  • .childNodes.item(1).text Returns "3.8"
  • .text Returns "James Smith 3.8" i.e.
    name GPA
  • Note Everything is case sensitive here

Data Island
52
XML Presentation
  • Viewing XML

53
Viewing XML
  • Unlike HTML, XML does not predefine display
    properties for specific elements.
  • C Data Source Object (DSO)
  • Binds XML to HTML and gives better performance,
    built in IE 5.0
  • Cascading Style Sheets (CSS)
  • Extended Stylesheet Language (XSL)

54
CSS
  • Separate file with a .css extension
  • object, property nameproperty value
  • DIV colorred font-size16pt
  • / Comments are entered the C way here /

55
Displaying XML with CSS
  • paragraph
  • COLOR red
  • FONT-FAMILY 'Book Antiqua'
  • FONT-VARIANT small-caps
  • FONT-WEIGHT bolder
  • preamble
  • COLOR blue
  • FONT-FAMILY 'Book Antiqua'
  • FONT-VARIANT small-caps
  • FONT-WEIGHT bolder
  • lt?xml-stylesheet type"text/css"
    href"const.css"?gt
  • Modify the presentation of XML Documents
  • Follow normal CSS syntax
  • Referenced in the XML source document

56
XSL
  • eXtended Stylesheet Language
  • Syntax is similar to XML
  • In fact, XSL is written in XML

57
  • The aim of XSL is to provide a simple but
    powerful stylesheet syntax, which can be used to
    define how XML documents should appear on the
    screen.
  • An XSL stylesheet transforms an XML document into
    a suitable form for presentation
  • It allows us to control how and which parts of
    the documents should be shown to the user.

58
  • The XSL processor takes an XML document as input,
    and translates it into a different XML document
    suitable for output. This resultant XML document
    can be passed through a separate tool to add the
    finishing touches ready for presentation.

59
Displaying XML with XSL
  • ltxslfor-each selectbooks/book"
    order-byauthor"gt
  • ltTR VALIGN"top" gt ltTDgt
  • ltxslvalue-of select"name"/gt
  • lt/TDgt
  • ltTDgt
  • ltxslvalue-of selectauthor"/gt
  • lt/TDgt
  • lt/TRgt
  • lt/xslfor-eachgt
  • More powerful than CSS
  • XSL documents are XML documents

60
XSL Styling Tools
61
Integrated use of XML
62
XSLT
  • eXtended Stylesheet Language for Transformations
  • Transforms XML from one tree-based structure to
    another

63
Advantages of XSLT
  • Convert between XML vocabularies used by
    different applications
  • Present data from an XML document by transforming
    it into HTML, or another format thats
    appropriate to the user or special device
    requesting the data

64
Database Integration
  • Query and Search XML documents

65
XPath
  • XML Path Language is a W3C standard
  • It is a declarative language
  • Used to interrogate an XML document to select
    subsets of information

66
  • XPath provides a method for addressing parts of
    an XML document.
  • Allows string, number and boolean manipulation
  • Treats XML document as set of nodes, allows
    matching
  • It is called a Path language because it is design
    like the path notation in URLs and files in
    directories

67
XPath Example
  • /MovieList/Movie/Cast/Actor/ returns all info
    about all actors
  • /MovieList/Movie/Cast/ returns all info about
    Cast including
  • that of Director and Actor and
  • Actress
  • To find attributes instead of elements we use _at_
  • /MovieList/Movie/_at_Title returns titles of all
    movies
  • //Actress/_at_Role returns the Role of
    any Actresses
  • anywhere in the document

68
  • Find the Cast of any movies directed by
  • Minghella with a running time greater than
  • 130 minutes
  • /MoiveList/MovieDirector/LastManghella and
    _at_RunningTime gt130/Cast

69
Where to Use XPath
70
Using XPath
  • Create an XPath for each unit of information
  • Carry the XPaths with the information as it is
    transformed into other formats
  • Use XPaths to link language strings and labels
    with the information they describe
  • Generate an XSL stylesheet that uses the XPaths
    to generate the outgoing message

71
You should be using XML
  • More reasons for you to use XML

72
Why should I use XML
  • XML enables a data web of information services
    it is vendor-neutral, platform-neutral,
    language-neutral technology
  • Simplifies application integration, consider the
    following example ?

73
Simplifies Integration
  • If a company has
  • Machines running OS from Sun, HP, IBM
  • Databases from Oracle, IBM, Microsoft
  • Packaged applications from Oracle, SAP
  • An XML-based representation of data and the HTTP
    protocol might be the only things these various
    systems can ever hope to have in common!
  • Especially if you want to integrate them over the
    Internet

74
Why should I use XML
  • XML also simplifies Information Publishing and
    Reuse
  • With XML and XSLT you can easily
  • Separate data from presentation, allowing you to
    change the look of information without affecting
    application code
  • Publish the same data using output styles
    specific to each kind of requesting device
    browser, cell phone, PDA or another computer,
    etc.

75
  • WML Wireless Markup Language for cell phones and
    PDAs
  • SVG Scalable Vector Graphics language for
    rendering rich, data-driven images
  • XSL Formatting Objects for high-quality printed
    output

76
Creating your own XML Vocabulary
77
XML Vocabulary
  • The set of XML elements and attributes for a
    certain application is called an XML Vocabulary
  • Can be used in more than one XML document for the
    same application

78
Create your own
  • Begin each document with an XML decleration
  • lt?xml version1.0?gt (case-sensitive)
  • Use only one top-level document element
  • cannot be repeated
  • Match opening and closing tags properly (notice
    case-sensetivity)

79
  • Add comments between
  • lt!-- and --gt characters
  • Start element and attribute names with a letter.
    Cannot contain spaces
  • Put attributes in the opening tag
  • Enclose attribute values in matching quotes

80
  • Use only simple text as attribute values
  • Use lt and amp instead of lt and for literal
    less-than and ampersand characters
  • Write empty elements as ltElementName/gt
  • If you follow these rules then your document will
    be a well-formed XML document

81
Review of Applications and Tools
82
XML Tools
  • Help author the grammars (Schema, Filter,
    Updates)
  • View, Edit Manage XML
  • Define mappings between XML logical views
    Databases.

83
How is XML used?
  • Publishing
  • Create once, use many
  • Database Management
  • Database integration
  • e-Business
  • Key driver

84
XML Creation Tools
  • Two schools
  • Data centric
  • Document centric

85
Microsoft XML Notepad
msdn.microsoft.com/xml/notepad
  • Free
  • Tree-based
  • Great search

86
CueSoft eXML
www.cuesoft.com
  • Tree or source view
  • Attribute handling
  • No search
  • No nav bar in source view

87
Techno2000 Clip!
  • Powerful
  • Flexible
  • Wizards
  • DTD creation

http//www.t2000-usa.com/download/download_index.h
tml
88
Vervet Logic XML Pro
  • More expensive

www.vervetlogic.com
89
Document Oriented
  • Examples
  • SoftQuad XMetaL
  • Corel WordPerfect 9
  • Arbortext Adept Editor
  • Not as good for data development
  • Require DTD or equivalent

90
XSL Tester
www.vbxml.com
  • Free
  • Open source
  • VB

91
XML Authority from Extensibility
92
XML Authority from Extensibility
  • DTDs
  • Schema

93
Other Choices
  • Web-based
  • DTD Generator
  • www.pault.com/Xmltube/dtdgen.html
  • IBM Suite
  • Visual DTD
  • Visual XML Transformation

94
Utilities
  • File manipulation
  • XML Junction
  • www.xmljunction.com
  • ODBC2XML

95
Security
  • IBMs XML Security Suite
  • www.alphaworks.ibm.com/tech/xmlsecuritysuite

96
Programming and XML
  • Programming tools and environments
  • Java including JAXP
  • Visual Basic (VB) and Visual Basic for
    Applications (VBA)
  • Databases and other tools
  • Document vs. event driven processing (DOM vs. SAX)

97
Resources for XML
98
Software Sources
  • Interesting sites
  • www.xmlsoftware.com
  • alphaworks.ibm.com
  • www.garshol.priv.no/download/xmltools/std_ix.html

99
WebSites
  • The Mother Ship
  • www.w3c.org/xml
  • Heavy Stuff
  • www.ibm.com/developer/xml
  • msdn.microsoft.com/xml
  • www.oracle.com/xml/
  • java.sun.com/xml/

100
More Resources
  • XML Information sites
  • www.xmlinfo.com
  • www.xml.com
  • www.xml.org
  • www.xmlelephant.com
  • metalab.unc.edu/xml/
  • www.ucc.ie/xml/
  • XML Tutorials
  • www.xml101.com

101
More Resources
  • XSL Information
  • http//www.mulberrytech.com/xsl/xsl-list
  • VB XML Information
  • mailtovbxml-subscribe_at_onelist.com

102
And finally
  • Were only at the very start of the Web
    revolution.
  • XML is the fundamental building block
  • XML technology is moving forward, and Standards
    are rapidly evolving

103
Thank You
Write a Comment
User Comments (0)
About PowerShow.com