Title: XML stands for EXtensible Markup Language'
1XML stands for EXtensible Markup Language. XML
is designed to describe data and to focus on what
data is. HTML was designed to display data
and to focus on how data looks.
2What is XML?
- XML was designed to describe data
- XML is a markup language much like HTML
- XML tags are not predefined. You must define your
own tags - XML uses a Document Type Definition (DTD) or an
XML Schema to describe the data - XML with a DTD or XML Schema is designed to be
self-descriptive - XML is a W3C Recommendation
3The Main Difference Between XML and HTML
- XML was designed to carry data.
- XML is not a replacement for HTML.XML and HTML
were designed with different goals - XML was designed to describe data and to focus on
what data is.HTML was designed to display data
and to focus on how data looks. - HTML is about displaying information, while XML
is about describing information
4XML Does not DO AnythingXML was not designed to
DO anything
- XML was created to structure, store and to send
information. - The following example is a note to Tove from
Jani, stored as XML - ltnotegt
- lttogtTovelt/togt
- ltfromgtJanilt/fromgt
- ltheadinggtReminderlt/headinggt
- ltbodygtDon't forget me this weekend!lt/bodygt
- lt/notegt
- The note has a header and a message body. It also
has sender and receiver information. - But still, this XML document does not DO
anything. - It is just pure information wrapped in XML tags.
Someone must write a piece of software to send,
receive or display it.
5XML is Free and Extensible
- XML tags are not predefined. You must "invent"
your own tags. - The tags used to mark up HTML documents are
predefined. - In HTML you can only use tags that are defined in
the HTML standard (like ltpgt, lth1gt, etc.). - XML allows you to define your own tags and your
own document structure. - The tags in the example above (like lttogt and
ltfromgt) are not defined in any XML standard.
These tags are "invented" by the author of the
XML document
6XML is a Complement to HTML
- XML is a cross-platform, software and hardware
independent tool for transmitting information. - XML in Future Web Development
- XML is going to be everywhere
- XML will be the most common tool for all data
manipulation and data transmission.
7It is important to understand that XML was
designed to store, carry, and exchange data. XML
was not designed to display data.
- XML can Separate Data from HTML
- With XML, your data is stored outside your HTML.
- When HTML is used to display data, the data is
stored inside your HTML. - With XML, data can be stored in separate XML
files. This way you can concentrate on using HTML
for data layout and display, and be sure that
changes in the underlying data will not require
any changes to your HTML. - XML data can also be stored inside HTML pages as
"Data Islands". You can still concentrate on
using HTML only for formatting and displaying the
data.
8XML is Used to Exchange Data
- With XML, data can be exchanged between
incompatible systems. - In the real world, computer systems and databases
contain data in incompatible formats. One of the
most time-consuming challenges for developers has
been to exchange data between such systems over
the Internet. - Converting the data to XML can greatly reduce
this complexity and create data that can be read
by many different types of applications.
9XML and B2B
- With XML, financial information can be exchanged
over the Internet. - Expect to see a lot about XML and B2B (Business
To Business) in the near future. - XML is going to be the main language for
exchanging financial information between
businesses over the Internet. A lot of
interesting B2B applications are under
development.
10XML Can be Used to Share Data
- With XML, plain text files can be used to share
data. - Since XML data is stored in plain text format,
XML provides a software- and hardware-independent
way of sharing data. - This makes it much easier to create data that
different applications can work with. - It also makes it easier to expand or upgrade a
system to new operating systems, servers,
applications, and new browsers.Â
11XML Can be Used to Store Data
- With XML, plain text files can be used to store
data. - XML can also be used to store data in files or in
databases. Applications can be written to store
and retrieve information from the store, and
generic applications can be used to display the
data.
12XML Can Make your Data More Useful
- With XML, your data is available to more users.
- Since XML is independent of hardware, software
and application, you can make your data available
to other than only standard HTML browsers. - Other clients and applications can access your
XML files as data sources, like they are
accessing databases. - Your data can be made available to all kinds of
"reading machines" (agents), and it is easier to
make your data available for blind people, or
people with other disabilities.
13If Developers Have Sense
- If they DO have sense, all future applications
will exchange their data in XML. - The future might give us word processors,
spreadsheet applications and databases that can
read each other's data in a pure text format,
without any conversion utilities in between. - We can only pray that Microsoft and all the other
software vendors will agree
14XML Syntax Rules
- The syntax rules of XML are very simple and very
strict. The rules are very easy to learn, and
very easy to use. - Because of this, creating software that can read
and manipulate XML is very easy.
15XML Sample Document
- lt?xml version"1.0" encoding"ISO-8859-1"?gt
- ltnotegt
- lttogtTovelt/togt
- ltfromgtJanilt/fromgt
- ltheadinggtReminderlt/headinggt
- ltbodygtDon't forget me this weekend!lt/bodygt
- lt/notegt
- ISO-8859-1 (Latin-1/West European) character set.
- First line is the root called note.The next 4
lines describe 4 child elements of the root (to,
from, heading, and body)
16XML Tags are Case Sensitive
- Unlike HTML, XML tags are case sensitive
- ltMessagegtThis is incorrectlt/messagegt
- ltmessagegtThis is correctlt/messagegt
17XML Elements Must be Properly Nested
- Improper nesting of tags makes no sense to XML.
- In HTML some elements can be improperly nested
within each other like this - ltbgtltigtThis text is bold and italiclt/bgtlt/igt
- In XML all elements must be properly nested
within each other like this - ltbgtltigtThis text is bold and italiclt/igtlt/bgt
18XML Documents Must Have a Root Element
- All XML documents must contain a single tag pair
to define a root element. - All other elements must be within this root
element. - All elements can have sub elements (child
elements). Sub elements must be correctly nested
within their parent element - ltrootgt
- ltchildgt
- ltsubchildgt.....
- lt/subchildgt
- lt/childgt
- lt/rootgt
19XML Attribute Values Must be Quoted
- With XML, it is illegal to omit quotation marks
around attribute values. - XML elements can have attributes in name/value
pairs just like in HTML. In XML the attribute
value must always be quoted.. - lt?xml version"1.0" encoding"ISO-8859-1"?gt
- ltnote date12/11/2002gt
- lttogtTovelt/togt
- ltfromgtJanilt/fromgt
- lt/notegt
- lt?xml version"1.0" encoding"ISO-8859-1"?gt
- ltnote date"12/11/2002"gt
- lttogtTovelt/togt
- ltfromgtJanilt/fromgt
- lt/notegt
- The error in the first document is that the date
attribute in the note element is not quoted. This
is correct date"12/11/2002". This is incorrect
date12/11/2002.
20With XML, CR / LF is Converted to LF
- With XML, a new line is always stored as LF.
- Do you know what a typewriter is?
- In Windows applications, a new line is normally
stored as a pair of characters carriage return
(CR) and line feed (LF). The character pair bears
some resemblance to the typewriter actions of
setting a new line. - In Unix applications, a new line is normally
stored as a LF character. - Macintosh applications use only a CR character to
store a new line
21Comments in XML
- The syntax for writing comments in XML is similar
to that of HTML. - lt!-- This is a comment --gt
22There is Nothing Special About XML
- It is just plain text with the addition of some
XML tags enclosed in angle brackets. -
- Software that can handle plain text can also
handle XML. In a simple text editor, the XML tags
will be visible and will not be handled
specially. - In an XML-aware application however, the XML tags
can be handled specially. The tags may or may not
be visible, or have a functional meaning,
depending on the nature of the application.
23XML Elements are ExtensibleXML documents can be
extended to carry more information
- ltnotegt
- lttogtTovelt/togt
- ltfromgtJanilt/fromgt
- ltbodygtDon't forget me this weekend!lt/bodygt
- lt/notegt
- we created an application that extracted the
lttogt, ltfromgt, and ltbodygt elements from the XML
document to produce this output - MESSAGE To ToveFrom Jani
- Don't forget me this weekend!
- Now the author of the XML document added some
extra information to it - ltnotegt
- ltdategt2002-08-01lt/dategt
- lttogtTovelt/togt
- ltfromgtJanilt/fromgt
- ltheadinggtReminderlt/headinggt
- ltbodygtDon't forget me this weekend!lt/bodygt
- lt/notegt
24XML Elements have RelationshipsElements are
related as parents and children
- Lets see know how relationships between XML
elements are named, and how element content is
described. - Example This is a description of a book
- My First XMLIntroduction to XML
- What is HTML
- What is XML
- XML Syntax
- Elements must have a closing tag
- Elements must be properly nested
- This XML document describes the book
- ltbookgt
- lttitlegtMy First XMLlt/titlegt
- ltprod id"33-657" media"paper"gtlt/prodgt attribute
named id has the value "33-657". - ltchaptergt attribute named media has the value
"paper". - Introduction to XML
- ltparagtWhat is HTMLlt/paragt
- ltparagtWhat is XMLlt/paragt
- lt/chaptergt
- ltchaptergtXML Syntax
25Element NamingXML elements must follow these
naming rules
- Names can contain letters, numbers, and other
characters - Names must not start with a number or punctuation
character - Names must not start with the letters xml (or
XML, or Xml, etc) - Names cannot contain spaces
26XML AttributesXML elements can have attributes
- In HTML you have this
- ltIMG SRC"computer.gif"gt.
- The SRC attribute provides additional information
about the IMG element. - In HTML attributes provide additional information
about elements - ltimg src"computer.gif"gt lta href"demo.asp"gt
- Attributes often provide information that is not
a part of the data. - In the example below, the file type is irrelevant
to the data, but important to the software that
wants to manipulate the element - ltfile type"gif"gtcomputer.giflt/filegt
27Quote Styles, "female" or 'female'?
- Attribute values must always be enclosed in
quotes, but either single or double quotes can be
used. - ltperson sex"female"gt
- or like this
- ltperson sex'female'gt
- Note If the attribute value itself contains
double quotes it is necessary to use single
quotes, like in this example - ltgangster name'George "Shotgun" Ziegler'gt
- Note If the attribute value itself contains
single quotes it is necessary to use double
quotes, like in this example - ltgangster name"George 'Shotgun' Ziegler"gt
28Use of Elements vs. AttributesData can be stored
in child elements or in attributes
- Take a look at these examples
- ltperson sex"female"gt
- ltfirstnamegtAnnalt/firstnamegt
- ltlastnamegtSmithlt/lastnamegt
- lt/persongt
- ltpersongt
- ltsexgtfemalelt/sexgt
- ltfirstnamegtAnnalt/firstnamegt
- ltlastnamegtSmithlt/lastnamegt
- lt/persongt
- In the first example sex is an attribute. In the
last, sex is a child element. Both examples
provide the same information.
29Avoid using attributes?
- Don't end up like this (this is not how XML
should be used) - ltnote day"12" month"11" year"2002" to"Tove"
from"Jani" heading"Reminder" body"Don't forget
me this weekend!"gt - lt/notegt
30Well Formed XML DocumentsA "Well Formed" XML
document has correct XML syntax.
- XML documents must have a root element
- XML elements must have a closing tag
- XML tags are case sensitive
- XML elements must be properly nested
- XML attribute values must always be quoted
31XML documents can have a reference to a DTD or to
an XML Schema.
- Look at this simple XML document called
"note.xml" - lt?xml version"1.0"?gt
- ltnotegt
- lttogtTovelt/togt
- ltfromgtJanilt/fromgt
- ltheadinggtReminderlt/headinggt
- ltbodygtDon't forget me this weekend!lt/bodygt
- lt/notegt
- A DTD File
- Here is a DTD file called "note.dtd" that defines
the elements of the XML document ("note.xml") - lt!ELEMENT note (to, from, heading, body)gt
- lt!ELEMENT to (PCDATA)gt
- lt!ELEMENT from (PCDATA)gt
- lt!ELEMENT heading (PCDATA)gt
- lt!ELEMENT body (PCDATA)gt
- The first line defines the note element to have
four child elements "to, from, heading,
body".Line 2-5 defines the to, from, heading,
body elements to be of type "PCDATA".
32XML Schema
- Here is an XML Schema file called "note.xsd" that
defines the elements of the XML document above
("note.xml") - lt?xml version"1.0"?gt
- ltxsschema xmlnsxs"http//www.w3.org/2001/XMLSch
ema" targetNamespace"http//www.w3schools.com"
xmlns"http//www.w3schools.com"
elementFormDefault"qualified"gt - ltxselement name"note"gt
- ltxscomplexTypegt
- ltxssequencegt
- ltxselement name"to" type"xsstring"/gt
- ltxselement name"from" type"xsstring"/gt
- ltxselement name"heading" type"xsstring"/gt
- ltxselement name"body" type"xsstring"/gt
lt/xssequencegt - lt/xscomplexTypegt
- lt/xselementgt
- lt/xsschemagt
- The note element is a complex type because it
contains other elements.
33A Reference to a DTD
- This XML document has a reference to a DTD
- Document Type Description
- lt?xml version"1.0"?gt
- lt!DOCTYPE note SYSTEM "http//www.w3schools.com/dt
d/note.dtd"gt - ltnotegt
- lttogtTovelt/togt
- ltfromgtJanilt/fromgt
- ltheadinggtReminderlt/headinggt
- ltbodygtDon't forget me this weekend!lt/bodygt
- lt/notegt
34Reference to an XML Schema
- This XML document has a reference to an XML
Schema - lt?xml version"1.0"?gt
- ltnote xmlns"http//www.w3schools.com"
xmlnsxsi"http//www.w3.org/2001/XMLSchema-instan
ce" xsischemaLocation"http//www.w3schools.com
/note.xsd"gt - lttogtTovelt/togt
- ltfromgtJanilt/fromgt
- ltheadinggtReminderlt/headinggt
- ltbodygtDon't forget me this weekend!lt/bodygt
- lt/notegt
- namespace(ns) comes from www.w3schools.com
35Syntax-check View your XML
- Open note.xml with browser
Displaying XML with XSL XSL is the preferred
style sheet language of XML.
36With Internet Explorer, the unofficial ltxmlgt tag
can be used to create an XML data island
- XML Data Embedded in HTML
- An XML data island is XML data embedded into an
HTML page. - The id attribute of the ltxmlgt tag defines an ID
for the data island, and the src attribute points
to the XML file to embed - lthtmlgt
- ltbodygt
- ltxml id"note" src"note.xml"gtlt/xmlgt
- lt/bodygt
- lt/htmlgt
37Parsing XML Documents
- To manipulate an XML document, you need an XML
parser. The parser loads the document into your
computer's memory. Once the document is loaded,
its data can be manipulated using the DOM. The
DOM treats the XML document as a tree.
38- Parsing an XML File - A Cross browser Example
- This is a cross browser example(xmlParser.asp)
that loads an existing - XML Document("note.xml")into the XML parser
W3Schools Internal Note To ToveFrom
JaniMessage Don't forget me this weekend!
39Parsing an XML String - A Cross browser Example
Text of first child element ToveText of second
child element Jani