Title: SASIntrNet V8
1The EXtensible Markup Language
Mary C. Parmelee
2What is XML?
- XML is a text-based meta-language that was
designed to - structure and describe Web data and make it more
human readable - be completely customizable and self describing
- be platform independent for seamless
interoperability - Facilitate the translation of a document from one
language to another
3Why is XML Important?
- Two main reasons
- Data interchange allows data to be shared
between different applications and systems - Flexibility allows data to be centrally stored
and transformed into different medias and formats
such as HTML
4Evolution of XML
- First there was Standard Generalized Markup
Language (SGML) - provides a specification for the writing of
markup languages - Developed by IBM researchers in 1965
- Became an ISO standard in 1986
- Powerful but complex
5Evolution of XML
- Then there was Hypertext Markup Language (HTML)
- 1989 researcher Tim Berners-Lee proposed a simple
system to share documents using hyperlinks - Named the hypertext system the WWW
- Defined a simple SGML Document Type (HTML)
- World Wide Web Consortium W3C founded in 1992 to
standardize HTML
6Evolution of XML
- Finally eXtensible Markup Language (XML)
- HTML was
- Portable
- Easy to understand and use
- BUT
- Fixed set of tags-not extensible
- Designed to format and display data not describe
its content - Does not address the transfer platform
independent transfer of data - 1996 W3C began XML development
- released XML 1.0 spec in 1998
7The XML Family
- 1998 CSS Cascading Style Sheets
- 2000 XSL Extensible Style Language
- 2000 XLL (XLINK) XML Linking Language
- 2000 XPointer XML Pointing Language
- 1999 XML Namespaces
- 1999 XML Schema
- 2000 XForms
- XML Query
8How Does XML Differ from HTML?
- Two main differences
- Both derived from the SGML standard BUT
- HTML is an application of SGML called a Document
Type Definition (DTD) that specifies Markup - XML is a subset of SGML allows specification of
custom DTDs and associated Markup - HTML specifies presentation
- XML specifies data structure (content)
9What Does XML Look Like?
Title
Author
Price
Introduction to XML
Mark Torr
22.00
Advanced XML
Mark Torr
32.00
10What Does XML Look Like?
Notice that there are NO presentation tags. XML
simply defines a data structure
lt?xml version1.0?gt ltbooksgt ltbook
nameIntroduction to XML price22.00gt
ltauthorgtMark Torrlt/authorgt lt/bookgt
ltbook nameAdvanced XML price32.00gt
ltauthorgtMark Torrlt/authorgt
lt/bookgt lt/booksgt
It is pretty self describing. It is clear we are
talking about books and that the data related to
that book from simply reading the tags.
11XML Structure Declaration
- XML declarations normally look as shown below
- lt?xml version1.0?gt
- You can tell this is a declaration by the ? That
follows immediately after the opening lt and the
? immediately prior to the closing gt. - XML declarations, otherwise known as processing
instructions, are used by parsers.
12XML Structure Document Root
- Every XML document must contain a document root
eg ltbooksgtlt/booksgt - The document root is simply the element that
appears first after the processing instruction. - A document root cannot be empty.
13XML Structure Element
- An XML element defines the content of an XML
document eg ltbooksgtlt/booksgt - Elements can contain
- Other Elements
- Text
- Known as PCDATA (Parsable character data)
- Nothing
14XML Structure Attribute
- An XML attribute adds data about the elements.
- Attributes can be considered similar to fields in
a database.
ltbook nameIntroduction to XML
price22.00gt lt/bookgt
15XML Structure Entities
- Here is a list of predefined entities. There are
only five and so it is pretty easy to remember
them.
16XML Structure Example
Attribute. Attributes always appear with within
Elements as shown in this example. Here name is
an attribute of the element book and price is
also an attribute of the element book.
Declaration or processing instruction. Used by
Parsers or XML editing tools.
lt?xml version1.0?gt ltbooksgt ltbook
nameIntroduction to XML price22.00gt
ltauthorgt Mark Torr lt/authorgt lt/bookgt
ltbook nameAdvanced XML price32.00gt
ltauthorgt Mark Torr amp His
Friendlt/authorgt lt/bookgt lt/booksgt
lt?xml version1.0?gt ltbooksgt ltbook
nameIntroduction to XML price22.00gt
ltauthorgtMark Torrlt/authorgt lt/bookgt
ltbook nameAdvanced XML price32.00gt
ltauthorgtMark Torr amp His Friendlt/authorgt
lt/bookgt lt/booksgt
Document root
Element. Other valid elements in this XML
document are ltbookgt, ltauthorgt. Notice that all
Elements start with a lt and end with a gt
Entity. This is the only entity in this XML
document. Entities always start with an and
end with a
17What is Well Formed XML?
- A well formed XML document must conform to these
rules - All element tags have been closed
- All attribute values are enclosed in quotes
- Nesting of components is correct
- Element tags are of the same case.
18 Lets Try Coding In XML
lt?xml version1.0?gt ltbooksgt ltbook
nameIntroduction to XML price22.00gt
ltauthorgt Mark Torr lt/authorgt
ltpersonal age22/gt lt/bookgt ltbook
nameAdvanced XML price32.00gt
ltauthorgt Mark Torr amp His Friendlt/authorgt
ltpublishergt Harry Smiths lt/publishergt
lt/bookgt lt/booksgt
Save it into your Ctemp file as sample.xml
19Load this into IE5
- IE has parser that will tell you if your XML is
Well Formed - Find the the Ctemp directory
- Find your file named sample.xml
- It will have an icon like this
- Double click on the file to open in Internet
Explorer -
20Congratulations! o)
If your syntax is correct, the screen looks like
this
21Uh Oh! o(
If the screen looks like this, you made an error
22Apply the Rules One At A Time
- A well formed XML document must conform to these
rules - All element tags have been closed
- All attribute values are enclosed in quotes
- Nesting of components is correct
- Element tags are of the same case.
23What next?
- Once you have a well formed document then you are
able to start to use things such as the DOM
(Document Object Model), XSL (EXtensible Style
Language) to transform this into another markup - HTML
- XML (You can subset and combine XML documents
documents to create new XML documents.)
24What next?
- Once a document is well formed you can also check
if it is valid. A badly formed XML document is
always invalid. - To check if an XML document is valid you need to
make use of DTDs. - DTDs
25What is a DTD
- DTD stands for Document Type Definition
- Used to validate XML documents
- Defines a set of rules that the content of one or
more XML files must adhere to in order for the
document to be valid
26What is validation?
- Validation is the process of checking
- Data is in the proper order
- Data is in the proper format
- Data is the correct type
- Any mandatory fields are present
27Book Example DTD
lt!ELEMENT books (book)gt lt!ELEMENT book ( title,
author , price, publisher)gt lt!ELEMENT title
(PCDATA)gt lt!ELEMENT author (PCDATA)gt lt!ELEMENT
price (PCDATA)gt lt!ELEMENT address
(PCDATA)gt lt!ELEMENT publisher (PCDATA)gt lt!ATTLI
ST book Edition CDATA REQUIREDgt lt!ATTLIST
publisher Address CDATA IMPLIEDgt lt!ATTLIST price
Currency (Dollar Pound Yen) "Pound"gt
28Conclusion
- XML is used to define the structure of the data.
- XML does not define any aspect of presentation.
- XML has strict rules that must be adhered to
unlike HTML which is a little more forgiving. - Making an XML document well formed is the first
step towards either doing transformation or
ensuring it is a valid XML document. - IE5 can be used to root out anything that is
making an XML document badly formed provided you
give the file a .xml extension.
29Further reading
- Extensible Markup Language (XML) 1.0
- http//www.w3.org/TR/WD-xml
- Defines the syntax of XML.