Title: Ziele des Kapitels
1Ziele des Kapitels über XML
- Studierende verstehen die Bedeutung von XML.
- Studierende erhalten einen Überblick über die XML
Sprachfamilie. - Studierende lernen, einfache XML Dokumente und
deren Layout zu spezifizieren.
2XML Kapitel Überblick
- XML in 7 points
- Überblick über XML Entwurfsziele und XML als
Familie von Technologien - XML Motivation und erste Beispiele
- Grenzt XML von HTML und SGML ab und zeigt
einfache Anwendungsbeispiele - XML Schema and UML
- Ausführliches Beispiel der Abbildung
- XML Tutorial
- Enthält Spezialbeispiele zur XML Sprachfamilie
3XML
- Extensible Markup Language, abbreviated XML,
describes a class of data objects called XML
documents and partially describes the behavior of
computer programs which process them. XML is an
application profile or restricted form of SGML,
the Standard Generalized Markup Language ISO
8879. By construction, XML documents are
conforming SGML documents. - see http//www.w3.org
4XML
- Definition A software module called an XML
processor is used to read XML documents and
provide access to their content and structure. - Definition It is assumed that an XML processor
is doing its work on behalf of another module,
called the application.
5XML
- XML documents are made up of storage units called
entities, which contain either parsed or unparsed
data. - Parsed data is made up of characters, some of
which form character data, and some of which form
markup. - Markup encodes a description of the document's
storage layout and logical structure. XML
provides a mechanism to impose constraints on the
storage layout and logical structure.
6Design Goals for XML (1)
- The design goals for XML are
- 1. XML shall be straightforwardly usable over the
Internet. - 2. XML shall support a wide variety of
applications. - 3. XML shall be compatible with SGML.
- 4. It shall be easy to write programs which
process XML documents. - 5. The number of optional features in XML is to
be kept to the absolute minimum, ideally zero. -
7Design Goals for XML (2)
- 6. XML documents should be human-legible and
reasonably clear. - 7. The XML design should be prepared quickly.
- 8. The design of XML shall be formal and concise.
- 9. XML documents shall be easy to create.
- 10. Terseness in XML markup is of minimal
importance.
8XML in 7 pointssee also http//www.w3.org/1999/XM
L-in-10-points
- XML, XLink, Namespace, DTD, Schema, CSS,
XHTML,... If you are new to XML, it may be hard
to know where to begin. - XLink, XPointer generalized link concepts
- XSL more powerful than CSS, serves formatting
purposes for XML documents
91. XML is a method for putting structured data in
a text file
- Structured data" spreadsheets, address books,
configuration parameters, financial transactions,
technical drawings, etc. - Text format allows one to look at the data
without the program that produced it. XML is a
set of rules, guidelines, conventions, for
designing text formats for such data, in a way
that produces files that are easy to generate and
read (by a computer) and that are unambiguous and
platform-independent.
102. XML looks a bit like HTML but isn't HTML
- Like HTML, XML makes use of tags (words bracketed
by 'lt' and 'gt') and attributes (of the form
name"value") - While HTML specifies what each tag attribute
means (and often how the text between them will
look in a browser), XML uses the tags only to
delimit pieces of data, and leaves the
interpretation of the data completely to the
application that reads it. - E.g. If you see "ltpgt" in an XML file, don't
assume it is a paragraph. Depending on the
context, it may be a price, a parameter, a
person, etc.
113. XML is text, but isn't meant to be read
- XML files are text files, but theyare not meant
to be read by humans. They are text files,
because that allows experts (such as programmers)
to more easily debug applications. - The rules for XML files are much stricter than
for HTML. A forgotten tag, or an attribute
without quotes makes the file unusable, while in
HTML such practice is often explicitly allowed,
or at least tolerated.
124. XML is a family of technologies
- We will look at the following technologies
- XML
- DTD (Document Type Definition)
- XML Schema, XSchema
- XPath, XPointer
- XInclude,
- XSLT, CSS
- XLink
134. XML is a family of technologies
- XML 1.0 specification that defines what "tags"
and "attributes" are, but around XML 1.0, there
is a growing set of optional modules that provide
sets of tags attributes, or guidelines for
specific tasks.
144. XML is a family of technologies
- When starting with XML, it's important to
realize that XML is not a markup language itself
(like HTML), but it provides rules (like SGML)
for defining a markup language. The names of the
tags are up to the authors. - Example ltmyFirstTaggt
- ltHello/gt
- ltWorld/gt
- lt/myFirstTaggt
- So XML is about the characters, tags can consist
of and defines a set of rules for
well-formedness. (every opening tag must have a
closing tag, )
15DTD's solve the problem of defining the structure
of a document. Example
ltmyFirstTag myFirstAttribute"Hello World"gt
ltHello/gt ltWorld/gt lt/myFirstTaggt
A correct DTD tells, how the tags should be
arranged, to form a valid document, or what
attributes a tag can include. In our case a DTD
tells the following
- The name of the main tag is "myFirstTag"
- myFirstTag has an attribute myFirstAttribute
- Inside myFirstTag there must exist two tags,
"Hello" and "World"
The main problem of the DTD concept is, that it
tells nothing about the content of a tag (data
type, format, pattern, )
ltmyFirstTaggt12.3lt/myFirstTaggt ltmySecondTaggtText
Contentlt/mySecondTaggt
164. XML is a family of technologies
- XML Schemas 1 and 2 help developers to precisely
define their own XML-based formats. - XSchema fills the gaps of the DTD concept. In
fact you can replace your DTD's entirely by an
"XML Schema - In addition to a DTD you can write exact rules
about the content of attributes and tags. This
includes - data types (integer, string),
- patterns (the content has to be a valid email
address) - lists of tokens
174. XML is a family of technologies
- If we talk about XML as set of rules that define
the syntax of a document, then XSchema's purpose
is to specify a pattern/semantic for a special
purpose. - Example
- myFirstTag has to contain a number between 5 and
10.7, that has to have 5 decimal places (6.30020)
184. XML is a family of technologies
- "XPath (XML Path Language) is a language for
addressing parts of an XML document, designed to
be used by both XSLT and XPointer. - In other words XPath provides the functionality
to jump to a certain part of an XML Document.
It's like a bookmark to identify a certain
point/part of an XML Document.
194. XML is a family of technologies
- "XPointer, which is based on XPath, supports
addressing into the internal structures of XML
documents. It allows for traversals of a document
tree and choice of its internal parts based on
various properties, such as element types,
attribute values, character content, and relative
position. - Basically works like anchors in html
- ltlink xlinkhref"mydocument.xmlxpointer(//AAA/BB
B1)"/gt
204. XML is a family of technologies
- XInclude allows to include documents / parts of
documents into a XML document (mostly like
programming languages do through include).
214. XML is a family of technologies
- CSS, the style sheet language, is applicable to
XML as it is to HTML. - XSL (Extensible Stylesheet Language) is the
advanced language for expresing style sheets. - A transformation expressed in XSLT describes
rules for transforming a source tree into a
result tree. The transformation is achieved by
associating patterns with templates. - A pattern is matched against elements in the
source tree. - A template is instantiated to create part of the
result tree. - The result tree is separate from the source tree.
- The structure of the result tree can be
completely different from the structure of the
source tree.
22In constructing the result tree, elements from
the source tree can be filtered and reordered,
and arbitrary structure can be added.
A transformation expressed in XSLT is called a
stylesheet. This is because, in the case when
XSLT is transforming into the XSL formatting
vocabulary, the transformation functions as a
stylesheet.
In other words XSL provides you with a way to
transform any XML input to match a certain
purpose.
- Examples
- transform an order, saved in XML, to a delivery
note - transform UML logic, saved in XML, to class
definitions in a given programming language.
234. XML is a family of technologies
- Xlink (still in development) describes a standard
way to add hyperlinks to an XML file. XPointer
XPath are syntaxes for pointing to parts of an
XML document.
24"XLink (XML Linking Language) allows elements to
be inserted into XML documents in order to create
and describe links between resources. It uses
XML syntax to create structures that can describe
links similar to the simple unidirectional
hyperlinks of today's HTML, as well as more
sophisticated links."
With XLink's you can replace today's lta hrefgt
links, and tie links together like resource
chains. XLink's can contain information about the
context, they are used in and about additional
information available.
254. XML is a family of technologies (3)
- The DOM is a standard set of function calls for
manipulating XML (and HTML) files from a
programming language. - XML Namespaces is a specification that describes
how you can associate a URL with every single tag
and attribute in an XML document. What that URL
is used for is up to the application that reads
the URL, though. - RDF, W3C's standard for metadata, uses it to link
every piece of metadata to a file defining the
type of that data.
265. XML is verbose, but that is not a problem
- Since XML is a text format, and it uses tags to
delimit the data, XML files are nearly always
larger than comparable binary formats. That was a
conscious decision by the XML developers. The
advantages of a text format are evident (see 3
above), and the disadvantages can usually be
compensated at a different level. - Communication protocols such as modem protocols
and HTTP/1.1 (the core protocol of the Web) can
compress data on the fly.
276. XML is new, but not that new
- Development of XML started in 1996 and it is a
W3C standard since February 1998. - The technology isn't very new. Before XML there
was SGML, developed in the early '80s, an ISO
standard since 1986, and widely used for large
documentation projects. - For HTML, development started in 1990.
- The designers of XML simply took the best parts
of SGML, guided by the experience with HTML, and
produced something that is no less powerful than
SGML, but vastly more regular and simpler to use.
287. XML is license-free, platform-independent and
well-supported
- By choosing XML as the basis for some project,
you buy into a large and growing community of
tools and engineers experienced in the
technology. Opting for XML is a bit like choosing
SQL for databases you still have to build your
own database and your own programs/procedures
that manipulate it, but there are many tools
available and many people that can help you. - XML isn't always the best solution, but it is
always worth considering.