AN INTRODUCTION TO XML Well look at XidML later

1 / 34
About This Presentation
Title:

AN INTRODUCTION TO XML Well look at XidML later

Description:

An industry standard language for the creation of structured documents ... Can be made incomprehensible and proprietary. Yet another 'language' with a learning curve ... – PowerPoint PPT presentation

Number of Views:87
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: AN INTRODUCTION TO XML Well look at XidML later


1
AN INTRODUCTION TO XML (Well look at XidML
later)
  • SESSION 1

2
THIS SESSION
  • What is XML?
  • History of XML
  • Applications of XML
  • Tools for XML (more in session 9)
  • Pros and Cons of XML
  • Data Interchange with XML (more in session 9)
  • Later, we will discuss
  • Why XidML?
  • The XidML schema in detail

3
WHAT IS XML?
  • eXtensible Markup Language
  • An industry standard language for the creation of
    structured documents
  • By Marking up data we mean inserting tags (i.e.
    elements or attributes) or other information into
    the document so that it is easier to process

ltBookgt ltName typetextgtThe
Outsiderlt/Namegt ltAuthor typetextgt Albert
Camuslt/Authorgt ltCategory typetextgt
Fictionlt/Categorygt ltPublished typedategtlt/Publis
hedgt ltPrice currencyUSD Typelistpricegt
19.99lt/Pricegt ltInStock typeIntegergt4lt/InStockgt
ltDiscount typebooleangtNolt/Discountgt lt/Bookgt
  • By the way This example is often used to
    illustrate XML, if this XML was developed by
    xidml.org the type attributes would be in a
    schema BooksML and the currency would be an
    Element (more later)

4
XML IS MORE THAN A BETTER ASCII
  • ASCII is normal text
  • Human readable (evolved from typewriter)
  • Many tools to manipulate it
  • Can be easily converted to proprietary (e.g.
    Word, Excel)
  • Plain Text - Low information

Name Author Category Published Price In Stock
Discount The Outsider Albert Camus Fiction 19.99
4 No The Road to McCarthy P. McCarthy Travel 2002
24.99 3 Yes
  • No context
  • No Structure
  • Needs Human interpretation

5
XML IS MORE THAN A BETTER CSV
  • Comma Separated Variables is structured ASCII
  • Human readable
  • Can be easily converted to proprietary (e.g.
    Word, Excel)
  • More information

Name, Author, Category, Published, Price, In
Stock, Discount The Outsider, Albert Camus,
Fiction,, 19.99, 4, No The Road to McCarthy, P.
McCarthy, Travel, 2002, 24.99, 3, Yes
  • Missing Information highlighted
  • Words grouped correctly
  • Still no context
  • Needs human intervention to import/export

6
XML
  • Adds context and structure

ltBookgt ltName typetextgtThe Outsiderlt/Namegt ltAu
thor typetextgt Albert Camuslt/Authorgt ltCategory
typetextgt Fictionlt/Categorygt ltPublished
typedategtlt/Publishedgt ltPrice currencyUSD
Typelistpricegt 19.99lt/Pricegt ltInStock
typeIntegergt4lt/InStockgt ltDiscount
typebooleangtNolt/Discountgt lt/Bookgt
  • Context added
  • Type information added
  • Can be automatically manipulated
  • Information rich
  • Needs rule book to understand

7
IF WE USED BookML
ltBook Reference "BK0001"gt ltNamegtThe XidML
handbooklt/Namegt ltAuthorsgt ltAuthor
Index"0"gt ltAuthorgtSid Emellt/Authorgt lt/Author
gt ltAuthor Index"1"gt ltAuthorgtJoe
Bloggslt/Authorgt lt/Authorgt lt/Authorsgt ltCategory
gtReferenceltCategorygt ltPublishedDategt27 March
2005 lt/PublishedDategt ltCurrencygtUsDollarslt/Curren
cygt ltAmountgt19.99lt/Amountgt ltNumberInStockgt20lt/Nu
mberInStockgt ltDiscountgtYeslt/Discountgt lt/Bookgt
8
A WORLDWIDE STANDARD
  • Who Controls it?
  • W3C (World Wide Web Consortium www.w3.org)
  • Adobe, AmEx, ATT, Boeing, Computer Associates,
    Ericcson, HP, Intel, Microsoft, Oracle, Siemens,
    Xerox etc.
  • Released in 1997
  • Intended as a means of distributing context rich
    documents on the internet
  • A lot of information on the internet
    (www.xml.org)

9
HISTORY OF XML
  • XML is not new!!!!
  • GML General Markup Language
  • Developed by IBM in the 1960s - good for humans,
    easy manipulation
  • Allowed a lot of cheatingmany variations
  • SGML Standard General Markup Language
  • ISO standard in 1986
  • Standardized document validation and interchange
  • HTML Hyper Text Markup Language
  • Evolved along with World Wide Web for document
    interchange
  • Loose standard
  • Focus on presentation, not content or structure
  • XML eXtensible Markup Language
  • Addresses weaknesses of HTML
  • A meta-language for defining structure and
    context
  • Focus on data and data interchange
  • First standard (a sub-set of SGML) released in
    1997

10
TOOLS FOR XML
  • XML has a lot of support tools (some free)
  • Designed to be easy to develop tools
  • (2 weeks for competent computer science
    graduate)
  • Parsers
  • Off the shelf tools for reading XML files
  • Stylesheets
  • A language (XSL) for transforming XML into
    other documents
  • Controls how browsers display XML files
  • Validators
  • Checks XML files for correct structure and syntax
  • Checks field values for range, format etc.
  • Needs to be told what the XML structure should be
    (Schema)
  • Editing Tools
  • Graphical environments that integrate other tools
  • Allows XML, schema, stylesheets to be developed
  • Validate files with strong error checking

11
ADVANTAGES
  • Allows specialists to extend language for a
    domain
  • (music, mathematics, graphics, flight test
    instrumentation)
  • Self-documenting
  • With a schema
  • Robust, recoverable and future-proof
  • XML formats are text-based, making them more
    readable, easier to document, and easier to
    debug.
  • Off the shelf tools
  • Tools are available on different platforms,
    making it simpler to use XML instead of binary
    formats to exchange complex information streams.
  • Maps very well onto structured data (e.g.
    databases)
  • Allows creation of own-labelled structures for
    storing information.
  • Well understood and supported

12
DISADVANTAGES
  • Different specialists developing different
    standards!
  • Can be made incomprehensible and proprietary
  • Yet another language with a learning curve
  • Files can be large
  • Not (inherently) an indexed medium
  • No native support for libraries and nested files
  • No native support for archiving and revision
    control (of the data as opposed to the schema)

13
APPLICATIONS OF XML
  • XML is good for
  • Data interchange between computers, platforms and
    applications
  • Representing structured data for automated
    processing
  • Future-proofing data
  • XML is not good for
  • Data storage/exchange in a homogenous environment
  • It is not a Database (despite the hype!)

14
EXAMPLE - PROBLEM
Test Vehicle
Ground
Flight Test DAU
Ground Station
Vendor 1 Database
Vendor 2 Database
Same information (or a subset) required in each
place Different database structures Different
vendors
FTE
Test Config Database
15
EXAMPLE SOLUTION 1
Test Vehicle
Ground
Flight Test DAU
Ground Station
Vendor 1 Database
Vendor 2 Database
Custom SQL Application
Custom SQL Application
Database structure cannot change Very static No
flexibility Hard to maintain Must learn database
mapping for each vendor
FTE
Test Config Database
16
EXAMPLE SOLUTION 2
Test Vehicle
Ground
Flight Test DAU
Ground Station
Vendor 1 Database
Vendor 2 Database
Must learn to write/read a new format for each
vendor Must learn database mapping for each
vendor Freedom to change databases
FTE
Test Config Database
Proprietary File Format
Proprietary File Format
17
EXAMPLE SOLUTION 3
Test Vehicle
Ground
Flight Test DAU
Ground Station
Tools available to Assist XML transformation
Vendor 1 Database
Vendor 2 Database
XML I
Must learn database mapping for each
vendor Freedom to change databases Read/Writing
significantly easier
FTE
Test Config Database
XML III
XML II
18
EXAMPLE SOLUTION 4
Test Vehicle
Ground
Flight Test DAU
Ground Station
Vendor 1 Database
Vendor 2 Database
XML
Must learn database mapping for a single XML
file Freedom to change databases Read/Writing
significantly easier
FTE
Test Config Database
19
SO FAR
  • XML A world standard for data interchange
  • A better ASCII
  • Community of XML developers for support and
    development
  • Not a database - but closely tied to databases
  • An optimal solution to the problem of data
    interchange
  • Development effort still required

20
XML SCHEMAS
  • Specifies the grammars for an XML document
  • Provides a description of the document structure
    and meaning (meta-meta-data)
  • A schema is necessary and sufficient to fully
    describe an XML grammar
  • Provides support for development and
    documentation
  • Tools exist to automatically build data model
    diagrams from a schema
  • Tools exist to automatically compare an XML
    document against its schema and test for validity
    (or not)
  • A schema can consist of many sub-schema that
    provide descriptions of sub-sets of the document.
  • When we talk about a specialist implementation of
    XML we really refer to a publicly released schema
    for that domain
  • MathML for mathematics
  • SensorML for sensors
  • XidML for data acquisition and processing

21
XML STYLESHEETS (XMLT)
  • XML stylesheets can be used to transform XML
    documents to almost any target format
  • Best used for XML or HTML target formats
  • Can avoid coding with a parser
  • Stylesheets are all about presentation
  • Single source document can be used to generate
    different reports for different audiences
  • An XML transform can be applied to an XML
    document by including a Stylesheet processing
    instruction  
  • Single line added before the XML node
  • Any program that understands XMLT (such as
    Internet Explorer) can use the stylesheet to
    transform the XML into the target format.

22
XPath
  • Used as a general-purpose query notation for
    addressing and filtering the elements and text of
    XML documents
  • Usually used in conjunction with XMLT (or a
    parser) to extract data from a source XML
    document and transform this data to a different
    format
  • Allows tools for searching XML files to be built

23
DATA, META-DATA, AND
24
DOCUMENTING XML
  • Sources of information
  • The schema
  • A structured description of all possible
    elements, acceptable data types and values
  • The documentation
  • Word document or HTML file with graphical
    description of elements
  • Useful when programming

25
THE SCHEMA
  • The schema
  • A structured description of all possible
    elements, acceptable data types and values
  • Actually, there are many files they combine to
    describe the schema
  • Not really for humans to read - more useful when
    used in conjunction with a tool like XMLSpy

26
THE SCHEMA - TREE
27
THE SCHEMA - GRID
28
XML DOCUMENTATION
  • Usually HTML or Word document
  • Every element is described
  • Data model diagrams provide a lot of information
    about the element
  • Can drill down into more and more detail as
    required
  • Provides information on cross-referencing

29
DATA MODEL DIAGRAM - FRAGMENT
Dotted line indicates optional element
indicates contains hidden child elements
(click to expand)
Indicates number of instances allowed
Connector indicates a sequence, a choice or all
30
DATA MODEL DIAGRAM - EXAMPLE
31
THREE VIEWS OF THE SAME DATA
Schema Documentation
32
ATTRIBUTES AND ELEMENTS
ltDataLinksgt ltDataLinkSetgt ltX-DataLink-1.0
Name"MyRS232_Link1"gt ltDataBitsPerWordgt8lt/Data
BitsPerWordgt lt/X-DataLink-1.0gt lt/DataLinkSetgt
lt/DataLinksgt
33
THIS SESSION COVERED
  • What is XML?
  • Main elements of XML
  • How XML is documented
  • Next, we will talk about XidML

34
END OF SESSION 1
Write a Comment
User Comments (0)