XML once over lightly - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

XML once over lightly

Description:

An XML document may optionally be validated against a DTD and/or a Schema. ... conversion between native python datatypes and soap syntax. ... – PowerPoint PPT presentation

Number of Views:98
Avg rating:3.0/5.0
Slides: 39
Provided by: mrt
Category:
Tags: xml | lightly | once | over

less

Transcript and Presenter's Notes

Title: XML once over lightly


1
XML- once over lightly
  • Keith Beattie
  • Mary Thompson
  • Monte Goode
  • Karlo Berket

2
Outline
  • Introduction
  • Namespaces
  • Xpath, Xpointer, Xinclude, XLink (ksb)
  • XMLSchema Schema
  • A few well known schemas
  • Parsing
  • Java tools
  • Python tools
  • Authoring tools

3
eXtensible Markup Language XML
  • XML structure and data expressed so that it can
    be parsed programmatically, is relatively
    human-readable, and is eXtensible
  • XML Schemas not just valid XML, but conforming
    to some model

4
XML - Intro
  • Subset of SGML
  • Goals
  • Usable over the Internet (like HTML)
  • Easy to write write tools for (authoring tools,
    filters, translators, etc.)
  • Compatible with SGML (apply existing tools)
  • Not dependant on a DTD
  • Minimum of optional features, for compatibility
  • Human readable (text, not binary)
  • Terseness is of minimal importance

5
XML - Definition
  • Specification for creating new markup languages
    (for structured documents)
  • XML does not define a set of tags you do that
  • An XML document must be well-formed
    (syntactically correct)
  • Tags properly nested, opened and closed, correct
    use of entities, etc.
  • An XML document may optionally be validated
    against a DTD and/or a Schema.
  • A DTD or Schema defines the new markup language
    (but is not required).

6
XML - Example
  • lt?xml version'1.0' encoding'utf-8'?gt
  • lt!DOCTYPE addressbook SYSTEM "addressbook.dtd"gt
  • lt!-- ksb's addressbook --gt
  • ltaddressbookgt
  • ltpersongt
  • ltnamegtKeith Jacksonlt/namegt
  • ltphone type'moble'gt555-555-5555lt/phonegt
  • ltphone type'land'gt999-999-9999lt/phonegt
  • ltaddressgt1 Any Lane, Berkeley,
    94111lt/addressgt
  • ltpasswordgtlt!CDATAd_at_n'tuegtltegtlt/passwordgt
  • lt/persongt
  • ltpersongt
  • ltnamegtAndrew amp Lisa Wrightlt/namegt
  • ltphone type'land'gt888-888-8888lt/phonegt
  • ltphone type'moble'gt444-444-4444lt/phonegt
  • ltaddress/gt
  • lt/persongt
  • lt/addressbookgt

7
DTD - Example
  • lt?xml version'1.0' encoding'utf-8'?gt
  • lt!-- DTD for ksb's addressbook --gt
  • lt!ELEMENT addressbook (person)gt
  • lt!ELEMENT person (name, phone, address,
    password?)gt
  • lt!ELEMENT name (PCDATA)gt
  • lt!ELEMENT phone (PCDATA)gt
  • lt!ATTLIST phone type (mobleland) REQUIREDgt
  • lt!ELEMENT address ANYgt
  • lt!ELEMENT password ANYgt
  • Limited type definition
  • No hierarchy of elements
  • XML Schema addresses these shortcomings

8
XML Namespaces Intro
  • A means to avoid name conflicts between XML
    elements and attributes
  • Elements and attributes can now have the form
    ltprefixgtltlocal namegt where ltprefixgt is the
    namespace id associated with a URI
  • Since URIs define either a physical or an
    abstract resource it doesnt need to actually
    exist

9
XML Namespace URIs
  • URIs have two general forms
  • URL
  • http//www.lbl.gov/akenti
  • URN
  • urnwww.lbl.govakenti
  • urnuuidA941CFC3-736E-48E5-A691-C5B2FE036555
  • A namespace URI is just a string with some
    guarantee of uniqueness

10
XML Namespace Example
  • lt?xml version'1.0' encoding'utf-8'?gt
  • lt!-- ksb's addressbook --gt
  • ltksbaddressbook
  • xmlnsksb'http//www.stobo.org/address
    book
  • xmlnslbl'http//www.lbl.gov/addressbo
    ok
  • xmlns'http//www.w3.org/addressbook'gt
  • ltksbpersongt
  • ltlblnamegtKeith Jacksonlt/lblnamegt
  • ltphone ksbtype'moble'gt555-555-5555lt/phonegt
  • ltphone ksbtype'land'gt999-999-9999lt/phonegt
  • ltaddressgt1 Any Lane, Berkeley,
    94111lt/addressgt
  • ltksbpasswordgtlt!CDATAd_at_n'tuemegtlt/ksbpa
    sswordgt
  • lt/ksbpersongt
  • lt/ksbaddressbookgt

11
Some XML syntaxes
  • XPath XML Path Language -
  • searches an XML document using a path-like
    string. (more on this later)
  • XPointer
  • an addressing scheme for individual parts of an
    XML document. think ltA NAME"foo"gt in html.
  • XInclude
  • include for XML documents.
  • XLink
  • linking syntax for XML docs. think ltA
    HREF"foo"gt in html.

12
Xpath - Syntax
  • XPath does not use XML syntax but its own query
    strings
  • Simple example
  • ./xp.py "person/phone" lt addressbook.xml
  • ltElement Node Name'phone' with 1 attributes and
    1 childrengt
  • ltElement Node Name'phone' with 1 attributes and
    1 childrengt
  • ltElement Node Name'phone' with 1 attributes and
    1 childrengt
  • ltElement Node Name'phone' with 1 attributes and
    1 childrengt
  • Not as simple example
  • ./xp.py "/addressbook/personname'Keith
    Jackson'/phone_at_type'moble'/text()" lt
    addressbook.xml
  • 555-555-5555

13
Xpath - Intro
  • The grep (or query language) of XML
  • An XPath expression is applied to an XML document
    (or DOM node) and returns one of the following
  • A Collection of nodes
  • A Boolean value
  • A floating-point value
  • A String
  • Used primarily in transforms (XSLT) and XPointer

14
Xpath - Syntax
  • XPath does not use XML syntax but its own query
    strings
  • Simple example
  • ./xp.py "person/phone" lt addressbook.xml
  • ltElement Node Name'phone' with 1 attributes and
    1 childrengt
  • ltElement Node Name'phone' with 1 attributes and
    1 childrengt
  • ltElement Node Name'phone' with 1 attributes and
    1 childrengt
  • ltElement Node Name'phone' with 1 attributes and
    1 childrengt
  • Not as simple example
  • ./xp.py "/addressbook/personname'Keith
    Jackson'/phone_at_type'moble'/text()" lt
    addressbook.xml
  • 555-555-5555

15
Well Known XML Schemas
  • XML Schema language to define other schemas
  • XSLT (Extensible StyleSheet Language
    Transformation) XML-to-XML translation
  • SOAP (Simple Object Access Protocol) RPC
    mechanism and serialization format (typically
    over HTTP)
  • WSDL (Web Services Description Language)
    network services as endpoints operating on
    messages

16
XML Schema
  • Language to define other schema
  • http//www.w3.org/TR/xmlschema-0/
  • Defines
  • element, sequence, choice, attributes,
    complexType, simpleType
  • Data types
  • int, long, string, token, dateTime
  • http//www.w3.org/TR/xmlschema-2/

17
Example Schema
ltxsschema xmlnsxshttp//www.w3.org/2001/XMLSch
ema" targetNamespace"http//www-stobo.org/address
book"gt ltxselement nameaddressbookgt
ltxscomplexTypegt ltxssequencegt
ltxselement nameperson typepersonTypegt
lt/xssequencegt lt/xscomplexTypegt lt/xselementgt
ltxscomplexType namepersonTypegt
ltxssequencegt ltxselement namename
typestringgt ltxselement namephone
minOccurs0 maxOccursunbounded
typephoneTypegt ltxselement nameaddress
maxOccursunbounded typestringgt
ltxselement namepassword minOccurs0gt
lt/xssequencegt lt/xscomplexTypegt
18
Example Schema (cont)
ltxssimpleType namephoneTypegt
ltxsrestriction basexsstringgt
lt/xsrestrictiongt ltxsattribute nametype
userequired ltxssimpleTypegt
ltxsrestriction basexsstringgt
ltxsenumeration valuemoble/gt
ltxsenumeration valueland/gt
lt/xsrestrictiongt lt/xssimpleTypegt
lt/xsattributegt lt/xscomplexTypegt
19
XSL XSLT
  • XSL - XML Stylesheet
  • a vocabulary for specifying formatting
  • XSLT - XSL Transforms
  • Used to map one XML schema to another
  • Source tree -gt result tree

20
SOAP
  • Simple Object Access Protocol
  • originally intended to implement RPC
  • now used for any kind of XML message
  • http//www.w3.org/TR/soap12-part0,part1,part1
  • Envelope
  • Header, Body, Fault
  • Body and Headers are defined as sequences of
    anything

21
SOAP (cont)
  • SOAP processing model
  • assumes a distributed model where the messages go
    from a sender node through some number of
    intermediary nodes to a Ultimate receiver nodes.
  • The nodes can be addressed by role names and each
    can do some processing of the message

22
WSDL
  • Web Services Description Language
  • http//www.w3.org/TR/wsdl12
  • Describes the messages
  • Operations - exchanges of messages
  • PortTypes - collection of operations
  • There is a binding document that describes SOAP,
    HTTP and MIME bindings

23
Resource Description Framework - RDF
  • W3C Semantic Web Activity
  • provides a model for metadata
  • for use in knowlege representation systems
  • describes the context of a document, so that
    documents can be searched for by content. -
    applies to non-text documents

24
XML Security Languages
  • Define vocabularies to express
  • Authentication assertions or identity
    credentials
  • Authorization assertions or licenses
  • Attribute assertions - attribute credentials
  • Enabling secure XML messages
  • Queries and Responses about security questions
  • Security policies - requirements of the resource
  • Security contexts - information about the user

25
XML Security Schemas
  • XML Signature (W3C and IETF)
  • Digital signatures for XML transactions
  • XML Encryption (W3C)
  • Encrypting data and representing the results in
    XML
  • XKMS - key management (W3C)
  • (MS, Verisign) delegate signature processing to
    a trust server on WWW (for thin or mobile clients)

26
XML Digital Signature
  • Specifies XML digital signature processing rules
    and syntax
  • Digital Signatures provide
  • Message integrity
  • Message authentication
  • Signer authentication
  • Signed data can be within the XML that includes
    the signature or elsewhere
  • Can sign all or part of a document

27
XML Encryption
  • Process for encrypting data and representing the
    result in XML
  • Data can be
  • arbitrary data (including XML document)
  • an XML element or element content
  • An XML encryption element
  • contains or references the cipher data
  • specifies the encryption method
  • specifies key info.

28
XML Schema Reference Card
  • Qualifier URN
  • xs http//www.w3.org/2001/XMLSchema
  • S http//www.w3.org/2002/06/soap-envelope
  • xsl http//www.w3.org/1999/XSL/Format
  • xslt http//www.w3.org/1999/XSL/Transform
  • Xpath http//www.w3.org/TR/xpath
  • dsig http//www.w3.org/2000/09/xmldsig
  • xenc http//www.w3.org/2001/04/xmlenc
  • xkms http//www.xkms.org/schema/xkms-2001-01-20
  • saml urnoasisnamestcSAML1.0assertion
  • samlp urnoasisnamestcSAML1.0protocol
  • xacml urnoasisnamestcxacml1.0policy
  • xacml-content urnoasisnamestcxacml1.0context

29
DOM - Document Object Model (W3C recommendation)
  • DOM Structure
  • Document, element, entity reference, text...
  • Nodes, NodeList, NamedNodeMap
  • Fundamental Interfaces
  • Document
  • createDocument, createAttribute, createElement,
    getElementByTagName
  • NoderemoveChild
  • NodeListgetNamedItem

30
DOM Parsers
  • Parses the entire document into a DOM tree.
  • Provide functions to examine pieces of the tree
  • Provides a createDocument interface which
    generates a XML document from the DOM tree

31
Simple API for XMLSAX
  • Started as a Java Event-based parser for XML
  • Reports parsing events through call-backs.
  • startElement(localName,qName,Attributes)
  • Attributes getName, getType, getValue
  • characters
  • http//www.saxproject.org/ distributed through
    SourceForge

32
Parsers
  • Xerces - Apache
  • Java, C Perl, COM wrappers
  • Does both DOM and SAX parsing
  • Validates
  • Java only
  • IBM XML4J, Microsoft, Oracle, Netscape
  • Suns JAXP (DOM and SAX)
  • C
  • XPAT - c (non validiating)

33
Java Tools
  • J2SE (since 1.4) was JAXP1.0
  • SAX and DOM interfaces
  • Interface for XSLT (generic, SAX, DOM, streams)
  • Java WSDP
  • JAXP 1.2.2 parsers and XSLT
  • JAXB schema -gt Java classes
  • JAXR access to XML registries
  • JAX-RPC SOAP-RPC
  • SAAJ Soap with attachments

34
Java Tools
  • Apache
  • Xerces SAX and DOM parser
  • Xalan XSLT and Xpath
  • Xindice native XML database
  • IBM
  • XML Processing Plus Plus stream-based APIs
  • XML Parser for Java
  • X-IT batch processing and transformation of XML
    files

35
Python Tools
  • Python Standard Library
  • provides both a SAX and DOM interface
  • interface to the Expat parser (Expat is a stream
    and callback based parser similar to SAX).
  • - PyXML a richer set of xml tools
  • fuller featured DOM implementations
  • adapters to allow various parsers to use the SAX
    interface,
  • XPath expression parsing and XSLT transform
    tools.
  • link http//pyxml.sourceforge.net/

36
More Python Tools
  • 4Suite
  • even more SAX/DOM implementations
  • plus XPath, XPointer,XInclude, XLink
  • RDF support including an XML/RDF data
    respository and server (data access, query,
    transfomation, etc).
  • link http//www.4suite.org/

37
Python Soap Support
  • SOAPy - WSDL/Web Service oriented SOAP lib.
  • Provides a stand-alone http-like server to run
    your soap services.
  • link http//soapy.sourceforge.net/
  • ZSI (Zolera Soap Infrastructure)
  • offers robust support for strict typing
  • conversion between native python datatypes and
    soap syntax.
  • provides a stand-alone http-like server to run
    your soap services.
  • link http//pywebsvcs.sourceforge.net/

38
Editing Tools
  • PSGML major mode for Emacs
  • Navigation, colorization, validation and other
    editing functions.
  • Supports loading of DTDs, maintaining a DTD
    library, and some editing based on said.
  • Done in a standard emacs-like text view.
  • http//sourceforge.net/projects/psgml
  • Gui XML editors
  • Offers tree view of XML document with text box
    editing fields.
  • Marginal utility.
Write a Comment
User Comments (0)
About PowerShow.com