XML for Ecommerce II - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

XML for Ecommerce II

Description:

XML processor is used to read XML documents and provide access ... Document Object Model (DOM) is a tree-based API for XML and HTML documents. Event-based API ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 45
Provided by: helenaah
Category:
Tags: xml | dom | ecommerce

less

Transcript and Presenter's Notes

Title: XML for Ecommerce II


1
XML for E-commerce II
  • Helena Ahonen-Myka

2
XML processing model
  • XML processor is used to read XML documents and
    provide access to their content and structure
  • XML processor works for some application
  • the specification defines which information the
    processor should provide to the application

3
Parsing
  • input an XML document
  • basic task is the document well-formed?
  • Validating parsers additionally is the document
    valid?

4
Parsing
  • parsers produce data structures, which other
    tools and applications can use
  • two kind of APIs tree-based and event-based

5
Tree-based API
  • compiles an XML document into an internal tree
    structure
  • allows an application to navigate the tree
  • Document Object Model (DOM) is a tree-based API
    for XML and HTML documents

6
Event-based API
  • reports parsing events (such as start and end of
    elements) directly to the application through
    callbacks
  • the application implements handlers to deal with
    the different events
  • Simple API for XML (SAX)

7
Example
lt?xml version1.0gt ltdocgt
ltparagtHello, world!lt/paragt lt/docgt
  • Events

start document start element doc start element
para characters Hello, world! end element
para end element doc
8
Example (cont.)
  • an application handles these events just as it
    would handle events from a graphical user
    interface (mouse clicks, etc) as the events occur
  • no need to cache the entire document in memory or
    secondary storage

9
Tree-based vs. event-based
  • tree-based APIs are useful for a wide range of
    applications, but they may need a lot of
    resources (if the document is large)
  • some applications may need to build their own
    tree structures, and it is very inefficient to
    build a parse tree only to map it to another tree

10
Tree-based vs. event-based
  • an event-based API is simpler, lower-level access
    to an XML document
  • as document is processed sequentially, one can
    parse documents much larger than the available
    system memory
  • own data structures can be constructed using own
    callback event handlers

11
We need a parser...
  • Apache Xerces http//xml.apache.org
  • IBM XML4J http//alphaworks.ibm.com
  • XP http//www.jclark.com/xml/xp
  • many others

12
and the SAX classes
  • http//www.megginson.com/SAX/
  • often the SAX classes come bundled to the parser
    distribution
  • some parsers only support SAX 1.0, the latest
    version is 2.0

13
Starting a SAX parser
import org.xml.sax.XMLReader import
org.apache.xerces.parsers.SAXParser XMLReader
parser new SAXParser() parser.parse(uri)
14
Content handlers
  • In order to let the application do something
    useful with XML data as it is being parsed, we
    must register handlers with the SAX parser
  • handler is a set of callbacks application code
    can be run at important events within a
    documents parsing

15
Core handler interfaces in SAX
  • org.xml.sax.ContentHandler
  • org.xml.sax.ErrorHandler
  • org.xml.sax.DTDHandler
  • org.xml.sax.EntityResolver

16
Custom application classes
  • custom application classes that perform specific
    actions within the parsing process can implement
    each of the core interfaces
  • implementation classes can be registered with the
    parser with the methods setContentHandler(), etc.

17
Example content handlers
class MyContentHandler implements ContentHandler
public void startDocument() throws
SAXException System.out.println(Parsing
begins) public void endDocument()
throws SAXException System.out.println(
...Parsing ends.)
18
Element handlers
public void startElement (String namespaceURI,
String
localName,
String rawName,
Attributes atts) throws SAXexception
System.out.print(startElement
localName) if (!namespaceURI.equals())
System.out.println( in namespace
namespaceURI
( rawname )) else
System.out.println( has no associated
namespace) for (int I0 Iltatts.getLength()
I) System.out.println( Attribute
atts.getLocalName(I)
atts.getValue(I))
19
endElement
public void endElement(String namespaceURI,
String
localName,
String rawName) throws SAXException
System.out.println(endElement localName
\n)
20
Character data
public void characters (char ch, int start, int
end) throws SAXException String s
new String(ch, start, end)
System.out.println(characters s)
  • parser may return all contiguous character data
    at once, or split the data up into multiple
    method invocations

21
Processing instructions
  • XML documents may contain processing instructions
    (PIs)
  • a processing instruction tells an application to
    perform some specific task
  • form lt?target instructions?gt

22
Handlers for PIs
public void processingInstruction (String
target,
String data) throws
SAXException System.out.println(PI
Target target
and Data data)
  • Application could receive instructions and set
    variables or execute methods to perform
    application-specific processing

23
Validation
  • some parsers are validating, some non-validating
  • some parsers can do both
  • SAX method to turn validation on

parser.setFeature (http//xml.org/sax/features/va
lidation, true)
24
Ignorable whitespace
  • validating parser can decide which whitespace can
    be ignored
  • for a non-validating parser, all whitespace is
    just characters
  • content handler

public void ignorableWhitespace (char ch, int
start,
int end)
25
XML Schema
  • DTDs have drawbacks
  • They can only define the element structure and
    attributes
  • They cannot define any database-like constraints
    for elements
  • Value (min, max, etc.)
  • Type (integer, string, etc.)
  • DTDs are not written in XML and cannot thus be
    processed with the same tools as XML documents,
    XSL(T), etc.
  • XML Schema
  • Is written in XML
  • Avoids most of the DTD drawbacks

26
XML Schema
  • XML Schema Part 1 Structures
  • Element structure definition as with DTD
    Elements, attributes, also enhanced ways to
    control structures
  • XML Schema Part 2 Datatypes
  • Primitive datatypes (string, boolean, float,
    etc.)
  • Derived datatypes from primitive datatypes (time,
    recurringDate)
  • Constraining facets for each datatype (minLength,
    maxLength, pattern, precision, etc.)
  • Information about Schemas
  • http//www.w3c.org/XML/Schema/

27
Complex and simple types
  • complex types allow elements in their content
    and may have attributes
  • simple types cannot have element content and
    cannot have attributes

28
Reminder DTD declarations
  • lt!ELEMENT name (fname, lname)gt
  • lt!ELEMENT address (name, street, (city, state,
    zipcode) (zipcode, city))gt
  • lt!ELEMENT contact
    (address, phone, email?)gt
  • lt!ELEMENT contact2 (address
    phone email)gt

29
Example USAddress type
ltxsdcomplexType nameUSAddress gt
ltxsdsequencegt ltxsdelement namename
typexsdstring /gt ltxsdelement
namestreet typexsdstring /gt
ltxsdelement namecity typexsdstring /gt
ltxsdelement namestate typexsdstring
/gt ltxsdelement namezip
typexsddecimal /gt lt/xsdsequencegt
ltxsdattribute namecountry typexsdNMTOKEN
usefixed valueUS
/gt lt/xsdcomplexTypegt
30
Example PurchaseOrderType
ltxsdcomplexType namePurchaseOrderTypegt
ltxsdsequencegt ltxsdelement
nameshipTo typeUSAddress /gt
ltxsdelement namebillTo typeUSAddress
/gt ltxsdelement refcomment
minOccurs0 /gt ltxsdelement
nameitems typeItems /gt
lt/xsdsequencegt ltxsdattribute
nameorderDate typexsddate
/gt lt/xsdcomplexTypegt
31
Notes
  • element declarations for shipTo and billTo
    associate different element names with the same
    complex type
  • attribute declarations must reference simple
    types
  • element comment declared elsewhere in the schema
    (here reference only)

32
continues
  • element is optional, if minOccurs 0
  • maximum number of times an element may appear
    maxOccurs
  • attributes may appear once or not at all
  • use attribute is used in an attribute declaration
    to indicate whether the attribute is required or
    optional, and if optional, whether the value is
    fixed or whether there is a default

33
More examples
ltitemsgt ltitem partNum"872-AA"gt
ltproductNamegtLawnmowerlt/productNamegt
ltquantitygt1lt/quantitygt ltpricegt148.95lt/pricegt
ltcommentgtConfirm this is
electriclt/commentgt lt/itemgt ltitem
partNum"926-AA"gt ltproductNamegtBaby
Monitorlt/productNamegt ltquantitygt1lt/quantitygt
ltpricegt39.98lt/pricegt
ltshipDategt1999-05-21lt/shipDategt lt/itemgt
lt/itemsgt
34
ltxsdcomplexType name"Items"gt ltxsdelement
name"item" minOccurs"0
maxOccurs"unbounded"gt ltxsdcomplexTypegt
ltxsdelement name"quantity"gt
ltxsdsimpleType base"xsdpositiveInteger"gt
ltxsdmaxExclusive value"100"/gt
lt/xsdsimpleTypegt lt/xsdelementgt
ltxsdelement name"price" type"xsddecimal"/gt
ltxsdelement ref"comment" minOccurs"0"/gt
ltxsdelement name"shipDate" type"xsddate
minOccurs"0"/gt
ltxsdattribute name"partNum" type"Sku"/gt
lt/xsdcomplexTypegt lt/xsdelementgt lt/xsdcomplexT
ypegt ltxsdsimpleType nameSkugt ltxsdpattern
value"\d3-A-Z2"/gt lt/xsdsimpleTypegt
35
Patterns
ltxsdsimpleType nameSkugt ltxsdrestriction
basexsdstringgt ltxsdpattern
value"\d3-A-Z2"/gt ltxsdrestrictiongt lt/xsd
simpleTypegt
  • three digits followed by a hyphen followed by
    two upper-case ASCII letters

36
Building content models
  • ltxsdsequencegt fixed order
  • ltxsdchoicegt (1) choice of alternatives
  • ltxsdgroupgt grouping (also named)
  • ltxsdallgt no order specified

37
Null values
  • A missing element may mean many things unknown,
    not applicable
  • an attribute to indicate that the element content
    is null

in schema ltxsdelement nameshipDate
typexsddate
nullabletrue /gt in
document ltshipDate xsinulltruegtlt/shipDategt
38
Specifying uniqueness
  • XML Schema enables to indicate that any attribute
    or element value must be unique within a certain
    scope
  • unique element first select a set of elements,
    then identify the attribute of element field
    relative to each selected element that has to be
    unique within the scope of the set of selected
    elements

39
Defining keys and their references
  • Also keys and key references can be defined

ltkey namepNumKeygt ltselectorgtparts/partlt/sel
ectorgt ltfieldgt_at_numberlt/fieldgt lt/keygt ltkeyref
namedummy2 referpNumKeygt
ltselectorgtregions/zip/partlt/selectorgt
ltfieldgt_at_numberlt/fieldgt lt/keyrefgt
40
XML Query Languages
  • Currently
  • There is no recommendation/standard available,
    only drafts
  • Different suggestions given in 1998, work in
    progress
  • XML Query Requirements
  • Requirements draft 16.8.2000
  • Query language until the end of 2000
  • XML Query Data Model
  • Draft 11.5.2000
  • More on XML Query Languages
  • http//www.w3.org/XML/Query/

41
XML Query Languages
  • Required features of an XML query language
  • Support operations (selection, projection,
    aggregation, sorting, etc.) on all data types
  • Choose a part of the data based on content or
    structure
  • Also operations on hierarchy and sequence of
    document structures
  • Structural preservation and transformation
  • Preserve the relative hierarchy and sequence of
    input document structures in the query results
  • Transform XML structures and create new XML
    structures
  • Combination and joining
  • Combine related information from different parts
    of a given document or from multiple documents

42
XML Query Languages
  • Required features of an XML query language
    (cont'd)
  • Closure property
  • The result of an XML document query is also an
    XML document (usually not valid but well-formed)
  • The results of a query can be used as input to
    another query
  • Notions
  • HTML is layout-oriented, queries can not be
    efficiently carried out
  • XML is not layout-oriented but is based on
    representing structure, DTDs and structure
    information can be used in queries
  • XML query languages are still under construction,
    but prototype languages exist (e.g., XML-QL, XQL,
    Lore)

43
XML Query Languages
  • We want our query to collect elements from
    manufacturer documents (in temp.database.xml)
    listing manufacturer's name, year, models,
    vendors, price, etc. to create new ltcargt elements
  • The results should list their make, model,
    vendor, rank, and price (in this order)
  • Lorel

Select xml(car(select X.vehicle.make,
X.vehicle.model,
X.vehicle.vendor, X.manufacturer.rank,
X.vehicle.price from
temp.database.xml X))
44
XML Query Languages
WHERE ltmanufacturergt ltmn_namegtmnlt/mn_namegt
ltvehiclemodelgt ltmodelgt
ltmo_namegtmonlt/mo_namegt ltrankgtrlt/rankgt
lt/modelgt ltvehiclegt ltpricegtylt/pricegt
ltvendorgtmnlt/vendorgt lt/vehiclegt
lt/vehiclemodelgt lt/manufacturergt IN
www.nhcs\temp.database.xml
CONSTRUCT ltcargt ltmakegtmnlt/makegt
ltmo_namegtmonlt/mo_namegt ltvendorgtvlt/vendorgt
ltrankgtrlt/rankgt ltpricegtylt/pricegt lt/cargt
  • XML-QL
Write a Comment
User Comments (0)
About PowerShow.com