Title: MC407 Implementing XML Solutions with ASA
1MC407 Implementing XML Solutions with ASA
David Carson Principal Consultant iAnywhere
Solutions dcarson_at_sybase.com
2Implementing XML Solutions with ASA
- Presentation Overview
- Overview of XML.
- syntax
- parsers
- Database Data in XML
- schema representation
- result sets
- Storage And Retrieval
3Implementing XML Solutions with ASA
4Implementing XML Solutions with ASA
- A General XML Application
Database
SQL Syntax
XML Translation
XML Stream
XML Document
Data Source Specific
Other Data Sources
5Overview Of XML
- What is XML
- A markup language (eXtensible Markup Language)
- descendant of SGML
- provides separation of content from presentation
- structured but not restricted
- tags (markup) relate to the content
- enables portability of the data
- provides ability to define tags and the
structure between them
6Overview Of XML
- Why XML
- Provides well structured documents for the web
- HTML has a limited tag set
- describes presentation of content
- does not describe the content
- SGML is very flexible, but ..
- difficult to implement for a browser
- very powerful, but more than needed over web
- Provides small footprint without loss of
flexibility
7Overview Of XML
- Why XML for Databases
- Expose legacy data to customers (e-Commerce)
- Provides structured secure data access for the
web - Allows searches of database content without
direct access to database - Allows online search engines to query your site
for specific types of data (uses DTD) - Allow import of third party data, export of your
data with a known interface
8Overview Of XML
- Issues
- Databases are generally relational
- XML documents tend to be hierarchical
- There are data mapping issues
- Indices
- Foreign Keys
- Binary and Large Non-Text Objects
- Will need some intelligence to handle data
in/out - Maintaining RI may be less important than data
access
9Overview Of XML
- XML Document Types
- Data Centric Documents
- regular defined structure
- fine grained data of element or attribute type
- minimal mixed content
- generally intended for machine consumption
- database tables, rows and columns
10Overview Of XML
- XML Document Types
- Document Centric Documents
- irregular structure
- large blocks of data
- mixed content - text, images, boilerplate etc.
- generally intended for human consumption
- catalog page with pictures, descriptions, pricing
11Overview Of XML
How Do We Get There
XML / SQL Translation (Parser)
SQL Syntax
Database
XML Stream
XML Document
SQL / XML Translation (Parser)
SQL ResultSet
Database
XML Stream
XML Document
12Overview Of XML
13Overview Of XML
- XML - Syntax And Rules
- XML Declaration (prolog)
- Looks like
- lt?xml version1.0 encodingUTF-8
standaloneyes?gt - Not required but always recommended
- This is really a form of Processing Instruction
(PI) - Needed to help tools identify document processing
needs - (ie. Web browser - knows to call xml parser)
14Overview Of XML
- XML - Syntax And Rules
- Processing Instructions (PI)
- lt?gettime timeis?gt
- gettime - target timeis - parameter
- passed to application (browser ?) as instructions
- application may ignore them if it cannot process
- any PI starting with xml is reserved
- Eg lt?xmlxsl ?gt
15Overview Of XML
- XML - Syntax And Rules
- Elements
- Named Tag Pairs
- Start with lt gt and end with lt/gt
- must be the same for both parts
- (ie ltelement1gt Content Stuff lt/element1gt
- May have attributes in tag
- (ie. lttextline colorbluegt Some Text
lt/textlinegt
16Overview Of XML
- XML - Syntax And Rules
- Entity References (Internal/External/Parameter)
- handles special cases and reserved characters
- always begins with an character
- used for character or string substitution
- (eg lt lt, so ltstuffgtlt 123lt/stuffgt lt
123) - provides method to include special characters
- (eg 8192 or x2000 represent Unicode U2000)
- Include external items boilerplate -gt
/boilerplt.txt
17Overview Of XML
- XML - Syntax And Rules
- Comments
- begin with lt!-- some comments here --gt
- not generally made part of XML output stream
- parser will ignore content within comments
18Overview Of XML
- XML - Syntax And Rules
- PCDATA ltPCDATAgt - Parsed Character Data
- used to supply markup or other characters to
application - parser will parse any content in PCDATA section
- parser will pass these on to application with any
entity substitutions applied
19Overview Of XML
- XML - Syntax And Rules
- CDATA Sections lt!CDATA gt
- used to supply markup or other characters to
application - parser would ignore any content in CDATA
section - parser will pass these on to application
unchanged - cannot contain - which is end of CDATA
section
20Overview Of XML
- Other Rules
- Identifiers
- May use most characters, some may need to be
escaped - lt gt ! xml --
- amp (), gt (gt), lt (lt)
- Cannot start with a number
- Each start/stop tag set must match
- Generally XML declarations are in caps for
visibility - ELEMENT, PCDATA, CDATA, etc.
21Overview Of XML
-
- Basic XML Structure
- Well Formed And Valid Documents
22Overview Of XML
- Basic XML Structure - Well Formed
- lt?xml version1.0?gt - version declaration
- ltparent nodegt - start of a markup set
- ltchild nodegt - start of content markup
- This is some data - the content
- lt/child nodegt - end of content markup
- ltchild node/gt - empty content
- lt/parent nodegt - end of markup set
23Overview Of XML
- Basic XML Rules for Well Formed Docs
- declare your XML version
- opening tags use lt gt format
- closing tags use lt/ gt format
- be sure all child nodes are nested under parent
node - you may use lt /gt for empty content nodes
- This is considered well formed XML.
24Overview Of XML
- XML Rules for Valid Well Formed Documents
- Document Type Definition (DTD)
- create a DTD to define document structure
- include as part of document (internal)
- include from a separate file (external)
- use a mixture or both if needed
- enforces structure on tag sets and content
- Document will be a valid XML document, if it
follows all of the structural rules in the DTD
25Overview Of XML
- Validated XML Markup
- Document Type Definitions
- (DTD)
26Overview Of XML
- Document Type Definition (DTD)
-
- Defines Structure for XML Document
- Determines Tag Names and Attributes
- Enforces format of content
- Provides a template for the XML parser and
constructor - Not always required, but makes document portable
- Can be included in document, or references
externally - Can merge several DTDs into a common Namespace
27Overview Of XML
- Validated XML Markup - The DTD
- lt?xml version1.0?gt - version declaration
- lt!DOCTYPE parent - start of parent DTD
- lt!ELEMENT parent (child )gt - start of parent
details - lt!ELEMENT child (info?, name)gt - start of child
details - lt!ELEMENT info emptygt - empty info details
- lt!ELEMENT name (lastname,firstname)gt - name
details - lt!ELEMENT lastname (PCDATA)gt - lastname details
- lt!ELEMENT firstname (PCDATA)gt - firstname details
28Overview Of XML
- Validated XML Markup (Contd)
- lt!ATTLIST info - optional number ID
REQUIRED - athome CDATA FIXED yes
- sex (boygirlunknown) unknown
- gt
- lt!ENTITY statement This Is A Parent-Child
Markupgt - gt
29Overview Of XML
- Validated Markup
- lt?xml version1.0?gt - start of document
- lt!DOCTYPE parent SYSTEM parent.dtdgt - indicate
what dtd is used - ltparentgt - start of parent markup
- statement - insert our statement here
- ltchildgt - start of a child node
- ltinfo number1 athomeyes sexgirl/gt -
some info attributes - ltnamegt - start of a name node
- ltlastnamegtMillerlt/lastnamegt
- ltfirstnamegtSusanlt/lastnamegt
- lt/namegt - close of a name node
- lt/childgt - close of a child node
30Overview Of XML
31Overview Of XML
- Getting at the XML Content - The Parser
- Non-Validating Parser
- checks for well formed documents only
- does not need the DTD to get content
- small footprint and usually very fast
- Validating Parser
- Requires DTD in document, or accessible
instance - More powerful, and provides DOM structure
- larger footprint, somewhat slower to parse stream
32Overview Of XML
A More Complete Model
Style Sheets Other
XML Parser (SAX/DOM)
XML Stream
XML Document
XML Enabled Application
SAX Events or DOM Nodes
Output WML/HTML XML/Other
Database
Database XML Factory (Builds XML)
Results
SQL Queries
33Overview Of XML
- The SAX API (Simple API for XML)
- Event based API for XML Parsing
- generates events for start/end of elements
- does not store any content or structures
- must grab content when event is fired
- small footprint and usually very fast
- ideal for stream type XML processing
- C and Java variants available
34Overview Of XML
- The SAX API (Simple API for XML)
XML Stream
For Each Node
Do Processing
Begin Node
Do Processing
Begin Attribute
More Sub-Nodes
Do Processing
End Node
Do Processing
Next Node
35Overview Of XML
- The DOM API (Document Object Model)
- DOM Tree Constructor API for XML Parsing
- builds tree like structure of elements
- uses some internal resources to hold tree
- allows walking of tree to access elements
content - provides opportunity to manipulate content
- Read/Add/Search/Modify/Delete
- C/C and Java variants available
36Overview Of XML
- The DOM API (Document Object Model)
Attribute Name
Text Bob Smith
Attribute Address
Text 12 Clair Rd
XML Stream
XML Document
Attribute City
Text Toronto
ElementParent
Element Child
Attribute Name
Text Susan
Attribute Sex
Text F
DOM Object Parent Record
37Overview Of XML
- Displaying the XML Content - Style Sheets
- XSL - Extensible Style Language
- language to define the current view of content
- can have several styles for different
- applications / users / devices
- CSS - Cascading Style Sheets
- provides similar features, but not XML specific
- intended originally for browser support
38Overview Of XML
- Displaying the XML Content - Translations
- XSLT - Extensible Style Language Translation
- translates one style sheet result to another
- select some or all elements of document for
inclusion in translation - allows changing one element tag name to another
- allows moving content between tags
- allows wrapping of XML in other tags (Eg. HTML)
39Describing Data
Describing Database Data
40Describing Data
- Types of Database Data
- Database Schema
- Tables
- Indices
- Data Types
- Database Data
- Record Sets
- Rows
- Columns
41Describing Data
- Database Schema
- Need to be able to describe tables and their
attributes - table name
- column names
- column attributes
- data types
- sizes
- defaults
- nullable
42Describing Data
- Simple Sales Database Schema
Sales Database
Customer CustID int Salutation
char(3) FirstName char(20) LastName
char(20) Address char(60) City char(60) State ch
ar(2)
Products ProductID int Description char(60)
Orders CustID int ProductID int Quantity char(3)
43Describing Data
- Basic Database XML Structure
- lttablegt - start of a table in schema
- ltrowgt - rows in table
- ltcolumn1gt lt/column1gt - columns in row
- ltcolumn2gt lt/column2gt
-
- lt/rowgt - next row
-
- lt/tablegt - end of table
44Describing Data
- Document Type Definition (DTD)
- First Pass for customer Table
- lt!ELEMENT customer (rows )gt
- lt!ELEMENT rows (custid, salutation, firstname,
lastname, address, city, state)gt - lt!ELEMENT custid (PCDATA)gt
- lt!ELEMENT salutation (PCDATA)gt
- lt!ELEMENT firstname (PCDATA)gt
- lt!ELEMENT lastname (PCDATA)gt
- lt!ELEMENT address (PCDATA)gt
- lt!ELEMENT city (PCDATA)gt
- lt!ELEMENT state (PCDATA)gt
45Describing Data
- customer Table (DTD) - Null/Not Null
- lt!ELEMENT customer (rows )gt
- lt!-- First Name Is Optional --gt
- lt!ELEMENT row (custid, salutation, firstname?,
lastname, address, city, state)gt - lt!ELEMENT custid (PCDATA)gt
- lt!ELEMENT salutation (PCDATA)gt
- lt!ELEMENT firstname (PCDATA)gt
- lt!ELEMENT lastname (PCDATA)gt
- lt!ELEMENT address (PCDATA)gt
- lt!ELEMENT city (PCDATA)gt
- lt!ELEMENT state (PCDATA)gt
46Describing Data
- customer Table (DTD) - Simple Constraints
- lt!ELEMENT customer (rows )gt
- lt!-- First Name Is Optional --gt
- lt!ELEMENT row (custid, salutation, firstname?,
lastname, address, city, state)gt - lt!ELEMENT custid (PCDATA)gt
- lt!ELEMENT salutation EMPTYgt
- lt!ATTLIST salutation greeting (Mr Mrs Ms
Dr) IMPLIEDgt - lt!ELEMENT firstname (PCDATA)gt
- lt!ELEMENT lastname (PCDATA)gt
- lt!ELEMENT address (PCDATA)gt
- lt!ELEMENT city (PCDATA)gt
- lt!ELEMENT state (PCDATA)gt
47Describing Data
- customer Table - The XML Document
- lt?xml version"1.0" encoding"UTF-8"?gt
- lt!DOCTYPE customer SYSTEM "file//localhost/D/XML
/TechWave/customer.dtd"gt - ltcustomergt
- ltrowgt
- ltcustidgt1lt/custidgt
- ltsalutation greeting"Mr"/gt - Salutation Is
An Attribute - ltfirstname/gt - First Name Is Empty
(NULL) - ltlastnamegtBrownlt/lastnamegt
- ltaddressgt 432 Brown Roadlt/addressgt
- ltcitygtDetroitlt/citygt
- ltstategtMIlt/stategt
- lt/rowgt
- .
- .
48Describing Data
- customer Table - The XML Document
- ltrowgt
- ltcustomer_idgt2lt/customer_idgt
- ltformal_grt greeting"Ms"/gt
- ltfirst_namegtCandicelt/first_namegt
- ltlast_namegtBergmanlt/last_namegt
- ltcitygtNew Yorklt/citygt
- ltstategtNYlt/stategt
- lt/rowgt
- lt/customergt
49Describing Data
- Customer Database Table - DOM View
Element CustID
Element Salutation
XML Stream
Element FirstName
XML Customer
Element Customer
ElementRows
Element LastName
More rows
Element Address
Element Rows
. . .
50Describing Data
51Describing Data
- Schema Representation
- Try To
- Reduce Overall XML Document Size
- Provide Ability to add new items
- Generally represent the existing RI if possible
- Present all data in tables in proper relationship
- Handle some constraints and nullable columns
52Describing Data
- Sales Database - Using A Flexible DTD
- lt!ELEMENT SalesRec (CustomerDetails)gt
- lt!ELEMENT CustomerDetails (OrderDetails)gt
- lt!ATTLIST CustomerDetails
- CustID CDATA REQUIRED
- FirstName CDATA IMPLIED
- LastName CDATA REQUIRED
- Address CDATA IMPLIED
- City CDATA IMPLIED
- Prov_State CDATA IMPLIEDgt
- lt!ELEMENT OrderDetails (LineItem)gt
- lt!ELEMENT LineItem ()gt
- lt!ATTLIST LineItem
- Quantity CDATA REQUIRED
- ProductID CDATA REQUIRED
- ProductDesc CDATA IMPLIED
- gt
53Describing Data
- Sales Database - Using A Flexible DTD
- lt!ELEMENT SalesRec (CustomerDetails)gtlt
- !ELEMENT CustomerDetails (OrderDetails)gt
- lt!ATTLIST CustomerDetails
- CustID CDATA REQUIRED
- FirstName CDATA IMPLIED
- LastName CDATA REQUIRED
- Address CDATA IMPLIED
- City CDATA IMPLIED
- Prov_State CDATA IMPLIEDgt
- lt!ELEMENT OrderDetails (LineItem)gt
- lt!ELEMENT LineItem ()gt
- lt!ATTLIST LineItem
- Quantity CDATA REQUIRED
- ProductID CDATA REQUIRED
- ProductDesc CDATA IMPLIED gt
54Overview Of XML
- DOM View Of Sales Database
ElementSalesRec
ElementCustomerDetails
More Customers
55Describing Data
56Describing Data
- A ResultSet DTD
- lt!ELEMENT ResultSet (ResultSetMetaData,ResultSetDa
ta)gt - lt!ELEMENT ResultSetMetaData (ColumnMetaData)gt
- lt!ATTLIST ResultSetMetaData getColumnCount CDATA
IMPLIEDgt - lt!ELEMENT ColumnMetaData EMPTYgt
- lt!ATTLIST ColumnMetaData
- getCatalogName CDATA IMPLIED
- .
- .
57Describing Data
- lt!ATTLIST ColumnMetaData
- getCatalogName CDATA IMPLIED
- getColumnDisplaySize CDATA IMPLIED
- getColumnLabel CDATA IMPLIED
- getColumnName CDATA IMPLIED
- getColumnType CDATA REQUIRED
- getColumnTypeName CDATA IMPLIED
- getPrecision CDATA IMPLIED
- getScale CDATA IMPLIED
- getSchemaName CDATA IMPLIED
- getTablename CDATA IMPLIED
- .
- .
58Describing Data
- .
- .
- isAutoIncrement (truefalse) IMPLIED
- isCaseSensitive (truefalse) IMPLIED
- isCurrency (truefalse) IMPLIED
- isDefinitelyWritable (truefalse) IMPLIED
- isNullable (truefalse) IMPLIED
- isReadOnly (truefalse) IMPLIED
- isSearchable (truefalse) IMPLIED
- isSigned (truefalse) IMPLIED
- isWritable (truefalse) IMPLIED
- .
- .
59Describing Data
- .
- .
- isWritable (truefalse) IMPLIED
- gt
- lt!ELEMENT ResultSetData (Row)gt
- lt!ELEMENT Row (Column)gt
- lt!ELEMENT Column (CDATA)gt
60Describing Data
- lt?xml version"1.0" encoding"UTF-8"?gt
- lt!DOCTYPE ResultSet SYSTEM "file//localhost/D/XM
L/TechWave/recordset.dtd" gt - ltResultSetgt
- ltResultSetMetaData getColumnCount"6"gt
- ltColumnMetaData getCatalogName"customer_id"
getColumnDisplaySize"4" getColumnLabel"Customer
ID" getColumnName"customer_id"
getColumnType"int" getTablename"customer"
isAutoIncrement"false" isCaseSensitive"false"
isCurrency"false" isDefinitelyWritable"true"
isNullable"false" isReadOnly"false"
isSearchable"true" isSigned"false"
isWritable"true"/gt - ltColumnMetaData getCatalogName"formal_grt"
getColumnDisplaySize"3" getColumnLabel"Formal
Grt" getColumnName"formal_grt"
getColumnType"char" getPrecision"3"
getTablename"customer" isAutoIncrement"false"
isCaseSensitive"false" isCurrency"false"
isDefinitelyWritable"true" isNullable"false"
isReadOnly"false" isSearchable"true"
isSigned"false" isWritable"true"/gt - ltColumnMetaData getCatalogName"first_name"
getColumnDisplaySize"20" getColumnLabel"First
Name" getColumnType"char" getPrecision"20"
getTablename"customer" isAutoIncrement"false"
isCaseSensitive"false" isCurrency"false"
isDefinitelyWritable"true" isNullable"true"
isReadOnly"false" isSearchable"true"
isSigned"false" isWritable"true"/gt - ltColumnMetaData getCatalogName"last_name"
getColumnDisplaySize"20" getColumnLabel"Last
Name" getColumnType"char" getPrecision"20"
getTablename"customer" isAutoIncrement"false"
isCaseSensitive"false" isCurrency"false"
isDefinitelyWritable"true" isNullable"false"
isReadOnly"false" isSearchable"true"
isSigned"false" isWritable"true"/gt
61Describing Data
- .
- .
- lt/ResultSetMetaDatagt
- ltResultSetDatagt
- ltRowgt
- ltColumngtlt!CDATA1gtlt/Columngt
- ltColumngtlt!CDATAMrgtlt/Columngt
- ltColumngtlt!CDATAJamesgtlt/Columngt
- ltColumngtlt!CDATABrowngtlt/Columngt
- ltColumngtlt!CDATADetroitgtlt/Columngt
- ltColumngtlt!CDATAMIgtlt/Columngt
- lt/Rowgt
- .
- .
62Describing Data
- .
- .
- ltRowgt
- ltColumngtlt!CDATA1gtlt/Columngt
- ltColumngtlt!CDATAMsgtlt/Columngt
- ltColumngtlt!CDATACandicegtlt/Columngt
- ltColumngtlt!CDATABergmangtlt/Columngt
- ltColumngtlt!CDATANew Yorkgtlt/Columngt
- ltColumngtlt!CDATANYgtlt/Columngt
- lt/Rowgt
- lt/ResultSetMetaDatagt
- ltResultSetDatagt
63Storage Retrieval
64Storage Retrieval
- What to Store
- Content only
- will need to parse data in/out to rebuild XML
document - required if data is changing often
- easier to manipulate data as SQL data on a
database - allows database to manage RI, constraints Etc.
- use database engine/Java support to manage
documents - could store some XML in database (DTD ?)
65Storage Retrieval
- What to Store
- Full Document
- fast return of complete XML document
- good if content is reasonably static (catalogs,
price lists ) - need a document management system to search
easily - could build indices for relational searches
- use database engine/Java support to manage
documents - may need binary or image data type support
66Storage Retrieval
- Retrieving and Building Documents
67Storage Retrieval
- Building the XML Document
- Options
- Java
- Java in the database, or external application
- Use JDBC to access database content
- Use off the shelf SAX/DOM APIs for Java
- JDBC attributes provide data description
- Need to reference the proper DTD to build
68Storage Retrieval
- Building the XML Document
- Options
- C/C
- external to database (DLL, external app)
- Use ODBC to access database content
- ODBC attributes provide data description
- Use C/C SAX/DOM APIs
69Storage Retrieval
- Building the XML Document
- Options
- PowerDynamo
- builds DOM object
- allows access to SQL query results through DOM
- merge XML with existing Web sites
- Use C/C SAX/DOM APIs (harder to find)
70Storage Retrieval
- Building the XML Document
- Options
- iAnywhere Wireless Server
- SQL Support
- SAX/DOM Parsers
- Java Servlets
- XLST - XML/HTML/HDML/WML ...
- iAnywhere framework
71Implementing XML Solutions with ASA