XML? Database - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

XML? Database

Description:

XML Database – PowerPoint PPT presentation

Number of Views:99
Avg rating:3.0/5.0
Slides: 30
Provided by: ackr
Category:
Tags: xml | database

less

Transcript and Presenter's Notes

Title: XML? Database


1
XML? Database
  • ???
  • ???????

2
??
  • Database, Web, and XML
  • XML Database Systems
  • Data Models
  • Query Language and Processing
  • Storage and Index
  • Other issues

3
Database and Web, before XML
  • DB a back-end server for Web Applications
  • CGI
  • JDBC
  • Embedded SQL
  • Web
  • Information Retrieval
  • Target to manage (Web DB)

Scripts
ThinClient
HTML
MiddleTier
Web Server
Template Engine
HTMLTemplates
Scripts
Application Server
ApplicationCode
MappingCode
BackEnd
4
XML
  • eXtensible Markup Language
  • A new emerging standard for data representation
    and exchange on the internet
  • See the XML catalog , http//www.xml.org
  • Separating content from presentation
  • Easy to provide multiple view of the same data
  • Easily parsed and self-describing

5
XML
  • Extensible a dynamic data model
  • Simple human-readable, easy to use
  • Flexible for handling complex data
  • Portable for cross-platform data exchange
  • Standard easy to integrate, widely adopted

6
HTML? XML ?? ??
7
XML Is All About Data
  • HTML example
  • ltheading1gt Invoice lt/heading1gt
  • ltboldgtTo Joe Bloggs ltPgt
  • From J. Abrams ltPgt
  • Date 2/1/1999ltPgt
  • Amount 100 ltPgt
  • Tax 21 ltPgt
  • Total 121 lt/boldgt

Datamixed withpresentation
8
XML Is All About Data
HumanReadable
  • XML example
  • ltInvoicegt
  • ltCustomergt Joe Bloggs lt/Customergt
  • ltFromgt J. Abrams lt/Fromgt
  • ltDate year1999 month2 day 1 /gt
  • ltAmount unit Dollarsgt 100 lt/Amountgt
  • ltTaxRategt 21 lt/TaxRategt
  • ltTotal currency Dollarsgt121 lt/Totalgt
  • lt/Invoicegt

Comeswith Tags
9
XML Is All About Data
Extensible
  • XML example
  • ltInvoicegt
  • ltCustomergt
  • ltNamegtJoe Bloggs lt/Namegt
  • ltAddressgt 25 Mall Road lt/Addressgt
  • lt/Customergt
  • ltFromgt J. Abrams lt/Fromgt
  • ltDate year1999 month2 day 1 /gt
  • ltAmount unit Dollarsgt 100 lt/Amountgt
  • ltTaxRategt 21 lt/TaxRategt
  • ltTotal unit Dollarsgt121 lt/Totalgt
  • lt/Invoicegt

ltNamegtJoe Bloggs lt/Namegt ltAddressgt 25 Mall
Road lt/Addressgt
10
XML Family of Standards
  • XML
  • DOM (Document Object Model)
  • XML Namespaces
  • XSL (style language)
  • XQL (XSL query language)
  • XML Data / DCD / Schema
  • XUL (updates, future)
  • many more

11
Building Web Applications with XML
Scripts
ThinClient
  • Quickly react to changes
  • Lower maintenance costs
  • Does not depend on a single vendor

HTML
MiddleTier
Web Server / App Server
ApplicationCode DOM
XSL
XML Server
Standard API andTemplate Language
XML
BackEnd
12
Legacy DBs for XML Applications
  • XML as a new data-exchange format
  • for legacy DB applications
  • DB2XML
  • Transforming the results of database queries or
    complete databases into XML documents or into
    HTML documents using XSLT stylesheets.
  • DB2XML can be used
  • as a standalone tool (with GUI or command line),
  • as a servlet to dynamically generate
    XML-documents
  • using the DB2XML API

13
XML Database Systems
  • 3 approaches
  • Build special-purpose systems
  • Lore, Strudel
  • Best performance for XML data
  • Use object-oriented database systems
  • eXelon, Monet, Ozone
  • Object-oriented modeling
  • Use relational database systems
  • Oracle, Microsoft
  • Matured large market

14
Lore
15
eXelon
16
Oracle 8i
17
Data Models for XML
  • XML is not a data model
  • Structure of an XML document
  • an ordered list of elements
  • each element
  • may have a set of attributes
  • may have (sub)elements (nested elements)
  • Structured data and full text mixed together
  • DOM defines how to translate an XML document into
    a data structure for processing
  • Need a true data model for XML data

18
OEM a Semi-structured Data Model
  • Object Exchange Model (Lore)
  • Semi-structured Data
  • Self-describing structure, the lack of schema
  • the structure changes rapidly and unpredictably
  • Labeled direct graph
  • Node Object (OID) or atomic value (leaf)
  • Labeled Edge object-subobject relationship

19
OEM, an example
  • ltDBGroupgt
  • ltMember Name? Advisorm1gt
  • ltAgegt28lt/Agegt
  • lt/Membergt
  • ltMember IDm1, Projectp1gt
  • ltNamegt?lt/Namegt
  • ltAdvisorgt?lt/Advisorgt
  • lt/Membergt
  • ltProject IDp1 Memberm1gt
  • ltTitlegtXML DBlt/Titlegt
  • lt/Projectgt
  • lt/DBGroupgt

DBGroup
1
Member
Project
Member
4
3
2
IDm1, Projectp1
NameSmith, Advisorm1
IDp1, Projectm1
Age
Name
Title
Advisor
5
7
6
8
Text
Text
Text
Text
12
9
10
11
28
?
?
20
Issues in Data Modeling
  • How to simultaneously view XML information in
    both
  • a set of documents
  • a single large database
  • No loss of information in XML
  • How to represent the Ordering of elements
  • external/internal entities, processing
    instructions

21
  • XML DB Design
  • When should attributes (subelements) be used?
  • Is a 1-to-1 relationship best represented using
    element nesting or IDREFs?
  • How to translate the conceptual model (OEM?) into
    an XML encoding?
  • Need to identify the relationship between DTDs
    and traditional DB schema

22
Query Languages for XML DB
  • Requirements
  • Path Expressions
  • Queries over
  • the structured and semistructured data
  • full text
  • the mixture of data elements and full text
  • W3C, Query Languages for the Web, 1998
  • QL for semistructured data
  • Lorel, UnQL
  • XQL, XML-QL

23
XML-QL
  • Syntax
  • Select ltvariable-listgt where ltXML-patterngt
  • Example
  • select n, h
  • where ltpersongt ltageagt
  • ltnamegt n lt/namegt
  • ltaddressgt ?? ??? ???3?lt/addressgt
  • lthobbygt h lt/hobbygt
  • lt/persongt, a gt 18

24
Issues in Query Processing
  • The true requirements for XML QL is not known
  • Need to review all facets of traditional query
    processing
  • Need to Develop a new IR model
  • proximity in XML documents
  • similarity measure between XML elements

25
  • How to integrate
  • traditional (DB) query processing model and
  • information retrieval model
  • Optimization Schemes
  • for not well-structured XML data
  • for queries mixed with full text retrieval and
    structured/semistructured search

26
Storage Structure and Indexing
  • Clustering schemes for storing XML data
  • New index types
  • for quickly finding certain elements, attributes,
    and more complex structural patterns
  • element orderings
  • Determine the level of parsing for storing XML
    documents
  • Based on the analysis of encoding pattern
  • merging identical text strings (sub-patterns) by
    using appropriate IDREFs
  • compression based on regular patterns

27
Issues in Various DB Features
  • Full view support for XML
  • both virtual and materialized views
  • incremental maintenance
  • XSL as a view definition language
  • Data integrity issue
  • What are constraints on XML data?
  • key, referential, domain
  • How to represent the constraints
  • How to check them when changes occur

28
  • Trigger
  • active database capabilities in XML
  • Transaction Control over XML database
  • Performance Evaluation
  • need to make an appropriate benchmark for XML
    data
  • XML data set
  • query types
  • mix of queries and updates

29
References
  • Research issues
  • Data Management for XML Research Directions,
    http//www-db.stanford.edu/wisom/xml-whitepaper.h
    tml
  • More on Data Management for XML,
    http//www.cs.washington.edu/homes/alon/widom-resp
    onse.html
  • Storing XML data into RDBMSs
  • A Performance Evaluation of Alternative Mapping
    Schemes for Storing XML Data in a Relational
    Database, ercim.inria. publications/RR-3680
  • XML Database Systems
  • http//www.xmlsoftware.com/database/
Write a Comment
User Comments (0)
About PowerShow.com