Title: native XML databases applications
1lt? native XML databases ?gtapplications
implications
- inls 258 db2
- alex vidas susan teague rector
- october 29, 2003
2What Are Native XML Databases (NXDs)?
- The XMLDB mailing list proposes that a native
XML database - Defines a (logical) model for an XML document --
as opposed to the data in that document -- and
stores and retrieves documents according to that
model. At a minimum, the model must include
elements, attributes, PCDATA, and document order.
Examples of such models are the XPath data model,
the XML Infoset, and the models implied by the
DOM and the events in SAX 1. - Has an XML document as its fundamental unit of
(logical) storage, just as a relational database
has a row in a table as its fundamental unit of
(logical) storage. - Is not required to have any particular underlying
physical storage model. For example, it can be
built on a relational, hierarchical, or
object-oriented database, or use a proprietary
storage format, such as indexed, compressed
files.
3Architecture
Native XML Data Store
XML-RPC Interface
Database Engine
DBBroker Interface
Native Broker
XMLDB API
SOAP Interface
Relational Broker
RDBMS
XPath Engine
example depicting the eXist native XML database,
from Chaudhri, Akmal B., Rashid, Awais, Zicari,
Roberto. (2003). XML Data Management Native XML
and XML-Enabled Database Systems.
4History Why were NXDs Developed?
- "... the new paradigm is the XML document.
Applications are slowly but steadily moving away
from a data-centric point of view to a
document-centric-oriented framework" - Mable.
- Need ways of processing, storing and manipulating
the semi-structured data, AND enabling a
generation of the original document - NXDs offer faster and more efficient alternatives
for some of these tasks.
5NXD Features
- Document Collections
- Query Languages
- Updates and Deletes
- Transactions, Locking, and Concurrency
- Application Programming Interfaces (APIs
- Round-Tripping
- Remote Data
- Indexes
- External Entity Storage
6Document Centric Versus Data Centric
- Data-Centric Documents
- use XML as a data transport Bourret.
- like the data in RDBMSs
- designed for machine consumption and the fact
that XML is used at all is usually superfluous
Bourret.
7Document Centric Versus Data Centric
- Document-Centric Documents
- intended for a person to read rather than for an
application process Mable. - implicit ordering that allows little or no
structure Mable - Semi-structured in nature Bourret
8Sample of Document-Centric/Semi-Structured XML
- http//www.ibiblio.org/dickens/xml/
- from the Dickens Collection, courtesy of
Documenting the - American South Digitization Project
9Overview of Advantages/Disadvantages to Using NXD
Technology
- Main Advantages
- Superior Document-Centric/Semi-Structured Storage
and Searching of XML - Efficiency
- Similar features to those of a RDBMS
- Reuse Indexing Added Value
- Maintains the integrity of the document
- Schema-less in most cases
- Reduced workload for database professionals
- Quicker development to market time
- No complex mapping between the XML document and
RDBMS - Supports full-text searching and manipulation of
recursive content
10Overview of Advantages/Disadvantages to Using NXD
Technology
- Main Disadvantages
- Extremely young, cutting-edge technology
- Has the stigma of becoming the next
object-oriented database - Organizations already have complex RDBMS models
in place - Lack of formatting options when retrieving data
from the system - No formal adoption of one query language
- Difficult to chose a system as there are so many
on the market now - Supports referential integrity, but is weak (with
XML in general) - Potential problems with redundancy
11Commercial Open Source NXDs
- key native xml databases
- eXist
- Xindice
- Tamino XML Database
- Berkeley DB
- db/XML
- GoXML DB Native XML database
- Ipedo XML Database
- X-Hive/DB
- Michigan's Digital Library Extension Service
12The Future of NXD Technology
- "Whether the future of databases is the
traditional, relational and SQL model with XML
technologies incorporated into it or a new
XML-based model is a matter of debate" - Krill
-
13Future, cont. Possibilities
- XML becomes dominant means of information
interchange - Modifications to relational database front-ends
for XML - Research into how to best index XML
- Defragmenting update abilities.
- Improved support for live remote data
- Changes or additions to Query languages
- Self-managing database technology
- Query processing changes
- Development of standards through standards bodies
AND consortia - Emergence of W3C XML Schema as the schema
language of choice for NXDs?