XML to Relational Mapping - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

XML to Relational Mapping

Description:

Algo. 19. DTD to Relational Schema. Na ve Approach: Each Element == Relation ... 29. Data Pattern Analysis. XML. Data. Algo. Overflow. Mapping. STORED. Mapping ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 30
Provided by: roseCs
Category:
Tags: xml | algo | mapping | relational

less

Transcript and Presenter's Notes

Title: XML to Relational Mapping


1
XML to Relational Mapping
2
XML-to-Relational Mapping
XML Translation Layer
Tuples
Relational Database System
3
Where/How to Store XML Data ?
  • File system
  • OODBMS
  • Semistructured DBMS Lore, etc.
  • XML DBMS eXcelon, Tamino, etc.
  • RDBMS/ORDBMS
  • CLOB(Character Large Object) column
  • Decomposing (Shredding) into tuples
  • XML Type

4
Approaches to XML Data Storage Using an RDBMS
  • Predefined set of table schema
  • Automatic generation of table schema for given
    DTD
  • Data mining approach

5
Predefined Schema
  • simple ad-hoc schemes
  • requires no input by the user or by the system
    administrator
  • works in the absence of type (e.g., DTD)
  • does not involve any analysis of the XML data
  • many variations possible
  • Edge-inlining
  • XRel
  • XParent

6
Predefined SchemaFlor99
7
XML Data Model
  • Ordered labeled graph
  • Each XML element is represented by a node in the
    graph the node is labeled with the oid of the
    XML object.
  • Element-subelement relationships are represented
    by edges in the graph and labeled by the name of
    the subelement.
  • In order to represent the order of subelements of
    an XML object, the outgoing edges of a node in
    the graph are also ordered.
  • Values (e.g., strings) of an XML document are
    represented as leaves in the graph.

8
Example XML Data
9
XML Data Model
10
Mapping XML Data into Relational Table
  • Mapping Edges
  • Edge Approach
  • Binary Approach
  • Universal Table Approach
  • Mapping Values
  • Separate Value Tables
  • Inlining
  • 3 x 2 6 possible mapping schemes

11
Edge Approach
  • Store all edges of the XML document in a single
    table Edge Table

Edge (source, ordinal, name, flag, target)
12
Edge Approach (Contd)
  • The index on the source column is useful for
    forward traversals needed to reconstruct a
    specific object given its oid.
  • The index on name, target is useful for
    backward traversals
  • e.g.) find all objects that have a child
    named John.

13
Binary Approach
  • Group all edges with the same label into one
    table
  • This approach corresponds to a horizontal
    partitioning of the Edge Table, using name as the
    partitioning attribute.

14
Universal Table Approach
  • Generate a single Universal Table to store all
    the edges.
  • The Universal Table corresponds to the result of
    a full outer join of all Binary Tables.
  • The Universal Table has many fields which are set
    to NULL.
  • This Universal Table is denormalized.

15
Separate Value Tables
  • Store values in the separate Value Tables.
  • A separate Value Table for each conceivable data
    type
  • e.g., integer, date, ref, string, etc.

16
Inlining
  • Store values and attributes in the same table
  • e.g., binary inlining approach

17
Modified Edge-Inlining Approach
  • XML Data Model ordered, labelled tree
  • Node id (i.e., source) assigned in preorder
    traversal of XML tree
  • 1 Table
  • Node id (source)
  • Tag name (name)
  • Text (PCDATA)
  • Parent Node id (tagging)
  • Path (optional)
  • Type (element or attribute)
  • Level
  • Doc id

18
Automatic Schema Generation out of DTDShan 99
  • Assumption
  • XML document conforms to a scheme (DTD)
  • Transformation Approach
  • Three Techniques
  • Basic, Shared, Hybrid Inlining
  • DTD ? DTD Graph ? Element Graph ? Table Schema

19
DTD to Relational Schema
  • Naïve Approach
  • Each Element gt Relation
  • Each Attribute of Element gt Column of Relation
  • Connect elements using foreign keys
  • Fragmentation problem
  • Too many relations
  • Requires many joins in query processing

20
Naïve Approach Example
lt!ELEMENT author (name, address)gt lt!ATTLIST
author id ID REQUIREDgt lt!ELEMENT name
(firstname?, lastname)gt lt!ELEMENT firstname
(PCDATA)gt lt!ELEMENT lastname (PCDATA)gt lt!ELEME
NT address ANYgt
author (authorID integer, id string) name
(nameID integer, authorID integer) firstname
(firstnameID integer, nameID integer, value
string) lastname (lastnameID integer, nameID
integer, value string) address (addressID
integer, authorID integer, value string)
21
Inlining
  • Inlining
  • solves fragmentation problem
  • Basic inlining as many descendents of an element
    as possible into a single relation
  • Shared inlining the nodes with an in-degree of 1
    (DTD graph)
  • Hybrid
  • Basic Shared
  • inlining the nodes with an in-degree gt 1 (DTD
    graph)
  • Issues
  • set-valued attribute
  • recursion

22
DTD
?
23
DTD Graph
24
Element Graph for editor
25
Basic Inlining Technique
26
Shared Inlining Technique
27
Hybrid Inlining Technique
  • Same as Shared Inlining except
  • inlines some elements that are not inlined in
    Shared
  • inlines elements with in-degree greater than one
  • that are not recursive or
  • reach through a node

28
Data Mining Approach
  • Data Pattern Analysis
  • Wang Lis Semistructured Data Mining Algorithm
  • STORED(Semistructured To Relational Data)
  • a declarative language for description of
    XML-to-Relational Mapping
  • Overflow Mapping
  • definition for the objects not mapped to
    relational data
  • overflow graph
  • stored in the object repository of semistructured
    data

29
Data Pattern Analysis
STORED Mapping
XML Data
Algo
Overflow Mapping
Write a Comment
User Comments (0)
About PowerShow.com