Title: Wenyue Du, Mong Li Lee, Tok Wang Ling
1XML Structures for Relational Data
- Wenyue Du, Mong Li Lee, Tok Wang Ling
- Department of Computer Science
- School of Computing
- National University of Singapore
- duwenyue, leeml, lingtw_at_comp.nus.edu.sg
2Contents
- Introduction
- Motivation
- Related Works
- Our Approach
- Background
- XML
- XML DTD
- Semantic Enrichment
- Proposed Relational to XML Translation
- Comparison
- Conclusion
31. Introduction
- Outline
- Motivation
- Related Works
- Our Approach
4Motivation
Introduction
- XML is emerging as a standard for information
publishing on the World Wide Web. However, the
underlying data is often stored in traditional
relational databases. Some mechanism is needed to
translate the relational data into XML data.
5Motivation (cont.)
Introduction
-
- Generates XML structures that are able to
describe the semantics and structures in
underlying relational databases. - Obtains properly structured XML data without
unnecessary redundancies and proliferation of
disconnected XML elements.
6Related Works
Introduction
- 1, 5, 6 basically focus on single relation
translation. In order to handle a set of related
relations, the relations are first denormalized
to one single relation. - The flat XML structure does not provide a good
way to show the structure of data. - It causes a lot of redundancies.
lt!ELEMENT Results(Employee)gt lt!ELEMENT Employee
(EMPTY)gt lt!ATTLIST Employee E
CDATA REQUIRED Ename CDATA
IMPLIED JoinDate CDATA IMPLIED
D CDATA REQUIRED DNAME
CDATA IMPLIED gt
Relations Dept(D, Dname) Employee (E,
Ename, JoinDate, D)
Maps to
7Related Works (cont.)
Introduction
- 7 developed a method to generate a hierarchical
DTD for XML data from a relational schema. - It lacks of semantic enrichment. So it cannot
handle more complex situations.
Relations Dept (D, Dname) Employee (E,
Ename, JoinDate, D)
lt!ELEMENT Results(Employee)gt lt!ELEMENT Employee
(Dept)gt lt!ATTLIST Employee E
ID REQUIRED Ename CDATA
IMPLIED JoinDate CDATA
IMPLIEDgt lt!ELEMENT Dept (EMPTY)gt lt!ATTLIST
Dept gt
Maps to
Is it an attribute of object or relationship?
8Our Approach
Introduction
- XML structures for relational data can be
obtained by the following steps
92. Background
- Outline
- XML
- XML Schema
- Semantic Enrichment
10XML
Background / XML
- Basic constructs of XML
- Element
- Attribute
- Reference (link)
- a relationship between resources (e.g.
elements). It is specified by attaching specific
attributes or sub-elements.
11XML DTD
Background / XML DTD
A Document Type Definition (DTD) describes
structure on an XML document.
ltRESULTSgt ltCUSTOMER CIDC980054Z"gt
ltCNAMEgtJ. Tanlt/CNAMEgt ltAGEgt36lt/AGEgt
lt/CUSTOMERgt lt/RESULTSgt
lt!ELEMENT RESULTS (CUSTOMER)gt lt!ELEMENT CUSTOMER
(CNAME, AGE)gt lt!ATTLIST CUSTOMER CID
ID REQUIREDgt lt!ELEMENT
CNAME (PCDATA)gt lt!ELEMENT AGE (PCDATA)gt
XML document
Corresponding DTD
12Semantic Enrichment
Background / Semantic Enrichment
- Semantic enrichment is a process that upgrades
the semantics of databases, in order to
explicitly express semantics that is implicit in
the data.
Such as various relationship types, cardinality
constraints, etc.
13Extra information needed
Background / Semantic Enrichment
- Functional Dependencies (FDs) and keys
- Inclusion dependencies (INDs)
- e.g. STUDENT (S, SNAME)
- HOBBIES(S, HOBBY)
- HOBBIESS ? STUDENTS
- Semantic dependencies (SDs) (T.W. Ling M.L.
Lee, 1995)
14Semantic Dependencies
Background / Semantic Enrichment
- EMPLOYEE(E, ENAME, JOINDATE, D)
-
- JOINDATE is functionally dependent on only E
- Assuming JOINDATE refers to the date on which an
employee assumes duty with the department. We say
that - JOINDATE is semantically dependent on E, D
15Semantic Enrichment using SD together with FD and
IND
Background / Semantic Enrichment
- To obtain
- Object relations and object attributes that
represent regular and weak entity types, and
their properties. - Relationship relations and relationship
attributes that represent various relationship
types such as binary, n-ary, recursive and ISA
(inheritance), and their properties. - Mix-type relations We need to split them into
object relations and relationship relations - Fragments of object relations or relationship
relations that represent multi-valued attributes
of entity types or relationship types. - Cardinality constraints
16An Original Relational Schema
Background / Semantic Enrichment
COURSE (CODE, TITLE) DEPT (D, DNAME) STUDENT
(S, SNAME) TUTORIAL (T, TUTORIALTITLE) HOBBIES(
S, HOBBY) STUDENTDEPT (S, D) C_S (CODE, S,
GRADE) ATTEND (CODE, T, S) COURSEMEETING
(CODE, S,MEETINGHISTORY)
17The Semantically Enriched Schema
Background / Semantic Enrichment
Object Relations COURSE (CODE, TITLE) DEPT (D,
DNAME) STUDENT (S, SNAME) TUTORIAL (T,
TUTORIALTITLE) Fragment of Object
Relations HOBBIES(S, HOBBY)
Relationship Relations STUDENTDEPT (S, D)
C_S (CODE, S, GRADE) ATTEND (CODE, T,
S) Fragment of Relationship Relations
COURSEMEETING (CODE, S,MEETINGHISTORY)
fragment of C_S
183. Proposed Relational to XML Translation
- Outline
- ORA-SS Model
- Relational Schema to ORA-SS Translation
- ORA-SS to XML Schema Translation
19ORA-SS Model
Proposed Relational to XML Translation / ORA-SS
-
- ORA-SS (Object-Relationship-Attribute model for
Semi-Structured data) - G. Dobbie, X.Y. Wu, T.W. Ling, M.L. Lee,
ORA-SS An Object-Relationship-Attribute Model
for Semi-structured Data, TR 21/00, National
Univ. of Singapore, 2001
20Concepts of ORA-SS (cont.)
Proposed Relational to XML Translation / ORA-SS
Object class
Binary relationship
Ternary relationship
Reference
Identifier
Relationship attribute
21Enriched Relational Schema to ORA-SS Schema
Translation
Enriched Relational Schema to ORA-SS Schema
Translation
- Objectives
- Identify object classes and their attributes from
object relations - Identify relationship types and their attributes
from relationship relations - Identify hierarchical structure
- Generate ORA-SS schema
22Overview of Translation Rules
Enriched Relational Schema to ORA-SS Schema
Translation
- Object relation rules to translate object
relations - Relationship relation rules to translate
relationship relations - Combination rule to be applied to the result
obtained from the application of object and
relationship relation rules, and generate the
final ORA-SS schema.
23Rule O1 Mapping object relations
Enriched Relational Schema to ORA-SS Schema
Translation /Object Relation Translation Rules
Maps to
Single-valued attribute
24Rule O2 Mapping fragment of object relations
Enriched Relational Schema to ORA-SS Schema
Translation /Object Relation Translation Rules
- STUDENT(S, SNAME)
- HOBBIES(S, HOBBY)
Maps to
Multivalued attribute
25Rule R1 Mapping 1-m/1-1 relationship relation
Enriched Relational Schema to ORA-SS Schema
Translation /Relationship Relation Translation
Rules
- Objectives
- Reduce disconnected elements
- Use parent-child structure
- Avoid unnecessary redundancies
- Use references
- Example
- ADVISOR(STAFF, POSITION) // object relation
- STUDENT(S, SNAME) // object relation
- STU_ADV(S, STAFF) //1-m relationship relation
26Rule R1 Mapping 1-m/1-1 relationship relation
(cont.)
Enriched Relational Schema to ORA-SS Schema
Translation /Relationship Relation Translation
Rules
- Case 1
- All the objects (instances) of STUDENT
participate in the relationship type
STU_ADV
ADVISOR
STU_ADV 2,0n,11
STU_ADV
Maps to
STUDENT
Use parent-child structure
27Rule R1 Mapping 1-m/1-1 relationship relation
(cont.)
Enriched Relational Schema to ORA-SS Schema
Translation /Relationship Relation Translation
Rules
- Case 2
- Not all the objects of STUDENT participate in
STU_ADV. - STUDENT is already as a child object and all the
objects of ADVISOR participate in STU_ADV .
or
STUDENT
STU_ADV 2,01,1n
STU_ADV
Maps to
ADVISOR
Use parent-child structure
28Rule R1 Mapping 1-m/1-1 relationship relation
(cont.)
Enriched Relational Schema to ORA-SS Schema
Translation /Relationship Relation Translation
Rules
- Case 3
- There exist objects of STUDENT and ADVISOR do
not participate in STU_ADV
STUDENT
ADVISOR
ADVISOR
STUDENT
Maps to
STU_ADV 2,,?
STU_ADV 2,,?
or
STU_ADV
A_Ref
S_Ref
ADVISOR1
STUDENT1
Use reference
29Rule R2 Mapping m-n binary relationship relation
Enriched Relational Schema to ORA-SS Schema
Translation /Relationship Relation Translation
Rules
Three ways to map
COURSE(CODE, TITLE) C_S(S, CODE, GRADE) STUDENT
(S, SNAME)
Preferred Mapping
30Other relationship relation rules
Enriched Relational Schema to ORA-SS Schema
Translation /Relationship Relation Translation
Rules
- Fragment of relationship relation is translated
similarly to the translation of the fragment of
object relation. - N-ary relationship relation is translated using
reference structures. The level of each
referencing object may be determined by the
aggregations. - If B ISA A, then B is mapped to a child object
class (OB) of OA.
31Combination Rule
Enriched Relational Schema to ORA-SS Schema
Translation /Combination Rule
- to be applied to the result obtained from the
application of object and relationship relation
rules, and generate the final ORA-SS schema.
Example PERSON(SSNO, RACE) //object
relation STUDENT(S, SSNO, MAJOR) //object
relation DEPT(D, DNAME) //object relation
STU_DEPT(S, D) //relationship relation
STUDENT ISA PERSON and one DEPT has many
STUDENT. In this case, STUDENT potentially has
multiple parents (i.e., DEPT and PERSON).
32Combination Rule
Enriched Relational Schema to ORA-SS Schema
Translation /Combination Rule
- Current solution
- Use references (K. Williams, et al. January 2001)
- -- It causes too many disconnected elements.
lt!ELEMENT Results (PERSON, STUDENTS
DEPT)gt lt!ELEMENT PERSON (EMPTY)gt lt!ATTLIST
PERSON SSNO ID REQUIRED
RACE CDATA IMPLIED STU_REF1
IDREF REQUIREDgt lt!ELEMENT STUDENT (EMPTY)gt
lt!ATTLIST STUDENT S ID
REQUIRED MAJOR CDATA IMPLIED
gt lt!ELEMENT DEPT (EMPTY)gt lt!ATTLIST
DEPT D ID REQUIRED
DNAME CDATA IMPLIED STU_REF2 IDREFS
REQUIREDgt
33Combination Rule (cont.)
Enriched Relational Schema to ORA-SS Schema
Translation /Combination Rule
- Our approach
- Translations are produced sequentially according
to their priorities. - The translation with the lowest priority will be
carried out last.
- The priorities of translations (in descending
order) - ISA, etc. semantic relationship relations and
their fragments // high semantic
cohesion among these participating object classes - 1-1 and 1-m relationship relation and their
fragments //
potentially represented as hierarchy (p-c)
structure - m-1 relationship relations and their fragments
//
potentially represented as hierarchy structure
preferably view as 1-m - m-n, n-ary relationship relations and their
fragments - This rule is used to avoid or reduce potential
multiple parents.
34Combination Rule (cont.)
Enriched Relational Schema to ORA-SS Schema
Translation /Combination Rule
We map STUDENT to the child object class of
PERSON first. Then map DEPT according to 1-m
relationship relation rule. Thus, we may get the
following result.
S ID REQUIRED
MAJOR CDATA IMPLIED gt lt!ELEMENT DEPT
(EMPTY)gt lt!ATTLIST DEPT D ID
REQUIRED DNAME CDATA
IMPLIED D_S_REF IDREFS REQUIREDgt
lt!ELEMENT OurSolution (PERSON, DEPT)gt lt!ELEMENT
PERSON (STUDENT)gt lt!ATTLIST PERSON SSNO
ID REQUIRED RACE
CDATA IMPLIED gt lt!ELEMENT STUDENT (EMPTY)gt
lt!ATTLIST STUDENT
35A possible ORA-SS Schema diagram derived from
university database
Enriched Relational Schema to ORA-SS Schema
Translation
Object Relations COURSE (CODE, TITLE) DEPT (D,
DNAME) STUDENT (S, SNAME) TUTORIAL (T,
TUTORIALTITLE) Fragment of Object
Relations HOBBIES(S, HOBBY)
Relationship Relations STUDENTDEPT (S, D)
C_S (CODE, S, GRADE) ATTEND (CODE, T,
S) Fragment of Relationship Relations
COURSEMEETING (CODE, S,MEETINGHISTORY)
fragment of C_S
36Input an ORA-SS schema diagram SDOutput an XML
DTDBegin Start from the top of SD and proceed
downward, for each object class O encountered
doStep 1. Sub-object classes of O
lt!ELEMENT O (subelementsList)gtStep 2. For each
attribute A of O Case (1) A is a
single valued simple attribute
lt!ATTLIST O A typegt Case (2) A is a
single valued composite attribute, replace A with
its components and add
to lt!ATTLIST O attributename typegt
Case (3) A is a multivalued simple attribute
lt!ELEMENT A(PCDATA)gt Case
(4) A is a multivalued composite attribute
lt!ELEMENT A(EMPTY)gt
As components lt!ATTLIST A
componentName typegtStep 3. For each relationship
attribute A under O, add A to subelementsList in
lt!ELEMENT O(subelementsList)gt.
Case (1) A is a simple attribute
lt!ELEMENT A(PCDATA)gt. Case (2) A is
a composite attribute lt!ELEMENT
A(EMPTY)gt, As
components lt!ATTLIST A componentName
typegt
Algorithm Mapping ORA-SS Schema Diagram
to XML DTD
37lt!ELEMENT UNIVERSITY (COURSE, STUDENT,
DEPT,
TUTORIAL)gtlt!ELEMENT COURSE (STUDENT1)gt
lt!ATTLIST COURSE CODE ID
REQUIRED TITLE CDATA
IMPLIEDgt lt!ELEMENT STUDENT1
(MEETINGHIS,TUTORIAL1)gt lt!ATTLIST
STUDENT1 C_S_REF IDREF REQUIRED
GRADE CDATA IMPLIEDgt lt!ELEMENT
MEETINGHIS (PCDATA)gt lt!ELEMENT TUTORIAL1
(EMPTY)gt lt!ATTLIST TUTORIAL1
T_REF IDREF REQUIREDgtlt!ELEMENT STUDENT
(HOBBIES)gt
Algorithm Mapping ORA-SS Schema Diagram to XML
DTD
The obtained XML structures (DTD)
lt!ATTLIST STUDENT S ID
REQUIRED SNAME CDATA IMPLIEDgt
lt!ELEMENT HOBBIES (PCDATA)gt lt!ELEMENT DEPT
(STUDENT2)gt lt!ATTLIST DEPT D
ID REQUIRED DNAME CDATA
IMPLIEDgt lt!ELEMENT STUDENT2 (EMPTY)gt
lt!ATTLIST STUDENT2 D_S_REF IDREF
IMPLIEDgtlt!ELEMENT TUTORIAL(EMPTY)gt lt!ATTLIST
TUTORIAL T ID
REQUIRED TUTORIAL_TITLE
CDATA IMPLIEDgt
384. Comparison
Rich structured and represents the real world accurately Yes ( ) 7, This paper
Rich structured and represents the real world accurately Partially 3
Rich structured and represents the real world accurately No 1, 5, 6
The representation of various relationship types and their attributes Yes ( ) This paper
The representation of various relationship types and their attributes Partially 7
The representation of various relationship types and their attributes No 1, 3, 5, 6
Number of disconnected elements Few ( ) 7, This paper
Number of disconnected elements Many Naïve approaches
Unnecessary redundancies Avoidable ( ) This paper
Unnecessary redundancies Partially 3, 7
Unnecessary redundancies Many 1, 5, 6
395 Conclusion
- Method proposed in this paper achieves
- Generation of semantically sound XML structures
for relational data possible - Generation of properly structured XML data
without unnecessary redundancies and
proliferation of disconnected XML elements
possible
40References
1 S. Banerjee, et al Oracle 8i The XML
Enabled Data Management System, Proc. 16th
Intl Conf. on Data Engineering, 2000 2 G.
Dobbie, X.Y. Wu, T.W. Ling, M.L. Lee, ORA-SS An
Object- Relationship- ttribute Model for
Semi-structured Data, TR 21/00, NUS, 2001 3
D.W. Lee, M. Mani, F. Chiu, W.W Chu,
Nesting-based Relational-to-XML Schema
Translation, Proc, 4th Intl Workshop on Web and
Databases, 2001 4 T.W. Ling, M.L. Lee,
Relational to Entity-Relationship Schema
Translation Using Semantic and Inclusion
Dependencies, In Journal of Integrated
Computer-Aided Engineering, pages 125-145,
1995 5 SYBASE, Using XML with the Sybase
Adaptive Server SQL Databases, A Technical
Whitepaper, http//www.sybase.com,2000 6 V.
Turau, Making Legacy Data Accessible for XML
Applications, http//www.informatik.fh-wies
baden.de/turau/veroeff.html1999 7 K. Williams,
et al., XML Structures for Existing Databases,
http//www- 106.ibm.com/developerworks/librar
y/x-struct/ January 2001 8 W.Y. Du, M.L. Lee,
T.W. Ling, XML Structures for Relational Data,
Proc. 2nd Intl Conf. on Web Information
Systems Engineering (WISE) , IEEE Computer
Society, 2001