Title: The ORASS Approach for Designing
1The ORA-SS Approach for Designing Semistructured
Databases
- Xiaoying Wu, Tok Wang Ling, Mong Li Lee
- National University of Singapore
- Gillian Dobbie
- University of Auckland, New Zealand
2Outline
- Motivation
- Introduction to ORA-SS (Object-Relationship-
- Attribute ) Model
- From ORA-SS to XML DTD
- Normal form for ORA-SS schema diagram
- Designing ORA-SS schema diagram into normal
form - Comparison with related proposals
- Summary
31. Motivation
- Example 1.1 Redundancy in XML document
ltdepartmentgt ltnamegtcslt/namegt
ltprofessorgt ltstaffnumbergt12lt/staffnumber
gt ltnamegtSmithlt/namegt
ltcoursegt ltcoursecodegt230lt/coursecod
egt lttitlegtDatabaselt/titlegt
lt/coursegt lt/professorgt ltprofessorgt
ltstaffnumbergt22lt/staffnumbergt
ltnamegtJoneslt/namegt ltcoursegt
ltcoursecodegt230lt/coursecodegt
lttitlegtDatabaselt/titlegt lt/coursegt
lt/professorgt lt/departmentgt
41. Motivation (Cont.)
51. Motivation (Cont.)
- Example 1.1 (Cont.)
- Corresponding ORA-SS instance diagram and schema
diagram
61. Motivation (Cont.)
A better Designed ORA-SS schema diagram
71. Motivation (Cont.)
A better Designed ORA-SS instance schema diagram
81. Motivation (Cont.)
- Example 1.2Ambiguity in OEM database and its
DataGgide
91. Motivation (Cont.)
- Example 1.2(Cont.) Ternary Relationship Type
Representation
101. Motivation (Cont.)
- Example 1.2 (Cont.)Binary Relationship Type
Representation
Note the DataGuide for the schema diagram is the
same as for the previous schema!
112. Introduction to ORA-SS Model
- Four concepts
- object classes
- relationship types
- attributes
- references
- Four Diagrams
- schema diagram
- instance diagram
- functional dependency diagram
- inheritance diagram
122. Introduction to ORA-SS Model(Cont.)
- Object Class
- attributes of object class
- Single valued
- Multi-valued
- ordering on object class
Object class employee with attributes in an
ORA-SS schema diagram
132. Introduction to ORA-SS Model(Cont.)
- Relationship Type
- attributes of relationship type
- Single valued
- Multi-valued
- degree of n-ary relationship type
- participation constraints of objects in
relationship type - disjunctive relationship type
- recursive relationship type
142. Introduction to ORA-SS Model(Cont.)
Representing binary relationship type
152. Introduction to ORA-SS Model(Cont.)
Representing ternary relationship type
162. Introduction to ORA-SS Model(Cont.)
- Attributes
- key attribute and identifier
- composite attribute
- disjunctive attribute
- attribute with unknown structure (ANY)
- ordering on attribute
- Attributes of object class/relationship type
- Single-valued / multi-valued attribute
- fixed and default values of attribute
- derived attribute
172. Introduction to ORA-SS Model(Cont.)
Object classes with relationship type and
attributes in an ORA-SS schema diagram
182. Introduction to ORA-SS Model(Cont.)
Disjunctive attribute and relationship in
an ORA-SS schema diagram
192. Introduction to ORA-SS Model(Cont.)
Referencing an object class in an ORA-SS schema
diagram
202. Introduction to ORA-SS Model(Cont.)
- Recursive relationship type in an ORA-SS schema
diagram
Symmetric relationship sets in an ORA-SS schema
diagram
213. Mapping ORA-SS schema diagram to XML DTD
- Algorithm 1 Mapping ORA-SS Schema Diagram to XML
DTD - input an ORA-SS schema diagram SD
- output an XML DTD
- Begin
- For each object class O in SD do
- Step 1. sub-object classes of O
lt!ELEMENT O (subelementsList)gt. - Step 2. For each attribute A of O
- Case (1)A is a single valued simple attribute
lt!ATTLIST O A typegt - Case (2)A is a single valued composite
attribute, replace A with its - components and add them to
lt!ATTLIST O attributeName typegt - Case (3)A is a multivalued simple attribute
lt!ELEMENTA (PCDATA)gt. - Case (4)A is a multivalued composite attribute
lt!ELEMENTA (EMPTY)gt, - As components lt!ATTLIST
A componentName type gt
223. Mapping ORA-SS schema diagram to XML DTD
(Cont.)
- Algorithm 1 mapping ORA-SS schema diagram to XML
DTD (cont.)
Step 3. For each relationship attribute A under O
Case (1)A is a simple attribute
lt!ELEMENTA (PCDATA)gt add A to O s
subelementsList. Case (2)A is a multi-valued
simple attribute lt!ELEMENTA (PCDATA)gt
and add A to O s
subelementsList . Case (3)A is a single-valued
composite attribute lt!ELEMENTA
(PCDATA)gt. As components
lt!ATTLISTA componentName type gt. Case (4) A is
a multi-valued composite attribute
lt!ELEMENTA (PCDATA)gt. As
components lt!ATTLISTA componentName type
gt. add A to O s subelementsList. Step 4. For
each reference O-Ref Case (1) O is a child
object class of O1, and has no extra attributes
and child object classes
lt!ATTLIST O1 O-Ref IDREF(S)gt Case (2) O
is a root object class or it has nested
attributes or child object classes
lt!ATTLIST O O-Ref IDREF(S)gt
233. Mapping ORA-SS schema diagram to XML DTD
(Cont.)
Referencing an object class in an ORA-SS schema
diagram
243. Mapping ORA-SS schema diagram to XML DTD
(Cont.)
An XML DTD for the ORA-SS schema diagram
254. Normal form for ORA-SS schema diagram
- Observation ORA-SS is similar to nested
relations - tree-like structure
- repeating groups or multiple occurrences of
objects. - e.g. the corresponding nested relation for the
following ORA-SS schema diagram is - Dept (dept-name, course (code, title, student
(number, s-name, grade)))
264. Normal form for ORA-SS schema diagram(Cont.)
- Objectives To ensure the corresponding set of
nested relations of the ORA-SS schema diagram is
in normal form for set of nested relations
(NF-NR) 5,6 - We will define
- Object class normal form (O-NF)
- Relationship type normal form (R-NF)
- ORA-SS normal form schema (ORA-SS NF)
274. Normal form for ORA-SS schema diagram(Cont.)
- Defn object class normal form (O-NF)
- An object class O of an ORA-SS schema diagram
is said to be in object class normal form (O-NF),
if the nested relation constructed by Os single
valued attributes as its atomic attributes, Os
multivalued attributes as its repeating groups,
is in normal form NF-NR.
284. Normal form for ORA-SS schema diagram(Cont.)
- Example 4.1Assume we have following functional
dependencies S ?dept, dept?faculty for the
ORA-SS schema diagram
The corresponding nested relation for the schema
diagram is Staff(s,dept,faculty), it is not
in 3NF, since faculty is transitive dependent on
S , hence the relation is not in NF-NR.
A better Designed ORA-SS schema diagram
Transitive functional dependency is removed.
294. Normal form for ORA-SS schema diagram(Cont.)
- Defn relationship type normal form (R-NF)
- A relationship type R of an ORA-SS schema
diagram D is said to be in relationship type
normal form (R-NF), if the nested relation
constructed by the identifiers of the
participating object classes, and Rs atomic
attributes as its atomic attributes, Rs
multivalued attributes and composite attributes
as its repeating groups, is in normal form NF-NR.
304. Normal form for ORA-SS schema diagram(Cont.)
- Example 4.2The ORA-SS schema attempts to show
that the lecturer can teach all the courses using
all the textbooks as described on the curriculum,
i.e. it should satisfy a MVD constraints
course-code??isbn staff..
The nested relation for the relationship type ctl
is ctl(course-code,isbn,staff) It is not in
4NF, so is not in NF-NR, hence the relationship
type ctl is not in R-NF.
A better design MVD is removed
314. Normal form for ORA-SS schema diagram(Cont.)
- Defn ORA-SS normal form schema
- An ORA-SS schema diagram D is in normal form (NF)
iff it satisfies the - following conditions
- 1.Every object class in D is in O-NF.
- 2.For every relationship type R in D
- (a) R is in R-NF.
- (b) Case(1) R is a binary relationship type
from object class A to object class - B, then all the Bs attributes can stay
with B only if R is a one-to-many or - one-to-one binary relationship type from
A to B. All the attributes of R - (if any) should be attached to B.
- Case (2) R is a n-ary relationship type with n
(ngt2) participating object - classes O1,O2,,On, and the path going downward
from the top of D - linking those object classes is /O1/O2//On, then
for each object class - Oi (2?i?n),
- (i) Oi should have an i-ary relationship
Ri with its ancestors O1,O2,,Oi-1. - (ii) The attributes of Oi can stay with
Oi only if functional dependency - Oi ? O1,O2,,Oi-1 can be derived
from the functional dependency - diagram for D. The attributes of Ri
(if any) should be attached to Oi. - 3.There is no relationship type nested under
another many-to-many or many-to
324. Normal form for ORA-SS schema diagram(Cont.)
- Example 4.4 The ORA-SS schema diagram is not in
NF, if professor is also an employee in the
department the qualification of a professor can
be derived from that of employee, such
information will be repeated in the underlying
databases.
A ORA-SS schema diagram that not in NF
A ORA-SS schema diagram that in NF
335. Converting ORA-SS Schema Diagrams into Normal
Form
- Two Approaches for Designing Semistructured
Databases - Approach 1.
- based on the users requirements, come out an
initial ORA-SS schema diagram - normalize the ORA-SS schema diagram to its normal
form - map it to an XML DTD or XML Schema
- Approach 2.
- Extract schema from the instances using the
schema extracting techniques. - Translate the schema into ORA-SS schema diagram.
Here we need semantic enrichment, since not all
semantics needed are available from the extracted
schema. - Convert the ORA-SS schema diagram into its normal
form. - translate the NF ORA-SS schema diagram back to
XML DTD or XML Schema. - Restructuring the initial data instance to
conform to the generated XML DTD or XML Schema.
345. Converting ORA-SS Schema Diagrams into Normal
Form(Cont.)
- Algorithm 2 Converting an ORA-SS schema diagram
into NF ORA-SS - schema diagram.
- Input an ORA-SS schema diagram SD, and its
functional dependency diagram. - Output a NF ORA-SS schema diagram.
-
- step 1. Convert any non O-NF object class to
O-NF. - step 2. Make each relationship type R in R-NF.
- step 3. This step involves two sub-steps.
- (1) Construct diagrams for each
object class with their attributes. - (2) Represent each relationship
type R. We make R satisfy the item (b) - of condition 2 as well as
condition 3 of the NF definition by introducing - referencing object classes,
and requiring each relationship type start - with an object class with
attributes (i.e., non-reference object class). - step 4. Remove those relationship types along
with their associated attributes - that can be derived from other
relationship types in the schema diagram - to satisfy condition 4 of NF
definition. -
355. Converting ORA-SS Schema Diagrams into Normal
Form(Cont.)
- Example 5.1
- There is a many-to-many binary relationship pc
between professor and course, and a many-to-many
binary relationship ct between course and
textbook. - It is not in NF ORA-SS since it violates the
condition 3 of the NF definition.
.
(a) Initial ORA-SS schema diagram
365. Converting ORA-SS Schema Diagrams into Normal
Form(Cont.)
- Example 5.1 (Cont.)
- Step 1. The three given object classes are
already in O-NF. - Step 2. The two relationship type pc and ct are
already in R-NF. - Step 3.
- (1) generate three diagrams for the object
classes with attributes.
(b) Fragment diagrams for object classes
375. Converting ORA-SS Schema Diagrams into Normal
Form(Cont.)
- Example 5.1 (Cont.)
- Step 3.(Cont.)
- (2) represent the binary relationship
pc, by creating a reference - object class course1 referencing
course and nest course1 - under professor
(c) Diagrams after representing relationship pc
385. Converting ORA-SS Schema Diagrams into Normal
Form(Cont.)
- Example 5.1 (Cont.)
- Step 3.(Cont.)
- (2) represent the binary
relationship ct, by creating a - reference object class
textbook1 referencing textbook - and nest textbook1 under
course.
(d) Final ORA-SS schema diagram that in NF
Step 4.(passed). The schema generated is in NF.
395. Converting ORA-SS Schema Diagrams into Normal
Form(Cont.)
- Example 5.2.
- There is a binary relationship cs between
course and student and a ternary relationship cst
between course, student and tutor. The grade is
an attribute of the binary relationship cs, and
feedback is an attribute of the ternary
relationship cst. - It is not in NF ORA-SS since it violates the item
(ii) of case 2 in condition 2-(b) of NF
definition.
(a) Initial ORA-SS schema diagram
405. Converting ORA-SS Schema Diagrams into Normal
Form(Cont.)
- Example 5.2(Cont.)
- Step 1. The three given object classes are
already in O-NF. - Step 2.The two relationship type cs and cst are
already in R-NF. - Step 3.
- (1) generate three diagrams for the object
classes with attributes.
(b) Fragment diagrams for object classes
415. Converting ORA-SS Schema Diagrams into Normal
Form(Cont.)
- Example 5.2 (Cont.)
- Step 3.(Cont.)
- (2) represent the binary relationship cs.
we create a reference - object class student1 referencing
student and nest student1 - under course. Relationship
attribute grade is attached to - student1.
(c) Diagram representing binary relationship cs
425. Converting ORA-SS Schema Diagrams into Normal
Form(Cont.)
- Example 5.2 (Cont.)
- Step 3.(Cont.)
- (2) represent the relationship cst. we
create a reference object class - tutor1 referencing tutor, and nest
tutor1 under student1. Relationship - attribute feedback is attached to
tutor1.
(d) Final ORA-SS schema diagram that in NF
Step 4.(passed). The schema generated is in NF.
436. Comparison with Related Proposal
- The first attempt to define normal form for
semistructured data4 - Defines a schema called S3-Graph, a labeled graph
in which vertices correspond to objects and edges
represent the object-subobject relationship. Its
data instance is called semistructured data
graph. - S3-Graph cannot show the degree of a n-ary
relationship type, neither can it distinguish
between attributes of object classes and
attributes of relationships types.
446. Comparison with Related Proposal(Cont.)
- The first attempt to define normal form for
semistructured data4 (Cont.) - Defined a dependency constraint SS-dependency.
- Proposes S3-NF. An S3-Graph is in S3-NF if there
is no transitive SS-dependency. Hence, only this
kind of redundancy can be recognized by S3-NF
456. Comparison with Related Proposal(Cont.)
- The first attempt to define normal form for
semistructured data4 (Cont.) - Presents two approaches to design S3-NF databases
- The decomposition method can remove identified
transitive SS-dependency and achieve S3-NF, while
may not able to remove the partial functional
dependency inside an entity type or object
classes, as well as the redundancy result from
over-nesting. - The transformation of a normal form ER diagram
into an S3-Graph. The result may not be unique
but is dependent on the path constructed. Hence
some results may not satisfy the application
requirements and comply with the users
viewpoints.
466. Comparison with Related Proposal(Cont.)
- The most recent proposal XNF (XML Normal
Form)2 - It mainly provides algorithms to translate a
schema, represented in a conceptual model called
CM hypergraph to a scheme-tree forest in XNF. - CM hypergraph has no concept of attribute (so too
many objects) and no hierarchical structure. - The given algorithms are non-deterministic, and
suffers from efficiency. - Adding new required information requires redesign
schema. - The algorithms generate a large no of solutions
rather than verifying whether a SS schema is in
normal form or not. - ISA hierarchies are removed from CM hypergraph
before input to the algorithms.
476. Comparison with Related Proposal(Cont.)
- The advantages of our proposal
- 2-level design incremental and iterative
- First, identify or figure out object classes,and
relationship types from user requirements. - Then add attributes for object classes and
relationship types. - In contrast, XNF requires all the
needed - information to be presented at once.
Even a - small change in information
requirements - requires redesign the whole schema.
486. Comparison with Related Proposal(Cont.)
- The advantages of our proposal (Cont.)
- Preserve the hierarchical structure satisfying
users requirements. - In contrast, since CM graph has no
hierarchy, - XNF needs to generate many solutions.
The - approach fails when user already has a
- hierarchical structure, and wants to
preserve - it and verifies the design is good or
not.
497. Summary
- ORA-SS model helps to detect redundancy in
semistructured data. - We need a normal form for ORA-SS, since ORA-SS
schema diagrams may contain redundancies and
suffers from considerable updating anomalies. - We define a normal form ORA-SS schema diagram. It
ensures - no unnecessary redundancy and
- no updating anomalies for semistructured
databases generated from the schema . - We present an algorithm for mapping ORA-SS schema
diagram into XML DTD/Schema
507. Summary (Cont.)
- We give a design methodology and present a
comprehensive algorithm for normalizing an ORA-SS
schema diagram into its normal form. The steps
presented can also be used as guidelines for
designing semistructured databases using the
ORA-SS model - As ORA-SS distinguished objects Vs. attributes,
the design complexity is reduced. - ORA-SS allows 2 levels of design first object
classes and relationship type then add in
attributes. - We show that ORA-SS design approach outperform
other related proposals.
51References
- G.Dobbie, X.Y.Wu, T.W.Ling and M.L.Lee. ORA-SS
An Object-Relationship-Attribute Model for
Semistructured Data. Technical Report TR21/00,
School of Computing, National University of
Singapore, 2000. - D.W.Embley and W.Y.Mok. Developing XML Documents
with Guaranteed Good Properties. ER 2001. - R. Goldman and J. Widom. DataGuides Enabling
Query Formulation and Optimization in
Semistructured Databases. Proceedings of the
Twenty-Third International Conference on Very
Large Data Bases, pages 436-445, Athens, Greece,
August 1997. - S. Y. Lee, M. L. Lee, T. W. Ling and L. A..
Kalinichenko. Designing Good Semi-structured
Databases. ER 1999 131-145 - T.W. Ling. A Normal Form for Entity-Relationship
Diagrams. Proc. 4th International Conference on
Entity-Relationship Approach (1985) - T. W. Ling. A normal form for sets of
not-necessarily normalized relations. In
Proceedings of the 22nd Hawaii International
Conference on System Sciences, pp. 578-586.
United States IEEE Computer Society Press, 1989. - X.Y.Wu, T.W. Ling, M.L.Lee, G.Dobbie. Designing
Semistructured Databases Using ORA-SS Model, in
Proceedings of the 2nd International Conference
on Web Information Systems Engineering (WISE),
IEEE Computer Society Kyoto, Japan, December 2001.