Title: Resolving Schematic Discrepancy in the Integration of Entity-Relationship Schemas
1Resolving Schematic Discrepancy in the
Integration of Entity-Relationship Schemas
- Qi He Tok Wang Ling
- Dept. of Computer Science
- School of Computing
- National Univ. of Singapore
2Outline
- Schema integration background
- Schematic discrepancy
- Representation of meta information in ER schemas
- Resolution of schematic discrepancy in schema
integration - Related work
- Conclusion
3Schema Integration
- In DB integration, produce an integrated view
which provides a unified access to heterogeneous
data in source schemas. - In DB design, produce a global schema of a
proposed DB by integrating user views in DB
design.
4Challenges in schema integration
- Many types of conflicts among different source
schemas need to be resolved in schema
integration - Naming conflicts
- Domain mismatch
- Structural conflicts
- Cardinality conflicts
- Local constraints vs global constraints
- (e.g. local vs global functional
dependencies) - Schematic discrepancy
-
5Schematic Discrepancy
- Schematic discrepancy occurs when a metadata in
one database corresponds to attribute values in
the other. - An example (next page)
- months and supplier numbers (i.e., S1, , Sn) are
modeled differently as attribute values or schema
labels (in general, metadata which will be
introduced later) in databases DB1, DB2, and DB3.
6Motivation Example
price is an attribute of the ternary relationship
type PMS
PM is a relationship type between product and
month
7Contexts of schema constructs
- Conceptual modeling is always done within a
particular context which is explicitly
represented as a set of meta attributes with
values (called metadata). - Meta attributes with values specify the
conditions satisfied by the instances of a schema
construct (i.e., entity type, relationship type,
or attribute).
8Ontology
- A representational vocabulary for a shared domain
of discourse which includes the definitions of
entity types, relationship types, and attributes. - We use an ontology to describe the meta
information of the ER schemas of the supply
example - Entity types product, supplier, month
- Attributes of entity types p, pname, s, month
- Relationship types
- PMS (a ternary supply relationship type
among product, month - and supplier)
- PM (a binary relationship type between
product and month) PM is a
projection of PMS. - Attributes of relationship types price (an
attribute of PMS)
9Example of Context
- In DB2, the entity type JAN_PROD is represented
as - JAN_PROD PM month jan
- where PM and month are resp. a relationship type
and an entity type from the ontology. - It means that JAN_PROD is derived from the
product-month binary relationship type (i.e. PM)
when the month value is jan. - month is a meta-attribute and jan the metadata of
JAN_PROD.
10Inheritance of Context
- Context could be specified at 4 levels of
- Databases
- Entity types
- Relationship types
- Attributes
- The context of a higher level schema construct
could be inherited by a lower level schema
construct. The inheritance hierarchy of contexts
is - relationship type ?
attribute of relationship type - database ? entity type
- attribute of entity type
11Example of context inheritance
- In DB2, the attribute S1_PRICE of the entity type
JAN_PROD is represented as - S1_PRICE price suppliers1, inherit ALL
- S1_PRICE inherits all, i.e. the context
monthjan, from the entity type JAN_PROD. - The representation means that each value of
S1_PRICE of the entity type JAN_PROD is a price
of a product supplied by supplier s1 in the month
of jan.
12Resolution of schematic discrepancy in the
integration of ER schemas
13- Basic Idea Remove the contexts of schema
constructs by transforming meta-attributes into
entity types. - Only meta-attributes causing schematic
discrepancy need to be transformed. - Schema transformation should keep the semantics
(information and constraints) of source schemas.
14- Resolve schematic discrepancy for entity types,
relationship types, attributes of entity types
and attributes of relationship types in order
(the order conforms to the hierarchical order of
context inheritance). - The context at database level is handled in the
entity types which inherit it.
15An example
- Transforming DB2 into DB1 in 2 steps
- Step 1 Resolve discrepancies for the entity
types JAN_PROD, , DEC_PROD - Step 1.1 Transform meta-attributes into entity
types - Step 1.2 Merge equivalent entity types,
relationship types and attributes - Step 2 resolve discrepancies for the attributes
S1_PRICE, , SN_PRICE
16PM is a relationship type between product and
month
- Step 1.1 Transform the meta-attribute month of
the entity type JAN_PROD (the other entity types
are similar) - Construct an entity type MONTH to model the meta
info - JAN_PROD becomes PROD after removing the context
- Construct a relationship type PM to relate PROD
and MONTH - Attributes S1_PRICE, , SN_PRICE are moved to PM,
as they inherit the context (i.e., the month) of
the entity type JAN_PROD.
17Step 1.2 Merge the equivalent entity types,
relationship types and attributes which refer to
the same ontology names. Note the domains of the
MONTH attributes are united.
18An example (cont.)
- Transforming DB2 into DB1 in 2 steps
- Step 1 Resolve discrepancies for the entity
types JAN_PROD, , DEC_PROD - Step 2 Resolve discrepancies for the attributes
S1_PRICE, , SN_PRICE - Step 2.1 Transform meta-attributes into entity
types. - Step 2.2 Merge equivalent entity types,
relationship types and attributes. - Step 2.3 Remove redundant relationship types.
19price is an attribute of the relationship type PMS
- Step 2.1 Transform the meta-attribute supplier
of the attribute S1_PRICE (the other attributes
are similar) - Construct an entity type SUPPLIER to model the
meta information. - Construct a relationship type PMS to relate PROD,
MONTH and SUPPLIER. - S1_PRICE becomes PRICE after removing the
context, and is moved to PMS.
20Step 2.2 Merge the equivalent entity types,
relationship types and attributes. The domains of
the S attributes are united.
21Step 2.3 Remove the redundant relationship type
PM that is a projection of PMS.
22Semantic preservation
- Our solution to schematic discrepancy preserves
the semantics of source schemas in schema
transformation - Information preservation. The instance of a
schema can be losslessly converted into the
instance of another schema, and conversely. - Constraint preservation. Cardinality constraints
of ER schemas can be preserved in schema
transformation, but in different forms in the
source and transformed schemas (an example is
given in the next page).
23Constraint Preservation (E.g.)
- Functional dependency (FD) is preserved in the
transformation from DB2 to DB1. - Suppose in each entity type JAN_PROD, , DEC_PROD
of DB2, the FD holds - P?? S1_PRICE, , SN_PRICE
- In DB1, the FD is preserved, but in a different
form - P, S, MONTH?? PRICE
- In 3, we gave inference rules to derive FDs in
schema transformation.
3 Qi He and Tok Wang Ling Extending and
inferring functional dependency in schema
transformation. CIKM, 2004.
24Related work
- The definition of context as a set of
meta-attributes with values is originally adopted
in 2, 9. - They defined context at the attribute level only.
- We consider contexts at the levels of database,
entity types and attributes, as well as the
inheritance of context.
2 C. H. Goh, S. Bressan, S. Madnick, and M.
Siegel Context interchange new features and
formalisms for the intelligent integration of
information. TOIS, 1999 9 E. Sciore, M. Siegel,
A. Rosenthal Using semantic values to facilitate
interoperability among heterogeneous information
systems, TODS, 1994
25Related work
- Existing work in schema integration focused on
the resolution of structural conflicts 1, 7 and
constraint conflicts 6, 8. - Our solution to schematic discrepancy complements
those works. - The resolution of schematic discrepancy is
followed by the resolution of other conflicts.
1 C. Batini, M. Lenzerini A methodology for
data schema integration in the Entity-Relationship
model. IEEE Trans. on Software Engineering,
10(6), 1984 6 Mong Li Lee, Tok Wang Ling
Resolving constraint conflicts in the integration
of entity-relationship schemas. ER, 1997 7 Mong
Li Lee, Tok Wang Ling A methodology for
structural conflicts resolution in the
integration of entity-relationship schemas.
Knowledge and Information Sys., 5, 2003 8 M.
P. Reddy, B.E.Prasad, Amar Gupta Formulating
global integrity constraints during derivation of
global schema. Data Knowledge Engineering, 16,
1995
26Related work
- Schematic discrepancy in relational model is
solved in some multidatabase languages 4, 5. - They solved a special problem in schematic
discrepancy they transform relation names or
attribute names to attribute values, or converse. - They did not consider the constraint issue in
schema transformation. - Our work solves a general problem, and preserves
cardinality constraints of ER schemas in the
schema transformation.
4 R. Krishnamurthy, W. Litwin, W. Kent
Language features for interoperability of
databases with schematic discrepancies. SIGMOD,
1991 5 L. V. S. Lakshmanan, F. Sadri, S. N.
Subramanian SchemaSQLan extension to SQL for
multidatabase interoperability. TODS, 2001
27Conclusion
- ER model supports cardinality constraints, which
facilitates the derivation of constraints in
schema transformation and integration. - Context is used to explicitly represent meta
information of entity types, relationship types
and attributes in ER schemas. - Schematic discrepancy is resolved by removing
context. - The solution to schematic discrepancy preserves
information and constraints.
28Reference
- 1 C. Batini, M. Lenzerini A methodology for
data schema integration in the Entity-Relationship
model. IEEE Trans. on Software Engineering,
10(6), 1984 - 2 C. H. Goh, S. Bressan, S. Madnick, and M.
Siegel Context interchange new features and
formalisms for the intelligent integration of
information. ACM Transactions on Information
Systems, 17(3), 1999, pp 270-293 - 3 Qi He and Tok Wang Ling Extending and
inferring functional dependency in schema
transformation. CIKM, 2004. - 4 R. Krishnamurthy, W. Litwin, W. Kent
Language features for interoperability of
databases with schematic discrepancies. SIGMOD,
1991, pp 40-49 - 5 L. V. S. Lakshmanan, F. Sadri, S. N.
Subramanian SchemaSQLan extension to SQL for
multidatabase interoperability. TODS, 2001, pp
476-519 - 6 Mong Li Lee, Tok Wang Ling Resolving
constraint conflicts in the integration of
entity-relationship schemas. ER, 1997, pp 394-407 - 7 Mong Li Lee, Tok Wang Ling A methodology for
structural conflicts resolution in the
integration of entity-relationship schemas.
Knowledge and Information Sys., 5, 2003, pp
225-247 - 8 M. P. Reddy, B.E.Prasad, Amar Gupta
Formulating global integrity constraints during
derivation of global schema. Data Knowledge
Engineering, 16, 1995, pp 241-268 - 9 E. Sciore, M. Siegel, A. Rosenthal Using
semantic values to facilitate interoperability
among heterogeneous information systems, TODS,
19(2), 1994, pp 254-290