Title: A Unified Schema Matching Framework
1A Unified Schema Matching Framework
- Alsayed Algergawy,
- Eike Schallehn, and
- Gunter Saake
- Institute fur Technische und Betriebliche
Informationssysteme - Otto-von-Guericke-Universitat Magdeburg
- Postfach 4120, D-39016 Magdeburg, Germany
- alshaht/ eike / saake_at_iti.cs.uni-magdeburg.de
2Outline
- What is schema matching?
- Where is schema matching used?
- Schema Matching Challenges
- The Proposed Framework
- Summary and Future Work
19. Workshop uber Grundlagen von Datenbanken
3Schema Matching
Area A
Area B
19. Workshop uber Grundlagen von Datenbanken
ltSchema name"Schema Tgt ltElementType
name"Customer"gt ltelement type"FName"/gt
ltelement type"LName"/gt ltelement
type"CAddress"/gt lt/ElementTypegt ltElementType
name"CAddress"gt ltelement type"street"/gt
ltelement type"city"/gt ltelement
type"provine"/gt ltelement type"code"/gt
lt/ElementTypegt lt/Schemagt
ltSchema name"Schema S"gt ltElementType
name"AccountOwner"gt ltelement type"Name"/gt
ltelement type"Address"/gt ltelement
type"BirthDate"/gt lt/ElementTypegt
ltElementType name"Address"gt ltelement
type"street"/gt ltelement type"city"/gt
ltelement type"state"/gt ltelement
type"ZIP"/gt lt/ElementTypegt lt/Schemagt
4Schema Matching Def.
- Schema matching is define as the task of
finding the semantic correspondences between
elements of two schemas.
19. Workshop uber Grundlagen von Datenbanken
S1
Match
Match Result
S2
Auxiliary information
( User feedback, Dictionaries, Previous mappings)
5Where is schema matching used?
- To motivate the importance of schema
matching, we summarize its use in several
application domains - Databases
- Data integration
- Data warehouse
- E-commerce
- Query processing
- Peer data management
- Model management
- Artificial Intelligent
- Knowledge bases, ontology merging,
- Web
- Semantic web services,
19. Workshop uber Grundlagen von Datenbanken
6Data Integration
- Problem Construct a global view from a set of
independently constructed schemas. - - Different structure and terminologies
-
- Solution Schema Matching is performed to find
relationships between concepts in each schema.
Then the matching elements can be unified.
19. Workshop uber Grundlagen von Datenbanken
7Query Processing
- Problem The terms used in the users query may
be different from those in the database. - Solution Matching is used to map the
user-specified concepts in the query to schema
elements.
19. Workshop uber Grundlagen von Datenbanken
8Challenges of Schema Matching
- Despite its pervasiveness and importance,
schema matching remains an extremely difficult
problem - Representation Problems different representation
models, different names and structures - Semantic Problems clues in schema and data are
incomplete and unreliable - Computational Cost problems
- Subjective and depending on the application
19. Workshop uber Grundlagen von Datenbanken
9- So we need
- A unified schema matching system, which
- Independent on the schema models,
- Independent on the application domains,
- Accurately identifies mapping elements
- Concerns on both match effectiveness and match
efficiency
ltSchema name"Schema Tgt ltElementType
name"Customer"gt ltelement type"FName"/gt
ltelement type"LName"/gt ltelement
type"CAddress"/gt lt/ElementTypegt ltElementType
name"CAddress"gt ltelement type"street"/gt
ltelement type"city"/gt ltelement
type"provine"/gt ltelement type"code"/gt
lt/ElementTypegt lt/Schemagt
ltSchema name"Schema S"gt ltElementType
name"AccountOwner"gt ltelement type"Name"/gt
ltelement type"Address"/gt ltelement
type"BirthDate"/gt lt/ElementTypegt
ltElementType name"Address"gt ltelement
type"street"/gt ltelement type"city"/gt
ltelement type"state"/gt ltelement
type"ZIP"/gt lt/ElementTypegt lt/Schemagt
19. Workshop uber Grundlagen von Datenbanken
Representation P.
Semantic P.
- tedious
- time consuming
- error prone, and
- expensive
10General Schema Matching Procedure (Proposed
Framework)
- The schema matching process requires the
following main phases - 1. Importing the schemas to be matched TransMat
Phase - 2. Identifying the elements to be matched
Pr-match Phase - 3. Applying the matching algorithm Matching
Phase - 4. Exporting the match result MapTrans Phase
19. Workshop uber Grundlagen von Datenbanken
1119. Workshop uber Grundlagen von Datenbanken
12TransMat Phase
- Transformation for Matching Process.
- To make the matching process a generic process.
- A common model chosen to represent the matched
schemas. - Graph data structure is used for internal
representation. - Graphs well-known data structure
- Transforming schema matching problem into a well
known problem graph matching
19. Workshop uber Grundlagen von Datenbanken
13Pre-Match Phase
- It is a critical phase
- Its output affects the input of matching phase
- Depends on the type of used matching algorithm.
- In rule-based systems
- COMA -------?graph traversing to identify
element to be matched ---------? nodes, paths,
fragments (COMA) - In learner-based system
- It is called a training phase
- Using AI techniques
- Neural networks -----------?SemInt
- Machine Learning -----------?LSD, iMAp
19. Workshop uber Grundlagen von Datenbanken
14Matching Phase
- It is the most important phase
---?identification of corresponding elements
19. Workshop uber Grundlagen von Datenbanken
15Element Matcher
Element Matcher
19. Workshop uber Grundlagen von Datenbanken
Element Property
Matcher Algorithm
Auxiliary information
Atomic/ structure
Schema-based/ instance-based
Rule-based
Learner-based
16Similarity Combiner
- Semantics of schema elements ---?available
information may vary - The relationships between them are fuzzy.
- A unified schema matching framework should
implement multiple matchers. - For every element pair, several similarity values
are computed. - To combine these values, similarity combiner is
used.
19. Workshop uber Grundlagen von Datenbanken
17Similarity Selector
- Not all the identified corresponding elements are
correct mappings. - Some selection criteria should used to select the
most suitable mappings.
19. Workshop uber Grundlagen von Datenbanken
18MapTrans Phase
- Mapping Transformation
- The match result should be exported to the
application domain. - inherently depends on
- the matching cardinality and
- the mapping representation.
19. Workshop uber Grundlagen von Datenbanken
1919. Workshop uber Grundlagen von Datenbanken
2019. Workshop uber Grundlagen von Datenbanken
2119. Workshop uber Grundlagen von Datenbanken
22Summary and Future Work
- Schema matching is such a pervasive, important,
and extremely difficult problem - A unified schema matching framework is proposed
to cope with the schema matching problem. - Many open research points are
- Problem formulation
- the internal representation
- Pre-matching especially for rule-based systems
- Choosing of matcher algorithm for performance
aspects
19. Workshop uber Grundlagen von Datenbanken
23References
- Aumüller, D., H.H. Do, S. Massmann, E. Rahm
Schema and Ontology Matching with COMA
(Software Demonstration). Proc. 24. ACM SIGMOD
Intl. Conf. Management of Data, 2005 - Berlin, J., A. Motro Database Schema Matching
Using Machine Learning with Feature Selection.
Proc. 14. Intl. Conf. Advanced Information
Systems Engineering (CAiSE), 2002 - Clifton, C., E. Housman, A. Rosenthal Experience
with a Combined Approach to Attribute-Matching
Across Heterogeneous Databases. Proc. IFIP 2.6
Working Conf. Database Semantics, 1996 - Do, H.H., E. Rahm COMA - A System for Flexible
Combination of Schema Matching Approach. Proc.
Intl. Conf. Very Large Databases (VLDB), 2002 - Doan, A.H., A. Halevy Semantic Integration
Research in the Database Community A Brief
Survey. AI Magazine, Special Issue on Semantic
Integration, 2005 - Rahm, E., P.A. Bernstein A Survey of Approaches
to Automatic Schema Matching. VLDB Journal,
10(4), 2001
19. Workshop uber Grundlagen von Datenbanken
24Thank you
19. Workshop uber Grundlagen von Datenbanken