Title: Doan Dai Duong and Le Thi Thu Thuy
1A Unified Framework for the Semantic Integration
of XML Databases
First IEEE International Conference on Digital
Information Management (ICDIM)
- Doan Dai Duong and Le Thi Thu Thuy
- Duong_Dai.Doan, Thuy_Thi_Thu.Le_at_unb.ca The
University of New Brunswick, Fredericton, NB,
Canada
Presented by Virendrakumar C. Bhavsar
December 06-08, 2006
2Agenda
- Introduction
- XML Declarative Description (XDD)
- Modeling of Data Components
- Modelling of Processing Components
- Conclusion
3Introduction
- General model of XML database integration
Step 1 Schema Integration
4Step 2 Query Processing
Users
4
5Proposed Integration Framework
- Powerful
- XDD supports for all tasks of framework
- Input XML query, input XML data, output XML data
- Rules, constraints, mappings
- Metadata
- Based on XML standard format, XDD combines all
tasks of framework tightly and makes it easily to
manipulate data - Reduce time and effort of programmers and users
and syntax errors
5
6XML Declarative Description
- XML Declarative Description (XDD) is XML-based
information representation - Ordinary XML expressions (ground XML
expressions) variables Non-ground XML
expressions - ? Enhancement of expressive power and
representation of implicit information - XML clauses of the form
- H ? B1, , Bm, C1, , Cn
- ? Able to express conditions, constraints
Wuwongse, V., Anutariya, C., Akama, K., and
Nantajeewarawat, E. XML Declarative Description
(XDD) A Language for the Semantic Web. IEEE
Intelligent Systems, Vol. 16, No. 3, (2001) 54-65
7Modeling of Data Components
- XML Databases
- Extension (actual data values) ground XML
expressions - Intension (schemas, logical specifications,
relationships, indexes and constraints)
non-ground XML expressions - XML Queries
- Include constructor, patterns, and filters
- Correspond to three parts (H, Bi, Cj) of XDD rule
- H ? B1 , Bm, C1,,Cn
8Modeling of Data Components
Query modelled by XDD
9Query Execution Example
10Modeling of Data Components
- Mappings
- Describes correspondence between object in
integrated schema and its corresponding objects
in local schemas - Supports decomposing XML queries and converting
data - Modeled by non-ground XML expressions
11Sample of Mappings
Object in integrated schema
Object in schema A
Object in schema B
12Modelling of Processing Components
- Schema Integration Component
- The main task is to resolve conflicts between
schemas of participating databases - Conflict resolution between various schemas is
done at one time (one-shot strategy) - Each local schema is big non-ground XML
expression (E_variable)
13Schema Integration Component
- XDD can interactively process all schemas as E
expressions
14Schema Conflict Classification
Conflicts between schemas can be classified into
four main kinds
- Naming conflicts
- Synonyms
- Acronyms
- Homonyms
- Structural conflicts
- Missing items conflicts
- Internal path discrepancy conflicts
- Aggregation conflicts
- Generalization/specification
- Constraint conflicts
- Occurring numbers of elements
- Fixed vs. default values
- Constraints of attributes
- Data type conflicts
- Disjoint or incompatible data types
- Compatible data types
- IDREF and IDREFS
1514
16Query Decomposition
- The main task ? yield n local subqueries from
global query
ltstudent id Sidgt ltnamegtSnamelt/namegt ltcountr
ygtScountrylt/countrygt lt/studentgt
Integrated schema
ltSATstudent key Sid sourceBgt ltfullnamegt
Sname lt/fullnamegt ltcountrygtScountrylt/countrygt
lt/SATstudentgt
Schema for source B
ltSOMstudent idSid sourceAgt ltnamegt Sname
lt/namegt ltnationgtScountrylt/nationgt lt/SOMstudentgt
Schema for source A
17Query Decomposition
Mappings from global to local
A. Brief view
Sub query for local source
query
Query Decomposition
Sub query for local source
Input XML query
ltstudent id Sidgt ltnamegtSnamelt/namegt ltcountr
ygtScountrylt/countrygt lt/studentgt
16
18Query Decomposition Example
ltanswergt ltSATstudent sourceBgt
ltcountrygtScountrylt/countrygt lt/SATstudentgt ltSOMst
udent sourceAgt ltnationgtScountrylt/nationgt
lt/SOMstudentgt lt/answergt
Local query for source A
results in
4
Local query for source B
ltanswergt Eexpression lt/answergt ?
ltMappinggt ltstudentgt
ltcountrygtScountrylt/countrygt lt/studentgt
ltlocalgtEexpressionlt/localgt lt/Mappinggt
3
infers to
matches with
1
ltMappinggt ltstudentgt ltcountrygtScountrylt/coun
trygt lt/studentgt ltlocalgt ltSATstudent
sourceB"gt ltcountrygtScountrylt/countrygt lt/
SATstudentgt ltSOMstudent sourceA"gt
ltnationgtScountrylt/nationgt lt/SOMstudentgt
lt/localgt lt/Mappinggt
bounds to
2
19Query Decomposition
- Using special structure of mapping and applying
XDD rules for query decomposition - Subqueries for distributed data sources are
simultaneously produced - Similarly for data conversion, extracted data are
simultaneously converted to global schema format
20Conclusion
- XDD is used to model all data components and
processing components of XML database integration
framework - Components of system modeled by XDD can
communicate and exchange data easily - Special structure for XDD-based bidirectional
mappings is designed. Information is produced
efficiently for both query decomposition and data
conversion, avoiding data redundancy - The framework can
- Integrate n participating schemas
- Decompose a query into n subqueries at a time.
21Thank you !