Using Schema Matching to Simplify Heterogeneous Data Translation - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Using Schema Matching to Simplify Heterogeneous Data Translation

Description:

Using Schema Matching to Simplify Heterogeneous Data Translation. Tova Milo, Sagit Zohar ... There are large amounts of data available on the Web but the format ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 18
Provided by: CraigP154
Category:

less

Transcript and Presenter's Notes

Title: Using Schema Matching to Simplify Heterogeneous Data Translation


1
Using Schema Matching to Simplify Heterogeneous
Data Translation
  • Tova Milo, Sagit Zohar
  • Tel Aviv University

2
Introduction
  • There are large amounts of data available on the
    Web but the format of the data is not
    homogeneous.
  • Most applications can handle only one or a small
    number of formats.
  • There is a need to translate data from one format
    to another.

3
Introduction
  • Two approaches to translating data
  • A specific program to translate from format A to
    format B. (e.g. Latex to HTML)
  • Data translation languages.

4
Introduction
  • The solution TranScm
  • A data translation system
  • Automatically translates a portion (often a large
    portion) of the desired data
  • Does not replace data translation languages, but
    reduces the amount of programming needed in them

5
TranScm Architecture
Input Schema
Output Schema
Import/Export Library
GUI
Rule Base
Matching Module
Typing Module
6
Data Model
  • Tree (Forest) Model
  • Similar to OEM
  • Allows an order on children
  • Can handle cyclic structures using ids as
    pointers

7
Data Model
Article
title
authors
sections
author
author
Conceptual Concepts
Al Gore Ithm
G WWW Bush
8
Schema Model
  • Labeled graphs
  • Some nodes may be ordered
  • Each vertex is a schema element (type)
  • Labels carry information about the node

9
Schema Model
Article 3
author 1
string
sections 2
title 1
authors 0,,-gt
ref
string
10
Rules
  • Rules are the basis of the matching and
    translation
  • Rules have an associated priority

11
Rules
  • Each rule has two components
  • Matching component
  • Match function
  • Decendents (sic) function
  • Translation component
  • Translation function

12
Matching
  • The Match function examines schema labels to
    determine possible matches.
  • The Decendents function checks the numbers and
    types of the children of the current node.

13
Matching
Article
Article
authors
author
author
author
author
14
When Matching Fails
  • Matching can fail for two reasons
  • Something in the source cant be matched to
    something in the target with the current set of
    rules.
  • Something in the source matches several items in
    the target equally well.

15
When Matching Fails
  • Via the GUI, the user can do the following
  • Add
  • Disable
  • Modify
  • Override

16
Translation
  • Using the mapping generated from the Matching
    step and the appropriate rules, data is
    transformed from the input schema to the output
    schema.
  • The translation process can make use of data
    translation languages
  • The translation process can perform type checking.

17
Conclusion
  • TranScm
  • Provides a general mechanism for data translation
  • Handles the common relatively simple translations
    automatically
  • Can use data translation languages for more
    difficult translations
Write a Comment
User Comments (0)
About PowerShow.com