Semantic Adaptation of Schema Mappings when Schemas Evolve - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Semantic Adaptation of Schema Mappings when Schemas Evolve

Description:

Schema Mappings are logical, declarative, assertions that can describe ... Outline of the Rest of the Talk. Incremental Approach vs. Composition Approach ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 29
Provided by: eecsU
Category:

less

Transcript and Presenter's Notes

Title: Semantic Adaptation of Schema Mappings when Schemas Evolve


1
Semantic Adaptation of Schema Mappings when
Schemas Evolve
  • Cong Yu
  • University of Michigan
  • Lucian Popa
  • IBM Almaden Research Center

VLDB05, Trondheim, Norway Sep 2, 2005
2
Schema Mappings
?
Schema S
Schema T
J
I
q
q
  • Schema Mappings are logical, declarative,
    assertions that can describe relationships
    between schemas.
  • enough semantics to guide run-time,
    instance-level, transformation
  • e.g., GLAV mappings (or tuple-generating
    dependencies)
  • They are key elements in two main areas in
    information integration
  • Data Exchange/Translation
  • Query Answering/Rewriting (or Federation)

3
Schema Evolution and Mapping Adaptation
  • Schemas evolve over time Mappings may become
    invalid !
  • A lot of effort goes into establishing mappings.
    How do we reuse them ?
  • Mapping Adaptation Problem VMP03
  • Given
  • mapping M from S to T,
  • changes/evolution of S to S, or T to T, or
    both,
  • Derive a best mapping M that
  • is valid with respect to the new schemas, and
  • reflects the original mapping as much as possible

4
Prior Solution Incremental Method
M
move elem
S
T
M1
add elem
T1
M2
T2
delete constraint
M3
T3
rename elem
  • VMP03 Incrementally adapts the mapping after
    each atomic change in the schemas (source and/or
    target).
  • Efficient and intuitive, for one or few changes.
  • However, for non-incremental evolution, there are
    drawbacks

5
M
T
S
Different evolution paths
Mn
Tn
  • The new schema may be radically different
  • The list of changes may not be known.
  • Evolution path must be discovered ? not
    necessarily unique
  • The method will ultimately be inefficient
  • The algorithm must be applied at each atomic
    change
  • As we shall see, the resulting mapping may not be
    the expected one.

6
Our Approach Composition-Based
M
S
Can use schema mapping tools (e.g., Clio) to
construct E.
T
E
M M E
T
  • Evolution itself is described as a schema
    mapping.
  • Concise, declarative, and expressive description
    of evolution.
  • Enables efficiency and can deal with arbitrary
    evolution
  • The adapted mapping is then obtained via
    composition.
  • Formal semantics of adaptation.
  • At high level, this is part of the model
    management vision Ber03.

7
Main Contributions
  • We study the interplay between schema evolution
    and mapping composition
  • interesting in terms of both semantics and
    implementation
  • We show that the composition-based approach for
    mapping adaptation can be practical

8
Outline of the Rest of the Talk
  • Incremental Approach vs. Composition Approach
  • Example (showing why composition is important)
  • Composition Semantics and Algorithm
  • Transformation semantics
  • ? specialized, more suitable for schema
    evolution, also more challenging
  • Optimization and Experiments
  • Compose only when necessary
  • (Some mapping formulas are unaffected by the
    change)
  • Conclusion

9
Simplified Example
Source
Source
Target
  • m SuppPart (s, p) ? PartOrder (p, o)
  • ? PotentialSupp (s, o)
  • ( GLAV mapping Halevy01, or,
  • source-to-target tgd FKMP03 )

SuppPart
LineItem
m
s p
PotentialSupp
li s p o qty
s o
PartOrder
p o
  • The mapping m exports orders o and all their
    potential suppliers s.
  • Schema evolution scenario
  • Data arrives in long tuples, each relating an
    order, a part and an available supplier.
  • The mapping m must be adapted to use new schema
    Source.

10
Incremental Approach VMP03
Source
Target
Source
m SuppPart (s, p) ? PartOrder (p, o)
? PotentialSupp (s, o)
SuppPart
LineItem
m
s p
PotentialSupp
li s p o qty
s o
PartOrder
p o
  • Pick a list of changes from Source to Source and
    rewrite mapping after each change.
  • (1) Move element SuppPart/s to PartOrder/s
  • SuppPart (p) ? PartOrder (s, p, o) ?
    PotentialSupp (s, o)
  • (2) Delete SuppPart/p and (3) delete SuppPart.
  • (4) Rename PartOrder to LineItem, (5) add
    LineItem/li and (6) add LineItem/qty
  • m LineItem (li, s, p, o, qty) ?
    PotentialSupp (s, o)

11
  • Although small, our example already needs 6
    schema changes.
  • For large schemas, this can become challenging
  • Furthermore, and somewhat surprisingly, the
    semantics of the adapted mapping may not be the
    expected one !

12
Loss of Semantics
Source
Target
Source
m SuppPart (s, p) ? PartOrder (p, o)
? PotentialSupp (s, o)
SuppPart
LineItem
m
s p
PotentialSupp
li s p o qty
m LineItem (li, s, p, o, qty) ?
PotentialSupp (s, o)
s o
PartOrder
p o
  • The original mapping m joins orders with
    suppliers
  • However, m loses relevant suppliers
  • It only pairs an order with a supplier provided
    they appear in the same LineItem tuple
  • To retain the original semantics, we must look in
    different tuples !
  • m LineItem (li, s, p, o, qty) ?
    LineItem (li, s, p, o, qty)
  • ?
    PotentialSupp (s, o)

13
  • The incremental approach is a mechanical
    procedure that makes local changes to the
    mapping.
  • A sequence of good local changes may not
    necessarily yield the best global adaptation

14
Mapping Composition Approach
  • We look at the evolution globally
  • Describe evolution through a schema mapping
    Source ? Source.

Source
Target
Source
SuppPart
LineItem
m
e1
s p
PotentialSupp
e1 LineItem (l, s, p, o, q) -gt SuppPart (s, p)
e2 LineItem (l, s, p, o, q) -gt PartOrder (p, o)
li s p o qty
s o
PartOrder
p o
e2
  • Define the adapted mapping to be a mapping
    Source ? Target, equivalent (e.g., same data
    movement) to the sequence of the evolution
    mapping and the original mapping.
  • The previous m satisfies the conditions for
    e1,e2 and m.

15
  • The composition approach is a more systematic
    approach, with precise semantics, guaranteed to
    behave the right way in all situations.
  • Although it may appear simple in the previous
    example, mapping composition poses challenges

16
Challenges in Composition Approach
  • Mapping language
  • Must handle nesting and complex types (as in XML
    Schema)
  • (details in the paper)
  • Furthermore, the usual mapping languages (GLAV,
    tuple-generating dependencies) are not closed
    under composition !
  • Recent extension that ensures composability
    second-order tgds FKPT04.
  • Main idea add functions to gain needed
    expressive power
  • Semantics and Algorithm
  • Efficiency/Scalability

Next
17
Composition Semantics and Algorithm
18
Composition Semantics
  • In mapping composition, we want to replace a
    sequence of schema mappings with one that is
    equivalent and avoids the middle schema.
  • What does equivalent mean ?
  • There are two semantics that we considered
  • Relationship semantics
  • More general
  • Transformation semantics
  • More suitable, specialized

19
Relationship Semantics
  • Mappings can be viewed as describing
    relationships between instances over the two
    schemas
  • Rel (M12) (I1, I2) (I1, I2) satisfies M12
  • Composition of relationships
  • Rel (M12) ? Rel (M23) (I1, I3) there is
    I2 such that
  • (I1,
    I2) satisfies M12 and (I2, I3) satisfies M23
  • FKPT04, Melnik04 A mapping M13 is equivalent,
    to the sequence of M12 and M23, under the
    relationship semantics, if
  • Rel (M13) Rel (M12) ? Rel (M23)

20
Example Semantics and Algorithm
S3
S1
S2
Unknown student id (Skolem term)
Student
Second-order tgd FKPT04
E
Takes
sid name
M
Takes
sid name course
M Takes (n, c) ? Student (F(n), n)
? Enrolls (F(n), c) E Student (s,
n) ? Enrolls (s, c) ?
Takes (s, n, c)
name course
Enrolls
sid course
  • M13 correctly captures the equivalent
    relationship between instances of S1 and S3.
  • Instances (and function F) can exist a priori.
  • A student n must be paired with a course c
  • even when c is listed under a different student
    name n,
  • provided the student id is the same F(n)
    F(n)

1. Substitution
M13 Takes (n, c) ? Takes (n, c) ? F(n)
F(n) ? Takes (F(n), n, c)
21
  • However, if we assume that the function F is
    one-to-one, an important simplification can be
    made

M13 Takes (n, c) ? Takes (n, c) ?
Takes (F(n), n, c)
M13 Takes (n, c) ? Takes (n, c) ? F(n)
F(n) ? Takes (F(n), n, c)
2. Reduction
F(n) F(n) ? n n
3. Minimization
Equivalent relationship
M13 Takes (n, c) ? Takes (F(n), n,
c)
Equivalent transformation
  • We can always make this assumption, if mappings
    are meant to describe transformations (i.e.,
    generation of a target instance).
  • F is a Skolem function assigning unique student
    ids n ? F(n)

22
Transformation Semantics
  • A mapping is a process (in the spirit of data
    exchange FKMP03)
  • I2 M12(I1)
  • Each mapping formula is a generator of target
    facts
  • Functions are one-to-one value generators
  • Theorem. Our composition algorithm produces the
    schema mapping with the equivalent transformation
    semantics
  • M13 (I1) M23( M12(I1) ) (up to
    the renaming of nulls)
  • Advantage of transformation semantics, in
    adaptation
  • ? simpler and more intuitive formulas !

23
Composition Algorithm Further Details
  • The substitution step is more complex than shown
  • Must handle nesting
  • Generate parameterized rules for set types in the
    middle schema
  • Reuse some of the mapping-based query rewriting
    techniques YP04
  • Minimization
  • Good it simplifies formulas and generates
    intuitive mapping.
  • (all this is enabled by the transformation
    semantics)
  • Bad it can be expensive (same as tableau
    minimization)

24
Optimization and Experimental Results
25
Full Adaptation
  • Full adaptation ? Compose whole schema mappings
  • (Compose all the formulas in the original
    mapping with all the formulas in the evolution
    mapping)
  • Inevitable when the schema evolution is drastic
    and affects most of the original mapping
    (non-incremental evolution)
  • Inefficient when the changes are small and
    localized (incremental evolution)

26
Compose Only When Necessary
  • Mapping Pruning
  • 1. Detect those parts (formulas) Mo of the
    original mapping Mo that are affected by
    evolution.
  • Only Mo need to be adapted.
  • 2. Only a subset Me of the formulas in the
    evolution mapping Me play a role in the
    composition with Mo
  • The rest are redundant (PTIME containment-like
    analysis, see paper)
  • 3. Compose Mo with Me
  • ? Big performance gain for incremental evolution
    and overall.

27
Analysis of Evolution Scenarios
Results based on Clio
Benefits 1 adapted mappings / (blank-sheet
mappings missed mappings)
We also have synthetic scenarios that show
scalability of Mapping Pruning with increasing
schema and mapping complexity
28
Conclusion
  • We studied
  • Mapping composition techniques for mapping
    adaptation
  • Transformation semantics in the context of schema
    evolution
  • Designed and implemented a practical adaptation
    system
  • Mapping pruning (schema evolution specific)
  • To Do
  • Optimization of composition in general
  • Improve performance of minimization
Write a Comment
User Comments (0)
About PowerShow.com