Query Relaxation Using Malleable Schemas - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Query Relaxation Using Malleable Schemas

Description:

Problem Multiple data sources Unmatched schemas Approach Malleable schemas Discover correlations Relax user queries Malleable ... sur_name name Malleable ... – PowerPoint PPT presentation

Number of Views:97
Avg rating:3.0/5.0
Slides: 23
Provided by: Aaron197
Category:

less

Transcript and Presenter's Notes

Title: Query Relaxation Using Malleable Schemas


1
Query Relaxation Using Malleable Schemas
  • Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke,
    Wolfgang Nejdl
  • L3S Research Center
  • Leibniz University
  • Hanover, Germany

Presented by Aaron Stewart BYU CS 652 Spring 2009
2
Problem

?
3
Problem
  • Multiple data sources
  • Unmatched schemas

4
Approach
  1. Malleable schemas
  2. Discover correlations
  3. Relax user queries

5
Malleable Schemas
  • Allow duplicate fields
  • Allow related fields

6
Malleable Schemas
7
Malleable Schemas
first_name, sur_name
name
8
Malleable Schemas
body
contents
9
In Practice Tables
  • a malleable schema contains imprecise and
    overlapping definitions of attributes or
    relationships.
  • In this way, a malleable schema can capture such
    heterogeneous data structures as in Figure 1.

10
In Practice Tables
11
In Practice Tables
Attributes (database fields, columns)
Entities (database records, rows)
Equivalently Distinct tables
12
Query Relaxation Planning
  • Multiple queries
  • Different columns or tables
  • As few queries as possible
  • Exponential number of relaxed queries
  • Evaluate in order of precision
  • Stop at k results

13
Query Relaxation Planning
relaxed attribute
child attributes
A1
A2
14
Query Relaxation Planning
  • A relaxed query always yields better precision
    than its child queries, so that it should always
    be evaluated prior to its child queries

15
Parent/Child Relationship
  • We would think A is the parent, and A1 and A2 are
    the children, but
  • Put them in order of correlation probability
  • If P(AA1) gt P(AA2)
  • Then A gt A1 gt A2

16
Query Relaxation Planning
17
Query Relaxation
18
Experiments
  • Data sets
  • IMDB Movies
  • Amazon.com DVDs and VHS videos

19
Results
20
Results
21
Results
22
Analysis
  • Strengths
  • Handles mixed schemas
  • Well-designed algorithms (IMO)
  • Future work
  • Speed
Write a Comment
User Comments (0)
About PowerShow.com