QED: An Efficient Framework for Temporal Region Query Processing

1 / 18
About This Presentation
Title:

QED: An Efficient Framework for Temporal Region Query Processing

Description:

... structure, RF-tree, for each time slot. Online Clustering Phase ... Online Query Processing Phase. Step1: Combine the RF-trees of the queried time slots. Step2: ... –

Number of Views:54
Avg rating:3.0/5.0
Slides: 19
Provided by: chu125
Category:

less

Transcript and Presenter's Notes

Title: QED: An Efficient Framework for Temporal Region Query Processing


1
QED An Efficient Framework for Temporal Region
Query Processing
  • Yi-Hong Chu ???
  • Network Database Laboratory
  • Dept. of Electrical Engineering
  • National Taiwan University

2
Introduction
  • Dense Region Query
  • Data records are viewed as data points in the
    d-dimensional data space constructed by the
    d-attributes.
  • Locate the regions with higher density than their
    surroundings.

Salary (1000)
Dense region
Age
3
Grid-based Approach
  • The data space is divided into non-overlapping
    rectangular grids (cells).
  • Density of a cell
  • the percentage of data points contained in this
    cell

Salary (1000)
Dense cell
Maximal connected dense cells
Dense region
Age
0 10 20 30 40 50 60 70 80 90 100
4
Motivation
  • Previous research tends to ignore the time
    feature of the data.
  • They execute queries over the entire database.
  • However, different dense regions may be
    discovered if different time periods are taken
    into consideration.
  • (the density of a cell )
  • Discovering dense regions over different time
    intervals is crucial for users to get the
    interesting patterns hidden in data.

5
Example
  • Some dense regions may exist in certain time
    intervals but will not be discovered if taking
    all data records into account.
  • Middle-aged people

ltAgt the number of customers in
different time slots ltBgt the number of
middle-aged people in different time
slots
6
Temporal Dense Region Query
  • Dense Region Discovery in the constrained time
    intervals.
  • E.g., each Sunday in May,
  • Time slots
  • Derived by segmenting the data points with a time
    granularity, e.g. hour, week, month, etc.
  • For users to specify a variety of time periods of
    interest
  • Problem Definition
  • Given a set of time slots, and the density
    threshold ?, find the dense regions in the
    queried time slots.

7
QED Framework
  • Challenge
  • The queried time intervals are unknown in
    advance.
  • QED (Querying tEmporal Dense region)
  • Offline Maintaining Phase
  • Construct a summarized data structure, RF-tree,
    for each time slot
  • Online Clustering Phase
  • Answer various user queries based on the RF-trees

8
QED Framework

9
Offline Maintaining Phase- Construct the RF-trees
  • Basic Idea
  • A number of cells having nearly the density value
    can be summarized by their average density value.
  • Uniform Region
  • A region where the cells contained in it have
    nearly of the same density value

region
10
Uniform Region
  • Entropy-based approach
  • Entropy of a region
  • Maximum entropy of a region
  • Uniform region

11
Example (Uniform Region)
  • Case 1
  • Case 2

Region A
Region A
12
Construct the RF-tree
  • Recursively partition the data space to find the
    uniform region
  • The leaf nodes will be of two cases
  • A cell
  • A uniform region
  • RF (Region Feature)

13
Online Query Processing Phase
  • Step1
  • Combine the RF-trees of the queried time slots.
  • Step2
  • Execute the query on the combined RF-tree.

14
Step1 Combine the RF-trees
  • Three cases for combining the corresponding
    regions in two RF-trees.
  • Case 1 Both are uniform regions
  • Case 2 Both are non-uniform regions
  • Case 3 Only one is a uniform region

15
Step2 Execute the query
  • All leaf nodes in the combined RF-trees are
    examined to discover the dense cells in the data
    space.
  • The leaf nodes will be of two cases
  • A cell
  • A uniform region compare the average density
    with the density threshold?
  • The leaf nodes containing dense cells will be put
    into a queue for further dense region discovery.

16
Conclusion
  • The problem of temporal dense region query is
    explored to discover dense regions in the queried
    time slots.
  • We also propose the QED framework to execute
    temporal dense region queries.
  • QED is advantageous in that various queries with
    different density thresholds and time slots can
    be efficiently supported by using the concept of
    time slot and proposed RF-tree.

17
References
  • Yi-Hong Chu, Kun-Ta Chuang, Ming-Syan Chen, QED
    an Efficient Framework for Temporal Dense Region
    Processing, in Proc. of PAKDD, 2005.
  • W. Wang, J. Yang, and R. Muntz1997, STING A
    Statistical Information Grid Approach to Spatial
    Data Mining, in Proc. of VLDB, 1997.
  • D,-S. Cho, B-H.Hong, and J.Max. Efficient Region
    Query Processing by Optimal Page Ordering. In
    Proc. of ADBIS-DASFAA, 2000.

18
Thank You
  • Q A
Write a Comment
User Comments (0)
About PowerShow.com