Concise Description of Structured Sets - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Concise Description of Structured Sets

Description:

San Diego. U = { Ottawa, Toronto, Windsor, Kingston. San Jose, San Francisco, San Diego } ... San Francisco. San Diego. Kingston. V = { Ottawa, Toronto, ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 27
Provided by: ken49
Category:

less

Transcript and Presenter's Notes

Title: Concise Description of Structured Sets


1
Concise Description of Structured Sets
  • Alberto O. Mendelzon
  • Ken Q. Pu
  • University of Toronto

2
Motivation A true story
  • An OLAP report writing software

GUI Range Selection
SQL Translation
RDBMS
Report Writer / Plotting
3
The GUI
User selects the region of interest via
GUI CTRL-A selects ALL, Clickn-drag selects
LOTS.
4
An Engineering Problem and Solution
  • The Problem
  • SQL statement has a length limit.
  • User selection is LARGE, especially when the
    visualization is plotting.
  • The Solution
  • Break a large selection to a collection of SQL
    queries.
  • The Complaint
  • Kind of slow!

5
A formalization
  • A tree-structured categorical set (dimensions)
  • A selected subset (user selection)
  • Find the shortest possible description of the
    selected subset (translated SQL query)

6
Outline
  • Formal definition of structured sets, expressions
    and the Minimal Descriptive Length (MDL) problem
  • Partition structures
  • Hierarchical structures
  • Multidimensional product structures

7
The structured set
  • A finite universe U
  • A finite alphabet ?
  • An interpretation function ? ? ? P(U)
  • E.g. a 1, 2, 3 1 1
  • U/? is the set cover naturally induced by ? on U.

8
Expressions
  • A language L
  • An evaluation function ? L ? P(U)
  • The Propositional Language L (U, ?) L ? L
    L L L L L
  • The L (U, ?) L ? L L
  • is union, is difference, is
    intersection.

9
The L -MDL Problem
  • Let V ? U, its descriptive lengthVL min
    s s V and s ? L
  • The L -MDL decision problem VL ? k ?
  • The L -MDL problemFind a (non-unique) compact
    (shortest) expression for V in L.

10
Complexity Issues
  • The L (U, ?)-MDL problem is NP-complete.
  • The L-MDL problem is NP-complete. reduction
    from set-cover

Back to Engineering
  • Is there anything good we can do?

11
Tractable MDL-Problems
  • Strategy reduce the complexity of the MDL
    problem by restricting to a class of structures.
  • Two tractable cases
  • Partition ? ?1 ? U, U/?1 is a partition
  • Hierarchy ? ?N ? ?N-1 ? ?1 ? U, and U/?i ?
    U/?i1 for all i gt 0.

12
Partition
Ottawa
San Jose
Toronto
Windsor
San Francisco
San Diego
Kingston
U Ottawa, Toronto, Windsor, Kingston
San Jose, San Francisco, San Diego
?1 Ontario, California
13
Efficient Symbols
  • Efficient symbols for V ?(V) ? ??1 ?
    ? V ? ? V 1
  • V Ottawa, Toronto, Kingston, San Francisco
    ?(V) Ontario
  • California is not efficient.

14
Normal Form for Partitions
  • N is a sub-language of L t (?1 ?2 ?3
    ) (a1 a2 a3 ) (b1 b2 b3 )
  • t is N -compact for V if
  • S ?(V)
  • A V S
  • B S V

S
A
B
15
L-MDL for Partitions
  • Every expression in L can be reduced to an
    equivalent expression in N !
  • To compute a compact expression for V
  • ? (V) ?(V)
  • ?(V) (V ? (V) , ? (V) V )

16
An Example
Ottawa
San Jose
Toronto
Windsor
San Francisco
San Diego
Kingston
  • V Ottawa, Toronto, Kingston, San Francisco
  • ? (V) Ontario , ?(V) (San Francisco,
    Windsor)
  • t Ontario San Francisco Windsor

17
A Normal Form for Hierarchy
  • Levels (?0U) ? ?1 ? ?2 ? ?3 ? ?N
  • A recursive definition for N The expression t
    is normal w.r.t level i ift t A B, with
    A, B ? ?i and t is normal w.r.t. level (i1).
  • Examplet (( Canada California Ontario )
    Toronto San Jose)

t
A
B
18
L-MDL for Hierarchies
  • Every expression in L can be reduced to an
    equivalent expression in N !
  • t t A B is a N -compact expression for V
    if
  • t 1 ?1(V) and is N -compact
  • (A, B) ?(V)

19
An iterative algorithm
V
( VN AN - BN )
( s3 A3 - B3 )
( s1 A1 - B1 ) is compact
20
Back to Engineering
21
Multiple dimensions
  • What about multi-dimensional sets
  • V ? U1 ? U2
  • U1 and U2 are hierarchically structured sets
  • U1 has ? ?0 ? ?1 ? ?2 ?
  • U2 has ? ?0 ? ?1 ? ?2 ?
  • Form a new (product) structured set
  • U U1 ? U2
  • ? ? ? ?
  • Interpretation

.
22
Example
U1 -- Time Jan, Feb, Mar U2 --
Geography Ontario, California
Jan ? Ontario (Jan, Ontario)
L-MDL for product structure is NP-complete.
23
Reduction by example
  • Reduction from 3-set cover

24
Related Work
  • The generalized MDL approach for summarization,
    Lakshmanan, Ng, Wang, Zhou, Johnson, VLDB, 1999
  • limited expressiveness for multidimensional data
    sets optimal expression is computed in
    polynomial time
  • Multidimensional MDL problem resembles
    rectilinear polygon packing and covering problem
  • Covering rectilinear polygons with axis-parallel
    rectangles, Kumar, Ramesh, ACM STOC, 1999

25
Whats next
?
  • More complex structures -- ordering
  • More complex operators -- range operator
  • Conjectured and known NP-hardness of the MDL
    problems
  • L (?, ) is NP-complete , polynomial if ? is
    total.

26
The End.
Write a Comment
User Comments (0)
About PowerShow.com