Spatial tree logics to reason about Semistructured Data - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Spatial tree logics to reason about Semistructured Data

Description:

Constraint Implication in presence of a schema: vld(A and B = B' ... Write a preliminar formal system for constraint (and type) implication. We plan two stages: ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 23
Provided by: diUn
Category:

less

Transcript and Presenter's Notes

Title: Spatial tree logics to reason about Semistructured Data


1
Spatial tree logics to reason about
Semistructured Data
SEBD 2003
  • Speaker Giovanni Conforti
  • Joint work with Giorgio Ghelli

Dipartimento di Informatica Università di Pisa
2
What Im going to talk about
  • A gentle introduction to Spatial Tree Logics
    (STL)
  • STL and Semistructured Data (SSD)
  • Properties of SSD (Constraints, Types, Queries) ?
    Spatial Tree Logic (STL) Formulas
  • Decision Problems for SSD ? Validity/Satisfiabilit
    y of STL Formulas
  • Presentation of a decidable fragment of the TQL
    logic

3
Background Spatial Logics
  • Modal Logics to describe properties of structured
    worlds
  • Many Applications Ambient Calculus, ?-calculus,
    tree structured data, shared data structures,
  • Spatial (and temporal) modal operators to
    describe structure (and behavior)
  • Equivalence, model checking and validity problem
    are already studied for many spatial logics
  • Many works involving Cardelli, Gordon, Caires,
    Ghelli, Gardner,

4
A Simple Ground Spatial Tree Logic
  • Worlds Information trees Unordered (multisets
    of) labeled trees
  • F,F 0 (empty root)
  • nF (an edge labelled n leading to the
    i.t. F)
  • F F (the i.t. F next to the i.t F)
  • Logic propositional logic connectives modal
    operators describing the structure
  • A,B True Not A A and B
  • 0 nA A B

5
Examples
  • F book
  • titleDatabases0
  • authorGhelli0
  • authorAlbano0

A book authorGhelli0 B book
authorGhelli0 True C book Not
(editorTrue True) D book titleTrue And
authorTrue
6
First order and modal recursion
  • The full TQL logic extends the ground fragment
    with
  • X tree variables
  • xA locations with label variables
  • Exists x. A quantification over labels (and
    trees)
  • µ?. A fixpoint (? positive in A)

7
Decision Problems
  • Given a formula A and a model F
  • Model checking F A ?
  • Query Answering find values of x such that F
    A(x)
  • Satisfiability sat(A) Exists a F such that F
    A ?
  • Validity vld(A) is true that For each model F,
    F A ?
  • Negation in the logic Sat(A) ?? Not vld(Not A)
  • Implication?F. FA implies FB ?? vld(Not A Or
    B)
  • With the simple ground STL all these problems are
    decidable, but that is not true for
    satisfiability/validity if we introduce variables
    and quantification (or fixpoint)

8
A SSD Data model labeled trees
articles article authorCardelli
authorGordon title Anywhere dateApr,
2000 article authorGhelli
titleTQL confETAPS date
monthFeb year2001
articles
article
article
title
date
author
author
date
author


Ghelli
year
Cardelli
Apr, 2000
month
Gordon
TQL
2001
Feb
9
SSD Schema and Types
  • Schema and Types to constraint the structure of
    SSD
  • DTDs
  • XML Schema
  • Regular Expression Types
  • A schema
  • Article article titleString,authorString
    ,dateTrue?
  • A recurisve type
  • Section section
  • initString, Section, concString

10
Types in STL
  • Regular Type expressions and DTD can be expressed
    (up to document order) in STL extended with modal
    recursion
  • A schema
  • article titleString,authorString,dateTrue
    ?
  • In STL
  • article titleTrue
  • (??. 0 Or authorTrue?)
  • dateTrue or 0

11
SSD Constraints
  • Integrity Constraints on the values of SSD
  • Inclusion Constraints
  • Inverse Relationship Constraints
  • Key Constraints
  • path expressions to navigate on SSD
  • articles.article.title(x)
  • root.section.init(x)
  • Integrity constraints as inclusion of paths
  • student.takes gt course.cno
  • student.takes ? course.taken_by
  • Key constraints (first order logic with paths)
  • ?x,y. article.title(x) And article.title(y)
    And ?(xy) gt ?(x y)

12
Constraints in STL
  • Integrity Constraints over SSD are easily
    expressed using STL with variables and
    quantification.
  • Examples using path abbreviation (.aA aA
    True)
  • An inclusion constraint
  • ?X. .student.takingX gt .course.cnoX
  • A key constraint for SSD
  • ?X. Not (.article.titleX
    .article.titleX )
  • Combining quantification with recursion we can
    express complex types and constraints (e.g.
    binary trees)

13
SSD Queries
  • Many query languages (Xquery, Lorel, Yatl, ),
    essentially queries are expressions selecting
    data reachable from paths and constructing new
    results
  • TQL a peculiar query language based on spatial
    tree logic, the selection is done using pattern
    matching over STL formulas
  • TQL logic expresses all regular path expressions
  • Query answering is implemented for the full TQL
    logic

14
SSD Decision Problems with STL
  • Given a data source F, and formulas A
    representing a schema and B, B a set of
    integrity constraints
  • Validation
  • F A, FB, F A And B
  • Schema/constraint consistency
  • sat(A), sat(B), sat(A And B)
  • Constraint Implication (inference)
  • vld(B gt B)
  • Constraint Implication in presence of a schema
  • vld(A and B gt B)

15
A decidable TQL sublogic
  • STL are good to express types, constraints and
    queries over SSD but
  • Validity in the full TQL logic is undecidable
  • The gound logic is decidable, but it is not
    enough to express all interesting types and
    contraints
  • We are looking for a decidable fragment of TQL
    expressive enough to reason about SSD
  • A first step in this direction is the following
    logic

16
A decidable TQL sublogic
  • A, B True A and B Not A
  • 0 A nA AB
  • We can define useful operators to describe types
    and constraints in this decidable logic
  • String def 0 Tree def
    True
  • A or B def Not (Not A And Not B) A gt
    B def Not A Or B
  • Aexists def A True
    Aforeach def Not( Not A True)
  • AforeachTree def (Tree gt A) foreach
  • Note if A gt Tree we can use AforeachTree to
    express A

17
Conclusions and Future Directions
  • STL provide a powerful unified framework for
    types, constraints, and queries over SSD and XML
  • This framework is worth of studying, it may lead
    to
  • A good formalization of SSD reasoning in terms
    of model checking and validity
  • Generalization of results on reasoning about
    types, constraints
  • Query Optimization strategies guided by
    types/constraints
  • (some) future steps
  • Extend the decidable logic to express integrity
    constraints
  • Modeling ordered trees

18
Spatial tree logics to reason about Constraints
and Types
Università di Pisa Ph.D. Proposal
  • Speaker Giovanni Conforti
  • Supervisor Giorgio Ghelli

19
SSD Query Optimization
  • TQL pattern clause uses STL formulas
  • We can use validated constraints C an types T as
    information to optimize queries (e.g. static
    declaration of empty result)
  • A query from Q A select Q can be rewritten
    with from Q B select Q for each B such that
  • (C and T) gt (A ltgt B)

20
Research Plan pianification
  • The challenge is ambitious, it must be intended
    as a long term direction of our work
  • We address some initial tasks we expect to
    accomplish
  • Comparison of STL with other formalisms for types
    and constraints
  • Find a satisfactory decidable logic fragment to
    express types (and constraints)
  • Write a preliminar formal system for constraint
    (and type) implication
  • We plan two stages
  • (2nd year) deep study of basic theories (tree
    automata, modal logics, description logics) and
    initial tasks investigation
  • (3rd year) Initial tasks completion and
    integration of the results in a unified formal
    framework

21
Research Plan directions
  • Main directions, investigate on
  • Expressivity of Spatial Tree Logics (in
    particular for standard Types and Constraints
    specifications)
  • Decidability and complexity of model checking and
    validity for fragments (or extensions) of TQL
    logic
  • Reformulation (or generalization) of known
    results about reasoning and optimization over SSD
  • Other interesting directions
  • Implementation of a query rewriter guided by
    constraints and types
  • Extensions to the logic to model order, data
    updates, private names

22
Background Semi-structured data (SSD)
  • Semi - Structured Data (SSD) are used to
  • model and query web (HTML, XML, )
  • store sperimental data
  • integrate eterogeneous databases
  • SSD are
  • Self-describing (structure is implicit)
  • Irregular
  • Always in evolution
Write a Comment
User Comments (0)
About PowerShow.com