Title: Design and Implementation of CALO Query Manager
1Design and Implementation of CALO Query Manager
- Jose Luis Ambite
- Vinay K. Chaudhri
- Richard Fikes
- Jessica Jenkins
Sunil Mishra Maria Muslea Tomas Uribe Guizhen Yang
2Outline
- Problem
- Technology
- Application
3Problem
- Cognitive Assistant that Learns and Organizes
(CALO) - Learn from experience
- Be told what to do
- Explain what it is doing
- Reflect on experience
- Respond robustly to surprises
- Situated in an office environment
4CALO Functions
Schedule Organize in Time
Monitor Manage Tasks
Organize Manage Information
CALO
Prepare Information Products
Observe Mediate Interactions
Acquire, Allocate Resources
5Knowledge Sources
- Data Sources
- Email
- Calendar
- Contacts
- Projects
- WWW
- Sensor output
- Reasoners
- Taxonomic Reasoning
- Constraint reasoning for scheduling meetings
- Truth maintenance over learned knowledge
- Max-Sat based reasoning
6Example Questions
- Which meetings will have a conflict if the
current meeting runs overtime by one hour? - Access calendar and compute time addition
- Who was present in the meeting in conference room
EJ228 at 1000AM this morning? - Conclude presence from the data
- Access sensor data from the meetings
- List all the people who are mentioned in an
article in which Joe is also mentioned? - Access articles to retrieve co-authors, and then
retrieve their articles of the co-authors - The query involves a join over the instances of
the class articles
7Outline
- Problem
- Technology
- Application
8Technology
- Hybrid Reasoning
- Use more than one reasoning method to answer a
question - Emphasis on first order reasoning
- Information Integration
- Use more than one information source in answering
a query - Emphasis on data retrieval
9Hybrid Reasoning
- Combine one or more reasoning methods in
answering a question - Procedural attachment
- Myers, AIJ 94
- Theory resolution
- Stickel, JAR, 85
- Constraint resolution
- Burckert, LNAI, 91
10Information Integration
- Use more than one information source in answering
a query - Mediation
- Wiederhold, 1992
- Data Warehousing
- Widom, 1995
11Outline
- Technology
- Problem
- Application
12Application
- Innovative Claims
- How to best combine the data and reasoning
sources? - No longer limited to conjunctive queries
(first-order) - Some sources have limited query capability
- How should query planning be embedded in theorem
proving process? - Novel Query Optimization Algorithm
- Grouping and access constraints
- Optimization over non-recursive datalog programs
13Solution Approach
- Use a combination of approaches
- Provide a mediation layer to connect all the
sources - Use multiple warehouses at the lowest level to
aggregate data from data sources - Provide a theory resolution framework to tie
together the data and reasoning resources - Control the overall process using a modular
architecture grounded in model elimination
reasoning - Provide customized procedure attachments for
specific reasoners
14Providing Mediation
- Enforce a schema that is shared across all the
data and reasoning sources - Provided by the CALO ontology (Chaudhri, et. al.
2006, SRI Technical Report) - Provide a query planning algorithm to decompose
queries and combine results - Ordering over non-recursive datalog rules (Yang,
et. al., PODS 2006) - Use off-the-shelf standards
- OWL, SPARQL, KIF
15Integrating Data
- Harvest data from personal information sources
into a data warehouse - All emails, calendars, and contacts are harvested
and stored in this warehouse - Motivated by the fact that the data needs to be
processed in batch by multiple learning
algorithms - Store the data collected during a meeting in a
warehouse - Information extracted from video, audio, and
sketch - Motivated by the fact that the data store will
not be on a users workstation - We do not harvest the online information sources
- Motivated by a desire to get the most up-to-date
information
16Integrating Reasoning
- We integrate the constraint reasoner, the KR
system, and the data warehouses using a theory
resolution approach - Fikes et. al. 2003
- Some modules use dedicated reasoners using
procedural attachment - Execution engine invokes a temporal reasoner
17Controlling Query Evaluation
- The overall query evaluation process in the
system relies on model elimination - Access to all the data and knowledge is routed
through a dispatcher - Dispatcher is responsible for invoking the query
planning and acting on the plan
18(No Transcript)
19Example
- Which meetings will have a conflict if the
current meeting runs overtime by an hour? -
20ExampleStep 1 Formulate Query
- Which meetings will have a conflict if the
current meeting runs overtime by an hour? - (and (CurrentCaloUser ?user)
- (is-calendar-attendee ?user ?meeting)
- (Event-Entry ?meeting)
- (calendar-summary ?meeting CALO Test)
- (has-end-date ?meeting ?end-date)
- (time-add ?end-date PT60M ?new-end-date)
- (Event-Entry ?affected-meeting)
- (time-inside-event ?new-end-date
-
?affected-meeting) - (is-calendar-attendee ?user
?affected-meeting))
?
21ExampleStep 2 Dispatch Query
- Which meetings will have a conflict if the
current meeting runs overtime by an hour? - (and (CurrentCaloUser ?user)
- (is-calendar-attendee ?user ?meeting)
- (Event-Entry ?meeting)
- (calendar-summary ?meeting CALO Test)
- (has-end-date ?meeting ?end-date)
- (time-add ?end-date PT60M ?new-end-date)
- (Event-Entry ?affected-meeting)
- (time-inside-event ?new-end-date
-
?affected-meeting) - (is-calendar-attendee ?user
?affected-meeting))
?
22Step 3 Expand query
- Theory resolution on
- Computing upfront inferences
- New end time for the meeting
- Generating a query plan
?
23Step 3a Perform upfront inferences
- Compute the definition of predicate
time-inside-event in terms of primitive relations
that the individual sources can compute - (gt
- (and (has-start-date ?e ?e-start)
- (has-end-date ?e ?e-end)
- (time-lt ?e-start ?time)
- (time-gt ?e-end ?time))
- (time-inside-event ?time ?e))
?
24Step 3a Perform upfront inferences
- (and
- (CurrentCaloUser ?user)
- (is-calendar-attendee ?user ?meeting)
- (Event-Entry ?meeting)
- (calendar-summary ?meeting CALO Test)
- (has-end-date ?meeting ?end-date)
- (time-add ?end-date PT60M ?new-end-date)
- (Event-Entry ?affected-meeting)
- (has-start-date ?affected-meeting ?e-start)
- (has-end-date ?affected-meeting ?e-end)
- (time-lt ?e-start ?new-end-date)
- (time-gt ?e-end ?new-end-date)
- (is-calendar-attendee ?user ?affected-meeting))
?
25Step 3a Perform upfront inferences
- (and
- (CurrentCaloUser ?user)
- (is-calendar-attendee ?user ?meeting)
- (Event-Entry ?meeting)
- (calendar-summary ?meeting CALO Test)
- (has-end-date ?meeting ?end-date)
- (time-add ?end-date PT60M ?new-end-date)
- (Event-Entry ?affected-meeting)
- (has-start-date ?affected-meeting ?e-start)
- (has-end-date ?affected-meeting ?e-end)
- (time-lt ?e-start ?new-end-date)
- (time-gt ?e-end ?new-end-date)
- (is-calendar-attendee ?user ?affected-meeting))
?
26Step 3b Generate a Query Plan
- Interface with the query planner
?
27Step 3b Generate a Query Plan
- G1_at_IRIS What are the users scheduled meetings?
- (and
- (CurrentCaloUser ?user)
- (is-calendar-attendee ?user ?meeting))
- G2_at_IRIS When does the CALO Test meeting end?
- (and
- (Event-Entry ?meeting)
- (calendar-summary ?meeting CALO Test)
- (has-end-date ?meeting ?end-date))
- G3_at_Time Reasoner What is the new end time?
- (time-add ?end-time PT60M ?new-end-date)
?
28Step 3b Generate a Query Plan
G4_at_IRIS Which other meetings overlap new end
time? (and (Event-Entry ?affected-meeting)
(has-start-date ?affected-meeting
?e-start) (has-end-date ?affected-meeting
?e-end) (time-lt ?e-start ?new-end-date)
(time-gt ?e-end ?new-end-date)) G5_at_IRIS Is
this meeting on users calendar? (is-calendar-atte
ndee ?user ?affected-meeting)
?
29Step 3b Generate a Query Plan
- Once the query plan is complete the dispatcher is
ready to execute it
?
30Step 3b Generate a Query Plan
- Once the query plan is complete the dispatcher is
ready to execute it
?
31Step 4 Evaluate the query
- As the literals are proven, the answers are
returned to the top level reasoning loop
?
32Step 4 Evaluate the query
- As the literals are proven, the answers are
returned to the top level reasoning loop
?
33Step 4 Evaluate the query
- As the literals are proven, the answers are
returned to the top level reasoning loop
?
34Step 4 Evaluate the query
- Un-evaluated literals follow a similar control
flow
?
?
35Step 4 Evaluate the query
- Un-evaluated literals follow a similar control
flow
?
?
36Step 4 Evaluate the query
- Un-evaluated literals follow a similar control
flow
?
?
37Step 4 Evaluate the query
- Un-evaluated literals follow a similar control
flow
?
?
38Step 4 Evaluate the query
- As the literals are satisfied, the bindings are
returned to the top level reasoning loop
?
?
39Step 5 Combine Results
- The bindings propagate and they get combined to
produce the answer
?
?
?
40Step 6 Produce the result
- Produce the answer
- There will be conflict with the evaluation
meeting
A
41Implementation
- We have implemented the approach described here
- We have tested the approach by implementing over
100 classes of test queries - The paper includes detailed performance data
42Is this complexity critical?
- Driven by system goals
- Flexibility
- Reasoners can return partial results
- Modularity
- Can combine different reasoners
- Reflection
- Ability to learn from the query evaluation
process
43Open Research Questions
- User-interface
- How to expose this service in the context of a
end-user interface? - Adapting sources
- Is there a generic way to adapt the existing
reasoning sources to return partial proofs to
support the theory resolution framework? - Dynamic query planning
- If the reasoners return new sub goals, a new plan
must be computed
44Summary and Conclusions
- Engineering a centralized query service for a
large and complex office assistant required us to
use multiple reasoning and information
integration methods - Used a shared ontology as a mediated schema
- Aggregated data into warehouses
- Used theory resolution and procedural attachment
to integrate reasoning - Interesting and novel application of existing
technology but also poses important new research
questions
45Backup
46Procedural Attachment
- Consider a FOL reasoner, and a formula
- a ? b
- Suppose, evaluating a requires us to
- Go to an external information source
- Perform a specialized computation
- We can attach an external function to a
- Whenever the reasoner evaluates a, the external
function is invoked - If a is a function, its value is computed
- If a is a relation, its truth value is computed,
possibly by binding the variables - We say that a has a procedural attachment
47Theory Resolution
- Traditional resolution
- Given
- a V b
- a
- We can derive b
- In theory resolution
- Given
- (a lt b) V P, (blt c) V Q, (c lt a) V R
- (x ltx), ((xlty) ? (yltz) -gt (xltz))
- We can derive
- P V Q V R
- We can perform external computations on a group
of literals - Procedural attachment becomes a special case of
theory resolution
48Constraint Resolution
- Also a special case of Theory Resolution
- Interpreted predicates are kept separate as
constraints, handled by specialized solvers - Example of Constraint Resolution
- p(X) gt q(X) / X lt 5
- q(X) gt r(X) / Xgt1
- --------------------
- p(X) gt r(X) / Xlt5 and Xgt1 satisfiable
- Goal r(3) Goal r(10)
- -------------------- - Fail.
- Subgoal p(3)
49Mediation
- The information continues to reside in individual
sources - At the time of querying, we decompose the query
into sub queries each of which is evaluated by an
individual source, and we combine the results - Query evaluation requires computing mappings
between the information residing in individual
sources - Answers are always current
50Data Warehousing
- We harvest the information into one location, and
answer queries with respect to a central data
repository - We still need to compute mappings across
different sources if they do not use the same
schema - The answers may not be always current, and the
warehouse needs to be maintained as the base data
changes
51Query Manager
- Provides a single point of access from which the
knowledge distributed in the system could be
queried - Engineering the query manager required us to use
multiple reasoning and information integration
methods - Used a shared ontology as a mediated schema
- Aggregated data into warehouses
- Used theory resolution and procedural attachment
to integrate reasoning