Title: Inference Web in Action: Lightweight Use of the Proof Markup Language
1Inference Web in Action Lightweight Use of the
Proof Markup Language
- Paulo Pinheiro da Silva1, Deborah McGuinness2,
- Nicholas Del Rio1, Li Ding2
- 1University of Texas at El Paso
- 2Rensselaer Polytechnic Institute
- inference-web.org
2MotivationUnderstanding and Trust through
Transparency
- If users (humans and agents) are to use, reuse,
and integrate system answers, they must trust
them. - System transparency supports understanding and
trust. - Even simple lookup systems benefit from
providing information about their sources. - Systems that manipulate information (with sound
deduction or potentially unsound heuristics)
benefit from providing information about their
manipulations.
Goal Provide interoperable infrastructure that
supports explanations of sources, assumptions,
and answers as an enabler for understanding and
trust.
3Inference Web
- Framework for explaining question answering tasks
by abstracting, storing, exchanging, combining,
annotating, filtering, comparing, and rendering
justifications from question answerers - IWs Proof Markup Language (PML) is an
interlingua for justification interchange.
Represented in OWL from day 1 - IWBase is a distributed repository of
meta-information - IW Registration and PSW services provide support
for PML generation - \IW Validator service provides support for PML
validation, and checking - IW Browser provides display capabilities for PML
documents - ProbeIt! provide complex visualization
capabilities for PML documents encoding complex
scientific datasets - IW Abstractor provides rewriting capabilities
enabling more understandable presentations - IW Explainer provides multi-modal dialogue
options including alternative strategies for
presenting explanations and summaries - IW Search (enhanced SWOOGLE for PML documents)
4Inference Web in Action
- Information extraction IBM (UIMA), Stanford
(TAP) - Information integration USC ISI
(Prometheus/Mediator) Rutgers University
(Prolog/Datalog) - Task processing SRI International (SPARK/CALO)
- Theorem proving
- Portable proofs across reasoners JTP (with
temporal and context reasoners (Stanford) CWM
(W3C), SNARK(SRI), KM (University of Texas,
Austin), JEOPS (Univ. of Fortaleza) - SATisfiability Solvers University of Trento
(J-SAT) - More than 30 theorem provers through TPTP
(University of Miami, FL) - Service composition - Stanford, University of
Toronto, UCSF (SNRC) - Semantic matching University of Trento
(S-Match) - Scientific provenance
- University of Texas at El Paso (GEON, CEON,
EarthScope) - Rensselear Politechnic Institute National
Center for Atmospheric Research (VSTO, SPCDIS,
SESDI) - Intelligence analysts tools (NIMD/KANI)
- Border Security (UTEP/DHS-Scientific Leadership)
- Learning systems
- Procedure learning (TAILOR, LAPDOG, / CALO)
- Integrated learning systems (GILA)
- Privacy policy law validation (TAMI)
- Trust in social collaborative networks (Wikipedia
TrustTab)
A single explanation/provenance approach that
has been used in multiple diversified areas
5Centralized Provenance vs.Distributed Provenance
- Logging provenance is a big challenge
- Centralized Provenance
- Requires central authority to enforce the
encoding of provenance information - Database solution
- Workflow-centered solution
- Distributed Provenance
- Data/metadata bundle
- Inference Web Approach (including PML)
-
6Research Problem
Gravity maps show us a low resolution image of
the internal structure of the Earth
Anomalies in gravity maps may indicate the
presence of mineral or oil reserves
I am looking for oil reserves to explore.
I dont know if this gravity anomaly is an
important result or just a mistake!
7The Need for (Distributed) Provenance
I do trust the sources used to derive the map
Let me inspect the provenance of this map
I think it is reasonable to use 2D-Nearest
Neighbor in this case (e.g., better than minimum
curvature)
Sources
Sources
Sources
The parameters appear to be correct
Inference engine Gridding service
Gridding parameters
Inference rule 2D-Nearest Neighbor
I thus believe that the map is correct
8Proof Markup Language (PML)
- PML provides a way of encoding distributed
provenance - It can be used to represent justifications of
information manipulation steps done by theorem
provers, extractors, web services, scripts,
applications, etc. - The main components concern inference
representation, e.g., logical rule, algorithm,
standard procedures, and provenance issues such
as author, source, etc.
PML document (or a provenance unit)
A conclusion
pointers to other documents including other PML
encodings
9Lightweight Use of PML
What is a node set?
ltiwNodeSet gt Has conclusion (S V)
ltiwisConsequentOfgt ltiwInferenceStepgt
source A asserted (S V)
lt/iwInferenceStepgt ltiwInferenceStepgt
AND introduction was used on S asserted
by source B and on V asserted by
source C lt/iwInferenceStepgt
ltiwInferenceStepgt source D
asserted (S V) ltiwInferenceStepgt
lt/iwisConsequentOfgt lt/iwNodeSetgt
ltiwNodeSet gt Has conclusion (S V)
ltiwisConsequentOfgt ltiwInferenceStepgt
source A asserted (S V)
lt/iwInferenceStepgt lt/iwisConsequentOfgt lt/iwNode
Setgt
What is an inference step?
One can view a node set with a single inference
step as a single node in a justification
How inference steps are related to node sets?
10Lightweight Use of PML
- Simplification strategy 1
- no use of alternate justifications
- The encoding of a justification can be
represented as a DAG of connected nodes - The notion of a justification as a collection of
nodes is more natural than the notion of a
justification as a collection of node sets and
inference steps
11Lightweight Use of PML
ltiwNodeSet rdfabout"http//foo.com/Example.owl
SmokeFire"gt ltiwhasConclusiongt(SF)lt/iwhasConc
lusiongt ltiwhasLanguage rdfresource"http//in
ferenceweb.stanford.edu/registry/LG/N3.owlN3"
/gt ltiwisConsequentOfgt
ltiwInferenceStepgt ltiwhasIndex
rdfdatatype"http//www.w3.org/2001/XMLSchemaint
"gt0lt/iwhasIndexgt ltiwhasInferenceEngin
e rdfresource"http//inferenceweb.stanford.
edu/registry/IE/CWM.owlCWM"/gt
ltiwhasRule
rdfresource"http//inferenceweb.stanford.edu/reg
istry/DPR/Told.owlTold"/gt
ltiwhasSourceUsagegt
ltiwSourceUsagegt
ltiwspanFromByte
rdfdatatype"http//www.w3.org/2001/XMLSchemaint
"gt824lt/iwspanFromBytegt
ltiwspanToByte rdfdatatype"http//www.w3
.org/2001/XMLSchemaint"gt1058lt/iwspanToBytegt
ltiwhasSource rdfresource"http//
inferenceweb.stanford.edu/registry/PUB/RC.owlRC"/
gt lt/iwSourceUsagegt
lt/iwhasSourceUsagegt lt/iwInferenceStepgt
lt/iwisConsequentOfgt lt/iwNodeSetgt
How can I generate PML documents about my
inference engine?
How can I generate PML documents about the
inference rules supported by my inference engine?
B.T.W. which rules are supported by my inference
engine?
12Lightweight Use of PML
- Simplification strategy 2
- no encoded knowledge about inference engines
and their inference rules - This strategy allows an inference engine (or
functionality) that is not registered in the
Inference Web to generate and use PML encodings - This strategy also allows an inference engine to
state a conclusion without naming the mechanism,
e.g., inference rule, used to derive the
conclusion
13Logging Distributed Provenance
- Provenance is typically logged (or captured) by
modules - attached to a workflow engine (i.e., client
side) - integrated into the core functionality of
services (i.e., server side but restricted to few
functionalities such as database queries) - Distributed provenance needs to be captured at
all levels of functionality whether it is a web
service or a script call to a local application
14Logging Distributed ProvenancePML Service
Wrapper (PSW)
Functionality without provenance support
Software functionality
Functionality with provenance support A single
level of indirection is introduced between the
process and the target services
PSW Wrapper
antecedent information
functionality provenance
Software functionality
15Advanced Provenance-Supported Search for
Scientific Data
- Use Case Find CHIP images at 3pm on Sept 10,
2008
Traditional IR search can be used to get this
WWW
16Provenance Experiment
HypothesisScientists with access to provenance
can identify and explain the quality of maps
more accurately than scientists without access
to provenance
The following list presents our experimental
procedure
- Provide introduction to main concepts (e.g.,
provenance, map quality) - Ask subjects to interact with the portal-like
application to initiate evaluation case map and
mapp in succession - Record subjects actions and comments speak
aloud method - Provide open discussion opportunity to collect
information about noted difficulties
17Demographics
- Requirement for participation in the user study
is that subjects are active researchers in some
scientific field - We were able to get the participation of over
twenty scientists from various fields including
geophysics, geology, biology, environmental
sciences, and physics - Additionally, these scientists are affiliated to
various organizations located in Alaska, Arizona,
California, Oklahoma, Texas and Brazil
Education PhD Holders
60 Graduate Students 40
18Evaluation Results
The results indicate that our hypothesis is
correct there was a significant difference
between the mean accuracy of results provided by
scientists when accessing provenance and when not
accessing provenance Significance was
verified using a two sample t-test at 95
confidence
Identification Task
Explanation Task
19Conclusions
- PML is a powerful language to encode distributed
provenance tested in multiple disciplines and
research projects - The generation of PML encodings can be
challenging - A comprehensive process for generating PML may be
more complex than initially needed - With the simplifications provided in this talk, a
lightweight use of PML is achieved - Even a simplified use of PML can be very useful
to understand results from complex application - Simplified PML documents can be browsed,
searched, and support the search of information
and data - More than 7 fold increase on answer understanding
is achieved with the use of provenance!