QuASI: Question Answering using Statistics, Semantics, and Inference

About This Presentation

Title:

QuASI: Question Answering using Statistics, Semantics, and Inference

Description:

Capture word-specific trends by lexicalizing symbols. ... New release of Chinese Treebank provides more data (~300,000 words) ... – PowerPoint PPT presentation

Number of Views:118

Avg rating:3.0/5.0

Slides: 71

Provided by: hea41

Category:

more less

Transcript and Presenter's Notes

Title: QuASI: Question Answering using Statistics, Semantics, and Inference

1
QuASI Question Answering using Statistics,
Semantics, and Inference

Marti Hearst, Jerry Feldman, Chris Manning,
Srini Narayanan
Univ. of California-Berkeley / ICSI / Stanford
University

2
Outline

Project Overview
Three topics
Assigning semantic relations via lexical
hierarchies
From sentences to meanings via syntax
From text analysis to inference using conceptual
schemas

3
Main Goals

Support Question-Answering and NLP in general
by
Deepening our understanding of concepts that
underlie all languages
Creating empirical approaches to identifying
semantic relations from free text
Developing probabilistic inferencing algorithms

4
Two Main Thrusts

Text-based
Use empirical corpus-based techniques to extract
simple semantic relations
Combine these relations to perform simple
inferences
statistical semantic grammar
Concept-based
Determine language-universal conceptual
principles
Determine how inferences are made among these

5
Relation Recognition (UCB)

Abbreviation Definition Recognition
TREC Genomics Track
Semantic Relation Identification

6
Abbreviation Dectection (UCB)

Abbreviation Definition Recognition
Developed and evaluated new algorithm
Better results than existing approaches
Simpler and faster as well
Semantic Relation Identification
Developed syntactic chunker
Analyzed sample relations
Began development of a new computational model
Incorporates syntax and semantic labels
Test example identify treatment for disease

7
Abbreviation Examples

Heat-shock protein 40 (Hsp40) enables Hsp70 to
play critical roles in a number of cellular
processes, such as protein folding, assembly,
degradation and translocation in vivo.
Glutathione S-transferase pull-down experiments
showed the direct interaction of in vitro
translated p110, p64, and p58 of the essential
CBF3 kinetochore protein complex with Cbf1p, a
basic region helix-loop-helix zipper protein
(bHLHzip) that specifically binds to the CDEI
region on the centromere DNA.
Hpa2 is a member of the Gcn5-related
N-acetyltransferase (GNAT) superfamily, a family
of enzymes with diverse substrates including
histones, other proteins,arylalkylamines and
aminoglycosides.

8
The Algorithm

Much simpler than other approaches.
Extracts abbreviation-definition candidates
adjacent to parentheses.
Finds correct definitions by matching characters
in the abbreviation to characters in the
definition, starting from the right.
The first character in the abbreviation must
match a character at the beginning of a word in
the definition.
To increase precision a few simple heuristics are
applied to eliminate incorrect pairs.
Example Heat shock transcription factor (HSF).
The algorithm finds the correct definition, but
not the correct alignment Heat shock
transcription factor

9
Results

On the gold-standard the algorithm achieved 83
recall at 96 precision.
On a larger test collection the results were 82
recall at 95 precision.
These results show that a very simple algorithm
produces results that are comparable to these of
the exiting more complex algorithms.

Counting partial matches, and abbreviations
missing from the gold-standard our algorithm
achieved 83 recall at 99 precision.
10
TREC Task 1 Overview

Search 525,938 MedLine records
Titles, abstracts, MeSH category terms, citation
information
Topics
Taken from the GeneRIF portion of the LocusLink
database
We are supplied with a gene names
Definition of a GeneRIF
For gene X, find all MEDLINE references that
focus on the basic biology of the gene or its
protein products from the designated organism.
Basic biology includes isolation, structure,
genetics and function of genes/proteins in normal
and disease states.

11
TREC Task 1 Sample Query

3 2120 Homo sapiens OFFICIAL_GENE_NAME ets
variant gene 6 (TEL ncogene)
3 2120 Homo sapiens OFFICIAL_SYMBOL ETV6
3 2120 Homo sapiens ALIAS_SYMBOL TEL
3 2120 Homo sapiens PREFERRED_PRODUCT ets variant
gene 6
3 2120 Homo sapiens PRODUCT ets variant gene 6
3 2120 Homo sapiens ALIAS_PROT TEL1 oncogene
The first column is the official topic number
(1-50).
The second column contains the LocusLink ID for
the gene.
The third column contains the name of organism.
The fourth column contains the gene name type.
The fifth column contains the gene name.

12
TREC Task 1 Approach

Two main components
Retrieve relevant docs
May miss many because of variation in how gene
names are expressed
Rank order them

13
TREC Task 1 Approach

Retrieval
Normalization of query terms
Special characters are replaced with spaces in
both queries and documents.
Term expansion
A set of pattern based rules is applied to the
original list of query terms, to expand the
original set, and increase recall.
Some rules with lower confidence get a lower
weight in the ranking step.
Stop word removal
Organism identification
Gene names are often shared across different
organisms
Developed a method to automatically determine
which MeSH terms correspond to LocusLink Organism
terms
Retrieved Medline docs indicated by LocusLink
links corresponding to a given organism
Organism terms were the most frequent MeSH
categories among the selected docs
Used these terms to identify the organism term in
Medline
An example of playing two databases off each
other.
Mesh concepts
When an exact match is found between one of the
query terms and a MeSH term assigned to a
document, the document is retrieved.

14
TREC Task 1 Approach

Relevance ranking
IBMs DB2 Net Search Extender was used as the
text search engine.
Scoring
Each query is a union of 5 different sub-queries
-
titles,
abstracts,
titles using low confidence expansion rules,
abstracts using low confidence expansion rules,
and
MeSH concepts.
Each sub-query returns a set of documents with a
relevance score from the text search engine (or a
fixed value for MeSH matches)
The aggregated score is the weighted SUM of the
individual scores with optional weights applied
to each sub-query score.
SUM performs better than MAX, since it gives
higher confidence to documents found in multiple
sub-queries.
Scores are normalized to be in the (0,1) range,
by dividing the score by the highest aggregated
score achieved for the query.

15
TREC Task 1 Approach

GeneRIF classification
A Naïve Bayes model is used to assign to each
document the probability it is a GeneRIF.
MeSH terms are used as features.
Combination of text retrieval score and GeneRIF
classification score.
We tried both an additive and a multiplicative
approach. Both behave similarly with a slightly
better performance achieved with the additive one.

16
TREC Task 1 Results

Performance is measured using the standard
trec_eval program.
On training data
Best published result 0.4125
With GeneRIF classifier 0.5101
Without GeneRIF classifier 0.5028
On testing data (turned in 8/4/03)
With GeneRIF classifier 0.3933
Without GeneRIF classifier 0.3768

17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
The Stanford Lexicalized ParserAn open source
Java parser

Dan Klein, Roger Levy, and Chris Manning Computer
Science and Linguistics
Stanford University
http//nlp.stanford.edu/

28
Probabilistic parsing

Standard Solutions Collins 96, 99 Charniak 97,
00
Capture word-specific trends by lexicalizing
symbols.
Capture environment-specific trends by marking
ancestors.
Benefits
Model context-freedom matches data
context-freedom better.
Maximum posterior parses are correct more often.
Costs
State space becomes huge.
Joint estimates become extremely sparse.
Exact inference becomes infeasible.
Parsers become difficult to engineer.

NP becomes NPrates
NPrates becomes NPVPSrates
We want to address these issues.
29
Factoring Syntax and Semantics
Lexicalized Tree T (C,D) P(T) P(C)P(D)
Syntax C P(C) is a standard PCFG, captures
structural patterns
Semantics D P(D) is a dependency grammar,
captures word-word patterns
30
Efficient exact inference The Factored A
Estimate
?D
aT?C?D??T
?C

A parsing will be efficient if we can find a
tight upper bound on
Finding the score of the best coherent pair (C,D)
is as hard as parsing, but P(C) and P(D) alone
are very simple, and so we can quickly find
These maximizations, considered jointly,
effectively range over all pairs (C,D) instead of
only coherent ones, so we know that ?T(E) ? ?C(E)
?D(E). We can therefore use a(E) ?T(E) ?D(E)
as a good admissible estimate.

31
Results Accuracy
Details Syntactic Basic is the unsmoothed
parent-annotated treebank covering grammar best
includes other annotation. Semantic Basic is a
word-word model smoothed by tags best includes a
simple distance and valence model. Results on
Penn Treebank WSJ Section 23. Labeled bracketing
is average sentence F1. Gold dependencies induced
heuristically from gold parsed trees. Klein and
Manning, IJCAI 2003
Labeled Bracketing Accuracy (F1)
Dependency Accuracy
32
Results Efficiency

The factored A estimate reduces work by a factor
of between 100 and 10,000 compared to exhaustive
parsing.

Search work!

Details
Parser uses the Eisner Satta 99 O(n4) schema
(though the exponential observed growth suggests
that so little work is being done that the
dominant effect is the small-constant exponential
function of the A gap, not the large-constant
polynomial function of the sentence length).
The total time is dominated by the plain-PCFG
parse phase, which can be reduced.

33
Recent Focus Accurate unlexicalized parsing

Most of the emphasis in the last decade has been
on exploiting lexical dependencies
We show that accurate structural (syntactic)
modeling has been highly underexploited
Strategy deterministically refine the category
set of a treebank so it better reflects important
linguistic distinctions (and hence better models
probabilistic dependencies)
Our best unlexicalized parsers outperform early
lexicalized parsers Klein and Manning, ACL 2003
cf. Magerman 1995 84.7, Collins 1996 86.0

34
Recent Focus Accurate unlexicalized parsing

E.g. representing subordinating complementizers
in category set fixes PP parse on the left

35
Recent Focus Accurate unlexicalized parsing
Note development set performance final test
set 40 words F1 86.32

Illustrates the strength of the Factored Parser
architecture we can quickly and easily improve
one component
Unlexicalized grammar is more domain-independent

36
Unlexicalized Sec. 23 Results

Beats first generation lexicalized parsers.
Much of the power of lexicalization from
closed-class monolexicalization.

37
Multilingual Parsing ChineseSyntactic sources
of ambiguity

English PP attachment (well-understood)
coordination scoping (less well-understood)
Chinese modifier attachment less of a problem,
as verbal modifiers direct objects arent
adjacent, and NP modifiers are overtly marked.

38
Chinese Performance

Close to state-of-the-art for Chinese parsing
Considerable difference in precision/recall split
from other work suggests complementary
strengths
Levy and Manning, ACL 2003

39
Recent Chinese results learning curve

New release of Chinese Treebank provides more
data (300,000 words)

40
Multilingual Parsing German

Linguistic characteristics, relative to English
Ample derivational and inflectional morphology
Freer word order
Verb position differs in matrix/embedded clauses
Target corpus Negra
400,000 words newswire text
Flatter phrase structure annotations (few PPs!)
Explicitly marked phrasal discontinuities

41
Current results (preliminary)

Area needing investigation Word dependency model
currently gives relatively little improvement.
Consistent with Dubey and Kellers findings that
basic head-complement lexical dependencies harm
performance for Negra German

42
Upcoming

Incorporation of morphological information into
parsing model
Recently released TIGER corpus (similar to Negra,
800,000 words)
Additional languages (Czech, Arabic)
Reconstruction of dislocated argument positions
(common in German, Czech, many other languages)

43
Semantic Role Identification Problem Statement

Given a sentence and a word of interest (the
predicator) in that sentence
Find
The constituents related to that word and the
nature of those relationships
The overarching relationship (the frame) for the
word and its roles
Example Tim drove his car to the store.
TimDriver his carVehicle to the
storeGoal
Relationship Transportation

44
Annotated Examples

Judge We praised Evaluee the syrup tart
extravagantly.
Her verse circulated to Manner warm Judge
critical praise.
Agent His brothers avenged Injured_party him.
Selector The president appoints Leader a Prime
Minister Conditions each year.
She bought Count three Unit kilos Stuff of
apples.
Beh It was Degree really mean Evaluee of
me.

45
Benefits of Solving the Problem

Identify that two syntactically different phrases
play the same role
The board changed their ruling yesterday.
The ruling changed because of protests.
NLP Question answering, WSD, translation,
summarization, speech recognition
Computational Biology Operon Prediction
Security
Intrusion Detection
Credit Card Fraud

46
A Generative Model
47
Results Framenet I
48
Confusion Table, Roles Contributing Most to Error
Rows, correct Columns guesses
49
Results Framenet II
Test Set Accuracy
Comparable numbers for Framenet I
50
Concept-based Analysis
Uniform formalism for encoding conceptual
relations and grammatical constructions Initial
version of construction parser Coordinated
Relational Probabilistic Models for inference
51
Inference and Conceptual Schemas Background

Hypothesis
Linguistic input is converted into a mental
simulation based on bodily-grounded structures.
Components
Semantic schemas
image schemas and executing schemas are
abstractions over neurally grounded perceptual
and motor representations
Linguistic units
lexical and phrasal construction representations
invoke schemas, in part through metaphor
Inference links these structures and provides
parameters for a simulation engine

52
Conceptual Schemas

We have developed an formalism for encoding
conceptual schemas.
Structured feature structure representation
(ECG).
Uniform representation for conceptual relations
and for grammatical constructions.
Supports structured probabilistic inference.
Initial DAMLOIL implementation.
Produced by a construction parser.

53
Construction Parser

The parser maps from language input to a deep
semantic specification
The semantic specification is network of linked
conceptual ECG schemas
Language and domain independent
Supports structured probabilistic inference.
First system running since November 2002
Uses novel parsing techniques combining chunking,
unification, and semantic fit

54
State of Resource Development

MetaNet
Pilot System implemented
SQL-based backend (Michael Meisel, CS Undergrad).
Data-Entry GUI.
Database is being populated with Image Schemas
(Ellen Dodge, Ling Grad)
FrameNet
DAMLOIL version of FrameNet-1
Combining FrameNet and WordNet for Semantic
Extraction (Behrang Mohit, SIMS and ICSI,
recently UTD)
Good use of FrameNet for QA (UTD, Stanford, CU)
Linking to external ontologies
ECG OpenCyc Link (Preslov Nakov, Marco Barreno)

55
Dynamic Probabilistic Inference for event
structure
Srini Narayanan Jerry Feldman ICSI and UC
Berkeley
56
Scenario Question (CNS data)

How has Al-Qaida conducted its efforts to acquire
WMD capability and what are the results of this
endeavor?
Even with perfect parsing, to answer this
question, we have to go beyond words in the input
in at least the following ways
Multiple sources (reports, evidence, news)
Fusing information from unreliable sources
(P(Information true source))
Non-monotonicity. Previous assertions or
predictions may have to be retracted in the light
of new evidence.
Modeling complex events
Evolving events with complex dynamics including
sequence, concurrency, coordination,
interruptions and resources.

57
Reasoning about Events for QA

Reasoning about dynamics
Complex event structure
Multiple stages, interruptions, resources
Evolving events
Conditional events, presuppositions.
Nested temporal and aspectual references
Past, future event references
Metaphoric references
Use of motion domain to describe complex events.
Reasoning with Uncertainty
Combining Evidence from Multiple, unreliable
sources
Non-monotonic inference
Retracting previous assertions
Conditioning on partial evidence

58
Cognitive Semantics

Much of language and thought is directly embodied
and relies on recurrent patterns of familiar
experience
Image Schemas
Containment, Force Dynamics, Spatial Relations
Motor Schemas
Homeostastis, Source Path Goal, Monitoring,
Aspect
Social Cognition
Authority, Care-giving, play
Abstract Language and Thought derive a
significant amount of their meaning from mappings
to embodied schemas
Event Structure Metaphor, Projection invariants
and Cogs (Aspect, topological relations), Frames,
Mental Spaces.

59
Previous work

Models of event structure that are able to deal
with the temporal and aspectual structure of
events
Based on an active semantics of events and a
factorized graphical model of complex states.
Models event stages, embedding, multi-level
perspectives and coordination.
Event model based on a Stochastic Petri Net
representation with extensions allowing
hierarchical decomposition.
State is represented as a Temporal Bayes Net
(T(D)BN).

60
(No Transcript)
61
(No Transcript)
62
(No Transcript)
63
Factorized Inference
64
Quantifying the model
65
Pilot System Results

Captures fine grained distinctions needed for
interpretation
Frame-based Inferences (COLING02)
Aspectual Inferences (Cogsci98, IJCAI 99,
COLING02)
Metaphoric Inferences (AAAI 99)
Sufficient Inductive bias for verb learning
(Bailey97, CogSci99), construction learning
(Chang02, to Appear)
Model for DAML-S (WWW02, Computer Networks 03)

66
Extensions to Pilot System

Scalable Data Resources
Language Resources/Ontology
Lexicon (Open Source, WordNet, FrameNet)
Conceptual Relations
Schemas, Maps, Frames, Mental Space
General Principle Use Semantic Web resources
(DAML, DAML-S, OpenCYC, IEEE SUMO)
Language Analyzer
Construction Parser (ICSI/EML)
Statistical techniques (UCB/Stanford, CU,UTD)
Scalable Domain Representation
Coordinated Probabilistic Relational Models

67
Problems with DBN

Scaling up to relational structures
Supports linear (sequence) but not branching
(concurrency, coordination) dynamics

68
Structured Probabilistic Inference
69
Probabilistic inference for QA

Filtering
P(X_t o_1t,X_1t)
Update the state based on the observation
sequence and state set
MAP Estimation
Argmaxh1hnP(X_t o_1t, X_1t)
Return the best assignment of values to the
hypothesis variables given the observation and
states
Smoothing
P(X_t-k o_1t, X_1t)
modify assumptions about previous states, given
observation sequence and state set
Projection/Prediction/Reachability
P(X_tk o_1..t, X_1..t)
Predict future states based on observation
sequence and state set

70
The CPRM algorithm

Combines insights from
the SVE algorithm for PRMs (Pfeffer 2000)
the frontier algorithms for temporal models
(Murphy 2002) and
Inference algorithms for complex, coordinated
events (Narayanan 1999)
Expressive Probabilistic Modeling paradigm with
relations and branching dynamics.
Offers principled methods to bound inferential
complexity.

71
Summary

QA with complex scenarios (such as the CNS
scenario/data) needs complex inference that deals
with
Relational Structure
Uncertain source and domain knowledge
Complex dynamics and evolving events
We have developed a representation and inference
algorithm that is capable of tractable inference
for a variety of domains.
We are collaborating with UTD (Sanda Harabagiu)
to apply these techniques to QA systems.

72
Putting it all Together

We explored two related levels of semantics
Universal conceptual schemas
Extracting semantic relations from text
In Phase I they remained separate
However, we came up with CPRMs as a common
representational format
In Phase II we propose to combine them in an
semantically based integrated QA system.

Write a Comment

User Comments (0)