Indirect Anaphora Resolution as Semantic Path Search - PowerPoint PPT Presentation

1 / 25

About This Presentation

Title:

Indirect Anaphora Resolution as Semantic Path Search

Description:

Indirect Anaphora Resolution as Semantic Path Search. James Fan, Ken Barker and Bruce Porter ... Resolve indirect anaphora by using a general-purpose search ... – PowerPoint PPT presentation

Number of Views:56

Avg rating:3.0/5.0

Slides: 26

Provided by: labwa

Category:

more less

Transcript and Presenter's Notes

Title: Indirect Anaphora Resolution as Semantic Path Search

1
Indirect Anaphora Resolution as Semantic Path
Search

James Fan, Ken Barker and Bruce Porter
University of Texas at Austin

2
Indirect Anaphora

Indirect anaphora is a type of anaphora in which
the referring expression and the object being
referred to are related by unstated background
knowledge.
May account for 15 of noun phrase anaphora.
Poesio and Vieira 98

3
Indirect Anaphora and Knowledge Capturing

In order to automatically capturing knowledge
from text, indirect anaphora must be resolved.
For example
When the detective got back to the garage, the
door was unlocked.
The referring expression, the door, relates to
the antecedent, the garage, through a part-whole
(metonymy) link.

4
Challenges in Indirect Anaphora Resolution

Requires semantic knowledge of the relationship
between the referring expression and the
antecedent.
Problematic for shallow processing systems.

5
Our Approach

Resolve indirect anaphora by using a
general-purpose search program that finds short
semantic paths in a knowledge base.
The search program has been used for a variety
of tasks, including
noun compound interpretation Fan, et al. 2003
query interpretation Fan and Porter 2004.

6
Previous Work ? Theoretical

Theoretical work has identified a variety of
types of indirect anaphora Clark 1975Gardent,
et al. 2003.

7
Some Frequent Types of Indirect Anaphora
8
Previous Work WordNet Based

Use WordNet as the knowledge base. Vieira and
Poesio 2000.
Choose one noun as the most likely antecedent
from a list of nouns that appear earlier in the
text.
The antecedent must relate to the referring
expression as a synonym, hypernym/hyponym,
coordinate sibling or meronym/holonym.
If multiple antecedents are found, they are
ranked based on their contextual distances from
the referring expression.

9
Previous Work WordNet Based

Strength
Reveals the type of association between each
referring expression and its antecedent.
Weakness
Low recall. Commonly attributed to that many
frequently used types of links, such as
event/role or cause/consequence, are not
available in WordNet.

10
Previous Work Machine-learning Systems

Use the web as the corpus Markert, et al. 2003
Bunescu 2003.
Issue a series of web search queries made of the
referring expression and each candidate
antecedent.
Use the number of web pages returned as a measure
of the strength of association.
If the strength exceeds a threshold, then
consider the candidate the true antecedent.
Machine learning techniques are used to determine
the best threshold.

11
Previous Work Machine-learning Systems

Strength
Broad coverage of all types of links.
Achieved results comparable with Wordnet-based
approaches.
Weakness
Do not determine the semantic nature of the
relationship between the referring expression and
the antecedent.

12
Our Interpreter

Task
Given a knowledge base encoded as a semantic
network.
Input a pair of nouns corresponding to two nodes
in the network.
Output a path of semantic relations between the
two nodes.
Stops when any subclass or superclass of the
goal node is found.
Sorting prefer paths of short length.

13
Our Interpreter
Door
14
Comparison With Previous Approaches

Similar to WordNet based-systems.
Differences
More relaxed stopping criterion.
Sorting based on lexical distance (path length)
rather than contextual distance.
Search inherited properties (not just local
ones).
Deeper search.

15
Applying Our Interpreter to Indirect Anaphora
Resolution

Word sense form the cross product of all
possible word senses of each referring expression
and each candidate antecedent. This forms
candidate pairs
ltreferring expression, antecedentgt
Search find semantic paths for each candidate
pair.
Select rank the semantic paths to choose the
best candidate path

16
Experiment 1 Evaluate the Interpreter's
Performance

Two data sets
32 articles from Brown corpus.
32 articles from Wall Street Journal.
Compared with an implementation of a
state-of-the-art WordNet-based system Vieira and
Poesio 2000.
WordNet 2.0 as knowledge base.

17
Experiment 1 (Results)
18
Experiment 1 Analysis

Precision remains the same.
Recall increases significantly.

19
Ablation Study

Why is the recall significantly better?
The systems differ in only four ways
More relaxed stopping criterion.
Sorting based on lexical distance (path length)
rather than contextual distance.
Search inherited properties (not just local
ones).
Deeper search.
We measured the contribution of each difference
through a series of ablations.

20
Ablation Study Results
21
Ablation Study Analysis

Little impact
Sorting.
Search depth.
Inherited properties.
Big impact
Stopping criterion.

22
Experiment 2

Is the effect of stopping criterion restricted to
these data sets and this task?
Evaluated impact of four different stopping
criteria on semantic path search equality,
superclass, subclass, super_or_subclass.
Task is noun compound interpretation.
Four sets of data (total of 742 pairs of nouns)
Biology text.
Small engine repair manual.
Sparcstation owners manual.
Online airplane descriptions.

23
Experiment 2 Results
24
Experiment 2 Result Analysis

The interpreter that used the most restricted
stopping criterion had the worst recall.
The interpreter that used the least restricted
stopping criterion had the best recall.
The more relaxed stopping criterion may induce
many false positives, but it was rarely the case
in practice.

25
Conclusion and Discussion

We applied a general tool for finding semantic
paths between concepts to indirect anaphora
resolution.
Our system achieved much higher recall with no
drop in precision.
A relaxed stopping criterion, not search depth,
is responsible for the increase in recall. This
suggests that the interpreter can be used on very
large knowledge bases.
In the future, we plan to
Assess the interpreters effectiveness on
additional natural language processing tasks.
Evaluate the impact of taxonomy design on
performance.