Collective Word Sense Disambiguation - PowerPoint PPT Presentation

About This Presentation

Title:

Collective Word Sense Disambiguation

Description:

The electricity plant supplies 500 homes with power. ... Dog, Hound, Canine. Retriever. Terrier. Animal. Bird. Available Data ... – PowerPoint PPT presentation

Number of Views:33

Avg rating:3.0/5.0

Slides: 19

Provided by: rebekah2

Learn more at: http://robotics.stanford.edu

Category:

more less

Transcript and Presenter's Notes

Title: Collective Word Sense Disambiguation

1
Collective Word Sense Disambiguation

David Vickrey
Ben Taskar
Daphne Koller

2
Word Sense Disambiguation
Clues
The electricity plant supplies 500 homes with
power.
vs.
A plant requires water and sunlight to survive.
Clues
Tricky
That plant produces bottled water.
3
WSD as Classification

Senses s1, s2, , sk correspond to classes c1,
c2, , ck
Features properties of context of word
occurrence
Subject or verb of sentence
Any word occurring within 4 words of occurrence
Document set of features corresponding to an
occurrence

The electricity plant supplies 500 homes with
power.
4
Simple Approaches

Only features are what words appear in context
Naïve Bayes
Discriminative, e.g. SVM

Problems
Feature set not rich enough
Data extremely sparse
space occurs 38 times in corpus with 200,000
words

5
Available Data

WordNet electronic thesaurus
Words grouped by meaning into synsets
Slightly over 100,000 synsets
For nouns and verbs, hierarchy over synsets

Animal
Mammal
Bird
Dog, Hound, Canine
Retriever
Terrier
6
Available Data

Around 400,000 word corpus labeled with synsets
from WordNet
Sample sentences from WordNet
Very sparse for most words

7
What Hasnt Worked

Intuition context of dog similar to context of
retriever
Use hierarchy to determine possibly useful data
Using cross-validation, learn what data is
actually useful
This hasnt worked out very well

8
Why?

Lots of parameters (not even counting parameters
estimated using MLE)
gt 100K for one model, 20K for another
Not much data (400K words)
a, the, and, of, to occur 65K times (together)
Hierarchy may not be very useful
Hand-built not designed for this task
Features not very expressive
Luke is looking at this more closely using an SVM

9
Collective WSD

Ideas
Determine senses of all words in a document
simultaneously
Allows for richer features
Train on unlabeled data as well as labeled
Lots and lots of unlabeled text available

10
Model

Variables
S1,S2, , Sn synsets
W1,W2, , Wn words, always observed

S1
S3
S2
S4
S5
W1
W3
W2
W4
W5
11
Model

Each synset generated from previous context
size of context a parameter (4)

n
?
P(Wi Si) P(Si Si-3,Si-2,Si-1)
P(S,W)
i 1
P(Sis Si-3,Si-2,Si-1) Z(si-3,si-2,si-1)
exp(?s(si-3)?s(si-2)?s(si-1)?s)
P(W) S P(S,W)
12
Learning

Two sets of parameters
P(Wi Si) Given current estimates of marginals
P(Si), expected counts
?s(s) For s ? Domain(Si-1), s ? Domain(Si),
gradient descent on log likelihood gives

P(w,si-3,si-2,s,s) P(w,si-3,si-2,s) P(s
si-3,si-2,s)
?s(s) S
Si-3,Si-2
13
Efficiency

Only need to calculate marginals over contexts
Forwards-backwards
Issue some words have many possible synsets
(40-50) want very fast inference
Possibly prune values?

14
WordNet and Synsets

Model uses WordNet to determine domain of Si
Synset information should be more reliable
This allows us learn without any labeled data
Consider synsets eagle,hawk, eagle (golf
shot), and hawk(to sell)
Since parameters depend only on synset, even
without labeled data, can find correct clustering

15
Richer Features

Heuristic One sense per discourse usually,
within a document any given word only takes one
of its possible senses
Can capture this using long-range links
Could assume each word independent of all
occurrences besides the ones immediately before
and after
Or, could use approximate inference (Kikuchi)

16
Richer Features

Can reduce feature sparsity using hierarchy
(e.g., replace all occurrences of dog and cat
with animal)
Need collective classification to do this
Could add global hidden variables to try to
capture document subject

17
Advanced Parameters