Learning Analogies and Semantic Relations - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Learning Analogies and Semantic Relations

Description:

and also relations between entities. How do you determine if you ' ... Is it surprising that information about relation similarity is spread out across ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 21
Provided by: willia95
Category:

less

Transcript and Presenter's Notes

Title: Learning Analogies and Semantic Relations


1
Learning Analogies and Semantic Relations
  • April 12 2007
  • William Cohen

2
Announcements
  • No critiques next week.
  • No class at all Thus (carnival)
  • Project presentations start April 24th
  • First up Jana Terrill, Yimeng
  • 30min each
  • Final class is May 10th, 12-2pm.

3
Machine Learning, 2005
4
Motivation
  • Information extraction is about understanding
    entity names in text
  • and also relations between entities.
  • How do you determine if you understand an
    arbitrary relation?
  • For fixed relations R labeled data (ACE)
  • For arbitrary relations ?

5
Evaluation
6
How do you measure the similarity of relation
instances?
  • Create a feature vector rxy for each instance
    xy
  • masonstone ? lt0.123, 0.001, 5.47, gt
  • soldiergun ? lt6.54, 0.013, 13.201, gt
  • Use cosine distance.

7
Creating an instance vector for xy
  • Generate a bunch of queries .
  • X of the Y (stone of the mason)
  • X with the Y (soldier with the gun)
  • For each query qj(X,Y), record the number of hits
    in a search engine as rxy,j
  • Actually record log(hits1)
  • Actually sometimes replace X with stem(X)

8
The queries used
Similar to Hearst 92 followups
9
Some results
Ranking 369 possible xy pairs as possible answers
10
How do you measure the similarity of relation
instances?
  • Create a feature vector rxy for each instance
    xy
  • Use cosine distance to rank (a),(d)
  • Test-taking strategy
  • Define margin(bestScore-secondBest)
  • If marginlt? and ?gt0 then skip
  • If marginlt? and ?lt0 then guess the top 2.

11
Results
12
Results
13
Results
14
Followup work
Turney, CL 2006
  • Given xy pairs, replace vectors with rows in M
  • Look up synonyms x, y of x and y and construct
    near analogies xy, xy. Drop any that dont
    occur frequently.
  • - e.g. masonstone ? masonrock
  • Search for phrase x Q y or y Q x, using near
    analogies as well as original pair xy, and any
    sequence of up to three words Q.
  • For each phrase create patterns by introducing
    wildcards.
  • Build a pair-pattern matrix frequency M.
  • Apply SVD to M to get best 300 dimensions ?M.
  • Define sim1(xy, uv) cosine distance in M.

Compute similarity of xy and uv as average of
sim1(p1,p2) for all pairs p1,p2 where (a) p1 is
xy or an alternate (b) p2 is uv or an
alternate and (c) sim1(p1,p2)gtsim1(xy,uv)
15
Results for LRA
56.5
40.3 VSM-WMTS
On 50B word WMTS corpus
16
Additional application relation classification
17
Relation classification
18
Ablation experiments - 1
19
Ablation experiments - 2
What is the effect of using many
automatically-generated patterns vs only 64
manually-generated ones? (Most of manual patterns
are found automatically).
Feature selection in pattern space instead of SVD
20
Lessons and questions
  • How are relations and surface patterns
    correlated?
  • One-many? (several class-subclass patterns)
  • Many-one? (some patterns are ambiguous)
  • Many-many? (and is it 10-10, 100-100, 1000-1000?)
  • Is it surprising that information about relation
    similarity is spread out across
  • So much text?
  • So many surface patterns?
Write a Comment
User Comments (0)
About PowerShow.com