Experiments on Using Semantic Distances Between Words in Image Caption Retrieval presentation

About This Presentation

Title:

Experiments on Using Semantic Distances Between Words in Image Caption Retrieval

Description:

Experiments on Using Semantic Distances Between Words in Image Caption Retrieval ... different words describing same thing ('stomach pain' 'belly ache' ... –

Number of Views:105

Avg rating:3.0/5.0

Slides: 10

Provided by: ADy5

Learn more at: http://www1.cs.columbia.edu

Category:

more less

Transcript and Presenter's Notes

Title: Experiments on Using Semantic Distances Between Words in Image Caption Retrieval

1
Experiments on Using Semantic Distances Between
Words in Image Caption Retrieval
Alan F. Smeaton and Ian Quigley School of
Computer Applications Dublin City University

Presenter Cosmin Adrian Bejan

2
IR implementation - traditional approach

Represent
a user query a bag of query terms
document a bag of index terms
Compute
a degree of similarity between a document and a
query based on the overlap or number of query
terms in common between them.

3
Problems in IR implementation

caused by
same words describing different things (bar,
bank)
different words describing same thing (stomach
pain belly ache)
natural language is fraught with ambiguities at
all levels leading to multiple interpretations of
words, phrases, etc.
Common way to address these problems query
expansion
The approach in this paper when computing the
degree of similarity between query and document
instead of basing similarity on the terms in
common between the two incorporate a quantitative
measure of the semantic similarity between index
terms into the measure.

4
Measuring semantic distance between words

knowledge base hierarchical concept graphs
(HCGs) automatically constructed from WordNet
The similarity of two classes or synsets
Computing the similarity between two word senses
(nouns) can only be done if both are in the same
HCG, otherwise they are regarded as being
dissimilar.

information content of the class ci
P(ci) the class probability of class ci
5
Experimental Set-up

Hand-caption 2714 images
Manually disambiguate polysemous words in caption
Manually built a collection of 60 queries
Compute various query-caption similarity measure
using word-word semantic distances.

6
Retrieval Strategies 1-2

Notation
query Qq1, q1, qm.
caption Cc1, c1 cn where a qi or a cj is the
original term used only as a representation for
its synset.
Sim(ti, tj) is the similarity between the
sense-disambiguated form of two terms ti and tj.
Run1
Run2

straightforward statistically-based tfIDF match
between the word forms or strings, i.e. not using
word sense disambiguated captions or queries.
where terms in caption in query are both expanded
to include other word strings from their sense
disambi-guated sysnsets (query expansion).
7
Retrieval Strategies 3-5

Run3
Run4
Run5

when considering different threshold values for
each HCG, given that there is a concentration of
usage of concepts from some HCGs (like entity)
and hardly any use of others (like shape).
8
Retrieval Strategies 6-8