Title: Experiments on Using Semantic Distances Between Words in Image Caption Retrieval
1Experiments on Using Semantic Distances Between
Words in Image Caption Retrieval
Alan F. Smeaton and Ian Quigley School of
Computer Applications Dublin City University
- Presenter Cosmin Adrian Bejan
2IR implementation - traditional approach
- Represent
- a user query a bag of query terms
- document a bag of index terms
- Compute
- a degree of similarity between a document and a
query based on the overlap or number of query
terms in common between them.
3Problems in IR implementation
- caused by
- same words describing different things (bar,
bank) - different words describing same thing (stomach
pain belly ache) - natural language is fraught with ambiguities at
all levels leading to multiple interpretations of
words, phrases, etc. - Common way to address these problems query
expansion - The approach in this paper when computing the
degree of similarity between query and document
instead of basing similarity on the terms in
common between the two incorporate a quantitative
measure of the semantic similarity between index
terms into the measure.
4Measuring semantic distance between words
- knowledge base hierarchical concept graphs
(HCGs) automatically constructed from WordNet - The similarity of two classes or synsets
- Computing the similarity between two word senses
(nouns) can only be done if both are in the same
HCG, otherwise they are regarded as being
dissimilar.
information content of the class ci
P(ci) the class probability of class ci
5Experimental Set-up
- Hand-caption 2714 images
- Manually disambiguate polysemous words in caption
- Manually built a collection of 60 queries
- Compute various query-caption similarity measure
using word-word semantic distances.
6Retrieval Strategies 1-2
- Notation
- query Qq1, q1, qm.
- caption Cc1, c1 cn where a qi or a cj is the
original term used only as a representation for
its synset. - Sim(ti, tj) is the similarity between the
sense-disambiguated form of two terms ti and tj. - Run1
- Run2
straightforward statistically-based tfIDF match
between the word forms or strings, i.e. not using
word sense disambiguated captions or queries.
where terms in caption in query are both expanded
to include other word strings from their sense
disambi-guated sysnsets (query expansion).
7Retrieval Strategies 3-5
when considering different threshold values for
each HCG, given that there is a concentration of
usage of concepts from some HCGs (like entity)
and hardly any use of others (like shape).
8Retrieval Strategies 6-8
9Experimental Results