The CLEF 2005 CrossLanguage Image Retrieval Track - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

The CLEF 2005 CrossLanguage Image Retrieval Track

Description:

Filipino. 0.1610. 5. Norwegian. 0.2074. 7. Swedish. 0.2225. 5. Turkish. 0.2305 ... Translation main focus for many groups. 13 languages have at least 2 groups ... – PowerPoint PPT presentation

Number of Views:66
Avg rating:3.0/5.0
Slides: 34
Provided by: paulc49
Category:

less

Transcript and Presenter's Notes

Title: The CLEF 2005 CrossLanguage Image Retrieval Track


1
The CLEF 2005 Cross-Language Image Retrieval Track
  • Organised by
  • Paul Clough, Henning Müller, Thomas Deselaers,
    Michael Grubinger, Thomas Lehmann, Jeffery Jensen
    and William Hersh

2
Overview
  • Image Retrieval and CLEF
  • Motivations
  • Tasks in 2005
  • Ad-hoc retrieval of historic photographs and
    medical images
  • Automatic annotation of medical images
  • Interactive task
  • Summary and future work

3
Image Retrieval and CLEF
  • Cross-language image retrieval
  • Images often accompanied by text (used for
    retrieval)
  • Began in 2003 as pilot experiment
  • Aims of ImageCLEF
  • Investigate retrieval combining visual features
    and associated text
  • Promote the exchange of ideas
  • Provide resources for IR evaluation

4
Motivations
  • Image retrieval a good application for CLIR
  • Assume images are language-independent
  • Many images have associated text (e.g. captions,
    metadata, Web page links)
  • CLIR has potential benefits for image vendors and
    users
  • Image retrieval can be performed using
  • Low-level visual features (e.g. texture, colour
    and shape)
  • Abstracted features expressed using text
  • Combining both visual and textual approaches

5
ImageCLEF 2005
  • 24 participants from 11 countries
  • Specific domains and tasks
  • Retrieval of historic photographs (St Andrews)
  • Retrieval and annotation of medical images
    (medImageCLEF and IRMA)
  • Additional co-ordinators
  • William Hersh and Jeffrey Jensen (OHSU)
  • Thomas Lehmann and Thomas Deselaers (Aachen)
  • Michael Grubinger (Melbourne)
  • Links with MUSCLE NoE including pre-CLEF workshop
  • http//muscle.prip.tuwien.ac.at/workshops.php

6
Ad-hoc retrieval from historic photographs
  • Paul Clough (University of Sheffield)
  • Michael Grubinger (Victoria University)

7
St Andrewsimage collection
8
Topics
  • 28 search tasks (topics)
  • Consist of title, narrative and example images
  • Topics more general than 2004 and more visual
  • e.g. waves breaking on beach, dog in sitting
    position
  • Topics translated by native speakers
  • 8 languages for title narrative (e.g. German,
    Spanish, Chinese, Japanese)
  • 25 languages for title (e.g. Russian, Bulgarian,
    Norwegian, Hebrew, Croatian)
  • 2004 topics and qrels used as training data

9
Relevance judgements
  • Staff from Sheffield University were assessors
  • Assessors judged topic pools
  • Top 50 images from all 349 runs
  • Average of 1,376 images per pool
  • 3 assessments per image (inc. topic creator)
  • Ternary relevance judgements
  • Qrels images judged as relevant/partially
    relevant by topic creator and at least one other
    assessor

10
Submissions Results (1)
  • 11 groups (5 new)
  • CEA
  • NII
  • Alicante
  • CUHK
  • DCU
  • Geneva
  • Indonesia
  • Miracle
  • NTU
  • Jaen
  • UNED

11
Submissions Results (2)
12
Submissions Results (3)
13
Summary
  • Most groups focused on text retrieval
  • Fewer combined runs than 2004
  • But still gives highest average MAP
  • Translation main focus for many groups
  • 13 languages have at least 2 groups
  • More use of title narrative than 2004
  • As Relevance feedback (QE) improves results
  • Topics still dominated by semantics
  • But typical of searches in this domain

14
Ad-hoc medical retrieval task
  • Henning Müller (University Hospitals Geneva)
  • William Hersh, Jeffrey Jensen (OHSU)

15
Collection
  • 50,000 medical images
  • 4 sub-collections with heterogeneous annotation
  • Radiographs, photographs, Powerpoint slides and
    illustrations
  • Mixed languages for annotations (French, German
    and English)
  • In 2004 only 9,000 images available

16
Search topics
  • Topics based on 4 axes
  • Modality (e.g. x-ray, CT, MRI)
  • Anatomic region shown in image (e.g. head, arm)
  • Pathology (disease) shown in image
  • Abnormal visual observation (e.g. enlarged heart)
  • Different types of topic identified from survey
  • Visual (11) visual approaches only expected to
    perform well
  • Mixed (11) text and visual approaches expected
    to perform well
  • Semantic (3) visual approaches not expected to
    perform well
  • Topics consist of annotation in 3 languages and
    1-3 query images

17
An example (topic 20 - mixed)
Show me microscopic pathologies of cases with
chronic myelogenous leukemia. Zeige mir
mikroskopische Pathologiebilder von chronischer
Leukämie. Montre-moi des images de la leucémie
chronique myélogène.
18
Relevance assessments
  • Medical doctors made relevance judgements
  • Only one per topic for money and time constraints
  • Some additional to verify consistency
  • Relevant/partially relevant/non relevant
  • For ranking only relevant vs. non-relevant
  • Image pools created from submissions
  • Top 40 images from 134 runs
  • Average of 892 images per topic to assess

19
Submissions
  • 13 groups submitted runs (24 registered)
  • Resources very interesting but lack of manpower
  • 134 runs submitted
  • Several categories for submissions
  • Manual vs. Automatic
  • Data source used
  • Visual/textual/mixed
  • All languages could be used or a single one

20
Results (1)
  • Mainly automatic and mixed submissions
  • some further to be classified as manual
  • Large variety of text/visual retrieval approaches
  • Ontology-based
  • Simple tf/idf weighting
  • Manual classification before visual retrieval

21
Results (2) highest MAP
22
Average results per topic type
23
Summary
  • Text-only approaches perform better than
    image-only
  • But some visual systems have high early precision
  • Depends on the topics formulated
  • Visual systems very bad on semantic queries
  • Best overall systems use combined approaches
  • GIFT as a baseline system used by many
    participants and still best visual completely
    automatic
  • Few manual runs

24
Automatic Annotation Task
  • Thomas Deselaers, Thomas Lehmann
  • (RWTH Aachen University)

25
Automatic annotation
  • Goal
  • Compare state-of-the-art classifiers for medical
    image annotation task
  • Purely visual task
  • Task
  • 9,000 training 1,000 test medical images from
    Aachen University Hospital
  • 57 classes identifying modality, body
    orientation, body region and biological system
    (IRMA code)
  • e.g. 01 plain radiography, coronal, cranuim,
    musculosceletal system
  • Classes in English and German and unevenly
    distributed

26
Example of IRMA code
  • Example 1121-127-720-500
  • radiography, plain, analog, overview
  • coronal, AP, supine
  • abdomen, middle
  • uropoetic system

27
Example Images
http//irma-project.org
28
Participants
  • Groups
  • 26 registered
  • 12 submitted runs
  • Runs
  • In total 41 submitted
  • CEA (France)
  • CINDI (Montreal,CA)
  • medGift (Geneva, CH)
  • Infocomm (Singapore, SG)
  • Miracle (Madrid, ES)
  • Umontreal (Montreal, CA)
  • Mt. Holyoke College (Mt. Hol., US)
  • NCTU-DBLAB (TW)
  • NTU (TW)
  • RWTH Aachen CS (Aachen, DE)
  • IRMA Group (Aachen, DE)
  • U Liège (Liège, BE)

29
Results
  • Baseline error rate 36.8

...
...
30
Conclusions
  • Continued global participation from variety of
    research communities
  • Improvements in ad-hoc medical task
  • Realistic topics
  • Larger medical image collection
  • Introduction of medical annotation task
  • Overall combining text and visual approaches
    works well for ad-hoc task

31
ImageCLEF2006 and beyond
32
ImageCLEF 2006
  • New ad-hoc
  • IAPR collection of 25,000 personal photographs
  • Annotations in English, German and Spanish
  • Medical ad-hoc
  • Same data new topics
  • Medical annotation
  • Larger collection more fine-grained
    classification
  • New interactive task
  • Using Flickr.com more in iCLEF talk

33
and beyond
  • Image annotation task
  • Annotate general images with simple concepts
  • Using the LTU 80,000 Web images (350
    categories)
  • MUSCLE collaboration
  • Create visual queries for ad-hoc task (IAPR)
  • Funding workshop in 2006
  • All tasks involve cross-language in some way
Write a Comment
User Comments (0)
About PowerShow.com