Title: Content-Based Image Retrieval
1Content-Based Image Retrieval
2Content-based Image Retrieval
- Retrieval by text
- Label database images by text tags
- Image retrieval as text retrieval
- Find images for textual queries using standard
text search engines
3Example Flickr.com
- Con require manually labeling
4Image Labeling by Human Computing
- ESP game http//www.gwap.com/gwap/gamesPreview/esp
game - Collect annotations for web images via a game
5Content-based Image Retrieval
- Retrieval based on visual content
- Represent images by their visual contents
- Each query is an image
- Search for images that have similar visual
content as the query image
6Content-based Image Retrieval
- Given a query image, try to find visually
similar images from an image database
Query
Answer
7Example www.like.com
8CBIR Challenges
- How to represent visual content of images
- What are visual contents ?
- Colors, shapes, textures, objects, or meta-data
(e.g., tags) derived from images - Which type of visual content should be used for
representing image ? - Difficult to understand the information needs of
an user from a query image - How to retrieve images efficiently
- Should avoid linear scan of the entire database
9Image Representation
- Similar color distribution
Histogram matching
Texture analysis
Image Segmentation, Pattern recognition
Degree of difficulty
Life-time goal -)
10Vector based Image Representation
- Represent an image by a vector of fixed number of
elements - Color histogram discretize color space count
pixels for each discretized color bin - Texture Gabor filters ? texture features
11Vector based Image Representation
0.5 0.1 0.4 V2
0.3 0.5 0.2 Vq
0.4 0.5 0.1 V1
R G B
V1 Vq lt V2 Vq
gt
12Images with Similar Colors
13Images with Similar Shapes
14Images with Similar Content
15Challenges in CBIR
- You get drunk,
- REALLY drunk
- Hit over the head
- Kidnapped to another city
- in a country on the other side of the world
- When you wake up,
- You try to figure out what city are you in, and
what is going on - Thats what its like to be a CBIR system!
16Near Duplicate Image Retrieval
- Given a query image, identify gallery images with
high visual similarity.
17Appearance based Image Matching
- Parts-based image representation
- Parts (appearance) shape (spatial relation)
- Parts local features by interesting point
operator - Shape graphical models or neighborhood
relationship
18Interesting Point Detection
- Local features have been shown to be effective
for representing images - They are image patterns which differ from their
immediate neighborhood. - They could be points, edges, small patches.
- We call local features key points or interesting
points of an image
19Interesting Point Detection
- An image example with key points detected by a
corner detector.
20Interesting Point Detection
- The detection of interesting point needs to be
robust to various geometric transformations
Original
ScalingRotationTranslation
Projection
21Interesting Point Detection
- The detection of interesting point needs to be
robust to imaging conditions, e.g. lighting,
blurring.
22Descriptor
- Representing each detected key point
- Take measurements from a region centered on a
interesting point - E.g., texture, shape,
- Each descriptor is a vector with fixed length
- E.g. SIFT descriptor is a vector of 128 dimension
23Descriptor
- The descriptor should also be robust under
different image transformation.
They should have similar descriptors
24Image Representation
- Bag-of-features representation an example
Each descriptor is 5 dimension
22 0 19 23 1
66 103 45 6 38
232 44 0 11 48
29 55 129 0 1
11 78 110 1 32
220 30 11 34 21
Descriptors of the key points
Original image
Detected key points
25Retrieval
22 0 19 23 1
66 103 45 6 38
232 44 0 11 48
29 55 129 0 1
...
How to measure similarity?
26Retrieval
22 0 19 23 1
66 103 45 6 38
232 44 0 11 48
29 55 129 0 1
...
Count number of matches !
27Retrieval
If the distance between two vectors is smaller
than the threshold, we get one match
28Retrieval
Matched points 1
Matched points 5
29Problems
- Computationally expensive
- Requiring linear scan of the entire data base
- Example match a query image to a database of 1
million images - 0.1 second for computing the match between two
images - Take more than one day to answer a single query
30Bag-of-words Model
- Compare to the bag-of-words representation in
text retrieval
An image
A document
What is the
difference
A collection of the words in the document
A collection of the key points of the image
31Bag-of-words
An image
A document
What is the
difference
A collection of the words in the document
A collection of the key points of the image
The same word appears in many documents
No same key point, but similar key point
appears in many images which have similar visual
content
Group similar key point in different images in
to visual words
32Bag-of-words Model
b4
Represent images by histograms of visual words
Group key points into visual words
33Bag-of-words
- The grouping is usually done by clustering.
- Clustering the key points of all images into a
number of cluster centers (e.g 100,000 clusters). - Each cluster center is called a visual word
- The collection of all cluster centers is called
visual vocabulary
34Retrieval by Bag-of-words Model
- Generate visual vocabulary
- Represent each key point by its nearest visual
word - Represent an image by a bag of visual words
- Text retrieval technique can be applied directly.
35Project
- Build a system for near duplicate image
retrieval - A database with 10,000 images
- Construct bag-of-words models for each image
(offline) - Construct a bag-of-words model for a query image
- Retrieve first 10 visually most similar images
from the database for the given query
36Step 1 Dataset
- 10,000 color images under the folder ./img
- The key points of each image have already been
extracted - Key points of all images are saved in a single
file ./feature/esp.feature - Each line corresponds to a key point with 128
attributes - Attributes in each line are separated by tabs
37Step 1 Dataset
- To locate key points for individual images, two
other files are needed - ./imglist.txt the order of images when saving
their keypoints - ./feature/esp.size the number of key points an
image have.
38Step 1 Dataset
- Example Three images imgA, imgB, imgC.
- imgA 2 key points imgB 3 key points imgC 2
key points.
39Step 2 Key Point Quantization
- Represent each image by a bag of visual words
- Construct the visual vocabulary
- Clustering all the key points into 10,000
clusters - Each cluster center is a visual word
- Map each key point to a visual word
- Find the nearest cluster center for each key
point (nearest neighbor search)
40Step 2 Key Point Quantization
- Clustering 7 key points into 3 clusters
- The cluster centers are cnt1, cnt2, cnt3
- Each center is a visual word w1, w2, w3
- Find the nearest center to each key point
41Step 2 Key Point Quantization
- imgA.jpg
- 1st key point ? w2
- 2nd key point ? w1
- imgB.jpg
- 1st key point ? w3
- 2nd key point ? w3
- 3rd key point ? w2
- imgC.jpg
- 1st key point ? w3
- 2nd key point ? w2
Bag-of-words Rep. imgA.jpg w2 w1 imgB.jpg w3
w3 w2 imgC.jpg w3 w2
42Step 2 Key Point Quantization
- We provide FLANN library for clustering and
nearest neighbor search. - For clustering, use flann_compute_cluster_centers(
- float dataset, // your key points
- int rows, // number of key points
- int cols, // 128, dim of a key point
- int clusters, // number of clusters
- float result, // cluster centers
- struct IndexParameters index_params, struct
FLANN
43Step 2 Key Point Quantization
- For nearest neighbor search
- Build index for the cluster centers
- flann_build_index(
- float dataset, // your cluster centers
- int rows, int cols, float speedup, struct
IndexParameters index_params, struct
FLANNParameters flann_params) - For each key point, search nearest cluster center
- flann_find_nearest_neighbors_index(
- FLANN_INDEX index_id, // your index above
- float testset, // your key points
- int trows, int result, int nn, int checks,
struct FLANNParameters flann_params)
44Step 2 Key Point Quantization
- In this step, you need to save
- the cluster centers to a file. You will use this
later on for quantizing key points of query
images - bag-of-words representation of each image in
trec format.
Bag-of-words Rep. imgA.jpg w2 w1 imgB.jpg w3
w3 w2 imgC.jpg w3 w2
ltDOCgt ltDOCNOgtimgBlt/DOCNOgt ltTEXTgt w3 w3
w2 lt/TEXTgt lt/DOCgt
ltDOCgt ltDOCNOgtimgAlt/DOCNOgt ltTEXTgt w2
w1 lt/TEXTgt lt/DOCgt
ltDOCgt ltDOCNOgtimgClt/DOCNOgt ltTEXTgt w3
w2 lt/TEXTgt lt/DOCgt
45Step 3 Build index using Lemur
- The same as what we did in the previous home work
- Use KeyfileIncIndex index
- No stemming
- No stop words
46Step 4 Extract key points for a query
- Three sample query images under ./sample
query/ - The query images are in the format of .pgm
- Extracting tool is under ./sift tool/
- For windows, use siftW32.exe
- For Linux, use sift
- Example issue command
- Sift lt input.pgm gt output.keypoints
47Step 5 Generate a bag-of-words model for a query
- Map each key point of a given query to a visual
word. - Use the cluster center file generated in step 2
- Build index for the cluster centers using
flann_build_index() - For each key point, search nearest cluster center
using flann_find_nearest_neighbors_index()
48Step 5 Generate a bag-of-words model for a query
- Write the bag-of-words model for a query image in
the Lemur format. - ltDOC 1gt
- The mapped cluster ID for the 1st key point
- The mapped cluster ID for the 2nd key point
-
- The mapped cluster ID for the 1st key point
- lt/DOCgt
49Step 6 Image Retrieval by Lemur
- Use the Lemur command RetEvalas
- RetEval ltparameter_filegt
- An example of parameter file
- ltparametersgt
- ltindexgt/home/user1/myindex/myindex.keylt/indexgt
- ltretModelgttfidflt/retModelgt
- lttextQuerygt/home/user1/query/q1.querylt/textQuerygt
- ltresultFilegt/home/user1/result/ret.resultlt/result
Filegt - ltTRECResultFormatgt1lt/TRECResultFormatgt
- ltresultCountgt10lt/resultCountgt
- lt/parametersgt
50Step 7 Graphical User Interface
- Build a GUI for the image retrieval system
- Browse the image database
- Select an image from the database to query the
database and display the top 10 retrieved results
- Extract the bag-of-words representation of the
query - Write it into the file with the format specified
in step7 - Run the RetEval command for retrieval
- Load in the external query image, search the
images in the database and display the top 10
retrieved results
51Step 8 Evaluation
- Demo your system in the classes of the last week.
- We will provide a number of test query images
- Run your GUI, load in each test query image and
display the first ten most similar images from
the database