Minimal Loss Hashing for Compact Binary Codes - PowerPoint PPT Presentation

About This Presentation
Title:

Minimal Loss Hashing for Compact Binary Codes

Description:

Minimal Loss Hashing for Compact Binary Codes Mohammad Norouzi David Fleet University of Toronto Thank you! Questions? After giving form of has function just in words ... – PowerPoint PPT presentation

Number of Views:106
Avg rating:3.0/5.0
Slides: 35
Provided by: Aid80
Category:

less

Transcript and Presenter's Notes

Title: Minimal Loss Hashing for Compact Binary Codes


1
Minimal Loss Hashing for Compact Binary Codes
  • Mohammad Norouzi
  • David Fleet
  • University of Toronto

2
Near Neighbor Search
3
Near Neighbor Search
4
Near Neighbor Search
5
Similarity-Preserving Binary Hashing
  • Why binary codes?
  • Sub-linear search using hash indexing(even
    exhaustive linear search is fast)
  • Binary codes are storage-efficient

6
Similarity-Preserving Binary Hashing
Hash function
kth row of W
Random projections used by locality-sensitive
hashing (LSH) and related techniques Indyk
Motwani 98 Charikar 02 Raginsky Lazebnik
09
7
Learning Binary Hash Functions
  • Reasons to learn hash functions
  • to find more compact binary codes
  • to preserve general similarity measures
  • Previous work
  • boosting Shakhnarovich et al 03
  • neural nets Salakhutdinov Hinton 07 Torralba
    et al 07
  • spectral methods Weiss et al 08
  • loss-based methods Kulis Darrel 09

8
Formulation
9
Loss Function
Similar items should map to nearby hash
codes Dissimilar items should map to very
different codes
10
Hinge Loss
11
Empirical Loss
  • Good
  • incorporates quantization and Hamming distance
  • Not so good
  • discontinuous, non-convex objective function

12
We minimize an upper bound on empirical loss,
inspired by structural SVM formulations
Taskar et al 03 Tsochantaridis et al 04 Yu
Joachims 09
13
Bound on loss
LHS RHS
14
Bound on loss
  • Remarks
  • piecewise linear in W
  • convex-concave in W
  • relates to structural SVM with latent variables
    Yu Joachims 09

15
Bound on Empirical Loss
  • Loss-adjusted inference
  • Exact
  • Efficient

16
Perceptron-like Learning
McAllester et al.., 2010
17
Experiment Euclidean ANN
Similarity based on Euclidean distance
  • Datasets
  • LabelMe (GIST)
  • MNIST (pixels)
  • PhotoTourism (SIFT)
  • Peekaboom (GIST)
  • Nursery (8D attributes)
  • 10D Uniform

18
Experiment Euclidean ANN
  • 22K LabelMe
  • 512 GIST
  • 20K training
  • 2K testing
  • 1 of pairs are similar
  • Evaluation
  • Precision hits / number of items retrieved
  • Recall hits / number of similar items

19
Techniques of interest
  • MLH minimal loss hashing (This work)
  • LSH locality-sensitive hashing (Charikar 02)
  • SH spectral hashing (Weiss, Torralba Fergus
    09)
  • SIKH shift-Invariant kernel hashing (Raginsky
    Lazebnik 09)
  • BRE Binary reconstructive embedding (Kulis
    Darrel 09)

20
Euclidean Labelme 32 bits
21
Euclidean Labelme 32 bits
22
Euclidean Labelme 32 bits
23
Euclidean Labelme 64 bits
24
Euclidean Labelme 64 bits
25
Euclidean Labelme 128 bits
26
Euclidean Labelme 256 bits
27
Experiment Semantic ANN
  • Semantic similarity measure based on
    annotations(object labels) from LabelMe
    database
  • 512D GIST, 20K training, 2K testing
  • Techniques of interest
  • MLH minimal loss hashing
  • NN nearest neighbor in GIST space
  • NNCA multilayer network with RBM pre-training
    and nonlinear NCA fine tuning Torralba, et al.
    09 Salakhutdinov Hinton 07

28
Semantic LabelMe
29
Semantic LabelMe
30
(No Transcript)
31
(No Transcript)
32
(No Transcript)
33
Summary
  • A formulation for learning binary hash
    functionsbased on
  • structured prediction with latent variables
  • hinge-like loss function for similarity search
  • Experiments show that with minimal loss hashing
  • binary codes can be made more compact
  • semantic similarity based on human labels can be
    preserved

34
  • Thank you!
  • Questions?
Write a Comment
User Comments (0)
About PowerShow.com