Nearest Neighbour Matching Using Multidimensional Binary Search Trees - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Nearest Neighbour Matching Using Multidimensional Binary Search Trees

Description:

Given a set of data points and target points in a metric space, find the ... Median of the most deviant dimension. Splitting Strategy vs. Query Cost. Query Cost ... – PowerPoint PPT presentation

Number of Views:133
Avg rating:3.0/5.0
Slides: 30
Provided by: torch
Category:

less

Transcript and Presenter's Notes

Title: Nearest Neighbour Matching Using Multidimensional Binary Search Trees


1
Nearest Neighbour Matching Using Multidimensional
Binary Search Trees
  • Henry Stern
  • Supervisor Pat KeastReader Mike Shepherd

2
Nearest Neighbour Matching
  • A fundamental computational geometry problem.
  • Given a set of data points and target points in a
    metric space, find the nearest data point to each
    target point.
  • Cost increases with both the number of points and
    the dimensionality of the space.

3
(No Transcript)
4
Process of Nearest Neighbour Matching
  • A k-dimensional Euclidean ball with radius r,
    centered around the target point contains all
    neighbours where distance(targetlt-gtpoint) lt r.
  • During a nearest neighbour match, the radius of
    this ball is the distance from the target point
    to the best known match.

5
(No Transcript)
6
Simple Nearest Neighbour Matching Algorithm
  • Every point is examined in turn.
  • If the point being examined is closer to the
    target than all of the previously known matches,
    it is saved as the candidate for nearest
    neighbour.

7
(No Transcript)
8
(No Transcript)
9
(No Transcript)
10
Simple but Expensive
  • Simple nearest neighbour matching algorithm runs
    in O(nk) time with space complexity O(nk).
  • Cheapest solution for a single query.
  • For multiple queries, it does not take advantage
    of any previous queries.

11
What Other Options Are There?
  • Quad Trees
  • Multidimensional Binary Search Trees

12
Quad Trees
  • Sacrifices space complexity for time complexity.
  • O(n) time complexity, expected linear.
  • O(n 2k) space complexity.

13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
(No Transcript)
17
Multidimensional Binary Search Tree
  • kd-trees.
  • Sacrifices time complexity for space complexity.
  • O(n) time complexity, expected linear.
  • O(n) space complexity.

18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
Query Optimization for kd-trees
  • Region Compression
  • Splitting Strategies

22
Region Compression
  • Tight bounds for each node are calculated.
  • O(n) method while tree is being constructed.
  • Significantly reduces query costs.

23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
Splitting Strategies
  • The choice of splitting dimension can have a
    significant impact on query performance.
  • Success of splitting strategy is highly
    data-dependent.
  • Three strategies examined
  • Median of the next dimension (Simple).
  • Median of the widest dimension.
  • Median of the most deviant dimension.

27
Splitting Strategy vs. Query Cost
28
Query Cost
  • Proportional to dimensionality and number of
    points.
  • Critical point when k approaches log n.
  • After the critical point, query cost slowly
    approaches n.

29
Conclusion
  • Nearest neighbour matching in sub-linear time is
    a difficult problem.
  • Time complexity vs. space complexity.
  • Difficult to minimize trade-off.
  • The kd-tree is ineffective for high-dimensional
    nearest neighbour matching.
Write a Comment
User Comments (0)
About PowerShow.com