Bkdtree: A Dynamic Scalable kdtree - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Bkdtree: A Dynamic Scalable kdtree

Description:

Real points from the TIGER. Uniform data. Points along a diagonal. 20. Agenda. Efficient index ... 21. Experimental Results. Space Utilization. 22. Experimental ... – PowerPoint PPT presentation

Number of Views:212
Avg rating:3.0/5.0
Slides: 31
Provided by: course2
Category:

less

Transcript and Presenter's Notes

Title: Bkdtree: A Dynamic Scalable kdtree


1
Bkd-tree A Dynamic Scalable kd-tree
  • Octavian Procopiuc, Pankaj K. Agarwal,
  • Lars Arge, and Jeffrey Scott Vitter

Presenter Lei XIA
2
Agenda
  • Efficient index
  • kd-tree
  • kdB-tree
  • Bkd-tree
  • Experiment Platform
  • Experimental Results
  • Conclusion

3
Efficient index
  • High space utilization
  • Query fast
  • Update quickly

4
Agenda
  • Efficient index
  • kd-tree
  • kdB-tree
  • Bkd-tree
  • Experiment Platform
  • Experimental Results
  • Conclusion

5
kd-tree
  • kd-tree
  • Recursive partitioning of point set into two
    equal subsets using vertical/horizontal line
  • Horizontal line on even levels, vertical on
    uneven levels
  • One point in each leaf
  • Query I/Os

6
kd-tree
  • Insertion
  • Search down the tree
  • Split the leaf
  • Problem
  • For insertion the resulting tree is no longer a
    kd-tree.
  • Reason the lines in the internal nodes no longer
    partition the points into equal sized set
  • Result deteriorate query performance
  • Conclusion kd-tree is a static data structure

7
Agenda
  • Efficient index
  • kd-tree
  • kdB-tree
  • Bkd-tree
  • Experiment Platform
  • Experimental Results
  • Conclusion

8
kdB-tree
  • kdB-tree
  • Points stored in leaves
  • Leaves and nodes stored in disk blocks
  • Query

9
kdB-tree
  • Problem
  • For insertion the split of an internal node may
    result in the need for splits of several of the
    subtrees
  • Result Update inefficient and bad space
    utilization

10
Agenda
  • Efficient index
  • kd-tree
  • kdB-tree
  • Bkd-tree
  • Experiment Platform
  • Experimental Results
  • Conclusion

11
Bkd-tree
  • Bulk Loading kd-trees

12
Bkd-tree
  • Classically (binary) O((N/B)log2(N/B)) I/Os
    (plus the cost of sorting O((N/B)logM/B(N/B))
    I/Os, which is still the same)

13
Bkd-tree
  • Improved (grid) O((N/B)logM/B(N/B)) I/Os
  • Sort
  • Build the log2t kd-tree
  • tt grid, each strip contains N/t points
  • Scan each grid cell and stored in tt matrix A
  • Create the upper subtree

14
Bkd-tree(cont)
  • Improved (grid) O((N/B)logM/B(N/B)) I/Os
  • Build the log2t kd-tree
  • 4. Scan input points and distribute the input
    points into rectangle
  • Build the bottom levels either in main memory or
    by recursing step
  • O((N/B)logM/B(N/B)) I/Os (better than
    O((N/B)log2(N/B)) I/Os )

15
Dynamic Updates
  • A Bkd-tree consists of log2(N/M) kd-trees, the
    ith kd-tree is either empty or contains exactly
    2iM points

16
Dynamic Updates
  • Deletion O(logB(N/B)log2(N/M)) I/Os
  • Insertion O((logM/B(N/B)log2(N/M))/B) I/Os

17
Query
  • To answer a window query, we simply query all
    log2(N/M) kd-tree
  • I/O

18
Agenda
  • Efficient index
  • kd-tree
  • kdB-tree
  • Bkd-tree
  • Experiment Platform
  • Experimental Results
  • Conclusion

19
Experiment Platform
  • Three different types of point sets
  • Real points from the TIGER
  • Uniform data
  • Points along a diagonal

20
Agenda
  • Efficient index
  • kd-tree
  • kdB-tree
  • Bkd-tree
  • Experiment Platform
  • Experimental Results
  • Conclusion

21
Experimental Results
  • Space Utilization

22
Experimental Results
  • Bulk Loading Performance

23
Experimental Results
  • Bulk Loading Performance

24
Experimental Results
  • Insertion Performance

25
Experimental Results
  • Insertion Performance

26
Experimental Results
  • Query Performance

27
Experimental Results
  • Query Performance

28
Agenda
  • Efficient index
  • kd-tree
  • kdB-tree
  • Bkd-tree
  • Experiment Platform
  • Experimental Results
  • Conclusion

29
Conclusion
  • For space utilization, KDBTree has poor
    utilizatin, while BkdTree is above 99.
  • For insertion, a BkdTree is 100 times faster than
    KDBTree.
  • For query, BkdTree is excellent

30
QA
  • Thank you!
Write a Comment
User Comments (0)
About PowerShow.com