Title: d a ee a aes e ed
1????d?? ?a? ?e???????e? ??a ??a?e???s? ?e?????
????? ?ed?µ????
??d?s???te? ?a???? ???st?? e-mail
makri_at_ceid.upatras.gr ??das?a??a ??µpt?
1700-1900 ?200
2?e????af? t?? ?a??µat??
- ?? µ???µa ape????eta? se ?s??? f??t?t?? ?????? ?a
ap??t?s??? ßas???? ???se?? st?? pe????? t?? d?µ??
ded?µ???? ?a? t?? a??????µ???? te?????? p??
s?et????ta? µe t? d?a?e???s? ?a? epe?e??as?a
µe????? ????? ded?µ????. - S?et???µe?a ?a??µata ??µ?? ?ed?µ????,
???????µ??e? ??µ?? ?ed?µ???? ?a? G?af???, Te???a
?as???? ??µ?? ?ed?µ????, ???????µ??, ?e???????e?
???p???s?? ???????µ??. ??f?a?? ?pe?e??as?a ?a?
?????s? ?????a?, ????t?s? ?????f???a?, ??se??
?ed?µ???? I,II.
3??t??e?µe?a p?? ?a??pt??ta?
- ?a ??µata p?? ?a??pt??ta? st? µ???µa e??a? ta
a??????a - 1. ???t??a ?e?te?e???sa? ???µ?? (???t???
d?? ep?p?d??, ?e?a????? µ??t??a, Cache Oblivious
µ??t??a). - 2. ???????µ?? ?a? ??µ?? ?ed?µ???? ??a
µ??t??a d?? ep?p?d?? (B-Trees, Weight Balanced
B-Trees, Buffer Trees) - ???????µ?? ?p?????st???? Ge?µet??a? st?
?e?te?e???sa ???µ?. - ???????µ?? ??a p??ß??µata ???es?? ?e?????? st?
?e?te?e???sa ???µ?. - ???????µ?? ??a?e???s?? S?µß???se???? st?
?e?te?e???sa ???µ? - 6. ???????µ?? ?a? ??µ?? ?ed?µ???? ??a
Cache Oblivious ???t??a (Cache Oblivious B-Tree,
Cache Oblivious ????? ???te?a??t?ta?, Ge?µet?????
???????µ?? st? Cache Oblivious ???t???
?p?????sµ??) . - 7. A??????µ?? st? ???t??? ???? ?ed?µ????.
- 8. ?fa?µ???? (X??????? ??se?? ?ed?µ????,
???????????? ??se?? ?ed?µ????, ????µes???? ??se??
?ed?µ????).
4??ad??ast???
- ???tas? (p??f?????)
- ???as?a
- ?a???s?as?
- G?apt? ??af???
- ?e????? ?a?µ??
- ?µ??????sµa
5??sta ???as???
- External Memory Data Structures
- Vitter, J. S. and Shriver, E. 1994a. Algorithms
for parallel memory I Two-level memories.
Algorithmica 12, 23, 110147. - Vitter, J. S. and Shriver, E. A.1994b. Algorithms
for parallel memory II Hierarchical multilevel - S. Lanka, E. Mays, Fully Persistent B-trees, ACM
International Conference on Management of Data,
426-435, 1992. - Varman P. Verma R. An Efficient Multiversion
Access Structure. IEEE Transactions on Knowledge
and Data Engineering, 391-409. 1997 - B. Becker, S. Gschwind, T. Ohler, B. Seeger, P.
Widmayer, An asymptotically optimal multiversion
B-tree, The VLDB Journal, 264-275. - P. Ferragina and R. Grossi. The string B-tree a
new data structure for string search in external
memory and its applications. J. ACM,
46(2)236-280, 1999.
6- External Memory Geometric Data Structures
- Arge, L. 1995a. The buffer tree A new technique
for optimal I/O-algorithms. In Proceedings of the
Workshop on Algorithms and Data Structures, Vol.
955 of Lecture Notes in Computer Science,
Springer-Verlag, 334345, 1995. A complete
version appears as BRICS Technical Report
RS9628, University of Aarhus. -
- Arge, L. 1997. External-memory algorithms with
applications in geographic information systems.
In M. van Kreveld, J. Nievergelt, T. Roos, and
P. Widmayer, eds, Algorithmic Foundations of GIS,
Vol. 1340 of Lecture Notes in Computer Science,
Springer-Verlag, 213254. - Arge, L. and Vahrenhold, J. I/O-efficient
dynamic planar point location. In Proceedings of
the ACM Symposium on Computational Geometry
(June), Vol. 9, 191200, 2000. - Kanellakis, P. C.,Ramaswamy, S., Vengroff, D.
E., and Vitter, J. S. 1996. Indexing for data
models with constraints and classes. Journal of
Computer and System Sciences 52, 3, 589612. - Arge, L. and Vitter, J. S. Optimal dynamic
interval management in external memory. In
Proceedings of the IEEE Symposium on Foundations
of Computer Science, 1996. - Arge, L., Samoladas, V., and Vitter, J. S. 1999b
Two-dimensional indexability and optimal range
search indexing. In Proceedings of the
ACMConference Principles of Database Systems
(Philadelphia, MayJune), Vol. 18, 346357, 1999. -
7- Cache Oblivious Data Structures
- M. Frigo, C.E. Leiserson, H. Prokop, and S.
Ramachandran. Cache-oblivious algorithms. In
Proc. 40th IEEE Symp. on Foundations of Computer
Science (FOCS 99), pages 285-297, 1999. - M. A. Bender, E. Demaine, and M.
Farach-Colton.Cache Oblivious B-Trees"
Proceedings of the 41st Annual Symposium on
Foundations of Computer Science (FOCS), pages
399-409, 2000. -
- Gerth St?lting Brodal and Rolf Fagerberg, Funnel
Heap A Cache Oblivious Priority Queue, In
Proc. 13th Annual International Symposium on
Algorithms and Computation, volume 2518 of
Lecture Notes in Computer Science, pages 219-228.
Springer Verlag, Berlin, 2002 - Gianni Franceschini, Roberto Grossi, J. Ian
Munro, and Linda Pagli. Implicit B-trees A New
Data Structure for the Dictionary Problem.
Journal of Computer and System Sciences, special
issue of the 43th Annual IEEE Symposium on
Foundations of Computer Science (FOCS), 2004. - Gianni Franceschini and Roberto Grossi. Implicit
dictionaries supporting searches and amortized
updates in O(log n loglog n). In Proceedings of
the 14th Annual ACM-SIAM Symposium on Discrete
Algorithms (SODA), pages 670-678. SIAM, 2003. - Gianni Franceschini and Roberto Grossi. Implicit
dictionaries supporting searches and amortized
updates in O(log n loglog n). In Proceedings of
the 14th Annual ACM-SIAM Symposium on Discrete
Algorithms (SODA), pages 670-678. SIAM, 2003. -
8- External Memory Algorithms
- A. Broder and M. Henzinger. Algorithmic Aspects
of Information Retrieval on the Web, Handbook of
Massice Data Sets, ed. Abello, Pardalos, Resende.
- M.Najork, A. Heydon, High Performance Web
Crawling, Handbook of Massice Data Sets, ed.
Abello, Pardalos, Resende. - W. Aiello, F. Chung, L. Lu, Random Evolution in
Massive Graphs. Handbook of Massice Data Sets,
ed. Abello, Pardalos, Resende. - O. Goldreich, Property Testing in Massive Graphs,
Handbook of Massice Data Sets, ed. Abello,
Pardalos, Resende. - R. Baeza-Yates, A. Moffat, G. Navarro, Searching
Large Text Collections, Handbook of Massice Data
Sets, ed. Abello, Pardalos, Resende. - J. Dula, F. Lopez, Data Envelopment Analysis
(DEA) in Massive Data Sets, Handbook of Massice
Data Sets, ed. Abello, Pardalos, Resende. - P. Bradley, O. Mangasarian, D. Musicant,
Optimization Methods in Massive Data Sets, in
Massive Data Sets, Handbook of Massice Data Sets,
ed. Abello, Pardalos, Resende. - F. Murtagh, Clustering in Massive Data Sets, in
Massive Data Sets, Handbook of Massice Data Sets,
ed. Abello, Pardalos, Resende. - M. Riedewald, D. Agrawal, A. Abbadi, Managing and
Analyzing Massive Data Sets with Data Cubes, in
Massive Data Sets, Handbook of Massice Data Sets,
ed. Abello, Pardalos, Resende. - M. Goodchild, K. Clarke, Data Quality in Massive
Data Sets in Massive Data Sets, Handbook of
Massice Data Sets, ed. Abello, Pardalos, Resende.
- T. Johnson, Data Warehousing, Handbook of Massice
Data Sets, ed. Abello, Pardalos, Resende. - Q. Ma, M. Wang, J. Gatliker, Mining Biomolecular
Data Using Background Knowledge and Artificial
Neural Networks, Handbook of Massice Data Sets,
ed. Abello, Pardalos, Resende. -
9RAM ???t??? ?p?????sµ??
- ?as??? ?e???t??? µ??t??? ?p?????sµ??
- ?pe??? µ??µ?
- ?µ???µ??f? ??st?? p??sp??as??
- ?p?? µ??t??? p??????st? se efa?µ???? t??
p????f??????
10?e?a???a ???µ?? ??p???? ???a??µat??
11??a?t??? T?µata
- Caching
- Virtual Memory
- Secondary Storage
- - Disk Floppy (hard, soft)
- Winchester
- Ram Disks
- Optical, CD-Rom
- Arrays
- Tape Reel, Cartirdge
- Robots
-
12Xa?a?t???st??? ?a???t???? ??s???
- ? p??sp??as? st? d?s?? e??a? 106 f???? p?? a???
ap? t? p??sp??as? t?? ????a? µ??µ??
- ?a s?st?µata d?s??? ep??e????? ?a ep?µe??s???
µe?????? ??????? p??sp??as?? µetaf????ta? µe???a
s??e?? blocks ded?µ???? (8-16Kbytes) - S?µa?t??? ? ap????e?s?/p??sp??as? ded?µ???? ?a
e?µeta??e?eta? t?? pa?e??µe?e? d??at?t?te?
?µad?p???s??.
13??p???? ???????? ????p????t?te?
- Seek Time (?????? e?t?p?sµ??)
- 10 -40 ms
- Rotational Delay (?????? ?a??st???s??
pe??st??f??) - 4.16 ms
- Block Transfer Time (?????? µetaf???? block)
- t 10 - 50 MB/sec
- block size 32Kb, t32MB/second, pa???e? ?????
µetaf???? block 1 ms - ??af??? Random ap? Sequential I/O
- ??st?? ????af?? pa??µ??? µe ??????s??
14???ß??µata ?etafe?s?µ?t?ta?
- ????? p?????µµata p?? ????? a?apt???e? ??a t?
RAM µ??t??? - t?????? se µe???a s????a ded?µ???? ??at? t? ?/S
µeta???e? ta blocks ?? ?fe??e? - ?a µ??t???a ?/S ???s?µ?p????? p???p???e? paging
?a? prefetching st?at?????? - e?t??t??? a? t? p????aµµa p?a?µat?p??e?
d?as???p?sµ??e? p??spe??se?? a??µa ?a? ta ?a??
?/S de? µp????? ?a a???p???s??? t?? block
p??spe??se??.
15???t??a ?e?a?????? ???µ??
16??ep?pede? ?e?a???e? ???µ??
- ?ed?µ??a p??ß??µ?t?? st? d?s??
- ???d?? p??ß??µ?t?? st? d?s??
- ???????µ??? ??st?? e??a? ?? p???e??
e?s?d??/e??d?? - ?a st???e?a ?µad?p?????ta? se blocks µe?????? B
17?a??????a ?p?s?st?µata ??s???
- ?? pe?????sµ??? pa??????? µ??t??? (?ggarwal and
Vitter 1987) - ?p???e? p???? ep?peda d?s???
- ?a a?t??e?µe?a ?µad?p?????ta? st? d?s??, µe B
a?t??e?µe?a a?? block - ?p???d?p?te ap? D blocks µp??e? ?a ??afe? ?a? ?a
d?aßaste? ta?t?????a se ??a I/O
- ?a??????? µ??t??? d?s??? (?ggarwal and Vitter
1987) - ?p???e? µ?a µ???da epe?e??as?a?, µ?a µ???da
µ??µ?? ?a? D µ???de? d?s??? - D blocks µp????? ?a d?aßast???/??af???
ta?t?????a, a??? µ??? e?? ß??s???ta? se
d?a???t??? d?s???? - ??? ?ea??st??? µ??t??? ap? t? µ? pe?????sµ???
pa??????? µ??t??? - H CPU µp??e? ?a e??a? t? s?µe?? ad?e??d?? e?? t?
D e??a? a??et? µe????.
- ?a??????e? ?e?a???e? ???µ?? (?ggarwal and Vitter
1987) - H ?e?a???e? t?? ?d??? t?p?? (µe H CPUs)
s??d???ta? µe ??a d??t?? - ? a???µ?? t?? d?s??? D µp??e? ?a e??a?
pe??ss?te???, ?d??? ? µ????te??? ap? t?? a???µ?
t?? epe?e??ast??
18????ep?pede? ?e?a???e? ???µ??
- Hierarchical Memory Model (Aggarwal, Alpern,
Chandra, Snir 1987) - ?p?????? p???? ep?peda d?s???
- ???sp??as? st? ??s? µ??µ?? x apa?te? ????? f(x)
- ? f e??a? µ?a a????sa s????t?s? ?ts? ?ste ?a
?p???e? sta?e?? c ?p?? f(2x) cf(x) ??a ???e x
- Block Transfer Model (Aggarwal, Chandra, Snir
1987 ) - ?p?????? p???? ep?peda d?s???
- ???sp??as? st? ??s? µ??µ?? x apa?te? ????? f(x)
- .?p? t? st??µ? p?? ??e? p?a?µat?p????e?
p??sp??as?, ep?p???? a?t??e?µe?a µp????? ?a
e?te???? st? ??st?? t?? e??? a?? a?t??e?µe???.
- Uniform Memory Hierarchies (Alpern, Carter, Feig
1990) - ?p???e? ?e?a???a µ??µ?? e??et???? µe??????
- ???e d?a???? s?et??eta? µe ??a bandwidth
- ???? ?? d?a???? µp??e? ?a e??a? e?e????
ta?t?????a
19Cache Oblivious ???t??? (1)
- ?st? µ?a ?e?a???a µe t??a ep?peda (p.?. Cache,
RAM, d?s??) - ?st? ?t? ? µetaf??? block se ?aµ???te?a ep?peda
???eta? se B1 ???e?? ?a? sta ?????te?a se B2
???e??. - ??? ?a µp????se ?a s?ed?aste? a??????µ?? p?? ?a
pa???e? ?p???? t?? ?a? B1 ?a? ?2
20Cache Oblivious ???t??? (2)
- ??a ep?ped? st?? ?e?a???a pa??e? t? ???? t??
cache ??a t? ep?µe?? ep?ped?. - ? d?ad??as?a caching e????eta? se hardware ?a?
software. - ?????? st?at?????? caching LRU, FIFO
- S?????? pe?????sµ??e? e?d???? t?? LRU, FIFO. ???
s???e???µ??a st?? p???? ????µe pe?????sµ???
associativity ? a???µ?? t?? p??a??? ??se?? ??a
d???? block e??a? ??a? µ????? a???µ?? k.
21Cache Oblivious ???t??? (3)
- ??? ep?p?d?? ?e?a???a
- Cache µe?????? M
- Cache-line µ????? B
- ?????? associative
- ???t?st? a?t??at?stas?
22Cache Oblivious ???t??? (4)
- ?as???? ?p???se??
- ??? ep?p?d?? ?e?a???a
- Tall Cache Assumption MO(?2)
- ???t?st? p???t??? Cache Replacement
- Cache Oblivious ???????µ??
- ???????µ?? p?? de? ??e???eta? ?a ???e? M, B.
- St?? a????s? t?? a??????µ?? ?? p???p????t?te?
e?f?????ta? ?? s??a?t?se?? t?? M ?a? ? - ??aµ????µe pa??µ??a p???p????t?ta µe a?t? t??
??ass???? µ??t????. - ?????? ???????µ??
- ??a? Cache-Oblivious a??????µ?? e??a? a?t?µata
ß??t?st?? ??a ??a ta ep?peda µ?a? ?e?a???a?
µ??µ??.
23?as??? ???t???
- N st???e??? st? p??ß??µa
- B st???e??? st? disk block
- M st???e??? st? ????a µ??µ?
- T st???e??? st?? ???d?
- I/O ?eta????s? block µeta?? µ??µ?? ?a? d?s???
- ?p???t??µe ?t? M gtB2
- ???p?? µ?t??s?? t?? ap?d?s??
- ????µ?? ?/?s
- ????? t?? d?s??? p?? ???s?µ?p??e?ta?
- ?????? ?p?????sµ?? p?? apa?te?ta?
- ?? s?µa??e? ß??t?st? p???p????t?t?
- ??af???p???s? a??µesa se on-line ?a? off-line
?p?????sµ???.
24?as???? ?e?????? (off-line)
- Merge-Sort like techniques
- Distribution Sorting techniques
- Distribution sweeping
- Persistent B-Trees (off-line te?????, ße?t???e?
?at? B t?? on-line te?????) - Batched Filtering (te????? ta?t??????? ?a??µ?t??,
se d?µ?? p?? µp????? ?a µ??te??p??????? sa
d???af?µata) - External Fractional Cascading
- External Marriage Before Conquest
- Batched Incremental Construction
25- Off-line Computational Geometry
- 1. Computer intersection of N plane segments and
their trapezoidal decomposition. - 2. Find all intersections between N
nonintersecting red line segments and N
nonintersecting blue line segments in the plane. - 3. Answer Q orthogonal 2-D range queries on N
points in the plane. - 4. Construct the 2-D and 3-D convex hull of N
points. - 5. Voronoi diagram and triangulation of N points
in the plane. - 6. Perform Q point location queries in a planar
subdivision of size N. - 7. Find all nearest neighbors for a set of N
points in the plane. - 8. Find the pairwise intersections of N
orthogonal rectangles in the plane. - 9. Compute the measure of the union of N
orthogonal rectangles in the plane. - 10. Compute the visibility of N segments in the
plane from a viewpoint. - 11. Perform Q ray-shooting queries in 2-D
Constructive Solid Geometry (CSG)
26(No Transcript)
27?as???? ????p????t?te?
- Internal External
- Scanning N
- Sorting N log N
- Permuting
- Searching
- ?a?at???se??
- G?aµµ??? I/O O(N/B)
- ??ad??ta?? µ? ??aµµ???
- ?? ?????? a?ad??ta??? ?a? d??ta??? e??a? p?a?t???
?s?? - B pa?????ta? ???? s?µa?t????
- ?e? µp??e? ?a d?at??e? ß??t?sta
28???ta??
- ltM/B d?ateta?µ??e? ??ste? (?????) µp????? ?a
s?????e????? se O(N/B) I/Os -
- ?? d?ateta?µ??? ??sta µp??e? ?a d?a????ste?
???s?µ?p????ta? ltM/B st???e?a d??spas?? se
O(N/B) I/Os
29???ta??
- Merge sort
- ??µ??????se N/M memory sized d?ateta?µ??e? ??ste?
- ?pa?a?aµßa??µe?a s?????e?se ??ste? T(M/B) t? f???
- f?se?? µe
I/Os ? ?a?eµ?a? ?/?s
30???????? HASHING
- ??? ?at?????e?
- Directory
- Directoryless
- Directory
- ???a?a? µe?????? 2d ap? ?e???
- ?a a?t??e?µe?a ap????e???ta? se ?e??? a?????a µe
ta d te?e?ta?a bits t?? hash d?e????s?? t??? - ???e ?e?? ??e? ??a de??t? p??? t? block p??
ap????e???ta? ta a?t??e?µe?a - 2 ?/?s a?? a?a??t?s?
- ???spas? ?a? s?????s? ?e????.
- ???e ?e?? ??e? ??a t?p??? ß???? b. ?p??e? ?a
µ?????eta? t? ?d?? block µe ???a ?e??? - Blocks ?at? 67 ?eµ?ta
- Directoryless
- - ???s?µ?p?????ta? ??ste? ?pe??e???s?? ?a?
ta blocks d?asp??ta? µe p???a????sµ??? se???.
31SPATIAL ????S ???????O?
- ?p????e?s? ?a? e??t?s? ded?µ???? ?????
- ??? e?d? ??????? d?µ??
- Data-driven
- ?as????ta? st? d?a????sµ? t?? ?d??? t??
a?t??e?µ???? - ?.? R-trees, kd-trees
- Space-driven
- ??a????sµ?? t?? ????? t?? a?t??e?µ????
- ?.? quad d??t?a, a??e?a grid
- ?ß??d???? d?µ??
- Cross tree
- D-d??stat? ??d?s? t?? ?-tree
- Data-driven d?a????sµ?? sta p??? ep?peda,
space-driven ??t? - ???t?se?? se
- ??sa????? d?a??af?? se
32??ß?????af?a
- Aggarwal and Vitter, 1988. The input/output
complexity of sorting and related problems.
Communications of the ACM 31, 9, 1116-1127 - Aggarwal, Alpern, Chandra, Snir, A model for
hierarchical memory. In Proceedings of the IEEE
Symposium on Foundations of Computer Science, 19,
305-314 - Aggarwal, Chandra, Snir, Hierarchical memory with
block transfer. In Proceedings of the IEEE
Symposium on Foundations of Computer Science, 28,
204-216. - Vitter and Shriver, Algorithms for parallel
memory I Two-level memories. Algorithmica 12
2-3, 10-147. - Vitter and Shriver, Algorithms for parallel
memory II Hierarchical multilevel memories.
Algorithmica 12 2-3, 148-169. - J. Abello, P. Pardalos, M. Resende, Handbook of
Massive Data Sets
33??apa??stas? ?ed?µ?????e?a?????? ??µ??
?e??t?d?t?s??
34Block Addresses in Main and Secondary Memory
Block address for blocks in main memory The block
has a virtual memory address when it is loaded
into a buffer in main memory Block address for
blocks in secondary memory The block has no
virtual memory address. The physical address
space has to be used. The physical address
describes the physical location of the block.
35Physical and logical addresses
Physical Address Describes physical location,
i.e. disk, cylinder, track numbers. Typical size
8-16 bytes. Logical address A fixed length
arbitrary string for each record. A table is
used to map logical addresses to physical
addresses.
36B-Tree
- B-d??t?? µe ßa?µ? e??d?? b ?a? pa??µet?? f????? k
(b,k8) - ??a ta f???a e??a? st? ?d?? ep?ped? ?a? pe???????
a??µesa ap? 1/4k ?a? k st???e?a - ??t?? ap? t? ???a, ???? ?? ??µß?? ????? ßa?µ?
e??d?? µeta?? 1/4b ?a? b - ? ???a ??e? ßa?µ? e??d?? µeta?? 2 ?a? b
- B-d??t?? µe pa??µet?? f?????
- O(N/B) ?????
- ????
- ep?µe??sµ??? ??st??
epa?a????s?? f????? - amortized internal node
rebalance operations - B-tree with branching parameter Bc, 0ltc1, and
leaf parameter B - Space O(N/B), updates ?(logBNk/B), queries
- Variations B (leaf links), B (variations of
B, with no splitting, sharing if possible).
37???????? HASHING
- ??? ?at?????e?
- Directory
- Directoryless
- Directory
- ???a?a? µe?????? 2d ap? ?e???
- ?a a?t??e?µe?a ap????e???ta? se ?e??? a?????a µe
ta d te?e?ta?a bits t?? hash d?e????s?? t??? - ???e ?e?? ??e? ??a de??t? p??? t? block p??
ap????e???ta? ta a?t??e?µe?a - 2 ?/?s a?? a?a??t?s?
- ???spas? ?a? s?????s? ?e????.
- ???e ?e?? ??e? ??a t?p??? ß???? b. ?p??e? ?a
µ?????eta? t? ?d?? block µe ???a ?e??? - Blocks ?at? 67 ?eµ?ta
- Directoryless
- - ???s?µ?p?????ta? ??ste? ?pe??e???s?? ?a?
ta blocks d?asp??ta? µe p???a????sµ??? se???.
38SPATIAL ????S ???????O?
- ?p????e?s? ?a? e??t?s? ded?µ???? ?????
- ??? e?d? ??????? d?µ??
- Data-driven
- ?as????ta? st? d?a????sµ? t?? ?d??? t??
a?t??e?µ???? - ?.? R-trees, kd-trees
- Space-driven
- ??a????sµ?? t?? ????? t?? a?t??e?µ????
- ?.? quad d??t?a, a??e?a grid
- ?ß??d???? d?µ??
- Cross tree
- D-d??stat? ??d?s? t?? ?-tree
- Data-driven d?a????sµ?? sta p??? ep?peda,
space-driven ??t? - ???t?se?? se
- ??sa????? d?a??af?? se
39?e?te?e???se? ??µ??
- ?ta? ???s?µ?p?????ta? d?µ?? de?te?e???sa?
ap????e?s??, µ?a epa?a????s? st? v ?a e??a? ?a??
?a ??st??e? O(w(v)) I/Os (w(v) e??a? t? ß???? t??
v) - ??? e???se?? p??pe? ?a pe??s???
ap? t? v - ? O(1) ep?µe??sµ??? ??st?? d??spas??
- ? ep?µe??sµ??? ??st?? ???es??
- ?? ??µß?? st? ??ass??? B-tree de? ????? t??
?d??t?ta a?t? -
40??a-Tree
- BB?-d??t?a ????? t?? ?d??t?ta a?t?
- ????eta? µe ?d??t?te? ß?????
- ?? p????? a??µesa st? ß???? t?? a??ste??? pa?d???
?a? t?? de???? pa?d??? e??a? µeta?? ? ?a? 1-? - ?
- ???? O(log N)
- ??? epa?a????s?
µp??e? ?a ???e? µe ???s? pe??st??f?? - Fa??eta? d?s???? ?a ???p??????? ta BB?-d??t?a
I/O-ap?te?esµat???
41Weight-balanced B-tree
- S??d?asµ?? B-tree ?a? BB?-tree
- ?e?????sµ?? ß????? a?t? ßa?µ??
- ?pa?a????s? µe d??spas?/s?????e?s? ?p?? se B-tree
- Weight-balanced B-tree µe pa?aµ?t???? b ?a? k
(bgt8, k8) - ??a ta f???a st? ?d?? ß???? ?a?
- pe??????? µeta?? k/4 ?a? k st???e?a
- ?s?te????? ??µß?? v st? ep?ped? l ??e?
- w(v) ltblk
- ??t?? ap? t? ???a, es?te????? ??µß?? v
- st? ep?ped? l ??e? w(v)gt(1/4) blk
- ? ???a ??e? p?? p???? ap? ??a pa?d??.
42- ???e es?te????? ??µß?? ??e? ßa?µ? µeta?? b ?a?
4b. ????-gtO(logBN/k) - Extern
- Choose 4bB (or even Bc for 0 lt c 1)
- kB
- ?
- O(N/B) ?????,
?????? e??t?s??
43Weight-balanced B-tree ???es?
- ???e ??a s?et??? f???? u ?a? ???ese ??? st???e??
- ??ap??ase µ???p?t? ap? f???? u se ???a
- ??? ? ??µß?? v ep?p?d?? l ??e? w(v)blk1 pa?d??
- sp?se t?? ??µß? se ??µß??? v ?a? v with
-
?a? - ???????µ?? s?st?? ef?s??
- ?ts? ?ste ?a?
- ???spe?a????ta? ??µß??
- ?d??t?ta ßa??????s??
- - O(blk) e??µe??se?? ??t? ap? v ?a? v p???
ep?µe?? p????.
44Weight-balanced B-tree ??a??af?
- ???e ??a s?et??? f???? u ?a? d????a?e st???e??
- ??ap??ase µ???p?t? ap? u st? ???a
- ??? ? ep?p?d?? l ??µß?? v ??e? t??a
- t?te s?????e?se µe sibling st? ??µß? v
- ?p??
- ??? then split into
nodes - µe ß????
- ?a?
- ???????µ?? s?st?? ?a? a???µp?
??µß??? - ?d??t?ta ß?????
- p???e?? ??t? ap? v ?a? v p???
t?? ep?µe?? p????
45S????? Weight-balanced B-tree
- ?a??????sµ??? B-tree µe ßa?µ? e??d?? b ?a?
pa??µet?? f????? k?(B) - O(N/B) ?????
- ????
- p???e?? epa?a????s?? µet?
ap? e??µ???s? - ?(w(v)) e??µe??se?? ??t? ap? t? v a??µesa ap?
s??e??µe?e? p???e?? v - ?a??????sµ??? B-tree µe ßa?µ? e??d?? Bc ?a?
pa??µet?? f????? B - ???µe??se?? ?a? e??t?se?? se
I/Os - ??µ?s? ap? ??t? p??? ta ??? se
I/Os
46??a???????t?ta
???????? ??se?? ?ed?µ????
??se?? ?ed?µ???? ?????? ??s?????a? (Transaction
Time) ??at????? t?? ?st???a t?? d?ast????t?t??
t???. ???e d?s?????a µe t? ??s? ?ed?µ????
p??sd?????eta? µe µ?a ??????? ??de???. ???a?
ef??t? ? p??sp??as? d?af??et???? ????????
st??µ??t?p?? t?? ??s?? ?ed?µ????. ??se??
?ed?µ???? ?s????t?? ?????? (Valid Time) ??at?????
a?t??e?µe?a p?? µ?a s???st?sa t??? ?p?d????e?
??????? st??µ??t?pa. ?? s????? t?? t?µ?? a?t??
ap????e???? t?? t?????? ???se?? ??a t? pa???, t?
pa?e???? ? a??µa ?a? t? µ????? t?? ap????e?µ????
a?t??e?µ????. ?????????? ??se?? ?ed?µ????
(Bitemporal) ?p?te???? s??d?asµ? t?? d?? pa?ap???
?at???????. ??apa??st??? t?? p?a?µat???t?ta
ep?t??p??ta? e?te a?ad??µ???? e?te e? t?? ?st????
a??a???.
47??se?? ?ed?µ???? ?????? ??s?????a?
??????? ???????
48??se?? ?ed?µ???? ?s????t?? ??????
??????? ???????
49?????????? ??se?? ?ed?µ????
??????? ???????
50??a???????t?ta-?????e?
??µ?? ?ed?µ????
G?aµµ??? a???????a e?d????
?f?µe?e?
??a????????
???? µ?a e?d???
?e?????
i
i1
i2
i
??????
i
?e?t???? d?µ? e?d????
i1
i4
i2
i
i3
i3
i4
i2
i5
S??e??t????
i6
DAG e?d????
51?etat??p? ?f?µe??? ??µ?? se ?e????? ??a???????
(Driscoll et al. 89)
??? ?e????? µ???d?? ??a d?as??deµ??e? d?µ??
a) fat-node
- pepe?asµ??? s????? ??µß?? µe sta?e?? a???µ?
ped??? (ded?µ??a ?a? de??te?) - sta?e?? a???µ? ??µß?? e?s?d??
??????(1)/e??µe?.
???????(logm)/e??µe?.? p??sp??as?
ß) node-copying
?p?µe??sµ???? ??????(1)/e??µ???s?
?p?µe??sµ???? ??????(1)/e??µ???s?
52Fat Node ????d??
?f?µe??? ??µß??
??d?s? ??µß??
??a???????? ??µß??
??µ? ?ed???
???µa ?ed???
??d?s?
53Node-Copying ????d??
??µß?? pe?????sµ???? ????t???t?ta? ?ta? ? ??µß??
?eµ?se? d?µ???????µe ??? a?t???af? t?? ??µß??
e??µ???s? ped???
?pe?d? p??pe? ?a e??a? ef??t? ? p??sp??as? e???
???? ??µß?? p??pe? ?a a?t???????µe t??? ?µes???
p???????? t?? ? ? µ???d?? d??e? ??a sta?e??
ep?µe??sµ??? ??st?? ?ta? ? ßa?µ?? e?s?d?? ???e
??µß?? e??a? f?a?µ????.
54??a??????? ?-Tree
- ?e???? ??a???????t?ta
- ???µ???se t?????sa ??d?s? (pa?????ta? ??a)
- ??t?se ??e? t?? e?d?se??
- Ta ???aµe µe????? d?a??????? B-tree µe
- O(N/B) ???? N ? a???µ?? p???e?? e??µ???s??
-
- -- ??st?? e??µ???s?? ?a? ?????? e??t?s??.
55?p????? ???s????s?
- ??????? t??p?? µetat??p?? t?? B-tree se µe?????
d?a??????? d?µ? - a?t???a?e d?µ? se ???e p????
- d?at???se ß????t??? d?µ? p??sp??as?? e?d?se??
(B-tree) - ??a??p???t????
?????? e??t?s?? a?? ??d?s?, a??? - O(N/B) ?????? I/O e??µ???s??
- O(N2/B) ?????
56??a??????? B-tree
- ?a st???e?a ef?d?????ta? µe d??st?µa ?pa????
?a? ap????e???ta? se µ?a d?µ? - ??a??????? B-tree µe pa??µet?? b (gt16)
- ?ate?????µe?? ???f?µa
- ?? ??µß?? pe??????? st???e?a ef?d?asµ??a µe
d??st?µa ?pa???? - ???e ??????? st??µ? t, ?? ??µß?? µe st???e?a
e?e??? t? ??????? st?µ? t s??µat????? B-tree µe
pa??µet?? f????? ?a? ßa?µ? e??d?? b - B-tree µe ????t???t?ta f????? b ?a? ßa?µ? e??d??
b se ??µß? ßa?µ?? e?s?d?? 0 - ?
- ??? bB
- ?????? e??t?s??
I/Os
57??a??????? B-tree ???µe??se??
- ???µe??se?? ?p?? ?a? st? B-tree
- G?a d?at???s? ??aµµ???? ????? d?at????µe t?? e???
s?????? - ??a? ???? ??µß?? pe????e? µeta?? 3?/8 ?a?
e?e??? st???e?a ?a? ?a???a a?e?e???.
58??a??????? B-tree - ???es?
- ???e ??a s?et??? f???? u ?a? ???ese ??? st???e??
- ??? ? u pe????e? B1 st???e?a Block overflow
- Version split
- S?µe??se u a?e?e??? ?a? d?µ??????se ??? ??µß? u
- If Strong overflow
- If Strong underflow
- If t?te e??µ???se
parent(u) - ?????a?e a?af??? se u ?a? ???ese se u
59Persistent B-tree ???es?
- Strong overflow ( )
- ???spase v se v ?a? v µe st???e?a (
) - ???µ???se parent(v)
- ?????a?e a?af??? se v ?a? ???ese a?af???? se v
?a? v - Strong underflow ( )
- S?????e?se x st???e?a µe y e?e??? st???e?a ap?
version split st? sibling (
) - ??? t?te (strong overflow)
d??spase se ??µß??? µe (xy)/2 elements (
) - ??ad??µ??? e??µ???se parent(u) ?????a?e d??,
???ese µ?a/d?? a?af????
60Persistent B-tree ??a??af?
- ???e ??a f???? u ?a? µa?????se st???e?? a?e?e???
- ??? u pe????e? e?e??? st???e?a
Block underflow - Version split
- S?µe??se u a?e?e??? ?a? d?µ??????se ??µß? u µe
x e?e??? st???e?a - Strong underflow ( )
- S?????e?se (version split) ?a? d??spase (strong
overflow) - ??ad??µ??? e??µ???se parent(u)
- ?????a?e d?? a?af????, ???ese µ?a ? d??
61Persistent B-tree
62Persistent B-tree A????s?
- ???µ???s?
- ????µ? ?a? epa?a????s? se ??a µ???p?t?
- ????? O(N/B)
- t??????st?? e??µe??se?? st? f???? se
d??st?µa ?pa???? - ?ta? t? u a?e?e???
- ?? p??? d?? ???? ??µß?? d?µ????????ta?
- ?? p??? ??a block over/underflow ??a ep?ped? ???
(in parent(l)) - ?
- Se N e??µe??se??, d?µ???????µe
- f???a
- ??µß??? i ep?peda ???
- ? blocks
63Persistent B-tree
- ??a??????? B-tree
- e??µ???se t?????sa ??d?s?
- ??t?se ??e? t?? e?d?se??
- ?p?d?t??? ???p???s? µe ???s? existence intervals
- Standard te?????
- ?
- ?at? t? d????e?a N p???e??
- O(N/B) ?????
- ?????? e??µ???s??
- ?????? e??t?s??
64???e? B-tree ?a?a??a???
- Level balanced B -trees
- Global instead of local balancing strategy
- Whole subtrees rebuilt when too many nodes on a
level - Used when parent pointers and divide/merge
operations needed - String B -trees
- Used to maintain and search (variable length)
strings
65B-tree ??µ?s?
- St?? ????a µ??µ? µp????µe ?a d?at????µe N
st???e?a se O(N log N) ????? ???s?µ?p????ta? ??a
????sµ??? d??t?? - ???ese ??a ta st???e?a ??a p??? ??a all elements
(construct tree) - ?a???s?ase ta st?? ???d? se d?ateta?µ??? se???
???s?µ?p????ta? in-order d?ap??as? - ? ?d??? a??????µ?? µe B-tree apa?te?
I/Os - ?? ß??t?st? ?at? ??a pa?????ta
- ?p????µe ?a ?t?s??µe B-tree ap? ??t? p??? ta ???
se ?/?s - Ta µa? e?d??fe?e ? ???s? d?µ?? µe
?/? ??st?? a?? p???? ? ??s? Buffer Tree
66Buffer-tree ?e?????
67?as???? ?d?e? (1)
- ???sµ??
- B-tree µe ßa?µ? e??d?? ?a? µ??e??? f????? B
- ???e??? M buffer se ???e es?te???? ??µß?
- ???µe??se??
- ???s???? time-stamp ??a ???es?/d?a??af? st???e???
- S?????? B st???e??? st? µ??µ? p??? t?? ???es? st?
root buffer - ??a?µat?p???se buffer-emptying ?ta? ?eµ?se? ?
buffer
68?as???? ?d?e? (2)
- ?a?at???s?
- ? buffer µp??e? ?a e??a? µe?a??te??? t?? m ?at?
t? d????e?a t?? a?ad??µ???? buffer-emptying - ?a st???e?a ?ata??µ??ta? se se??? d??ta???
- ? t? p??? m st???e?a ad??ta?ta st? buffer
- ?pa?a????s? ??e???eta? ?ta? ? leaf-node buffer
ade???e? - Leaf-node buffer-emptying p?a?µat?p??e?ta? µ???
af?? ???? ?? p????? es?te????? ??µß?? ade??s???