Data Structures - PowerPoint PPT Presentation

About This Presentation
Title:

Data Structures

Description:

Data Structures & Algorithms Radix Search Richard Newman based on s by S. Sahni and book by R. Sedgewick – PowerPoint PPT presentation

Number of Views:80
Avg rating:3.0/5.0
Slides: 31
Provided by: nemo68
Learn more at: https://www.cise.ufl.edu
Category:

less

Transcript and Presenter's Notes

Title: Data Structures


1
Data Structures Algorithms Radix Search
Richard Newman based on slides by S. Sahni and
book by R. Sedgewick
2
Radix-based Keys
  • Key has multiple parts
  • Each part is an element of some set
  • Character
  • Numeral
  • Key parts can be accessed (e.g., string si)
  • Size of set is radix

3
Advantages of Radix-based Search
  • Good worst-case performance
  • Simpler than balanced trees, etc.
  • Fast access to data
  • Easy way to handle variable-length keys
  • Save space (part of key in structure)

4
Disadvantages of Radix-based Search
  • May be space-inefficient
  • Performance depends on access to bytes of keys
  • Must have distinct keys, or other way to handle
    duplicate keys

5
Digital Search Trees
  • Similar to binary search trees
  • Difference is that we use bits of the key to
    determine subtree to search
  • Path in tree prefix of key

6
Digital Search Trees
  • Insert A-S-E-R-C-H-I-N-G

A
Key Repr A 00001 S 10011 E 00101 R 10010 C 00011 H
01000 I 01001 N 01110 G 00111
1
0
S
E
1
0
1
0
R
C
H
1
1
0
1
0
0
N
G
I
1
0
1
0
1
0
Note that binary tree is not sorted in BST sense
7
Digital Search Trees
Prop 15.1 A search or insertion into a DST takes
about lg N comparisons on average, and about 2 lg
N comparisons in the worst case, in a tree built
from N keys. The number of comparisons is never
more than the number of bits in the search key.
8
Tries
  • Use bits of key to guide search like DST
  • But keep keys in order like BST
  • Allow recursive sort, etc.
  • Pronounced try-ee or try
  • Keys kept at leaves of a binary tree

9
Tries
  • Defn. 15.1 A trie is a binary tree that has
    keys associated with each leaf, defined as
    follows
  • a trie for an empty set is a null link
  • a trie for a single key is a leaf w/key
  • a trie for gt 1 key is an internal node with
    left link referring to trie for keys that start
    with 0, right for keys 1xxx

10
Tries
  • Insert A-S-E-R-C-H-I-N-G

Key Repr A 00001 S 10011 E 00101 R 10010 C 00011 H
01000 I 01001 N 01110 G 00111
A
1
0
S
A
Construct tree to point where prefixes match
11
Tries
  • Insert A-S-E-R-C-H-I-N-G

Key Repr A 00001 S 10011 E 00101 R 10010 C 00011 H
01000 I 01001 N 01110 G 00111
A
1
0
S
A
1
0
1
0
1
0
1
0
A
E
1
0
1
0
R
S
Construct tree to point where prefixes match
12
Tries
  • Insert A-S-E-R-C-H-I-N-G

Key Repr A 00001 S 10011 E 00101 R 10010 C 00011 H
01000 I 01001 N 01110 G 00111
1
0
1
0
1
0
H
1
0
1
0
A
E
1
0
1
0
1
0
C
A
R
S
13
Tries
  • Insert A-S-E-R-C-H-I-N-G

Key Repr A 00001 S 10011 E 00101 R 10010 C 00011 H
01000 I 01001 N 01110 G 00111
1
0
1
0
1
0
H
1
0
1
0
1
0
E
1
0
1
0
1
0
1
0
1
0
C
A
R
S
H
I
14
Tries
  • Prop. 15.2 The structure of a trie is
    independent of key insertion order there is one
    unique trie for any given set of distinct keys.
  • Prop. 15.3 Insertion or search for a random key
    in a trie built from N random keys takes about lg
    N bit comparisons on average, in the worst case,
    bounded by bits in key

15
Tries
  • Annoying feature of tries
  • One-way branching when keys have common prefix
  • Prop. 15.4 A trie built from N random w-bit
    keys has about N/lg 2 nodes on the average (about
    1.44 N)

16
Patricia Tries
  • Annoying feature of tries
  • One-way branching when keys have common prefix
  • Two different types of nodes in trie
  • Patricia tries fix both of these
  • Practical Algorithm To Retrieve Information
    Coded In Alphanumeric

17
Patricia Tries
  • Avoid one-way branching
  • Keep at each node the index of the next bit to
    test
  • Skip over common prefix!
  • Avoid two types of nodes
  • Store data in internal nodes
  • Replace external links with back links

18
Patricia Tries
Key Repr A 00001 S 10011 E 00101 R 10010 C 00011 H
01000 I 01001 N 01110 G 00111
0
S
4
1
R
H
2
E
3
C
4
A
19
Patricia Tries
Key Repr A 00001 S 10011 E 00101 R 10010 C 00011 H
01000 I 01001 N 01110 G 00111
0
S
4
1
R
H
2
E
3
C
4
A
20
Patricia Tries
  • Prop 15.5 Insertion or search in a patricia
    trie built from N random bitstrings takes about
    lg N bit comparisons on average, and about 2 lg N
    in the worst case, but never more than the length
    of the key.

21
Map
  • Radix search
  • Digital Search Trees
  • Tries
  • Patricia Tries
  • Multiway tries and TSTs
  • Text string algorithms

22
Multiway Tries
  • Like radix sort, can get benefit from comparing
    more than one bit at a time
  • Compare r bits, speed up search by a factor of r
  • What could possibly be bad?
  • Number of links is now R2r
  • Can waste a lot of space!

23
Multiway Tries
  • Structure is (almost) the same as binary tries
  • Except there are R branches
  • Search start at root, leftmost digit
  • Follow ith link if next R-ary digit is i
  • If null link, then miss
  • If reach leaf, it contains only key with prefix
    matching path to it - compare

24
Existence Tries
  • Only keys, no records
  • Insert/search
  • Defn. 15.2 The existence trie for a set of keys
    is
  • Empty set null link
  • Non-empty set internal node with links for each
    possible digit to tries built with the leading
    digit omitted

25
Existence Tries
  • Convenient to return null on miss, dummy record
    on hit
  • Convenient to have no duplicate keys and no key a
    prefix of another key
  • Keys of fixed length, or
  • Use termination character with value NULLdigit,
    only used as sentinel

26
Existence Tries
  • No need to store any data
  • All keys captured in trie structure
  • If reach NULLdigit at the same time we run out of
    key digits, search hit
  • Otherwise, search miss
  • Insert search until find null link, then add
    nodes for each of the remaining digits in the key

27
Existence Tries
the time is now for
t
n
i
f
h
i
s
o
o
m
e
w
r
e
28
Multi-way Tries
  • R-ary branching
  • Keys stored at leaves
  • Path to leaf defines prefix of key stored at leaf
  • Only build tree downward until prefixes become
    distinct

29
Multi-way Tries
  • Defn. 15.3 The multiway trie for a set of keys
    associated with leaves is
  • Set empty null link
  • Singleton set leaf with key
  • Larger set internal node with links for each
    possible digit to tries built with the leading
    digit omitted

30
Multi-way Tries
  • Def

31
Summary
  • Radix search
  • Digital Search Trees
  • Tries
  • Patricia Tries
  • Multiway tries and TSTs
  • Text string algorithms
Write a Comment
User Comments (0)
About PowerShow.com