Title: Tree Edit Distance
1Tree Edit Distance
2TED
- Minimum edits to transform one tree into another
3The edit operations
Delete a node
Relabel a node
w
v
???
???
4The edit operations
Insert a node
v
???
???
???
5Existing Algorithms
6Recursive Algorithm SZ89
Recurs on the rightmost root
Delete v d(F,G) min
Delete w Match v and
w
F
G
v
w
7Recursive Algorithm SZ89
Recurs on the rightmost root
Delete v d(F,G) min
Delete w Match v and
w
F
G
v
w
8Recursive Algorithm SZ89
Recurs on the rightmost root
Delete v d(F,G) min
Delete w Match v and
w
F
G
v
w
9Recursive Algorithm SZ89
Recurs on the rightmost root
Delete v d(F,G) min
Delete w Match v and
w
F
G
v
w
10Recursive Algorithm SZ89
Recurs on the rightmost root
Delete v d(F,G) min
Delete w Match v and
w
F
G
v
w
11Recursive Algorithm SZ89
Recurs on the rightmost root
Delete v d(F,G) min
Delete w Match v and
w
F
G
v
w
12Time Complexity SZ89
- relevant subproblem if it shows up while
computing d(F,G) - relevant subproblems time complexity O(n2m2)
O(n4) - O(nm . minDepth(F),Leaves(F) .
minDepth(G),Leaves(G))
F
G
v
w
Relevant subforests
13Klein98
- Same as previous algorithm, but recurs on a light
child in F. - relevant subproblems (relevant subforests of
F) . m2 - O(nlogn
. m2) O(n3logn)
F
G
By heavy path decomposition HT84
14Decomposition strategy DT03
- For every two subforests (F,G) a strategy says
right or left. - Zhang Shashas strategy right always.
- Kleins strategy right iff the rightmost tree
in F is smaller than the leftmost tree in F. - Lower bound of strategy algorithms ?(nm . logn
. logm) - Any strategy algorithm computes the edit distance
between any two subtrees of F and G (without
their roots).
15Our Results
-
- An O(m2n(log 1)) O(n3) time, O(nm) space
algorithm. (Today O((nm)3/2
)O(n3) time and space) DMRW ICALP07 - A strategy algorithm symmetrically dependant on
the two input trees. - A matching lower bound for all strategy
algorithms. (Today A
lower bound of ?(nm2)) - Local edit distance and affine gap penalties at
the cost of one execution. (Today Local RNA edit
distance) BHLW CPM06
n
m
16Our Algorithm
G
F
- Our algorithm to compute d(F,G)
- If FltG compute d(G,F).
- Recursively run d(Ki,G) for every Ki.
- Run Kleins strategy where master is F (no need
to recurs).
K4
K1
K3
K2
K5
17Analysis
G
F
- Our algorithm to compute d(F,G)
- If FltG compute d(G,F).
- Recursively run d(Ki,G) for every Ki.
- Run Kleins strategy where master is F (no need
to recurs).
K4
K1
K3
K2
K5
R(F, G) ?
18An O((nm)3/2) O(n3) Upper Bound
- We show that
. Proof by induction - R(F,G)
19An O((nm)3/2) O(n3) Upper Bound
- We show that
. Proof by induction - R(F,G)
By inductive assumption
By () and ()
We know GltF
20An O((nm)3/2) O(n3) Upper Bound
- We show that
. Proof by induction - R(F,G)
By inductive assumption
By () and ()
We know GltF
21An O((nm)3/2) O(n3) Upper Bound
- We show that
. Proof by induction - R(F,G)
By inductive assumption
By () and ()
We know GltF
22An O((nm)3/2) O(n3) Upper Bound
- We show that
. Proof by induction - R(F,G)
By inductive assumption
By () and ()
We know GltF
23An O((nm)3/2) O(n3) Upper Bound
- We show that
. Proof by induction - R(F,G)
By inductive assumption
By () and ()
We know GltF
24An O( ) Bound
n
m
- Proof idea
- At most log(n/m) nested recursive calls where F
is master before all trees m. - For all trees m use previous O(m3) bound . At
most n/m such trees so total - n/m.O(m3) O(nm2) .
F
G
K4
K1
K3
K2
K5
25A Matching Lower Bound for all decomposition
strategy algorithms
26A Matching Lower Bound for all decomposition
strategy algorithms
F
G
27A Matching Lower Bound for all decomposition
strategy algorithms
- An ?(nm2) lower bound
- Consider this computational path
- If the strategy says left delete from F,
otherwise delete from G. - For every two internal nodes v in F and w in G we
get - minFv,Gw new subproblems (Fv is the tree
rooted at v). - Summing over all such v,w
28A Matching Lower Bound for all decomposition
strategy algorithms
- An lower bound
- A careful counting argument on
G
F
29Thank you!