Title: F96943167 ???
1Special Topics on Graph Algorithms Finding the
Diameter in Real-World GraphsExperimentally
Turning a Lower Bound into an Upper Bound
- F96943167 ???
- F97943070 ???
- R98943086 ???
- R98943090 ???
R98943088 ??? R98921072 ? ? R99921040
??? R99942061 ???
2Outline
Introduction
R98943086 ???
- R98943090 ???
- R98943088 ???
Previous Work
R99921040 ??? R99942061 ??? R98943086 ???
Finding the Diameter in Real-World Graphs
F96943167 ??? F97943070 ???
Other Related Topics
Conclusion and Future Work
R98921072 ? ?
3Diameter
- The length of the "longest shortest path" between
any two vertices in a graph or a tree - Given a connected graph G (V,E) with nV
vertices and mE edges - the diameter D is Max d(u,v) for u,v in V, where
d(u,v) denotes the distance between node u and v
3
3
2
2
3
3
1
3
1
2
2
5
5
3
3
5
2
4
A Tree, D 13
A Graph, D 9
3
4Diameter of a Tree
- The diameter of a tree can be computed by
applying double-sweep algorithm - 1. Choose a random vertex r, run a BFS at r, and
find a vertex - a farthest from r
- 2. Run a BFS at a and find a vertex b farthest
from a - 3. Return D d(a,b)
0
8
r
3
3
2
2
3
11
3
3
2
10
1
1
3
5
2
2
3
11
3
3
5
5
5
13
b
6
8
a
8
0
a
4
5Diameter of a Graph
- Double-sweep algorithm might not correctly
compute the diameter of a graph - It provides a lower bound instead
r
b
a
a
D 9
5
6Outline
Introduction
R98943086 ???
- R98943090 ???
- R98943088 ???
Previous Work
- R99921040 ???
- R99942061 ???
- R98943086 ???
Finding the Diameter in Real-World Graphs
F96943167 ??? F97943070 ???
Other Related Topics
Conclusion and Future Work
R98921072 ? ?
7Naïve Algorithm
- Perform n breadth-first searches (BFS) from each
vertex to obtain distance matrix of the graph - T(n(nm)) time and T(m) space
- By using matrix multiplication, the distance
matrix can be computed in O(M(n)logn) time and
T(n2) space Seidel, ACM STC92 - M(n) the complexity for matrix multiplication
involving small integers only (O(n2.376)) - ? Is too slow for massive graphs and has a
prohibitive space cost
8All Pairs Shortest Path
- Compute the distances between all pairs of
vertices without resorting to matrix products - Feder, ACM STC91 T(n3 / logn) time and O(n2)
space - Chan, ACM-SIAM06 O(n2(loglogn)2 / logn) time
and O(n2) space - ? Still too slow and space consuming for massive
graphs
9All Pairs Almost Shortest Path (1/2)
- Compute almost shortest paths between all pairs
of vertices Dor, ECCC97 - Additive error 2 ?
- Treat high-degree vertices and low-degree
vertices separately
10All Pairs Almost Shortest Path (2/2)
- Additive error 2 apasp2
- O(min(n3/2m1/2, n7/3)logn) time and T(n2) space
- ? Still too expensive
11Self-checking Heuristics
- Too expensive to obtain the exact value or
accurate estimations of the diameter for massive
graphs - ? Empirically establish some lower and upper
bounds by executing a suitable small number of
BFS - L ? D ? U
- Obtain the actual value of D for G when L U
- ? Self-checking heuristics
12Self-checking Heuristics
- No guarantee of success for every feasible input,
BUT - 1) It requires few BFSes in practice, and thus
its complexity is linear Magnien, JEA09 - 2) An empirical upper bound is possible
- 3) Large graphs can be analyzed
- since BFS has a good external-memory
implementation Mayer, AESA02 and works on
graphs stored in compressed format Vigna,
IWWWC04
13A Comparing Work
- Fast Computation of Empirically Tight Bounds for
the Diameter of Massive Graphs Magnien, JEA09 - Various bounds to confine the solution range
- Trivial bounds
- Double sweep lower bound
- Tree upper bound
- Iterative algorithm to obtain the actual diameter
14Trivial Bounds
- The eccentricity of any vertex v gives trivial
bounds of the diameter ecc(v) D 2ecc(v) - Trivial bounds can be computed in T(m) space and
time, where m is the number of edges in the graph - D 2ecc(v)
- If D gt 2ecc(v), then max(ecc(v)) gt 2ecc(v)
- We can choose a center point in the diameter that
contradicts the derived inequality - Therefore, D 2ecc(v)
15Double Sweep Lower Bound
- On chordal graphs, AT-free graphs, and tree
graphs, if a vertex v is chosen such that d(u, v)
ecc(u) for a vertex u, then D ecc(u) (i.e. v
is among the vertices which are at maximal
distance from u) Corneil01, Handler73 - The diameter may therefore be computed by a BFS
from any node u and then a BFS from a node at
maximal distance from u, thus in T(m) space and
time, where m is the number of edges. - Generally, the value obtained in this way may
different from the diameter, but still better
than trivial lower bounds
16Double Sweep Lower Bound An Example
D 2
actual diameter
D 4
17Tree Upper Bound
- The diameter of any spanning connected subgraph
of G is larger than or equal to the diameter of G - Tree diameter can be obtain in T(m) time and
space Handler73, where m is the number of
edges in G - Spanning trees of G, are good candidates for
obtaining an upper bound - A tree upper bound is the diameter of a BFS tree
from a vertex - It is always better than the corresponding
trivial upper bound
18Tree Upper Bound An Example
D 5
actual diameter
D 4
19Tighten the Bounds
- Iteratively choosing different initial vertices
for tighter bounds (for tree upper bounds) - Random tree upper bound (rtub)
- Iterate the tree upper bound from random vertices
- Highest degree tree upper bound (hdtub)
- Consider vertices in decreasing order of degrees
when iterating the algorithm
20The Iterative Algorithm
- Iterate the double sweep lower bound and highest
degree tree upper bound until the difference
between the best bounds obtained is lower than or
equal to a given threshold value - Multiple choices for this threshold value
- Depending factors the graph considered, the
desired quality of the bounds, or even set the
threshold to be a given precision (e.g. D-D/Dltp) - All heuristics have a T(m) time complexity, and a
T(mn) space complexity. - Does the tree upper bound eventually converge to
the exact diameter?
21Possibly Unmatching Upper Bound
- No guarantee of obtaining the exact diameter as
all the tree upper bounds may be strictly larger
than D - E.g. if G is a cycle of n vertices, its diameter
is n/2 and the tree upper bound is n-1 which ever
vertex one starts from - Is there an algorithm that provides more matching
upper bounds?
D 5
D 3
22Outline
Introduction
R98943086 ???
- R98943090 ???
- R98943088 ???
Previous Work
- R99921040 ???
- R99942061 ???
- R98943086 ???
Finding the Diameter in Real-World Graphs
F96943167 ??? F97943070 ???
Other Related Topics
Conclusion and Future Work
R98921072 ? ?
23The Fringe Algorithm
- Fringe method is used to improve the upper bound
U and possibly match the lower bound L obtained
by the double sweep method
24The Fringe Algorithm
- An unweighted, undirected and connected graph G(
V, E ) - For any vertex
- Tu denotes an unordered BFS-tree
- Eccentricity ecc(u) is the height of Tu
- gt 2 ecc(u) ? diam(G)
-
25The Fringe Algorithm
- Proof 2 ecc(u) ? diam(G)
- gt ecc(u) ? diam(G)/2
- 1) if ecc(u) lt diam(G)/2, diam(G) d(a,b)
- d(u,v) lt diam(G)/2, for all
- then d(u,a)ltdiam(G)/2
- d(u,b)ltdiam(G)/2
- gt d(u,a)d(u,b)lt d(a,b)
- contradiction!!!
- ? 2 ecc(u) ? diam(G)
diameter
b
a
u
diameter
26The Fringe Algorithm
- Tu denotes an unordered BFS-tree
- Tu is a subgraph of G
- , , ,
- gt
- let , so
diam (Tu )
U
27The Fringe Algorithm
- The fringe of u, denote F(u), as the set of
vertices such that -
U
F(U) 3
28The Fringe Algorithm
U
B(u) max ecc(A), ecc(B), ecc(C)
A
A
B
C
B
C
BFS(A) gtecc(A)
BFS(B) gtecc(B)
BFS(C) gtecc(C)
29The Fringe Algorithm
- The fringe of u, denote F(u), as the set of
vertices such that -
-
30The Fringe Algorithm
-
- Lemma. U(u) ?D, where D is the diameter of G
-
31The Fringe Algorithm
- Case 1 F(u) 1 gt
- Case 2 F(u) gt 1 , B(u)2ecc(u)
- gt
- Case 3 F(u) gt 1 , B(u)2ecc(u)-1
- gt
- Case 4 F(u) gt 1 , B(u)lt2ecc(u)-1
- gt
32The Fringe Algorithm
U
33The Fringe Algorithm
- Case 2 F(u) gt 1 , B(u)2ecc(u)
-
- ecc(u) 3 , diam(Tu) 6
- diameter upper bound 6
- B(u) provides lower bound
- gt if B(u) 2 ecc(u)
- ? diameter diam(Tu)
U
34The Fringe Algorithm
- Case 3 F(u) gt 1 , B(u)2ecc(u)-1
-
- Non-leave node
- upper bound 2ecc(u)-2
- Leave node
- upper bound 2ecc(u)
- if B(u) 2ecc(u)-1
- gt diameter 2ecc(u)-1
U
d(a,u) ? ecc(u)-1
d(b,u) ? ecc(u)-1
a
b
35The Fringe Algorithm
- Case 4 F(u) gt 1 , B(u)lt2ecc(u)-1
-
-
- Non-leave node
- upper bound 2ecc(u)-2
- Leave node
- upper bound 2ecc(u)
- if B(u) lt 2ecc(u)-1
- gt diameter ? 2ecc(u)-2
U
d(a,u) ? ecc(u)-1
d(b,u) ? ecc(u)-1
a
b
36The Fringe Algorithm
- The fringe algorithm correctly computes an upper
bound for the diameter of the input graph G,
using at most F(u)3 BFS.
37The Fringe Algorithm
- Let r,a,and b be the vertices identified by
double sweep(using two BFSes) - Find the vertex u that is halfway along the path
connecting a and b inside the BFS-tree Ta - Compute the BFS-tree Tu and its eccentricity
ecc(u) - If F(u)gt1,find the BFS-tree Tz for each
and compute B(u) - If B(u)2ecc(u)-1,return 2ecc(u)-1
- If B(u)lt2ecc(u)-1,return 2ecc(u)-2
- Return the diameter(Tu)
38Example(1/2)
x1 xp
When number of P is large !! We choose X1 as r
choose A ,B, x1 as b y1-gtA 4 y1-gtB 4 y1-gtx1 4
diameter 4
B
DS x1-gtA 3 x1-gtB 3 x1-gty1 4 Choose y1 as a
Diameter6
row3
B
A
Wrong !!!
y1
column6
39Example(2/2)
x1 xp
II. Find a vertex u that is
halfway along the path connecting a
and b
Case 1 III. ecc(u) 4 F(u)gt1
B(u)6
Case 2 IV.B(u)2ecc(u) 6 (23)
return 2ecc(u) diameter 6
Case 2 III. ecc(u) 3 F(u)gt1 B(u)6
Case 1 IV.B(u)lt2ecc(u)-1 6 lt (24) -1
return 2ecc(u)-2 diameter 6
- Fringe
- I. Use DS to find
- a and b
- x1 as a
- y1 as b
row3
y1
column6
40A Bad Case for Fringe
r
a
41A Bad Case for Fringe
b
u
a
42A Bad Case for Fringe
- Ecc(u) 3
- B(u) 3
- B(u) lt 2ecc(u) 1(5)
- return 2ecc(u) 2(4)
-
- Real diameter 3
- ? Fringe fail !!!
u
F(u)
43Experimental Results (1/2)
- Implemented in C on a 2.93Ghz Linux workstation
with 24 GB memory - 44 real-word graphs are tested
- each with 4000 50 million nodes, 20000 3000
million edges - Real diameter is found by exhaustive search to
check the obtained upper bounds
Approaches Results (44 in total) Results (44 in total)
Approaches Matches Failures
fub 37 7
mtub 13 31
hdtub 10 34
rtub 7 37
43
44Experimental Results (2/2)
- The proposed method generates the tightest upper
bound for the 7 mismatches, compared with the
approaches in previous work
Benchmarks D fub mtub hdtub rtub
CAH2 18 20 20 20 20
CITP 26 28 30 29 31
DBLP 22 24 24 24 25
P2PG 11 14 15 14 15
ROA1 865 987 987 1047 988
ROA2 794 803 803 873 832
ROA3 1064 1079 1079 1166 1128
44
45Outline
Introduction
R98943086 ???
- R98943090 ???
- R98943088 ???
Previous Work
- R99921040 ???
- R99942061 ???
- R98943086 ???
Finding the Diameter in Real-World Graphs
F96943167 ??? F97943070 ???
Other Related Topics
Conclusion and Future Work
R98921072 ? ?
46Finding the Diameter on Weighted Graphs
- Consider a large complete graph with edge weight
be 1 except for only one edge - The eccentricities of most points are 1
- However, the diameter of the graph is larger than
1 - The fringe algorithm may not efficiently find
tight diameter bounds for weighted graphs
1
1
1
1
1
1.5
47Minimum Diameter Spanning Trees
- Minimum diameter spanning tree (MDST) problem
- Given a graph G(V,E) with edge weight
- Find a spanning tree T for G such that
- is minimized
3
3
2
2
1
1
2
2
2
4
1
1
Diameter3
Diameter5
MDST
48Outline
Introduction
R98943086 ???
- R98943090 ???
- R98943088 ???
Previous Work
- R99921040 ???
- R99942061 ???
- R98943086 ???
Finding the Diameter in Real-World Graphs
Other Related Topics
Geometric MDST
F97943070 ???
MDST
F96943167 ???
Conclusion and Future Work
R98921072 ? ?
49Geometric MDST
- Geometric MDST (GMDST)
- Given a set of n points in the Euclidean space,
find a spanning tree connecting these points so
that the length of its diameter is minimum - GMDST corresponds to finding an MDST on a
complete graph with edge weight being the
Euclidean distance between two points
50Monopolar and Dipolar
- A spanning tree is said to be monopolar if there
exists a point (called monopole) s.t. all
remaining points are connected to it - A spanning tree is said to be dipolar if there
exists two points (called dipole) s.t. all
remaining points are connected to one of the two
points in the dipole
dipole
monopole
A dipolar spanning tree
A monopolar spanning tree
51GMDST with a Simple Topology
- Theorem
- There exists a GMDST of a set S of n points
which is either monopolar (n?3) or dipolar (n?4)
All monopolar spanning trees of the 4 points
4 dipolar spanning trees of the 4 points
52Center Edge
- An edge (ai-1,ai) is a center edge of a path
P(a0,a1,,ak) if - is minimized
dist(ai, ak)
dist(a0, ai-1)
ak
a0
ai
ai-1
a1
ak-1
53Lemma
- Lemma
- Let (ai-1,ai) be a center edge of a path
P(a0,a1,,ak), then - (1) and
- (2)
dist(ai-1, ak)
dist(a0, ai-1)
ak
a0
ei-1
ei-2
ai
ai-1
ak-1
a1
ai-2
Otherwise, the center edge is not (ai-1, ai)
B
If A gt B max A, B-ei-1 gt max A-ei-2, B
A
ak
a0
ei-1
ei-2
ai
ak-1
a1
ai-1
ai-2
54Proof of the Theorem
- Theorem There exists a GMDST of a set S of n
points which is either monopolar or dipolar - Proof
- Case 1 Given any optimal GMDST T with a diameter
composed of only two edges, i.e., D(T) (a0, a1,
a2) of size DT, a monopolar spanning tree T can
be constructed with the same diameter
a1
a1
Optimal T
a0
a0
T
a2
a2
v
v
u
u
55Proof of the Theorem (contd)
- Case 2 Given any optimal GMDST T with diameter
D(T) (a0,a1,,ak) of size DT, k ?3. A dipolar
spanning tree T can be constructed with the
same diameter - Let (ai-1,ai) be the center edge of D(T)
- Connect all points in the subtree Ti-1 to ai-1,
and connect all points in the subtree Ti to ai
Ti-1
Optimal T
Ti
T
a1
a1
ak-1
ak-1
a0
a0
ai
ai
ak
ak
ai-1
ai-1
Center edge
56Proof of the Theorem (contd)
- For any point pair u and v, if the two points are
in different subtrees, their distance is
obviously less than DT - If u and v are in the same subtree
Ti-1
Optimal T
Ti
T
a1
a1
ak-1
ak-1
a0
a0
ai
ai
ak
ak
ai-1
ai-1
v
v
57Finding a Geometric MDST
- Theorem There exists a GMDST of a set S of n
points which is either monopolar or dipolar - By enumerating all monopolar and dipolar spanning
trees of a set of given points, an optimal GMDST
can be found - The enumeration process can be done in ?(n3)
58Outline
Introduction
R98943086 ???
- R98943090 ???
- R98943088 ???
Previous Work
- R99921040 ???
- R99942061 ???
- R98943086 ???
Finding the Diameter in Real-World Graphs
Other Related Topics
Geometric MDST
F97943070 ???
MDST
F96943167 ???
Conclusion and Future Work
R98921072 ? ?
59Introduction
- Ho et al., SIAMJ91 solved the geometric MDST
in O(n3) - Actually, the general MDST problem is identical
to the absolute 1-center problem (A1CP) - The absolute 1-center problem
Let
Find xx such that F(x) is minimized
59
59
60Equivalence of A1CP and MDST
SPT(x) is minimum diameter spanning tree
x is absolute 1-center (continuum set)
60
60
61The Proof of Equivalence
- Proof idea
- Considering metric space solution with continuum
set SPT(y) - Diameter of SPT(x) equals to that of SPT(y)
- As SPT(y) is minimum, the MDST is solved
- Continuum set
- Let the graph be rectifiable
- Refer interior points on an edge by their
distances from the two nodes
3
5
10
7
5
61
61
62Property of Continuum Set
- For given tree T, diameter D(T) equals to
2?FT(y) - y is the absolute 1-center of T
62
62
63Proof
- Assume that z is the absolute 1-center of G
- By following the property of continuum set,
- Since the tree is the shortest path tree rooted
at z, - Since z is the absolute 1-center of G,
- For any tree Ti rooted at u,
- It implies that, for any spanning tree Tj ,
63
63
64Conclusion
- The concepts of monopolar and dipolar Ho et al.,
SIAMJ91 are exactly the same as the proved
result - By using all pairs shortest distance Fredman and
Tarjan, JACM87, the A1CP can be solved in O(mn
n2 log n)
dipole
monopole
absolute 1-center
64
65Outline
Introduction
R98943086 ???
- R98943090 ???
- R98943088 ???
Previous Work
- R99921040 ???
- R99942061 ???
- R98943086 ???
Finding the Diameter in Real-World Graphs
F96943167 ??? F97943070 ???
Other Related Topics
Conclusion and Future Work
R98921072 ? ?
66Conclusion
- In todays presentation, we have
- Introduce the difference between finding the
diameter on a tree and finding the diameter on a
general graph - Give some naïve algorithms for finding the
diameter on a graph - Present the double sweep algorithm introduced in
the previous work - Present the fringe algorithm which extends the
double sweep algorithm - Compare the double sweep algorithm and the fringe
algorithm
67Conclusion (contd)
- Besides, we further
- Identify the difference between finding the
diameter on an unweighted graph and finding the
diameter on a weighted graph - Present two algorithms that find minimum diameter
spanning trees on weighted graphs
68Future Work
- Another topic related to the design methodology
for directed graphs with minimum diameter is
interesting as well - Design to minimize diameter on building-block
network, Makoto Imase and Masaki Itoh - A design for directed graphs with minimum
diamter, Makoto Imase and Masaki Itoh - Given nodes and the upper bounds of in- and
out-degree, design a directed graph s.t. the
diameter is minimized
n 9, d 2
69Future Work (contd)
- How to find the diameter (or find the tight upper
and lower bounds) of a weighted graph is still an
opening problem
3
2
1
1
3
1
1
1
1
3
2
1.5
5
5
3
2
4
Diameter 1.5
Diameter 9