Title: A survey of external graph algorithms
1A survey of external graph algorithms
- Introduction
- Locality
- External graph algorithm
- Current best results
- Sparsification
- Simulating PRAM
- BFS Minimum Spanning Forest
- Point-to-point shortest path
3Does time complexity measure the running time?
- Problem loading a graph into adjacent lists
- Input a list of edges sorted by Link_From
- In the left one, we add the edge into the
link-list of FROM - In the right one, we add the edge into the
link-list of TO
- while (input not exhausted) read an edge and
allocate mem.ptr-gtnodeedge-gtto
ptr-gtnextlistedge-gtfrom listedge-gtfrompt
- while (input not exhausted) read an edge and
allocate mem. ptr-gtnodeedge-gtfrom
ptr-gtnextlistedge-gtto listedge-gttoptr1
The difference of running time is 2050
4Another test
- Data access in a large array
- Random vs Sequential
- 100M integers (400M bytes), no virtual memory
- Performance ratio 4004000
- 300M integers (1.2G bytes), virtual memory
- Performance ratio Very large
- Program running time does not only dependent on
computational time complexity
5Whats the problem?
- To load a graph into memory
- edges stored in a file
- Sorted adjacency list
- How long does it take for a graph with 30M edges?
- Sorted 64 secs
- Unsorted gt10hrs!!
- Whats the time complexity?
- Linear time (bounded degree)?
- For massive data, CPU time is not the major
The time to load a graph ( nodesedge/15)
6Memory hierarchy
10 ns
Access time
from Vitter 2007
- Locality, working set
- Caching, prefetching
- System level (general) solutions
- External memory (or EM) algorithms/data
structures - The ones that explicitly manage data placement
and movement. - A.k.a I/O algorithms or out-of-core algorithms.
- The I/O complexity
- The number of communications between the internal
memory and the external memory. - Bottleneck of the performance of EM algorithms
8The reason for our tests
- The computer system supports virtual memory
- programmers can use very large memory as if it
was internal memory - Severe performance problem will be encountered if
data is not arranged well - For both cache memory and external memory
9- A nice survey paper
- External Memory Algorithms and Data Structures
Dealing with Massive Data, - ACM Computing Surveys, Vol. 33, No. 2, June 2001,
- February 2007 revision Available online at
- Scott Vitter,
- Dean of Science
- Purdue University
10Parallel disk model
11Problem parameters
- Usually, we focus on the case P1 and regard DB
as the block size. - For graph problems, NVE.
12Fundamental operations
Note that the constant is somehow more important
when discussing I/O complexity.
13- In practice, sorting can be done in almost linear
numbers of I/Os - Logmn is almost constant for N1T, M1G, B10K,
gt n100M, m100K gt Logmnlt2
14Current best results
15Current best results (cont.)
16Current best results (cont.)
Sparsification--A technique for speeding up
dynamic graph algorithms.D. Eppstein, Z. Galil,
G.F. Italiano, and A. Nissenzweig.FOCS, 1992,
pp. 60-69. J. ACM 44(5)669-696, 1997.
Giuseppe F. Italiano U. Roma "Tor Vergata"
Zvi Galil Columbia U. President, Tel-Aviv U.
David Eppstein UC, Irvine
Research is what I'm doing when I don't know
what I'm doing.
18- (often used to) convert I/O bounds
- from O(Sort(E)) to O((E/V)Sort(V))
- speedup logE/logV
19An illustration of sparsification MSF
- L Arges algorithm O((loglog(V/e))Sort(E))
- Improve by sparsification
- Partition graph into E/V sparse subgraphs, each
with V edges on the V vertices - Apply L Arges algorithm to each subgraph
- merge the E/V forests, two at a time, in a
balanced binary merging procedure by repeatedly
applying LArge.
20For all levels, total O((loglogB)(E/V)Sort(V))
Only E/2 edges left 2nd level total
Total (loglogB)(E/V)Sort(V) in 1st level
Each (loglog(V/e))Sort(E)(loglogB)Sort(V) since
each has only V edges
21Why sparsification works
- After each merge, only O(V) data needed
- The same approach can be applied to connectivity,
biconnectivity, and maximal matching. - For example, for biconnected components,
- the merging process replacing each biconnected
component by a cycle. - The resulting graph has O(V ) size and contains
the necessary information.
22Simulation of parallel algorithms
Y.-J. Chiang, M. T. Goodrich, E. F. Grove, R.
Tamassia, D. E. Vengroff, and J. S. Vitter.
External-memory graph algorithms. SODA 1995,
Yi-Jen Chiang, PolyTechnic U BS NTU, 1986 PhD,
Brown University, 1995
T.H. Cormen, Virtual memory for data parallel
computing, PhD Thesis, MIT 1992
23An illustrationlist ranking
After finding the ranks, we can rearrange the
nodes by sorting
24A parallel algorithm
Find a independent set
Merge with successors
25Simulating the PRAM algorithm
- In each phase
- Find the independent set O(sort(N))
- Several methods
- Take O(sort(N)) I/O to combine the nodes
- If the independent set is at lease cN, the data
size is decreased by a constant factor. - The total cost is also O(sort(N)) I/Os
26Duplicate Elimination in a Multiset BFS
- K. Munagala and A. Ranade,
- I/O-complexity of graph algorithms.
- SODA 1999
27Duplicate Elimination
- Input N integers in 1,P.
- Output an array C1..P, Ci1 if i exists.
- LB ?((N/P)Sort(P))
- UB (Algorithm)
- Divide N input records into N/P groups of P
records O(scan(N)) - sort the records within each group and construct
a vector of size P O((N/P)Sort(P)) - merge the vectors (OR-operation)O(scan(N))
- Scan(N)N/B (N/P)sort(P) (N/P)(P/B)logmP/B
28Connected component via BFS
- Input unordered edge list
- Output A list L1..n, Li is the smallest
vertex that vertex i is connected to. - sort the edges into ordered adjacency list and
get the degree and the ptr to its first edge of
each vertex. O((E/V)Sort(V)) - construct Front(t) for t1,2,
- construct Nbr(Front(t-1))
- O(Scan(E)V) in total, the term V for the
possible round off - remove duplicates O((E/V)Sort(V)V) in total
- eliminate from it those in Front(t-1) or
Front(t-2) O(Scan(E)V) in total
- The algorithm is optimal for dense graph (E/VgtB).
- If the graph is sparse, we can use a
preprocessing algorithm to group vertices into
supper nodes. - The total I/O-complexity is at most an additional
factor loglog(VB/E) to the optimal.
30An external algorithm for MST
- L. Arge, G. S. Brodal, and L. Toma.
- On external-memory MST, SSSP, and multi-way
planar - graph separation.
- Journal of Algorithms, 2004.
31The Prims algorithm
A priority queue for vertices not in T Priority
of x d(x,T), the min distance from x to any in T
32- The steps in each iteration
- Extract-min u from priority queue
- Insert u into T
- Relax d(T,v)mind(T,v),d(T,u)w(u,v) for all
v not in T - Problem in EM how to avoid one I/O per relaxation
33L.Arges solution
- The priority queue Q is not for vertices but all
edges with one or both endpoints in T - When a vertex v is selected, insert all edges
incident to v. - how to check if both u and v are already in T
when extract an edge (u,v) from Q? - If so, (u,v) must appear in Q twice.
34I/O complexity
- V(E/B) I/Os read the adjacent list
- O(E) inserts and deletes on the queue
- amortized O((1/B)logM/B (N/B)) I/O per op.
- O(V(E/B) (E/B)logM/B (N/B)) O(VSort(E))
- Sort(N) O((N/B)logM/B (N/B))
- Comments ?????????
- Laziness -- more computation and less I/O
35MST vertex reduction
- The complexity O(VSort(E)) can be further
improved for sparse graph - Vertex reduction in each phase
- Choose a shortest edge for each vertex and make
it an MST edge - Also contract the two into one super-vertex
- After O(log(VB/E)) phases, the above method gives
an O(Sort(E)log(VB/E)) algorithm. - Further improved to O(Sort(E)loglog(VB/E))
36Computing Point-to-Point Shortest Paths from
External Memory
- A. V. Goldberg and R. Werneck, ALENEX '05 2005
Andrew Goldberg Microsoft Research Ph.D. in
Computer Science, MIT, 1987
37Shortest path problems
- Single source and Point-to-Point
- Worst case O(mnlogn) computational time
- not what we only concern for the P2P problem
- Dijkstras algorithm
- Relaxation for each arc (v,w), if d(w) gt
d(v)l(v,w) set d(w) d(v)l(v,w). - Starting from the source until the sink is met.
38Bidirectional algorithm
- Goldbergs data
- 1.6M vertices, 3.8M arcs, travel time metric.
39A algorithm
- Similar to Dijkstras algorithm but
- Domain-specific estimates pt(v) on dist(v, t) .
- At each step pick a labeled vertex with the
minimum k(v) d(v)pt(v). - Best estimate of path length throgh v.
- In general, optimality is not guaranteed.
- Find Optimal if pt(v) is under-estimate.
- To use A, we need a lower bound function pt(v)
of dist(v, t).
40ALT algorithms
- use A search and landmark-based lower bounds.
- Landmark a and b in the graph
- dist(v,w) dist(v, b)-dist(w, b)
- dist(v,w) dist(a,w)-dist(a, v).
- We choose several landmarks
- At preprocessing, compute dist(a,v)
- Select the largest LB
41How to choose the landmarks
- dist(v,w) dist(a,w)-dist(a, v).
- The equality holds when w is one the shortest
path from v to the landmark
Bidirectional ALT Example
42Reaches Gutman 04
- Consider a vertex v that splits a path P into P1
and P2. - rP(v) min(l(P1), l(P2)).
- r(v) maxP(rP(v)) over all shortest paths P
through v. - Using reaches to prune Dijkstra
- If r(w) lt min(d(v)l(v,w),LB(w, t)) then prune w.
- We dont like many nodes with large reach
- Add short-cuts break ties by of hops.
44(No Transcript)
45(No Transcript)