IOComplexity of Graph Algorithms - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

IOComplexity of Graph Algorithms

Description:

Global disk address space is striped across the disks: ... Connected components in bipartite graphs. Biconnected components labeling. Minimum spanning trees ... – PowerPoint PPT presentation

Number of Views:90
Avg rating:3.0/5.0
Slides: 21
Provided by: lir49
Category:

less

Transcript and Presenter's Notes

Title: IOComplexity of Graph Algorithms


1
I/O-Complexity of Graph Algorithms
Kameshwar MunaglaAbhiram Ranade Presenter Li
Rui
2
Outline
  • Parallel disk model
  • Lower bound of graph problem (edge list version)
  • Connected components, etc.
  • Computational Model for lower bound
  • P-way Indexed I/O tree
  • Decision tree
  • Duplicate elimination (DE)
  • Lower bound
  • Upper bound
  • Connected component labeling (CC)
  • Lower bound reduce DE to CC
  • Upper bounds
  • BFS, Preprocessing step for sparse graph

3
Parallel Disk Model
D disks
M
P
4
Parallel Disk Model
  • Global disk address space is striped across the
    disks
  • Record i in track j of disk k is
    iB(k-1)BD(j-1) record in the sequence
  • Parallel access of different track in different
    disks

disk 1
disk 2
disk D
Track 1

Track 2
Track 3
5
Basic algorithms
  • M gt BD
  • Scan(N) O(N/BD)
  • Number of disk accesses needed to read N records
  • N/BD tracks, or parallel steps
  • Sort(N)
  • Optimum to sort N records
  • phases using
    I/Os each

6
Connected Component Labeling
  • Input An undirected graph G(V, E)
  • As edge list (1, 3), (1, 4), (2, 1), (2, 5),
  • In range 1,V
  • Output A array L1V, such that L(i)L(j) iff u
    and v are in the same component
  • CC(E,V)

1
5
3
1
3
1
2
6
3
1
8
7
1
3
1
10
3
9
2
4
7
Duplicate elimination
  • Remove duplicates in a multiset with elements in
    a finite domain
  • Used for lower bounds for many graph problems
  • Input a sequence of N integers records each in
    range 1..P
  • Output C1P in P records on disk, Ci 1 if i
    occurs in the input, and 0 otherwise.
  • DE(N, P)

8
Main Results
  • Assumption
  • V, P are larger than M, logV and log P are
    smaller than Blog(M/B)
  • Duplicate elimination
  • DE(N,P)
  • A matching upper bound
  • Connected Components Labeling
  • CC(E,V)
  • Upper bound
  • Breadth First Search
  • Optimum for E/V gt BD
  • Preprocessing for sparse graph
  • Transforms V vertices to E/BD vertices

9
Decision tree recall
  • Comparison based
  • Binary tree
  • Tree height
  • Lower bound for sorting B elements, B! possible
    leafs,
  • Merge two sorted lists with B, and M-B elements
    respectively,

10
P-way indexed I/O tree
  • Extend I/O Tree
  • Allows P-way indexed accesses to disk
  • Restricted model D1
  • Three types of nodes
  • Comparison node (of main memory records i j)
  • Indexed Input node (I/O node)
  • Which track is to read lti,m1,m2,,mB,t1,t2,,tPgt
  • Indexed Output node (I/O node)
  • Which track is to write lti,m1,m2,,mB,t1,t2,,tPgt
  • Given such an I/O tree T
  • Let I/OT(x) be I/O nodes for an input instance
    x,
  • Let I/OT be the maximum of I/OT(x). To Lower
    Bound!
  • Can be transformed to a binary decision tree??

11
P-Way indexed I/O tree and decision tree
  • Lemma 2.1 (D1)
  • N the number of records in the problem of I/O
    tree T, theres a decision tree Tc which can be
    constructed from T and
  • Theorem 2.1 (general D)
  • The transformation satisfies total ordering of
    records in main memory!
  • It means the decision tree has some constraint!

12
Duplicate elimination lower bounder
  • Lemma 3.1
  • The depth of any decision tree solving DE(N,P) is
    at least Nlog(P/2)
  • there are (P/2)N odd instances. Take logarithm!
  • having
  • So
  • With some rearranging

13
Duplicate elimination Upper bound
  • Upper bound Algorithm
  • Divide N input into N/P groups of P records each
  • Scan(N)
  • Sort each group construct the solution of each
    group, get N/P bit vectors for each group,
    C1..P
  • In each group sorting Sort(P), construction
    Scan(P)
  • Total N/P Sort(P) Scan(N)
  • Compute the OR of all bit vectors
  • Scan(N)
  • Total time O(N/PSort(P) Scan(N)) O(N/P
    Sort(P))
  • Match the lower bound

14
Connected Components Lower Bound
  • Idea
  • Reduce a similar DE(N,P) problem to a CC(E,V)
    problem.
  • Lower bound CC(E,V) to lower bound of DE(N,P)
  • Another Duplicate elimination problem DE(N,P)
  • A sequence of N records in P1,2P, P lt N lt P2
  • b can be divided into P contiguous sequences
    each of length N/P, each sequence has distinct
    elements.
  • The depth of any decision tree solving DE(N,P)
    is
  • Similarly we have

15
DE(N,P) to CC(E,V)
  • Reduce DE(E-V/2,V/2) to CC(E,V)
  • Let bj to be the jth subsequence with length
    (E-V/2)/(V/2)
  • If V/2 i appears in bj, edge (j, V/2i)
  • edge(i, i 1) for every i 1..V/2-1
  • Exactly E-1 edges and V nodes
  • If we solve CC(E,V) and get L1..V
  • V/2i appears in b if and only if L1
    LV/2i
  • So from lower bound of DE(E-V/2,V/2) we have
    lower bound of CC(E,V)

j

V/22
2
3
V/2
V/21
V
V/2i
1
16
Connected Components Upper Bound
  • Algorithm (BFS)
  • First convert edge list to adjacent list form
  • In sorting time
  • Using same idea taught in the lecture (L7)
  • L(0) r
  • d 0
  • While L(d) not empty do
  • Find L(d)s neighbors, remove duplicates
  • Remove all vertices in (L(d) U L(d 1))
  • The resulting set is L(d1)
  • d d1
  • Plug in the expression for parallel disk model,
    similar result

17
Connected Components Upper Bound
  • Upper bound does not match lower bound
  • Optimization for sparse graph? By preprocessing
  • PRAM algorithm, Cole and Vishkin
  • Chi et al.
  • a)identify a group of vertices in same group
  • b)identify a leader in the group
  • c)move edges incident on every vertex to its
    leaer
  • Construct the labeling from the labeling of
    induced graph
  • Transform V vertices to E/BD vertices
  • Iterative steps Halve of leaders each time
  • Nearly optimal algorithm for sparse graphs

18
Other graph problems
  • Lower bound also applies for
  • Connected components in bipartite graphs
  • Biconnected components labeling
  • Minimum spanning trees
  • Ear decomposition
  • Their decisions versions
  • All related to DE(N,P) problem

19
Open problems
  • Lower bound assumes input of edge-list format
  • Other format
  • Gap between CC(E,V)s Upper Lower bounds.

20
Summary
  • Lower bound of graph problem
  • edge list parallel disks model
  • P-way Indexed I/O tree, Decision Tree
  • Duplicate elimination
  • Connected Component
Write a Comment
User Comments (0)
About PowerShow.com