Fan Chung - PowerPoint PPT Presentation

1 / 93
About This Presentation
Title:

Fan Chung

Description:

Applications --web search,ITdata sets, xxxxxxxxxxxxxpartitioning ... Acquaintance graphs. Graphs from any data a.base. protein interaction network Jawoong Jeong ... – PowerPoint PPT presentation

Number of Views:219
Avg rating:3.0/5.0
Slides: 94
Provided by: ronald144
Category:

less

Transcript and Presenter's Notes

Title: Fan Chung


1
The PageRank of a graph
Fan Chung University of California,
San Diego
2
(No Transcript)
3
What is PageRank?
  • Ranking the vertices of a graph
  • Partially ordered sets graph theory

A new graph invariant for dealing with
-Applications --web search,ITdata sets,
xxxxxxxxxxxxxpartitioning algorithms, -Theory
--- correlations among vertices
4
fan
5
Outline of the talk
  • Motivation
  • Define PageRank
  • Local cuts and the Cheeger constant
  • Four versions of the Cheeger inequality

using
  • eigenvectors
  • random walks
  • PageRank
  • heat kernel
  • Four partitioning algorithms
  • Greens functions and hitting time

6
What is PageRank?
What is Rank?
7
(No Transcript)
8
(No Transcript)
9
What is PageRank?
PageRank is defined on any graph.
10
An induced subgraph of the collaboration graph
with authors of Erdös number 2.
11
A subgraph of the Hollywood graph.
12
A subgraph of a BGP graph
13
The Octopus graph
Yahoo IM graph Reid Andersen 2005
14
Graph Theory has 250 years of history.
Leonhard Euler 1707-1783
The Bridges of Königsburg
Is it possible to walk over every bridge once and
only once?
15
Geometric graphs
Topological graphs
Algebraic graphs
General graphs
16
Massive data
Massive graphs
  • WWW-graphs
  • Call graphs
  • Acquaintance graphs
  • Graphs from any data a.base

protein interaction network Jawoong Jeong
17
Big and bigger graphs
New directions in graph theory
18
Many basic questions
  • Correlation among vertices?
  • The geometry of a network ?

distance, flow, cut,
  • Quantitative analysis?

eigenvalues, rapid mixing,
  • Local versus global?

19
Googles answer
The definition for PageRank?
20
A measure for the importance of a website
The importance of a website is proportional to
the sum of the importance of all the sites that
link to it.
21
A solution for the importance of a website
x
Solve
for
22
A solution for the importance of a website
x
Solve
for
x ? A x
Adjacency matrix
23
A solution for the importance of a website
x
Solve
for
x ? A x
Eigenvalue problems!
24
Graph models
(undirected) graphs
  • directed graphs
  • weighted graphs

25
Graph models
(undirected) graphs
  • directed graphs
  • weighted graphs

26
Graph models
(undirected) graphs
2.3
1.2
1.5
1
  • directed graphs

2.8
1.1
3.3\
2
1.5
  • weighted graphs

27
In a directed graph,
there are two types of importance
authority
hub
Jon Kleinberg 1998
28
Two types of the importance of a website
x
Importance as Authorities
y
Importance as Hubs
x r A y
y s A x
T
Solve
and
x rs A A x
T
y rs A A y
T
Singular eigenvalue problems!
29
Eigenvalue problem for n x n matrix.
n 30 billion websites
Hard to compute eigenvalues
Even harder to compute eigenvectors
30
In the old days, compute for a given
(whole) graph.
In reality, can only afford to compute
locally.
31
A traditional algorithm Input a
given graph on n vertices.
Efficient algorithm means polynomial algorithms
n3, n2, n log n, n
New algorithmic paradigm Input access
to a (huge) graph
(e.g., for a vertex v, find its neighbors)
Bounded number of access.
32
A traditional algorithm Input a
given graph on n vertices.
Efficient algorithm means polynomial algorithms
Exponential polynomial
n3, n2, n log n, n
New algorithmic paradigm Input access
to a (huge) graph
(e.g., for a vertex v, find its neighbors)
Infinity finite
Bounded number of access.
33
The definition of PageRank given by Brin and Page
is based on
random walks.
34
Random walks in a graph.
G a graph
P transition probability matrix
the degree of u.
A lazy walk
35
Original definition of PageRank
A (bored) surfer
  • either surf a random webpage

with probability a
  • or surf a linked webpage

with probability 1- a
a the jumping constant
36
Definition of personalized PageRank
Two equivalent ways to define PageRank pr(a,s)
(1)
s the seed as a row vector
a the jumping constant
s
37
Definition of PageRank
Two equivalent ways to define PageRank ppr(a,s)
(1)
(2)
s
the (original) PageRank
s
some seed, e.g.,
personalized PageRank
38
How good is PageRank as a measure of
correlationship?
Depends on the applications?
How good is the cut?
Isoperimetric properties
39
Isoperimetric properties
What is the shortest curve enclosing a unit
area?
In a graph G and an integer m, what is the
minimum cut disconnecting a subgraph of m
vertices?
In a graph G, what is the minimum cut e(S,V-S)
so that e(S,V-S) is the smallest?
_____Vol S
40
How good is the cut?
Two types of cuts
  • Vertex cut
  • edge cut

S
E(S,V-S)
41
e(S,V-S)
e(S,V-S)
_____Vol S
_____ S
Vol S S deg(v)
S S 1
v e S
v e S
V-S
S
42
The Cheeger constant for graphs
The Cheeger constant
The volume of S is
hG and its variations are sometimes called
conductance, isoperimetric number,
43
The Cheeger inequality
The Cheeger constant
The Cheeger inequality
? the first nontrivial eigenvalue of the
xx(normalized) Laplacian of a connected graph.
44
The spectrum of a graph
  • Adjacency matrix


Many ways to define the spectrum of a graph.
How are the eigenvalues related to
properties of graphs?
45
The spectrum of a graph
  • Adjacency matrix
  • Combinatorial Laplacian


adjacency matrix
diagonal degree matrix
Gustav Robert Kirchhoff 1824-1887
46
The spectrum of a graph
  • Adjacency matrix
  • Combinatorial Laplacian


adjacency matrix
diagonal degree matrix
Matrix tree theorem
spanning . trees
Gustav Robert Kirchhoff 1824-1887
47
The spectrum of a graph
  • Adjacency matrix
  • Combinatorial Laplacian


adjacency matrix
diagonal degree matrix
  • Normalized Laplacian

Random walks Rate of convergence
Gustav Robert Kirchhoff 1824-1887
48
The spectrum of a graph
loopless, simple
Discrete Laplace operator
not symmetric in general
  • Normalized Laplacian

symmetric normalized
with eigenvalues
49
The spectrum of a graph
Discrete Laplace operator
not symmetric in general
  • Normalized Laplacian

symmetric normalized
with eigenvalues

50
dictates many properties of a graph.
  • expander
  • diameter
  • discrepancy
  • subgraph containment
  • .

Spectral implications for finding good cuts?
51
Finding a cut by a sweep
order the vertices
For
Consider sets
and the Cheeger constant of
Define
52
Finding a cut by a sweep
Using a sweep by the eigenvector, can reduce the
exponential number of choices of subsets to a
linear number.
53
Finding a cut by a sweep
Using a sweep by the eigenvector, can reduce the
exponential number of choices of subsets to a
linear number.
Still, there is a lower bound guarantee by using
the Cheeger inequality.
54
Four types of Cheeger inequalities.
Four proofs using
  • eigenvectors
  • random walks
  • PageRank
  • heat kernel

Leading to four different one-sweep
partitioning algorithms.
55
Four proofs of Cheeger inequalities
  • graph spectral method
  • random walks
  • PageRank
  • heat kernel

spectral partition algorithm
local partition algorithms
56
Graph partitioning
Local graph partitioning
57
What is a local graph partitioning algorithm?
A local graph partitioning algorithm finds a
small cut near the given seed(s) with running
time depending only on the size of the output.
58
Examples of local partitioning
59
Examples of local partitioning
60
Examples of local partitioning
61
Examples of local partitioning
62
Examples of local partitioning
63
(No Transcript)
64
Four proofs of Cheeger inequalities
  • graph spectral method
  • random walks
  • PageRank
  • heat kernel

spectral partition algorithm
local partition algorithms
65
Four proofs of Cheeger inequalities
  • graph spectral method
  • random walks
  • PageRank
  • heat kernel

Cheeger 60s, Fiedler 73
Alon 86, JerrumSinclair 89
Lovasz, Simonovits, 90, 93 Spielman, Teng, 04
Andersen, Chung, Lang, 06
Chung, PNAS , 08.
66
The Cheeger inequality
Partition algorithm
Using eigenvector
,
the Cheeger inequality can be stated as
where ? is the first non-trivial eigenvalue of
the Laplacian and is the minimum Cheeger
ratio in a sweep using the eigenvector .
67
Proof of the Cheeger inequality
from definition
by Cauchy-Schwarz ineq.
from the definition.
summation by parts.
68
A Cheeger inequality using random walks
Lovász, Simonovits, 90, 93
Leads to a Cheeger inequality
where is the minimum Cheeger ratio over
sweeps by using a lazy walk of k steps from every
vertex for an appropriate range of k .
69
A Cheeger inequality using PageRank
Using the PageRank vector.
Recall the definition of PageRank ppr(a,s)
(1)
(2)
Organize the random walks by a scalar a.
70
Random walks versus PageRank
How fast is the convergence to the stationary
distribution?
Choose a to satisfy the required property.
For what k, can one have ?
71
A Cheeger inequality using PageRank
with seed as a subset S
Using the PageRank vector
and a Cheeger
inequality can be obtained
where ?S is the Dirichlet eigenvalue of the
Laplacian, and is the minimum Cheeger ratio
over sweeps by using the appropriate personalized
PageRank with seeds S.
72
Dirichlet eigenvalues for a subset
over all f satisfying the Dirichlet
boundary condition
for all
73
Local Cheeger constant for a subset
74
A Cheeger inequality using PageRank
with seed as a subset S
Using the PageRank vector
and a Cheeger
inequality can be obtained
where ?S is the Dirichlet eigenvalue of the
Laplacian, and is the minimum Cheeger ratio
over sweeps by using personalized PageRank with
seed S.
75
Algorithmic aspects of PageRank
  • Fast approximation algorithm for
    x
    personalized PageRank

greedy type algorithm, almost linear complexity
Can use the jumping constant to approximate
PageRank with a support of the desired size.
  • Errors can be effectively bounded.

76
A graph partition algorithm using PageRank
Given a set S with
randomly choose a vertex v in S.
With probability at least
the one-sweep algorithm using
has an initial segment with the Cheeger
constant at most
77
Graph partitioning using PageRank vector.
198,430 nodes and 1,133,512 edges
78
(No Transcript)
79
(No Transcript)
80
Kevin Lang 2007
81
Four proofs of Cheeger inequalities
  • graph spectral method
  • random walks
  • PageRank
  • heat kernel

Fiedler 73, Cheeger, 60s
Alon 86
Lovasz, Simonovits, 90, 93 Spielman, Teng, 04
Andersen, Chung, Lang, 06
Chung, PNAS , 08.
82
PageRank versus heat kernel
Geometric sum
Exponential sum
83
PageRank versus heat kernel
Geometric sum
Exponential sum
recurrence
Heat equation
84
A Cheeger inequality using the heat kernel
Theorem
where is the minimum Cheeger ratio over
sweeps by using heat kernel pagerank over all u
in S.
Theorem For
85
Definition of heat kernel
86
A Cheeger inequality using the heat kernel
Using the upper and lower bounds,
a Cheeger inequality can be obtained
where ?S is the Dirichlet eigenvalue of the
Laplacian, and is the minimum Cheeger ratio
over sweeps by using heat kernel with seeds S
for appropriate t.
87
(No Transcript)
88
(No Transcript)
89
(No Transcript)
90
Many applications of PageRank for problems in
Graph Theory
  • Graph drawing using PageRank
  • Graph embedding using PageRank
  • Pebbing and routing using PageRank
  • Covering and packing using PageRank
  • Relating graph invariants of subgraphs to the
    host graph using PageRank
  • Your favorite old problem using PageRank?

91
New Directions in Graph Theory for information
networks
  • Random graphs with general degrees
  • pageranks
  • Algorithmic game theory, graphical games

Topics
  • Spectral methods
  • Probabilistic methods
  • Quasirandom

Using
92
(No Transcript)
93
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com