SI 614 Directed

About This Presentation

Title:

SI 614 Directed

Description:

Title: Slide 1 Author: ladamic Last modified by: ladamic Created Date: 11/14/2005 1:35:40 PM Document presentation format: On-screen Show Company: University of Michigan – PowerPoint PPT presentation

Number of Views:72

Avg rating:3.0/5.0

Slides: 64

Provided by: LAD103

Learn more at: https://public.websites.umich.edu

Category:

more less

Transcript and Presenter's Notes

Title: SI 614 Directed

1
SI 614Directed weighted networks, minimum
spanning trees, flow
Lecture 12 Instructor Lada Adamic
2
Outline

directed networks
prestige
weighted networks
minimum spanning trees
flow

3
Review of centrality in undirected
networks Comparison

Comparing across these 3 centrality values
Generally, the 3 centrality types will be
positively correlated
When they are not (low) correlated, it probably
tells you something interesting about the network.

Low Degree
Low Closeness
Low Betweenness
High Degree

Embedded in cluster that is far from the rest of
the network
Ego's connections are redundant - communication
bypasses him/her
High Closeness
Key player tied to important important/active
alters

Probably multiple paths in the network, ego is
near many people, but so are many others
High Betweenness
Ego's few ties are crucial for network flow
Very rare cell. Would mean that ego monopolizes
the ties from a small number of people to many
others.

slide Jim Moody
4
Centrality in Social Networks Power / Eigenvalue
Bonacich Power Centrality Actors centrality
(prestige) is equal to a function of the prestige
of those they are connected to. Thus, actors who
are tied to very central actors should have
higher prestige/ centrality than those who are
not.

a is a scaling vector, which is set to normalize
the score.
b reflects the extent to which you weight the
centrality of people ego is tied to.
R is the adjacency matrix (can be valued)
I is the identity matrix (1s down the diagonal)
1 is a matrix of all ones.

slide Jim Moody
5
Centrality in Social Networks Power / Eigenvalue
Bonacich Power Centrality
The magnitude of b reflects the radius of power.
Small values of b weight local structure, larger
values weight global structure. If b is
positive, then ego has higher centrality when
tied to people who are central. If b is
negative, then ego has higher centrality when
tied to people who are not central. As b
approaches zero, you get degree centrality.
slide Jim Moody
6
Centrality in Social Networks Power / Eigenvalue
Bonacich Power Centrality
b 0.23
slide Jim Moody
7
Centrality in Social Networks Power / Eigenvalue
Bonacich Power Centrality
b-.35
b.35
slide Jim Moody
8
Centrality in Social Networks Power / Eigenvalue
Bonacich Power Centrality
b.23
b -.23
slide Jim Moody
9
Examples of directed networks?

WWW
food webs
population dynamics
influence
hereditary
citation
transcription regulation networks
neural networks

10
Prestige in directed social networks

when prestige may be the right word
admiration
influence
gift-giving
trust
directionality especially important in instances
where ties may not be reciprocated (e.g. dining
partners choice network)
when prestige may not be the right word
gives advice to (can reverse direction)
gives orders to (- -)
lends money to (- -)
dislikes
distrusts

11
Extensions of undirected degree centrality -
prestige

degree centrality
indegree centrality
a paper that is cited by many others has high
prestige
a person nominated by many others for an reward
has high prestige

12
Extensions of undirected closeness centrality

closeness centrality usually implies
all paths should lead to you
and unusually not
paths should lead from you to everywhere else
usually consider only vertices from which the
node i in question can be reached

13
Influence range

The influence range of i is the set of vertices
who are reachable from the node i

14
Extending betweenness centrality to directed
networks

We now consider the fraction of all directed
paths between any two vertices that pass through
a node

paths between j and k that pass through i
betweenness of vertex i
all paths between j and k

Only modification when normalizing, we have
(N-1)(N-2) instead of (N-1)(N-2)/2, because we
have twice as many ordered pairs as unordered
pairs

15
Directed geodesics

A node does not necessarily lie on a geodesic
from j to k if it lies on a geodesic from k to j

j
k
16
Prestige in Pajek

Calculating the indegree prestige
NetgtPartitiongtDegreegtInput
to view, select FilegtPartitiongtEdit
if you need to reverse the direction of each tie
first (e.g. lends money to -gt borrows
from)NetgtTransformgtTranspose
Influence range (a.k.a. input domain)
Netgtk-NeighboursgtInput
enter the number of the vertex, and 0 to consider
all vertices that eventually lead to your chosen
vertex
to find out the size of the input domain, select
InfogtPartition
Calculate the size of the input domains for all
vertices
NetgtPartitionsgtDomaingtInput
Can also limit to only neighbors within some
distance

17
Proximity prestige in Pajek

Direct nominations (choices) should count more
than indirect ones
Nominations from second degree neighbors should
count more than third degree ones
So consider proximity prestige
Cp(ni)

fraction of all vertices that are in is input
domain
average distance from i to vertex in input domain
18
Weighted networks

Examples
email communication
sports matches
packet transfer
population movement
co-authorship
food webs
Weighted treatment of data/algorithms usually
left for future work

19
But what are weights good for?

Defining thresholds
Shortest paths that dont take long
Flow/capacity of a network

20
Food webs

Food webs
usually considered as binary networks
problems in defining threshold fluxes
do killer whales who eat bears count?
weights
interaction frequency
acts of predation per hectare per day
carbon flow (prey to predator)
grams of Carbon per meter squared per year
interaction strength (predator on prey)
(carbon flow of prey to predator)/ (biomass of
predator)

Lake carbon flow
21
Co-authorship networks

The weight assigned to each edge is the sum of
the number of papers in which two people were
co-authors, divided by the total number of people
in that paper
large-scale high energy physics collaboration
producing a paper with 100 authors is less
evidence of direct collaboration than an article
in Social Networks with only two co-authors.
Should we normalize?
all weights from i to other nodes should sum to
1? (probably not)

all papers where i and j were coauthors
number of authors of paper k
22
Symmetry in normalization

If normalizing by the sum of values for each node

assume simple weighting number ofpapers
co-authored
2
1
3
wij 3/31 wji 3/151/5
3
j
3
i
6

Cosine similarity symmetric values
assume the weight for each paper is wk 1/(nk-1)
i and j each have vectors of 0s and ws
depending on whetherthey authored paper k
normalize by the length of both vectors

23
Other similarity Measures
Simple matching Dices Coefficient Jaccards
Coefficient Cosine CoefficientOverlap Coefficient
a1
a3
p2
p9
p1
p4
p11
p5
p3
p6
p10
p8
a2
p7
Q set of papers authored by a1 D set of papers
authored by a2
24
Weighted shortest paths

Routes
shortest route from Chicago to Boston
vertex intersection
edge weights road distances
alternative weights expected time traveled, gas
consumed
usually sum the weights from each segment

finish
surface road 25 mph, 50 miles 2 hours
freeway, 70 mph 30 miles/70 mph 26 minutes
start
freeway, 65 mph 40 miles/65 mph 37 minutes
25
Reliable paths through social networks

The probability of transmitting a message or
infectious agent could be related to the strength
of the tie
e.g. rather than summing the weights, we might
multiply the probabilities of getting through

p 1
p 0.001
p 0.05
p 0.5
p 0.5
Probability of getting an idea through to the
head of labs via CEO (0.0011 0.001), via
direct manager (0.50.5 0.25)
26
Shortest Path Problem

Given a weighted graph and two vertices u and v,
we want to find a path of minimum total weight
between u and v.
Length of a path is the sum of the weights of its
edges.
Example
Shortest path between Providence and Honolulu
Applications
Internet packet routing
Flight reservations
Driving directions

849
PVD
ORD
1843
142
SFO
802
LGA
1205
1743
337
1387
HNL
2555
1099
1233
LAX
1120
DFW
MIA
slide by Huajie Zhang, http//www.cs.unb.ca/cours
es/cs3913/
27
Negative weights

Shortest paths usually undefined for edges with
negative weights if there are negative cycles
present

2
4
3
-3
28
Shortest Path Properties

Property 1
A subpath of a shortest path is itself a
shortest path
Property 2
There is a tree of shortest paths from a start
vertex to all the other vertices
Example
Tree of shortest paths from Providence

849
PVD
ORD
1843
142
SFO
802
LGA
1205
1743
337
1387
HNL
2555
1099
1233
LAX
1120
DFW
MIA
slide by Huajie Zhang, http//www.cs.unb.ca/cours
es/cs3913/
29
Dijkstras Algorithm

The distance of a vertex v from a vertex s is the
length of a shortest path between s and v
Dijkstras algorithm computes the distances of
all the vertices from a given start vertex s
Assumptions
the graph is connected
the edges are undirected
the edge weights are nonnegative

We grow a cloud of vertices, beginning with s
and eventually covering all the vertices
We store with each vertex v a label d(v)
representing the distance of v from s in the
subgraph consisting of the cloud and its adjacent
vertices
At each step
We add to the cloud the vertex u outside the
cloud with the smallest distance label, d(u)
We update the labels of the vertices adjacent to
u

slide by Huajie Zhang, http//www.cs.unb.ca/cours
es/cs3913/
30
Edge Relaxation

Consider an edge e (u,z) such that
u is the vertex most recently added to the cloud
z is not in the cloud
The relaxation of edge e updates distance d(z) as
follows
d(z) ? mind(z),d(u) weight(e)

d(u) 50
d(z) 75
10
e
u
z
s

d(u) 50
d(z) 60
10
e
u
z
s

slide by Huajie Zhang, http//www.cs.unb.ca/cours
es/cs3913/
31
Example
0
A
4
8
2
4
2
8
7
1
C
B
D
3
9
?
?
2
5
E
F
0
0
A
A
4
4
8
8
2
2
3
2
8
3
2
7
7
1
7
1
C
B
D
C
B
D
3
9
3
9
5
11
5
8
2
5
2
5
E
F
E
F
slide by Huajie Zhang, http//www.cs.unb.ca/cours
es/cs3913/
32
Example (cont.)
0
A
4
8
2
3
2
7
7
1
C
B
D
3
9
5
8
2
5
E
F
0
A
4
8
2
3
2
7
7
1
C
B
D
3
9
5
8
2
5
E
F
slide by Huajie Zhang, http//www.cs.unb.ca/cours
es/cs3913/
33
Minimum spanning trees

Connect all vertices with a single tree

Consider a communications company, such as ATT
or GTE that needs to build a communication
network that connects n different users. The
cost of making a link joining i and j is cij.
What is the minimum cost of connecting all of the
users?

Common assumption the only links possible are
the ones directly joining two nodes.
web.mit.edu/jorlin/www/15.082/Lectures/16_Spannin
g_Trees.ppt
34
Electronic Circuitry

Consider a system with a number of electronic
components. In order to make two pins i and j of
different components electrically equivalent, one
can connect i and j by a wire. How can we
connect n different pins in this way to make them
electrically equivalent to each other so as to
minimize the total wire length.

web.mit.edu/jorlin/www/15.082/Lectures/16_Spannin
g_Trees.ppt
35
Minimum Cost Spanning Tree Problem

Undirected network G (N, A).
(i, j) is the same arc as (j, i).
We associate with each arc (i, j) ? A a cost
cij.
A spanning tree T of G is a connected acyclic
subgraph that spans all the nodes. A connected
graph with n nodes and n 1 arcs is a spanning
tree.
The minimum cost spanning tree problem is to find
a spanning tree of minimum cost.

web.mit.edu/jorlin/www/15.082/Lectures/16_Spannin
g_Trees.ppt
36
A Minimum Cost Spanning Tree Problem
10
8
2
4
6
35
15
17
1
30
25
20
21
40
3
5
7
15
11
web.mit.edu/jorlin/www/15.082/Lectures/16_Spannin
g_Trees.ppt
37
A Minimum Cost Spanning Tree
10
8
2
4
6
35
15
17
1
30
25
20
21
40
3
5
7
15
11
web.mit.edu/jorlin/www/15.082/Lectures/16_Spannin
g_Trees.ppt
38
Prim-Jarnik Algorithm

Vertex based algorithm
Grows one tree T, one vertex at a time
A cloud covering the portion of T already
computed
Label the vertices v outside the cloud with
keyv the minimum weigth of an edge connecting
v to a vertex in the cloud, keyv , if no
such edge exists

www.cs.earlham.edu/celikeb/fall_2005/cs310_aads/l
ecture_slides/ch23_minimum_spanning_trees.ppt
39
Prim Example
www.cs.earlham.edu/celikeb/fall_2005/cs310_aads/l
ecture_slides/ch23_minimum_spanning_trees.ppt
40
Prim Example (2)
www.cs.earlham.edu/celikeb/fall_2005/cs310_aads/l
ecture_slides/ch23_minimum_spanning_trees.ppt
41
Prim Example (3)
www.cs.earlham.edu/celikeb/fall_2005/cs310_aads/l
ecture_slides/ch23_minimum_spanning_trees.ppt
42
Kruskal's Algorithm

The algorithm adds the cheapest edge that
connects two trees of the forest

MST-Kruskal(G,w) 01 A Æ 02 for each vertex v Î
VG do 03 Make-Set(v) 04 sort the edges of E
by non-decreasing weight w 05 for each edge (u,v)
Î E, in order by non-decreasing weight do 06 if
Find-Set(u) ¹ Find-Set(v) then 07 A A È
(u,v) 08 Union(u,v) 09 return A
www.cs.earlham.edu/celikeb/fall_2005/cs310_aads/l
ecture_slides/ch23_minimum_spanning_trees.ppt
43
Kruskal Example
www.cs.earlham.edu/celikeb/fall_2005/cs310_aads/l
ecture_slides/ch23_minimum_spanning_trees.ppt
44
Kruskal Example (2)
www.cs.earlham.edu/celikeb/fall_2005/cs310_aads/l
ecture_slides/ch23_minimum_spanning_trees.ppt
45
Kruskal Example (3)
www.cs.earlham.edu/celikeb/fall_2005/cs310_aads/l
ecture_slides/ch23_minimum_spanning_trees.ppt
46
Kruskal Example (4)
www.cs.earlham.edu/celikeb/fall_2005/cs310_aads/l
ecture_slides/ch23_minimum_spanning_trees.ppt
47
Network flow

Applications
traffic transportation
maximum number of cars that can commute from
Berkley to San Francisco during rush hour
fluid networks pipes that carry liquids
computer networks packets traveling along fiber
extended applications (from Kleinberg Tardos,
Algorithm Design)
bipartite matching problem
number of disjoint paths between two vertices
survey design
airline scheduling
image segmentation
baseball elimination

48
Max flow problem how much stuff can we get from
source to sink per unit time?
Capacity
7
Sink
Source
www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
49
Equivalent tasks

Find a cut with minimum capacity
Find maximum flow from source to sink

www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
50
A Flow
3
5
7
2
residual graph
5
2
www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
51
Augmenting Paths

A path from source to sink in the residual graph
of a given flow
If there is an augmenting path in the residual
graph, we can push more flow

www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
52
Ford-Fulkerson Method

initialize total flow to 0
residual graph G G
while augmenting path exist in G
pick a augmenting path P in G
m bottleneck capacity of P
add m to total flow
push flow of m along P
update G

www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
53
Example
3
2
1
3
1
1
3
1
3
1
2
1
4
3
2
2
1
1
1
2
4
2
4
www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
54
Example
3
2
1
3
1
1
3
1
3
1
2
1
4
3
2
2
1
1
1
2
4
2
4
www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
55
Example
3
2
1
3
1
1
3
1
3
1
2
1
4
3
2
2
1
1
1
2
3
1
3
1
1
1
www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
56
Example
3
2
1
3
1
1
3
1
3
1
2
1
4
3
2
2
1
1
1
2
3
1
3
1
1
1
www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
57
Example
1
2
1
3
1
1
1
1
1
1
2
1
4
3
2
2
1
1
1
2
3
1
3
1
1
1
www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
58
Example
1
2
1
3
1
1
1
1
1
1
1
4
3
2
2
1
1
1
2
3
1
3
1
1
1
www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
59
Example
1
2
1
3
1
1
1
1
1
1
1
1
3
3
2
1
1
1
1
1
2
2
2
2
2
2
www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
60
Answer Max Flow 4
2
2
2
2
2
1
1
1
2
2
2
www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
61
Answer Minimum Cut 4
3
2
1
3
1
1
3
1
3
1
1
4
3
2
2
1
1
1
2
4
2
4
www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
62
project status report

worth 5 of your grade, meant to keep you on
track
2-3 weeks later in-class presentation
1 month later final project report due
what it should do
include part of your project proposal as intro
include result summaries (including figures
tables).
be 4-6 pages
include references to and briefly (paragraph or
2) discuss some related work.
include a plan of remaining work.
It is graded on a 0-5 scale
5 - same as 4, but very complete and already
shows interesting new insights
4 - data, more than basic analysis (e.g. looked
at robustness, community structure, centrality,
etc. if applicable)
3 - some data, preliminary analysis (imported
data into Pajek or GUESS, counted things up,
visualized, if possible)
2 - some data, no results
1 - attempts made to get project started, but
nothing worked out (no data, no results)
0 - no work done

63
GUESS installation

Windows
unzip the files into a folder
edit the guess.bat (a batch executable file) so
that
_at_rem set GUESS_HOMEc\program files\GUESS
becomes _at_set GUESS_HOMEC\PROGRA1\GUESS
if you installed into c\Program Files\GUESS
else you can try installing into a directory with
no spaces in the name and have (e.g.) _at_set
GUESS_HOMEC\apps\GUESS