Social networks from the perspective of Physics J - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

Social networks from the perspective of Physics J

Description:

Social networks from the perspective of Physics J nos Kert sz1,2 Jukka-Pekka Onnela2, Jari Saram ki2, J rkki Hyv nen2, Kimmo Kaski2, Jussi Kumpula2 David Lazer3 ... – PowerPoint PPT presentation

Number of Views:113
Avg rating:3.0/5.0
Slides: 47
Provided by: JP
Category:

less

Transcript and Presenter's Notes

Title: Social networks from the perspective of Physics J


1
Social networks from the perspective of
Physics János Kertész1,2 Jukka-Pekka Onnela2,
Jari Saramäki2, Jörkki Hyvönen2, Kimmo Kaski2,
Jussi Kumpula2 David Lazer3 Gábor Szabó3,4,
Albert-László Barabási3,41Budapest
University of Technology and Economics, Hungary
2Helsinki University of Technology,
Finland3Harvard University4University of Notre
Dame, USA
2
Outline
  • 0. Introduction
  • Constructing the social network
  • Basic statistics
  • Granovetters hypothesis
  • Thresholding (percolation)
  • Spreading
  • Modeling
  • Conclusions

3
Introduction
Complex systems Many interacting units such that
the resulting behavior is more than a mere sum
(brain, internet, society) Much is known about
the interactions but complex behavior often still
puzzling N 3 can be many! See Three-body
problem of mechanics Statistical physics N
1023 Social sciences N 3 109
4
Introduction
Complex systems More input needed than mere
interactions ? Forget about interactions Network
s Scaffold of complexity Useful to concentrate
on the carrying NW structure (nodes and links)
Holistic approach with very general
statements Spectacular recent development Abunda
nce of data due to IT new concepts
5
Introduction
Phenomenon Nodes Links
Cell metabolism Molecules Chemical reactions
Scientific collaboration Scientists Joint papers
WWW Pages URL links
Air traffic Airports Airline connections
Economy Firms Trading
Language Words Synonymous meaning
Society People Acquaintances

6
Introduction
  • Characterization of many empirical NW-s
  • BROAD DEGREE DISTRIBUTION in many natural and
    human made NW-s
  • - SMALL WORLD property Average distance between
    two nodes usually very small ( log( N ) ) 6
    degrees of separation
  • - HIGH CLUSTERING The number of triangles is
    significantly high
  • Studied in many networks WWW, Internet, actor,
    citation, metabolic etc

7
World not only small and scale free but
clustered! Friends of friends are often friends.
Clustering coeff. C ratio of connected neighbors
ER graph too small clustering!
8
Introduction
WEIGHTED NW-S Step toward reductionism
Interactions have different strength ? weights on
links Weights Fluxes (traffic or chemical
reactions), correlation based networks,
etc. (Often no negative weights, wij ? 0.) How to
characterize weighted NW-s? E.g. STRENGTH of
node i si ?j wij Intensity, coherence of
subgraphs clustering, motifs etc. (see
Onnela et al. PRE 71, 065103(R) (2005)
9
Introduction
SOCIAL NW-S Much has been taken from Sociology
betweennes, clustering, assortativity Main
method Questionnaires (10 - 10 000) Weighted
social nw-s Strength of social relationships
varies over wide range
I know him/her We are on first name basis We
are friends We are good friends We are very
good friends
How to measure?
Scale? Subjectivity?
10
Introduction
Advantage of questionnaires Ask whatever you
are interested in. It enables complex studies,
multi-factor analyses. Disadvantage Difficulty
in quantification and subjectivity E.g.,
AddHealth Quantification of tie strength by
number of joint activities Mutuality test fails
very often M.Gonzales et al. Physica A 379,
307-316. (2007) Alternative approach Use
communication databases (email, phone etc)
11
Outline
  • 0. Introduction
  • Constructing the social network
  • Basic statistics
  • Granovetters hypothesis
  • Thresholding (percolation)
  • Spreading
  • Modeling
  • Conclusions

12
Constructing the Network
  • Use a network constructed from mobile phone calls
    as a proxy for a social network
  • In the network
  • Nodes ? individuals
  • Links ? voice calls
  • Link weights
  • Number of calls
  • Total call duration (time money)

13
Constructing the Network
  • Over 7 million private mobile phone subscriptions
  • Focus voice calls within the home operator
  • Data aggregated from a period of 18 weeks
  • Require reciprocity (X?Y AND Y?X) for a link
  • Customers are anonymous (hash codes)
  • Data from an European mobile operator

14
Outline
  • 0. Introduction
  • Constructing the social network
  • Basic statistics
  • Granovetters hypothesis
  • Thresholding (percolation)
  • Spreading
  • Modeling
  • Conclusions

15
Basic Statistics Visualisation
Largest connected component dominates 3.9M / 4.6M
nodes 6.5M / 7.0M links Use it for analysis!
16
Basic Statistics Distributions
Vertex degree distribution
Link weight distribution
Fat tail
Dunbar number (monkeysphere) max 150 connections
17
Outline
  • 0. Introduction
  • Constructing the social network
  • Basic statistics
  • Granovetters hypothesis
  • Thresholding (percolation)
  • Spreading
  • Modeling
  • Conclusions

18
Granovetters Weak Ties Hypothesis
  • Granovetter suggests analysis of social
    networks as a tool for
    linking micro and
    macro levels of sociological theory
  • Considers the macro level implications of tie
    (micro level) strengths
  • The strength of a tie is a (probably linear)
    combination of the amount of time, the emotional
    intensity, the intimacy (mutual confiding), and
    the reciprocal services which characterize the
    tie.
  • Formulates a hypothesis
  • The relative overlap of two individuals
    friendship networks varies directly with the
    strength of their tie to one another
  • Explores the impact of the hypothesis on, e.g.
    diffusion of information, stressing the cohesive
    power of weak ties
  • M. Granovetter, The Strength of Weak Ties,
  • The American Journal of Sociology 78,
    1360-1380, 1973.

19
Granovetters Weak Ties Hypothesis
  • Hypothesis based on theoretical work and some
    direct evidence
  • Present network is suitable for testing the
    hypothesis
  • (i) Call durations ? time commitment ? tie
    strength
  • (ii) Call durations ? monetary commitment ? tie
    strength
  • (iii) Largest weighted social network so far
  • (Problem Other factors, such as emotional
    intensity or reciprocal services?)
  • What is the coupling between network topology
    and link weights?
  • Consider two connected nodes. We would like to
    characterize their relative neighborhood overlap,
    i.e. proportion of common friends
  • This leads naturally to link neighborhood
    overlap

20
Overlap
  • Definition relative neighborhood overlap
    (topological)
  • where the number of triangles around edge (vi,
    vj) is nij
  • Illustration of the concept

21
Empirical Verification
  • Let ltOgtw denote Oij averaged over a bin of
    w-values
  • Use cumulative link weight distribution
  • (the fraction of links with weights less than
    w)
  • Relative neighbourhood overlap increases as a
    function of link weight
  • ?Verifies Granovetters hypothesis (95)
  • (Exception Top 5 of weights)
  • Blue curve empirical network
  • Red curve weight randomised network

22
Local Implications
  • Implication for strong links?

Neighbourhood overlap is high ? People
form strongly connected communities
  • Implication for weak links?

Neighbourhood overlap is low ?
Communities are connected by weak links
23
A Piece of the Network
weak links
strong links
community
24
Overlap
Global optimization to transport would put high
weights to links with high betweenness
centrality ( passing shortest paths)
In contrast, ltO gt decreases with b
25
High Weight Links?
  • (a) Average Oij as a function of weight w
  • w ? 104 stronger tie ? larger overlap
  • w ? 104 stronger tie ? smaller overlap
  • Contradicts the weak ties hypothesis !
  • Links in the decreasing part correspond to over
    3h of communication over the period
  • (b) Putting it into perspective
  • - For only 5 of links w ? 104
  • - Corresponds to 325 000 links, cannot be
    insufficient statistics

26
High Weight Links?
  • Weak links Strengh of both adjacent nodes (min
    max) considerably higher than link weight
  • Strong links Strength of both adjacent nodes
    (min max) about as high as the link weight
  • Indication High weight relationships clearly
    dominate on-air time of both, others negligible
  • Time ratio spent communicating with one other
    person converges to 1 at roughly w 104
  • Consequence Less time to interact with others
  • Explaining onset of decreasing trend for ltOgtw

27
Outline
  • 0. Introduction
  • Constructing the social network
  • Basic statistics
  • Granovetters hypothesis
  • Thresholding (percolation)
  • Spreading
  • Modeling
  • Conclusions

28
Thresholding Analysis Introduction
  • Childrens approach Break to learn!
  • We do this systematically using thresholding
    analysis
  • Order the links by weight
  • Delete the links, one by one, based on their
    order
  • Control parameter f is the fraction of removed
    links
  • We can continuously interpolate, in either
    direction, between the initial connected network
    (f0) and the set of isolated nodes (f1)
  • We use two different thresholding schemes
  • (i) Increasing thresholding (remove low
    wij/Oij links first)
  • (ii) Descending thresholding (remove high
    wij/Oij links first)
  • Question How does the network respond to link
    removal?
  • How similar is the response to wij and Oij driven
    thresholding?

29
Thresholding
  • Initial connected network (f0)
  • ? All links are intact, i.e. the network is in
    its initial stage

30
Thresholding
  • Increasing weight thresholded network (f0.8)
  • ? 80 of the weakest links removed, strongest
    20 remain

31
Thresholding
  • Initial connected network (f0)
  • ? All links are intact, i.e. the network is in
    its initial stage

32
Thresholding
  • Decreasing weight thresholded network (f0.8)
  • ? 80 of the strongest links removed, weakest
    20 remain

33
Thresholding
  • We will study, as a function of the control
    parameter f, the following
  • Order parameter (size of the largest component)
  • Susceptibility (average size of other
    components)
  • Average path lengths (in LCC)
  • Average clustering coefficient in the LCC

34
Thresholding Size of Largest Component
  • RLCC is the fraction of nodes in the largest
    connected component
  • LCC is able to sustain its integrity for moderate
    values of f
  • Least affected by removal of high Oij links (in
    tight communities)
  • Most affected by removal of low Oij links
    (between communities)
  • Difference between removal of low and high wij
    links is small, but LCC breaks earlier if weak
    links are removed (Granovetter)
  • Very few links are required for global
    connectivity

remove low first remove high first
(c)
35
Thresholding Size of Other Components
  • Collapse for different values of f, but what is
    its nature?
  • Susceptibility (average cluster size excl. LCC)
  • ns is the number of clusters with s nodes
  • Percolation theory S?8 as f?fc
  • Finite signature of divergence fc 0.60 (incr.
    o.) fc 0.82 (incr. w.)
  • Demarcation between weak and strong links given
    by fc 0.82
  • Qualitatively different role for weak and strong
    links

remove low first remove high first
(c)
36
Thresholding Path Lengths in LCC
  • Granovetter refers to interpersonal flow
    (information, rumour) from one person to another
  • In order for a flow to exist, the two people
    (nodes) need to be connected at least through one
    path
  • The size of the LCC says nothing about how
    tightly connected the component is, only that it
    is connected
  • Granovetters corollary
  • Weak ties create a large number of short paths
    between nodes in different communities, and
    thus removing them should increase average path
    lengths and make it more difficult for the flow
    to happen

37
Thresholding Path Lengths in LCC
  • Connectedness necessary but not sufficient
    condition for flow
  • But how is the LCC connected?
  • Use a.p.l. ltlgt to study the role of different
    links for global paths
  • Removing weak links leads to longer paths
    f0.75 ltlgt45 vs. ltlgt30
  • Supports the weak ties conjecture on path lengths
  • (communities are locally connected by weak ties)

remove low first remove high first
(c)
38
Thresholding Clustering in LCC
  • Effect of different links on the structure of
    communities?
  • Quantify this with ltCgt, average clustering
    coefficient
  • Strong links are mostly within communities
    (triangles abundant), and thus removing them
    lowers clustering
  • Weak links are mostly between communities (rarely
    participate in triangles), and thus removing them
    has little effect
  • Removing high Oij links shatters communities
    quickly
  • Removing low Oij links brings out communities

remove low first remove high first
39
Outline
  • 0. Introduction
  • Constructing the social network
  • Basic statistics
  • Granovetters hypothesis
  • Thresholding (percolation)
  • Diffusion of infromation
  • Modeling
  • Conclusions

40
Diffusion of information
  • Knowledge of information diffusion based on
    unweighted networks
  • Use the present network to study diffusion on a
    weighted network Does the local
    relationship between topology and tie strength
    have an effect?
  • Spreading simulation infect one node with new
    information
  • (1) Empirical pij ? wij
  • (2) Reference pij ? ltwgt
  • Spreading significantly faster on the reference
    (average weight) network
  • Information gets trapped in communities in the
    real network

Reference
Empirical
41
Diffusion of information
  • Where do individuals get their information?
    Majority of infections through
  • (1) Empirical ties of intermediate strength
  • (2) Reference (would be) weak ties
  • Both weak and strong ties have a diminishing role
    as information sources The weakness of weak
    and strong ties

42
Diffusion of information
  • - Start spreading 100 times (large red node)
  • - Information flows differently due to the local
    organizational principle
  • (1) Empirical information flows along a strong
    tie backbone
  • (2) Reference information mainly flows along
    the shortest paths

Best search results Reach out of your own
community
Empirical
Reference
43
Spreading
  • In simplified terms, we can think of each link as
    transmitting information locally between the two
    individuals it connects
  • Strong links involve larger time commitments, so
    natural to assume that information flow through a
    link is proportional to its weight wij
  • Flow through weak (high wij) links
  • (i) Low per se (by definition)
  • (ii) Low overlap Oij ? Few alternative paths of
    length 2, so information can easily get trapped
  • Flow through strong (high wij) links
  • (i) High per se (by definition)
  • (ii) High overlap Oij ? Many alternative paths
    enhance flow further, so particularly well suited
    to efficient local transfer

44
Searching
  • Fix a set of search strategies
  • Study which strategies are successful in finding
    information
  • Best search results Reach out of your own
    community!

45
Outline
  • 0. Introduction
  • Constructing the social network
  • Basic statistics
  • Granovetters hypothesis
  • Thresholding (percolation)
  • Spreading
  • Modeling
  • Conclusions

46
Modeling
  • What is all this good for?
  • Understanding structure and mechanisms of the
    society
  • Improving spreading of news and opinions
  • (Developing marketing strategies and other tools
    of mass manipulation)
  • MODELING needed

47
Modeling
Needed Weighted network model, which reflects
the observations with possibly limited
input Links created by random encounters on
acquaintance basis Weights generated by
one-to-one activities (phone calls) Take into
account the different time scales Encounter
(call) frequency Lifetime of relationships
Lifetime of nodes
?
treated together
48
Microscopic mechanisms in sociology
  • Network sociology
  • Cyclic closure
  • Exponential decay for growing geodesic distance
  • Focal closure
  • Distance independent
  • Sample window
  • Network model
  • Local attachment (LA)
  • Special case of cyclic closure Triadic closure
  • Global attachment (GA)
  • Node deletion (ND)

M. Kossinets et al., Empirical Analysis of an
Evolving Social Network, Science 311, 88 (2006)
49
Modeling
i meets j with prob. ? wij , who meets k with
prob. ? wjk. If k is a common friend wij, wjk
wki are increased by ? (a). If k is not connected
to i, wik w0 ( 1) is created with probability
p? (b). With prob. pr new links with w0 weight
are created (c). With prob. pd a node with all
links is deleted and a new one is born with no
links.
50
Microscopic rules in the model
  • Summary of the model
  • Weighted local search for new acquaintances
  • Reinforcement of existing (popular) links
  • Unweighted global search for new acquaintances
  • Node removal, exp.link weight lifetimes lttgt2
    lttwgt(pd)-1
  • Model parameters
  • d Free weight reinforcement parameter
  • pr 10-3 Sets the time scale of the model lt tN
    gt 1/pd
  • (average node lifetime of 1000 time steps)
  • pr 510-4 Global connections results not
    sensitive for it
  • (one random link per node during 1000 time
    steps)
  • p? Adjusted in relation to d to keep ltkgt
    constant
  • (structure changes due to only link
    re-organisations)

51
Modeling
pd 0.001 pr 0.0005 N 30 000
Changing ? by keeping ltkgt fixed by adjusting p?.
Communities emerge, with strong internal
links. Communities are interconnected by weak
links
? 0.001 0.1 0.5
52
Social network model
Samples of N105 network for variable
weight-increase d
Tie strength weak ? intermediate ? strong tie
53
Communities by inspection
  • Average number of links
  • constant ltLgt N ltkgt/2
  • (ltkgt 10 )
  • gt All changes in structure due to
    re-organisation of links
  • Increasing d traps search
  • in communities, further enhancing
    trapping effect
  • gt Clear communities form
  • Triangles accumulate weight and act as
    nuclei for communities to emerge

54
Communities by k-clique method
  • k-clique algorithm as definition for communities
  • Focus on 4-cliques (smallest non-trivial cliques)
  • Relative largest community size Rk4 ? 0,1
  • Average community size ltnsgt (excl. largest)
  • Observe clique percolation through the system for
    small d
  • Increasing d leads to condensation of communities

G. Palla et al., Uncovering the overlapping
community structure..., Nature 435, 814 (2005)
55
Global consequences
Remove weak strong links first
56
Global consequences
Model network
Phone network
Ascending link removal
Descending link removal
Ascending Descending
Fraction of links, f
0
1
f
f
Phase transition for ascending tie removal
(weaker first)
57
Modeling
  • The model fulfills essential criteria of social
    nw-s
  • Broad (but not scale free degree) distribution
  • Assortative mixing (popular people attract each
    other)
  • High clustering many triangles (by
    construction)
  • Community structure with strong links inside and
    weak ones between them

58
Outline
  • 0. Introduction
  • Constructing the social network
  • Basic statistics
  • Granovetters hypothesis
  • Thresholding (percolation)
  • Spreading
  • Modeling
  • Conclusions

59
Discussion and Conclusion
  • Weak ties maintain networks structural
    integrity Strong ties maintain local
    communities Intermediate ties mostly responsible
    for first-time infections
  • How can one efficiently search for information in
    a social network? Go out of your community!
  • Social networks seem better suited to local
    processing than global transmission of
    information
  • Are there simple rules or mechanisms that lead to
    observed properties?
  • Efficient modeling possible

Publications J.-P. Onnela, et al. PNAS 104,
7332-7336 (2007) J.-P. Onnela, et al. New
J. Phys. 9, 179 (2007) J.M.
Kumpula, et al. PRL (to be published)
www.phy.bme.hu/kertesz/
60
Marc Granovetter, Connections, 1990
gtgtIn the history of public speaking, there have
been many famous denials. One sunny day in 1880,
Karl Marx declared "I am not a Marxist". On a
less auspicious occasion in 1973, Richard Nixon
insisted "I am not a crook". Neither Marx nor
Nixons audience gave much credence to their
denials, and you too may respond with disbelief
when I tell you that "I am not a networker".ltlt
gtgtInstead, the slogan of the day will be "We are
all networkers now".ltlt
Write a Comment
User Comments (0)
About PowerShow.com