Title: Traffic-driven model of the World-Wide-Web Graph
1Traffic-driven model of the World-Wide-Web Graph
- A. Barrat, LPT, Orsay, France
- M. Barthélemy, CEA, France
- A. Vespignani, LPT, Orsay, France
2Outline
- The WebGraph
- Some empirical characteristics
- Various models
- Weights and strengths
- Our model
- Definition
- Analysis analyticsnumerics
- Conclusions
3The Web as a directed graph
nodes i web-pages directed links hyperlinks
l
j
i
in- and out- degrees
4Empirical facts
- Small world captured by Erdös-Renyi graphs
With probability p an edge is established among
couple of vertices
ltkgt p N
5Empirical facts
- Small world
- Large clustering different neighbours of a node
- will
likely know each other
gtgraph models with large clustering, e.g.
Watts-Strogatz 1998
6Empirical facts
- Small world
- Large clustering
- Dynamical network
- Broad connectivity distributions
- also observed in many other contexts
- (from biological to social networks)
- huge activity of modeling
(Barabasi-Albert 1999 Broder et al. 2000 Kumar
et al. 2000 Adamic-Huberman 2001 Laura et al.
2003)
7Various growing networks models
- Barabási-Albert (1999) preferential attachment
- Many variations on the BA model rewiring (Tadic
2001, Krapivsky et al. 2001), addition of edges,
directed model (Dorogovtsev-Mendes 2000,
Cooper-Frieze 2001), fitness (Bianconi-Barabási
2001), ... - Kumar et al. (2000) copying mechanism
- Pandurangan et al. (2002) PageRankpref.
attachment - Laura et al. (2002) Multi-layer model
- Menczer (2002) textual content of web-pages
8The Web as a directed graph
nodes i web-pages directed links hyperlinks
l
j
i
Broad P(kin) cut-off for P(kout)
(Broder et al. 2000 Kumar et al. 2000
Adamic-Huberman 2001 Laura et al. 2003)
9Additional level of complexity Weights and
Strengths
l
j
Links carry weights/traffic wij
i
In- and out- strengths
Adamic-Huberman 2001 broad distribution of sin
10Model directed network
(i) Growth
j
(ii) Strength driven preferential
attachment (n koutm outlinks)
i
Busy gets busier
AND...
11Weights reinforcement mechanism
j
i
The new traffic n-i increases the traffic i-j
Busy gets busier
12Evolution equations
(Continuous approximation)
Coupling term
13Resolution
Ansatz
supported by numerics
14Results
15Approximation
Total in-weight ?i sini approximately
proportional to the total number of in-links ?i
kini , times average weight hwi 1?
Then A1?
gsin 2 221/m
16Numerical simulations
Measure of A prediction of ?
Approx of g
17Numerical simulations
NB broad P(sout) even if koutm
18Clustering spectrum
i.e. fraction of connected couples of neighbours
of node i
19Clustering spectrum
- d increases gt clustering increases
- New pages point to various well-known pages,
often connected - together gt large clustering for small nodes
- Old, popular pages with large k many in-links
from many less popular pages which are not
connected together - gt smaller clustering for large nodes
20Clustering and weighted clustering
takes into account the relevance of triangles in
the global traffic
21Clustering and weighted clustering
Weighted Clustering larger than topological
clustering triangles carry a large part of the
traffic
22Assortativity
Average connectivity of nearest neighbours of i
23Assortativity
- knn disassortative behaviour, as usual in
growing networks - models, and typical in technological networks
- lack of correlations in popularity as measured by
the in-degree
24Summary
- Web heterogeneous topology and traffic
- Mechanism taking into account interplay between
topology and traffic - Simple mechanismgtcomplex behaviour, scale-free
distributions for connectivity and traffic - Analytical study possible
- Study of correlations non-trivial hierarchical
behaviour - Possibility to add features (fitnesses, rewiring,
addition of edges, etc...), to modify the
redistribution rule... - Empirical studies of traffic and correlations?