The Complex Network of Wikipedia - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

The Complex Network of Wikipedia

Description:

The Complex Network of Wikipedia. F. Colaiori, V. Servedio, G. Caldarelli, AC ... by Voss and Zlatic show that Wikipedia is indeed a complex network, with power ... – PowerPoint PPT presentation

Number of Views:173
Avg rating:3.0/5.0
Slides: 32
Provided by: pilPhysU
Category:

less

Transcript and Presenter's Notes

Title: The Complex Network of Wikipedia


1
The Complex Network of Wikipedia
F. Colaiori, V. Servedio, G. Caldarelli,
AC physics dept., La Sapienza, Rome (Italy) D.
Donato, S. Leonardi computer science dept., La
Sapienza, Rome (Italy) L. Salete
Buriol computer science dept., University of
Porto Alegre, Rio Grande do Sul (Brazil)
2
The Complex Network of Wikipedia
Network description Statistical analysis of
Wikipedia Model and interpretation
The Complex Network of Wikipedia
3
(No Transcript)
4
(No Transcript)
5
La rete complessa di Wikipedia
6
(No Transcript)
7
How does Wikipedia work?
  • Thanks to the Wiki technology, a user can
  • add new entries to the encyclopedia
  • modify the content of existing entries
  • modify their connections
  • NB in the World Wide Web every user is
    responsible only for the out-degree of his web
    page.

The Complex Network of Wikipedia
8
Nodes and edges in Wikipedia
Network edges are encyclopedia entries Edges
are citations between entries
The Complex Network of Wikipedia
9
Statistical Properties
Entries number grows exponentially in time
The Complex Network of Wikipedia
10
Statistical Properties
Preliminary results found by Voss and Zlatic show
that Wikipedia is indeed a complex network, with
power law degree distributions. J. Voss,
Proceedings of 10th International Conference of
the International Society for Scientometrics and
Informetrics, (Stockholm, Sweden), 2005. V.
Zlatic, M. Bozicevic, H. Stefancic, and M.
Domazet Phys. Rev. E 74, 016115 (2006)
The Complex Network of Wikipedia
11
Degree distribution
The Complex Network of Wikipedia
12
Preferential attachment
To detect the preferential attachment, we have
adopted the method introduced by Newman (2001)
one builds the histogram ?(k) of the degree of
vertices acquiring new edges at each time step t
weighing their contribution by a factor
n(k,t)/N(t), where N(t) is the number of nodes
at time t n(k,t) is the number of nodes with
degree k at time t If ?(k) has an approximatedly
linear behaviour, therefore perhaps we can
conclude that there is preferential attachment.
The Complex Network of Wikipedia
13
Preferential attachment
Circles english Triangles portuguese Filled
in-degree White out-degree
The Complex Network of Wikipedia
14
Lack of correlations (in-in)
english portuguese
The Complex Network of Wikipedia
15
A model for Wikipedia
  • At each time step one adds a node and M edges.
    The direction of edges is a random variable
  • 1. with probability R1 the edge leaves the new
    node and points an existing node chosen with
    probability proportional to its in-degree.

The Complex Network of Wikipedia
16
A model for Wikipedia
  • At each time step one adds a node and M edges.
    The direction of edges is a random variable
  • 2. with probability R2 the edge points the new
    node and leaves an existing node chosen with
    probability proportional to its out-degree.

The Complex Network of Wikipedia
17
A model for Wikipedia
  • At each time step one adds a node and M edges.
    The direction of edges is a random variable
  • 3. with probability R3 1 R1 - R2 the edge
    points an existing node with probability
    proportional to its in-degree and leaves and
    leaves an existing node chosen with probability
    proportional to its out-degree.

The Complex Network of Wikipedia
18
Parameters in real data
  • The parameters have a physical meaning and can
    been measured on real data. In the english case,
    for instance, this yields
  • R1 0.026, R2 0.091
  • in the data we have, M 10

The Complex Network of Wikipedia
19
Rate equation for in- e out-degree
  • By approximating discrete time variation by
    derivativatives with respect to the continuous
    variable t, one can write and solve the following
    rate equations for the in- and out-degree
  • dkin /dt (R1R3) kin t-1
  • dkout /dt (R2R3) kout t-1

The Complex Network of Wikipedia
20
Distribution of in- e out-degree
  • By solving the rate equation, one obtains the
    time evolutions and, with little algebra, the
    distributions of the in- and out-degree

The Complex Network of Wikipedia
21
Distribution of in- e out-degree
  • Such distributions can be checked against real
    data, by plugging the real data coefficients
    R1,2,3 into the theoretical equations.

The Complex Network of Wikipedia
22
Distribution of in- e out-degree
The Complex Network of Wikipedia
23
Correlations
  • The rate equations allow one to compute also the
    indegree-indegree correlations

The Complex Network of Wikipedia
24
Lack of correlations
Model
model 0.5
The Complex Network of Wikipedia
25
Naïf interpretation
  • Hypothesis
  • in-degree popularity
  • out-degree quality
  • If the probability of increasing the in-degree
    depends on the in-degree itself, it means that in
    Wikipedia popularity prevails over quality. As in
    the World Wide Web?

The Complex Network of Wikipedia
26
Community structure
  • Wikipedia displays a strong community structure

The Complex Network of Wikipedia
27
Conclusions
  • Wikipedia entries form a complex network with
    preferential attachment, power law distribution
    for both in- and out-degree and lack of
    correlations
  • Preferential attachment explains the main
    statistical properties
  • A naif interpretations would imply that the Wiki
    technology is not enough to provide a better
    dissemination of information with respect to the
    World Wide Web.
  • More understanding is needed for the community
    structure.

The Complex Network of Wikipedia
28
Thank You
Reference Preferential attachment in the growth
of social networks The internet encyclopedia
Wikipedia A.C., V. D. P. Servedio, F. Colaiori,
L. S. Buriol, D. Donato, S. Leonardi, and G.
Caldarelli Phys. Rev. E 74, 036116 (2006)
The Complex Network of Wikipedia
29
(No Transcript)
30
(No Transcript)
31
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com