Lotkaian Informetrics and applications to social networks - PowerPoint PPT Presentation

About This Presentation
Title:

Lotkaian Informetrics and applications to social networks

Description:

Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief Journal of ... – PowerPoint PPT presentation

Number of Views:103
Avg rating:3.0/5.0
Slides: 50
Provided by: AnB62
Category:

less

Transcript and Presenter's Notes

Title: Lotkaian Informetrics and applications to social networks


1
Lotkaian Informetrics and applications to social
networks
  • L. Egghe
  • Chief Librarian Hasselt UniversityProfessor
    Antwerp UniversityEditor-in-Chief Journal of
    Informetrics
  • leo.egghe_at_uhasselt.be

2
1-dimensional informetrics
  • authors in a field
  • journals in a field
  • articles in a field
  • references (or citations) in a field
  • borrowings in a library
  • websites, hosts,
  • web citations to a paper
  • in- (or out-) links to/from a website
  • downloads of an article

3
Growth
  • Exponential growth
  • All new fields grow exponentially
  • Otherwise there is S-shaped growth.

4
web servers versus time
5
(No Transcript)
6
(No Transcript)
7
2- dimensional informetrics
  • authors in a field (sources)
  • articles in a field (items)
  • indicating which author has written which
    papers
  • S Set of sources
  • I set of items
  • IPP Information Production Process

8
Examples of IPPs
S F I
Authors Articles
Journals Articles
Articles Citations (to/from)
Books Borrowings
Words ( types) Use of words in a text ( tokens)
Web sites Hyperlinks (in-/out-)
Web sites Web pages
Cities/villages Inhabitants
Employees Their production
Employees Their salaries

9
  • size-frequency function
  • for n 1,2,3,
  • sources with n items
  • rank-frequency function
  • for r 1,2,3,
  • items in the source on rank r
  • (sources are ranked in decreasing order of
    number of items they have)

10
Continuous model
  • Source densities
  • Item densities

11
Lotkaian Informetrics
  • The law of Lotka and the law of Zipf
  • Lotka (1926)

. The value is a
turning point in informetrics (see further).
12
  • Lotkas law is equivalent with Zipfs law

Linguistics Zipfs law in econometrics is called
Paretos law
13
Dependence of G on . Existence of a Groos
droop if .
14
log-log scale
  • decreasing straight line with slope

15
Rank-frequency distributions for websites
16
The scale-free property
  • f scale-free
  • such that

17
Theorem (i)?(ii)
  • f is continuous, decreasing and scale-free
  • f is a decreasing power function
  • such that
  • i.e. Lotkas law

18
  • Explanation of Lotkas law based on exponential
    growth of sources and items (Naranan (1970)) and
    an interpretation of Lotkaian IPPs as
    self-similar fractals
  • (Egghe (2005))
  • Fractals and fractal dimension

19
  • Divide a line piece into 3 equal parts
  • ? we need 331 line pieces of this length to
    cover the original line piece
  • 3 ? need 331 ? dim1


20
  • Divide the sides of a square into 3 equal parts ?
    we need 932 squares with this side length to
    cover the original square
  • 3 ? need 932 ? dim2
  • The same for a cube
  • 3 ? need 2733 ? dim3

21
Construction of the triadic Koch curve
22
  • For the triadic Koch curve
  • 3 ? need 43D ? dimD
  • with

The Koch curve is a proper fractal with fractal
dimension Complexity theory Fractal theory
Mandelbrot
23
Naranan (Nature, 1970)
  • Theorem
  • (i) The number of sources grows exponentially in
    time t
  • (ii) The number of items in each source grows
    exponentially in time
  • (iii) The growth rate in (ii) is the same for
    every source (ii) and (iii) together imply a
    fixed exponential function
  • for the number of items in each source at time
    t.

24
  • Then this IPP is Lotkaian, i.e. the law of Lotka
    applies if f(p) denotes the number of sources
    with p items, we have
  • where

25
Egghe (2005) (Book and JASIST)
  • (i) The number of line pieces grows
    exponentially in time t, here proportional with
    4t
  • (ii),(iii) 1/length of each line piece grows
    exponentially in time t and with the same
    growth rate 3. Hence we have growth proportional
    with 3t.

26
  • Rephrased in terms of informetrics
  • a (Lotkaian) IPP is a self-similar fractal and
    its fractal dimension is given by the logarithm
    of the growth rate of the sources, divided by the
    logarithm of the growth rate of the items.
  • (which can be gt or lt 1). Hence, the exponent in
    Lotkas law satisfies the important relation
  • This result was earlier seen by Mandelbrot but
    only in the context of (artificial) random texts
    (hence in linguistics).

27
Further applications of Lotkaian Informetrics
  • Concentration theory (inequality theory) Lorenz
    curves (cf. econometrics).
  • Egghe (2005) (Book, Chapter IV).
  • Fractional modelling of authorship (case of
    multi-authored articles) determine
  • authors with articles
  • (fractional counting an author in an
  • m-authored paper receives a score ).

28
Theoretical and experimental fractional frequency
distributions (case of i4).
29
  • Dynamics of Lotkaian IPPs, described via
    transformations on the sources and on the items
    includes the description of dynamics of networks.
  • Relations with 3-dimensional informetrics See
    new journal
  • L. Egghe. General evolutionary theory of IPPs
    and applications to the evolution of networks.
    Journal of Informetrics 1(2), 115-122, 2007

30
  • Item transformation
  • Source transformation
  • New rank-frequency function

31
  • Theorem New size-frequency function
  • where

32
  • Case is example of linear 3
    dimensional informetrics
  • Sources1 ? Items1 Sources2 ? Items2
  • Examples
  • Webpages ? hyperlinks ? use of hyperlinks
  • Library subject categories ? books
  • ? borrowings
  • See further.
  • Back to the general case.

33
  • Power law transformations in Lotkaian IPPs

34
  • Theorem
  • is only dependent on b/c due to the
  • scale-free nature of Lotkaian systems.

35
  • Corollary
  • With this, one can study the evolution of an IPP,
    e.g. a part of WWW V. Cothey (2007) confirms
    theory except in one case where non-Lotkaian
    evolution is found, probably due to automatic
    creation of web pages (deviation from a social
    network).

36
  • Further application
  • IPPs without low productive sources
  • (Egghe and Rousseau (2006))
  • Take sources remain but they grow
    in number of items
  • Now

37
  • and (since )
  • Evolution decreasing Lotka exponent and no low
    productive sources

38
Examples
  • Country sizes data from www.gazetteer.de (July
    10, 2005) 237 countries 1.69 (best fit)
  • Municipalities in Malta (1997 data) 67
    municipalities 1.12 (best fit)
  • Database sizes on the topic fuzzy set theory
    (20 largest databases on this topic) (Hood and
    Wilson (2003))
  • 1.09 (best fit)
  • Unique documents in databases (20 databases
    above) 1.33 (best fit).

39
  • Application of Lotkas law to the modelling of
    the cumulative first-citation distribution
  • i.e.
  • the distribution over time at which an article
    receives its first citation.

40
  • The time t1 at which an article receives its
    first citation is an important indicator of the
    visibility of research.
  • At t1 the article switches its status from
    unused to used.
  • t1 is a measure of immediacy but, of course,
    different from the immediacy index (Thomson
    Scientific).

41
  • The distribution of t1 over a group of articles
    is the topic of the present study. We will study
    the cumulative first-citation distribution
  • cumulative fraction of all papers
    that have, at t1, at least 1 citation.

42
  • Rousseau (1994) uses two different differential
    equations to model two types of graphs a concave
    one and an S-shaped one. These equations are not
    explained and are not linked to any informetric
    distribution.

43
  • In Egghe (2000), I use only 2 elementary
    informetric tools
  • the density function of citations to an
    article, t time after its publication
    (exponential, ),
  • the density function of the number of
    papers with A citations in total (Lotka,
    ), (only ever cited papers
    are used here).

44
  • Normalizing to distributions
  • becomes for an article
    with
  • A citations in total
  • becomes but we will use
  • the fraction of ever cited articles, in
    order to include also the never cited articles.

45
  • Theorem
  • concave if
  • S-shaped if
  • , hence explaining both shapes in one model.
  • Note the turning point of .

46
  • Proof A first citation is received if
  • ()
  • ? Cumulative fraction of all articles that are
    already cited at time t1
  • ()
  • ? () into () yields

47
Motylev (1981)
48
  • fit

49
Rousseau (1994)
JACS to JACS data of Rousseau Time-unit 2
weeks, 4-year period
  • fit
Write a Comment
User Comments (0)
About PowerShow.com