Evolving Networks - PowerPoint PPT Presentation

1 / 59
About This Presentation
Title:

Evolving Networks

Description:

Many devices with wireless capabilities: computers, mobile phones, PDA... around, mobile. data is transmitted in a multihop fashion. Current technology ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 60
Provided by: velblodVid
Category:

less

Transcript and Presenter's Notes

Title: Evolving Networks


1
Evolving Networks
  • Jean-Loup Guillaume
  • NPA team / LIP6 / UPMC - France
  • Joint works with
  • E. Fleury, C. Robardet, A. Scherrer,
  • M. Latapy and S. Le Blond.

2
Outline
  • Typical evolving networks/applications
  • peer to peer networks
  • web graphs
  • internet networks
  • phone calls networks
  • Measurement issues
  • Summary of classical approaches
  • A case study proximity sensor network
  • degrees and their evolution
  • evolution of components and social groups.

3
Peer-to-peer networks
  • Systems used to share files or computer
    resources,
  • to connect peoples (skype?)...
  • A typical P2P system/protocol allows
  • addition of new clients/files in a simple way
  • effective search of contents in the network
  • resilience clients can leave unexpectedly
  • users should not be overloaded.
  • Two main approaches
  • centralised a central server records all files
    and answers all queries
  • distributed peers are organised in an overlay
    network and are cooperatively in charge of all
    operations.

4
Peer-to-peer networks (cont.)
  • Typical figures on a Edonkey server for 48h
  • around 50 000 clients connected (small server)
  • 1,5 millions of connections/disconnections
  • 210 millions of sources search (who share file
    X).
  • P2P networks viewed as graphs
  • links between peers in an overlay network
  • efficient search of files, resilience to
    departure...
  • in general not related to geography/peers
    interests.
  • files exchange
  • communities of users with similar interests
  • links between files
  • communities of files - recommendation systems...

5
Peer-to-peer networks (cont.)
  • Degree-degree correlations
  • Files exchange (oriented network)
  • Out-degree how many files you are looking for.
  • In-degree how many files you are sharing.
  • Global time aggregation

6
 Peer-to-peer networks (cont.)
  • Degree evolution
  • Most popular clients
  • number of queries received linear in time
    (repeated automatically by software)
  • number of queries received for different files is
    converging really fast few popular files.

7
Web graphs
  • Hyperlinks between web pages.
  • Size and evolution
  • billions of web pages
  • pages/links can be created/modified/removed
  • million of modifications every day.
  • lot of dynamic content.
  • Search engines (Google)
  • use text-mining and link analysis to rank pages
  • a page is good if linked to by many good pages.
  • need to know the structure of the network
  • fast evolving pages might be more relevant?

8
Web graphs (cont.)?
  • Ranking pages (pagerank)
  • need a good/up to date knowledge of the network
  • avoid file not found.
  • visit web pages often enough, but not too often
  • need to know if some pages/portions of the web
    are evolving faster or slower
  • web pages classification?
  • Web spam detection text-mining
  • detect specific subnetworks (cliques...) of fake
    pages used to increase artificially the rank
  • can use the dynamics to detect substructures with
    a non-natural rate of apparition
  • a 10-clique can exists but if it appears in one
    day!

9
Internet network
  • Set of machines/routers with physical links
  • Size and evolution
  • millions of machines (billions as soon as gsm,
    cars, fridges, etc. gets equipped with wireless)
  • evolution is slow, but routing is changing very
    often (failures, congestion, load balance).
  • Security issues
  • given a set of routing tables/networks, can you
    detect outliers?
  • on-line detection of attacks (DDOS for instance)
    through the observation of routing evolution?

10
Phone calls networks
  • Who calls who
  • typically millions of customer for a company A
  • few calls/sms/mms per day for each user (from
    company A to A, A to B, B to A. Calls from B to C
    are unknown)
  • use of non network services (internet, logo or
    music downloading, etc.) with mobiles
  • information is kept for billing issues
  • customer information can also be used for
    marketing issues segmentation...
  • Main objectives
  • keep customers and get new ones
  • create new services and sell them to their
    clients.

11
Typical evolution
  • Aggregation per day
  • number of calls and sms per day
  • sociological effects week-end plus specific days
    (Christmas, new year, Valentines day, etc)
  • calls and sms are complementary.

12
Phone calls networks (cont.)
  • Churn prediction (3 types of churn)
  • more than 20 of churn every year
  • Sociological network approaches
  • strong correlations for operator, geographical
    distance, age and even handset brand
  • evolution of calls patterns
  • use of data-mining/feature selection approaches.
  • Acquire new customers
  • every time a client x from company A calls or is
    called by a client y from company B, company A
    gets some knowledge about y
  • A can offer specific prices to y if A thinks that
    y is willing to churn.

13
Phone calls networks (cont.)
  • Diffusion of innovation/viral marketing
  • Use word of mouth
  • give specific offer to one person and encourage
    him to talk about it to his friends (which might
    get the same offer)
  • some services are more likely to be diffused
    (person to person services, e.g. sms, mail,
    etc.)?
  • Can be observed on phone calls networks
  • Many sms/mms sent are only response to previous
    sms/mms.
  • Diffusion effect clearly visible.
  • Live experiments in many countries based on
    graph/data mining approaches
  • good preliminary results (response rate much
    greater than expected).

14
Outline
  • Typical evolving networks/applications
  • peer to peer networks
  • web graphs
  • internet networks
  • phone calls networks
  • Measurement issues
  • Summary of classical approaches
  • A case study proximity sensor network
  • degrees and their evolution
  • evolution of components and social groups.

15
Measurement issues
  • Some networks are simply log files,
  • many are obtained through measurements
  • Measuring evolution is hard in general
  • Example of Web graphs
  • one machine with good Internet connexion can
    capture few millions web pages every day
  • a number of pages of the same order have been
    modified or created during this day.
  • Two solutions
  • study long scale evolution or small subnetworks
  • (ask or be Google).

16
Measurement issues (cont.)
  • Data quality is always an issue.
  • Reliability
  • Who made the measurement?
  • What proportion have been measured, how long did
    it take and what is the evolution rate?
  • Are there constraints which might bias the
    result?
  • Technological, biological,
  • e.g. if a web page is not linked to, it cannot be
    found following links.
  • Can it be reproduced?

17
Measurement issues (cont.)
  • Approximation of the quality of Internet maps
  • number of sources/destinations
  • for many parameters.
  • In general, studied networks are incomplete and
    biased by the measurement process
  • work on bias removal and,
  • work on the biased data.

18
Outline
  • Typical evolving networks/applications
  • peer to peer networks
  • web graphs
  • internet networks
  • phone calls networks
  • Measurement issues
  • Summary of classical approaches
  • A case study proximity sensor network
  • degrees and their evolution
  • evolution of components and social groups.

19
Aggregation
  • Consider the agglomerated graph rather than its
    evolution
  • often used when the measurement process is too
    long or too hard to get many snapshots.
  • many parameters are available to describe a
    static network
  • centrality of nodes (degree, betweenness, etc.)
  • communities or typical subnetworks
  • correlations between properties
  • etc.
  • Aggregation can be done on a smaller scale
  • minute, day, month
  • depends on the phenomenon under study.

20
Evolution of static properties
  • Consider a static property and plot its evolution
    through time
  • number of nodes, links, etc.
  • tools from signal processing can be used
  • long range dependence (process with memory)
  • detection of anomalies

21
Study specific users/phenomenon
  • Outliers, central individuals, bridges...
  • churn prediction
  • viral marketing or diffusion over the network

22
Define new properties
  • Properties which capture the evolution (but
    cannot be defined on a static network)
  • time connected components set of users which can
    reach other others using the evolution of the
    network.
  • ?

23
Drawing as a tool
  • A good drawing algorithm might
  • give a good view of the overall structure
  • enlighten specific parts in the network
    (anomalies).
  • Specific parts can be studied later on.
  • Some simple approaches
  • draw the fully-aggregated network and for each
    time (or aggregation) step only draw the
    corresponding subnetwork
  • display through time a matrix representing the
    network (adjacency/weight/etc.)
  • display a (time x space) matrix.

24
(No Transcript)
25
(No Transcript)
26
image courtesy of C. de Kerchove
27
Outline
  • Typical evolving networks/applications
  • peer to peer networks
  • web graphs
  • internet networks
  • phone calls networks
  • Measurement issues
  • Summary of classical approaches
  • A case study proximity sensor network
  • degrees and their evolution
  • evolution of components and social groups.

28
Context
  • Many devices with wireless capabilities
  • computers, mobile phones, PDA
  • ambient wireless network
  • pair wise contacts, intermittent connectivity.
  • Nodes spread around, mobile
  • data is transmitted in a multihop fashion

29
Current technology
30
Goals
  • Goal of a network transmit information
  • proximity is important (radio medium)
  • Performance / reliability / connectivity
  • rely on the underlying network and the mobility
  • need to better understand the evolution and
    prepare for scalability issues.

31
Mobility
  • Only consider geographical proximity
  • need to know movements
  • proximity can be deduced from movement and
    initial positions.
  • How are people moving
  • randomly, group movement, something else?
  • How to measure it
  • geolocalisation (GPS)
  • expensive every person/machine must be equipped.
  • using gsm approximate position.
  • Exact position is not important, proximity is
  • use proximity sensors.

32
Constraints
  • Geographical proximity imposes constraints
  • many contacts gt some contacts must be near
  • anything else?
  • This is not going to be considered here.

33
Available (not so massive) data
A. Chaintreau et al., WDTN 2005
  • Infocom 2005 conference
  • 54 sensors (11 out of order, 2 lost) 41
    (small)
  • 3 days (short)
  • very specific situation.
  • Bluetooth sensors
  • Seeking for contacts (5s)
  • Wait answers (108-132s).
  • Data
  • For each time step (0-250000), a set of links
  • All links have been symmetrised.
  • Few other similar datasets are available.

34
(No Transcript)
35
Evolution of the network
  • Sociological effects
  • Day/night/breaks
  • Lots of small variations. 50 of isolated nodes
    (day)?
  • Maximum of 34 nodes connected (in 1 CC)?

36
One day typical evolution
37
Evolution of the network (cont)
  • Positive correlation nodes / links
  • For a given of nodes, there exists a large
    number of possible configurations from sparse to
    dense

38
Random process - contacts
Contact duration a 1.4
Inter-contact duration a 0.41
  • Straight line
  • wide range 300, 20k 100, 30k
  • power law distribution

39
Random process - degrees
  • Analyze the differential sequence
  • DkSk1-Sk if Skoriginal data sequence
  • Covariance / wavelet based tool
  • No long-range dependence / similar to a random
    process.

40
Random process - degrees
  • Covariance / wavelet based tool to obtain a
    spectral log-log representation of the covariance
    in the wavelet domain
  • j is the scale
  • Sj is roughly the average of the wavelet coef. at
    scale j
  • Power law ? long range dependency rather than
    high variability
  • Estimated exponent ? Hurts exponent is close to
    the special value 0.5
  • ? no long range ? Independent Identically
    Distributed (IID)?

Covariance of the differential sequence in the
wavelet domain
() P. Abry and D. Veitch, Wavelet analysis of
long-range dependent traffic, TIT, 1998
41
Connected components
  • At each time step network set of Ccs
  • groups of people which can communicate.
  • CC stability/structure
  • stable gt possibility of long communications.
  • too stable gt cannot communicate outside the CC.
  • structure information on the number of hops, the
    number of radio conflicts you might face...

42
Connected components (cont.)
  • For each time-step compute every CC
  • 2 similar sets of nodes but different sets of
    links are different CC (routing has to be
    modified)
  • very different CCs are observed.

43
CCs - Density
  • Set of connected nodes and links
  • Small components strong variation of density.
  • Big components low density
  • max(nb_links) 4.5nb_nodes
  • Day one  giant  components and few very small
    ones.
  • Night many small components (mainly isolated
    links)?

44
CCs - Stability
  • Strong heterogeneity, most components
  • Appear only a few times.
  • Have a very short cumulated lifetime.

45
CCs - Stability (cont.)
  • Large components (given set of nodes/links)
  • Have a very short cumulated lifetime (12nodes /
    100sec max)?
  • Rarely appear more than once.
  • Global dynamic effects impact on large CCs
  • However
  • Large components mainly encounter small
    modifications.

46
CCs - Isomorphisms
  • Consider nodes and links or nodes only?

47
CC - Isomorphisms (cont.)
  • Isomorphic components over-represented?
  • 1 1 1 3 (star) 17.9 (10.5)?
  • 1 1 2 2 (cycle-1) 36.1 (31.6)?
  • 1 2 2 3 (star1) 27.2 (31.6)?
  • 2 2 2 2 (cycle) 2.7 (7.9)?
  • 2 2 3 3 (cycle1) 12.4 (15.8)?
  • 3 3 3 3 (clique) 3.8 (2.6)?
  • In general
  • Very high and low density over-represented.

48
Data mining techniques
  • Computation of set patterns using complete
    solvers
  • D-miner.
  • Evolution (edges x aggregated time) boolean
    matrix
  • looking for maximal rectangles of true values
    (formal concepts)
  • maximal frequent subgraphs using a time
    threshold.
  • maximal significant subgraphs using a edge
    threshold.

49
Identifying social groups
  • We obtain 23 316 frequent connected subgraphs
  • 10 time step (10x240) and at least 5 edges
  • Most of the subgraphs only cover few individuals
    with a low edge density

50
Identifying social groups (cont.)
  • social group
  • frequent and significant connected component
  • Only 281 with a density greater than 0,8
  • same sets of vertices are covered many time, or
    differ of very few individuals or time steps.
  • Merging very similar social groups
  • A subgraph is defined by set of edges E and
    characterized by a set of time steps T
  • Two subgraphs (E1,T1) and (E2,T2) are such that
    if E1 is included in E2 then T2 is included in T1
  • As sub graphs are dense, we consider the vertices
    they cover. We merge two sub graphs if V1?V2 and
    T1\T2 contains time steps that differ to at most
    one time step to a time step of T2.
  • 15 groups of vertices

51
Trajectories among social groups
  • Individual 19, enters group 13 (time step 1215)?
  • Goes to group 9
  • Before going to group 10

9
10
19
13
52
Random evolving model
  • Very simple model matching only the power laws
  • On-off sequence for each link based on the real
    distribution
  • Connected components
  • Similar results for of CCs, lifetime, of
    apparitions
  • Almost tree-like components (linear in the of
    nodes)?
  • Limit size effect for large components.

53
Evolving connexity
  • Capture the evolution
  • Simulation of a dynamic broadcast.
  • For a given node u at time t
  • T(u,t) flooding tree from u at time t
  • u receive the information at time t
  • u send the information asap to all its
    neighbours.
  • Parameters
  • broadcast source
  • broadcast starting time
  • immediate transmission or not.

54
Fast diffusion
  • Diffusion initiated at time t0
  • Large component.

55
Slow diffusion
  • Diffusion initiated during the night (t115k)?
  • End of night t138k
  • Maximal number of links at 139k.

56
Utility of nodes and links
  • Tree structure
  • Depth number of transmitters
  • Width, ...
  • Utility of a node or link
  • Number of nodes in the sub-tree
  • Defined from a given source at a given time
  • Global utility.

57
Conclusions
  • Characterization of the evolution is crucial
  • Components of individual are playing a key role
  • Heterogeneous nature of CCs.
  • How to describe the evolution of CCs?
  • Can we work with other groups/communities?
  • Real test bed with 200 nodes / 1 month period
  • Data available soon at http//www.worldsens.net

58
Future work
  • Evolving networks
  • Analyze the dynamics
  • Introduce new parameters to describe it.
  • Dynamical models
  • Topology and/or mobility
  • Intra-inter duration not sufficient.
  • Community structure.
  • To be used for protocols simulation.

59
Future work (cont.)
  • Local/Distributed evolving communities
  • Mobile networks
  • Detection of intra/inter community links
  • Opportunistic routing outside communities
  • Other strategies inside.
  • Other parameters and related strategies.
  • P2P networks
  • Detect creation and evolution of communities
  • Follow users inside communities.

60
Future work (cont.)?
  • Networks security
  • Evolution of topology/routing/use
  • Describe typical evolution.
  • Measure/detect specific events
  • attacks, failures, new use
  • small perturbations gt strong impact.
  • Robustness to attacks
  • heterogeneity gt sensitivity to targeted attacks
  • dynamical context relation with utility?
Write a Comment
User Comments (0)
About PowerShow.com