SI 614 Network subgraphs (motifs) - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

SI 614 Network subgraphs (motifs)

Description:

SI 614 Network subgraphs (motifs) Biological networks Lecture 11 Instructor: Lada Adamic Outline motifs motif detection (software & Pajek) review of network ... – PowerPoint PPT presentation

Number of Views:134
Avg rating:3.0/5.0
Slides: 45
Provided by: ladamic
Category:

less

Transcript and Presenter's Notes

Title: SI 614 Network subgraphs (motifs)


1
SI 614Network subgraphs (motifs) Biological
networks
Lecture 11 Instructor Lada Adamic
2
Outline
  • motifs
  • motif detection (software Pajek)
  • review of network characteristics
  • used to compare model with real-world network
  • one more degree assortativity
  • biological networks
  • types
  • characteristics
  • hierarchical modularity model

3
Schematic view of network motif detection
4
Motifs can overlap in the network
motif to be found
graph
motif matches in the target graph
http//mavisto.ipk-gatersleben.de/frequency_concep
ts.html
5
Examples of network motifs (3 nodes)
  • Feed forward loop
  • Found in neural networks
  • Seems to be used to neutralizebiological noise
  • Single-Input Module
  • e.g. gene control networks

X Y Z
X
a
b
c
d
b
c
d
6
All 3 node motifs
7
Examples of network motifs (4 nodes)
  • Parallel paths
  • Found in neural networks
  • Food webs

W
X
Y
Z
8
4 node subgraphs (computational expense increases
with the size of the graph!)
9
Network motif detection
  • Some motifs will occur more often in real world
    networks than random networks
  • Technique
  • construct many random graphs with the same number
    of nodes and edges (same node degree
    distribution?)
  • count the number of motifs in those graphs
  • calculate the Z score the probability that the
    given number of motifs in the real world network
    could have occurred by chance
  • Software available
  • http//www.weizmann.ac.il/mcb/UriAlon/

10
What the Z score means
m mean number of times the motifappeared in
the random graph
the probability observing a Z score of 2 is
0.02275 In the context of motifs Z gt 0, motif
occurs more often than for random graphs Z lt 0,
motif occurs less often than in random
graphs Z gt 1.65, only a 5 chance of random
occurence
s standard deviation
of times motif appeared in random graph
x - mx
zx

sx
11
Finding classes on graphs based on their motif
profiles
12
Finding motifs (cliques and subgraphs) in Pajek
  • Create a second network that is the subgraph you
    are looking for
  • e.g. an undirected triad
  • Vertices 3
  • 1 "v1"
  • 2 "v2"
  • 3 "v3"
  • Arcs
  • Edges
  • 2 3 1
  • 1 2 1
  • 1 3 1

13
finding motifs with Pajek
  • Use the two drop down menus in the networks
    list to specify two networks
  • Then run NetsgtFragment (1 in 2)gtFind
  • under NetgtFragment (1 in 2)gtOptions
  • can select induced subnetwork containing only
    overlapping fragments

in
14
finding motifs with Pajek (contd)
  • Now we have just the triads
  • Creates a hierarchy object with the membership of
    each triad listed

15
Comparing network models with the real thing
  • check for structural similarity between the
    artificial network (the model) and the real world
    network
  • degree distribution
  • assortativity
  • do high degree nodes connect to other high degree
    nodes?
  • average shortest path
  • dependence on size of network
  • clustering coefficient
  • compare to a randomized version conserving node
    degree
  • dependence on node degree
  • dependence on size of network
  • motif profile

16
How can we randomize a network whilepreserving
the degree distribution?
  • Stub reconnection algorithm (M. E. Newman, et al,
    2001, also known in mathematical literature since
    1960s)
  • Break every edge in two edge stubsA??B to A?
    ?B
  • Randomly reconnect stubs
  • Problems
  • Leads to multiple edges
  • Cannot be modified to preserve additional
    topological properties

17
Local rewiring algorithm
  • Randomly select and rewire two edges (Maslov,
    Sneppen, 2002, also known in mathematical
    literature since 1960s)
  • Repeat many times
  • Preserves both the number of upstream and
    downstream neighbors of each node

18
Conserving additional low-level topological
properties
  • In addition to ki one may also conserve
  • The exact numbers of loops or other motifs
  • The size and numbers of components Internet
    all nodes have to be connected to each other
  • Metropolis algorithm two edges are rewired based
    on E(Nactual-Ndesired)2/Ndesired
  • If ?E?0 rewiring step is always accepted
  • If ?Egt0 rewiring step is accepted with
    pexp(-?E/T)

19
Assortativity
  • Social networks are assortative
  • the gregarious people associate with other
    gregarious people
  • the loners associate with other loners
  • The Internet is disassortative

Assortative hubs connect to hubs
Random
Disassortative hubs are in the periphery
20
Correlation profile of a network
  • Detects preferences in linking of nodes to each
    other based on their connectivity
  • Measure N(k0,k1) the number of edges between
    nodes with connectivities k0 and k1
  • Compare it to Nr(k0,k1) the same property in a
    properly randomized network
  • Very noise-tolerant with respect to both false
    positives and negatives

21
Correlation profiles give complex networks unique
identities
2D picture
Protein interactions
Internet
slide by Sergei Maslov
22
Correlation profiles give complex networks unique
identities
Sergei Maslov 2D histogram
Protein interactions
Internet
23
Correlation profiles -contd
  • Pastor-Satorras and Vespignani 2D plot

average degree of the nodes neighbors
degree of node
24
Correlation profiles -contd
  • Newman single number

-0.189
internet degree correlation coefficient The
Pearson correlation coefficient of nodes on
each side on an edge
25
Other examples of assortative mixing
  • Assortativity is not limited to degree-degree
    correlations other attributes
  • social networks race, income, gender, age
  • food webs herbivores, carnivores
  • internet high level connectivity providers,
    ISPs, consumers
  • Tendency of like individuals to associate
    homophily
  • Scott Feld paper

26
Biological networks
  • In biological systems nodes and edges can
    represent different things
  • nodes
  • protein, gene, chemical
  • edges
  • mass transfer, regulation
  • Can construct bipartite or tripartite networks
  • e.g. genes and proteins

27
GENOME
protein-gene interactions
PROTEOME
protein-protein interactions
METABOLISM
bio-chemical reactions
slide after Reka Albert
28
Cellular processes form networks on many levels
  • metabolic reaction networks (tri-partite)
  • Node types
  • metabolites (substrates or products), open
    rectangles
  • metabolite-enzyme complexes (black rectangles)
  • enzymes (open ovals)
  • Edges
  • substrate to complex or complex to product
  • symmetrical edges

slide after Reka Albert
29
regulatory networks
nodes genes, proteins edges translation
regulation activating inhibiting
slide after Reka Albert
30
the yeast two-hybrid method
  • Activation and binding domains are separated and
    each attached to a different protein
  • If the proteins interact, the two domains will be
    brought together and activate the transcription
    of a reporter gene
  • Can do simultaneous genome-wide experiments

slide after Reka Albert
31
Resulting interaction network
slide after Reka Albert
32
Properties and problems of resulting networks
  • Properties
  • giant component exists
  • power law distribution with an exponential cutoff
  • longer path length than randomized
  • higher incidence of short loops than randomized
  • Problems
  • false positives
  • false negatives
  • only 20 overlap between different studies

33
Implications
  • Robustness
  • resilient to random breakdowns
  • mutations in hubs can be deadly
  • Evolution
  • most connected hubs conserved across organisms
    (important)
  • gene duplication hypothesis
  • new gene still has same output protein, but no
    selection pressure because the original gene is
    still present. So some interactions can be added
    or dropped
  • leads to scale free topology

34
Metabolic networks how to represent them
  • Can consider the one-mode projection of substrate
    interactions (undirected)

slide after Reka Albert
35
Metabolic networks are scale-free
  • In the bi-partite graph
  • the probability that a given substrate
    participates in k reactions is k-a
  • indegree a 2.2
  • outdegree a 2.2

(a) A. fulgidus (Archae) (b) E. coli (Bacterium)
(c) C. elegans (Eukaryote), (d) averaged over 43
organisms
36
Modularity
  • No modularity
  • Modularity
  • Hierarchical modularity

(Pajek!)
E. Ravasz et al., Science 297, 1551 -1555 (2002)
37
How do we know that metabolic networks are
modular?
  • clustering decreases with degree as
  • C(k) k-1
  • randomized networks (which preserve the power law
    degree distribution) have a clustering
    coefficient independent of degree

38
How do we know that metabolic networks are
modular?
  • clustering coefficient is the same across
    metabolic networks in different species with the
    same substrate
  • corresponding randomized scale free networkC(N)
    N-0.75 (simulation, no analytical result)

bacteria archaea (extreme-environment single cell
organisms) eukaryotes (plants, animals, fungi,
protists) scale free network of the same size
39
review what would the clustering coefficient of
a random network be
  • assume average degree of node is k
  • probability of one neighbor linking to another is
    k/N
  • scales as N-1

40
Constructing a hierarchically modular network
  • RSMOB model
  • Start from a fully connected cluster of nodes
  • Create 4 identical replicas of the cluster,
    linking the outside nodes of the replicas to the
    center node of the original (N 25 nodes)
  • This process can repeated indefinitely
  • (initial number of nodes can be different than 5)

41
Properties of the hierarchically modular model
  • RSMOB model
  • Power law exponent g 2.26 (in agreement with
    real world metabolic networks)
  • C 0.6, independent of network size (also
    comparable with observed real-world values)
  • C(k) k-1, as in real world network
  • How to test for hierarchically arranged modules
    in real world networks
  • perform hierarchical clustering on the
    topological overlap map (well cover hierarchical
    clustering in a few weeks)
  • can be done with Pajek

42
Topological overlap
  • A Network consisting of nested modules
  • B Topological overlap matrix

hierarchical clustering
43
Hubs may act within a module, or connect modules
  • Party hub
  • simultaneous interactions
  • tends to be within the same module
  • Date hub
  • sequential interactions
  • connect different modules

Han et al, Nature 443, 88 (2004)
slide after Reka Albert
44
  • some matching motifs frequently overlap (e.g.
    feed forward loop)

Zhang et al, J. Biol 4, 6 (2005)
Write a Comment
User Comments (0)
About PowerShow.com