Clustering of Interaction Network - PowerPoint PPT Presentation

About This Presentation

Title:

Clustering of Interaction Network

Description:

Title: Data Mining Approaches to Genomic Data Analysis Last modified by: azhang Created Date: 3/22/2001 12:46:03 PM Document presentation format – PowerPoint PPT presentation

Number of Views:84

Avg rating:3.0/5.0

Slides: 27

Provided by: cseBuffal9

Learn more at: https://cse.buffalo.edu

Category:

more less

Transcript and Presenter's Notes

Title: Clustering of Interaction Network

1
Clustering of Interaction Network

Definition
Process to detect densely connected sub-graphs
Determines protein complexes or functional
modules
Difficulties
Noisy data (too many false positives or false
negatives)
Cannot be solved by traditional clustering
techniques
Difficult to define the pair-wise distance
between proteins in the network.
Protein complexes may overlap.
Disparate sources of data
Different reliabilities
1750
Small overlaps
lt17

2
Protein Interaction Network

Undirected, unweighted graph
Node represents protein, edge represents
interaction

Example of Yeast protein interaction network
Importance
Provide a global view of cellular organizations
and biological functions
Applicable to systematic approaches for
functional knowledge discovery
Problem
Large scale
Complex connectivity

3
Structural Property

Small-world Phenomenon ( Watts Strogatz )
Appearance of networks in the middle of regular
and random networks
Higher average clustering coefficient than
expected by random chance
Significantly small average shortest path length
Scale-free Distribution ( Barabasi Albert )
Network growth by preferential attachment
Power law degree distribution a few high degree
nodes, many low degree nodes
Clustering coefficient distribution independent
to degree

Protein Interaction Database DIP MIPS
density 0.0015 0.0015
average clustering coefficient 0.2283 0.2878
average shortest path length 4.14 4.43
degree distribution (?) 1.77 1.64
4
Conventional Graph Clustering Approaches

Density-based Clustering
Finding densely connected sub-graphs ( e.g.
Maximal clique algorithm )
Hierarchical Clustering
Top-down approach iteratively partitioning a
graph
( e.g. Minimum cut algorithm )
Bottom-up approach iteratively merging nodes
( e.g. Node merging by common neighbors )
Problems
Computationally inefficient
Unable to detect overlapping clusters
Discard sparsely connected nodes

5
Functional Influence Model

Functional Flow
treat each protein of known functional annotation
as a source of functional flow for that
function
simulating the spread of this functional flow
through the neighborhoods surrounding the sources
with random walk.
functional score the amount of flow that the
protein has received for that function

u
v
Func(a)
6
Functional Influence

Functional Influence based on Distance.

Weibull Distribution

Curve Fitting

d is the distance between two nodes
7
Functional Influence Model

Information Flow Simulation
Computation of functional influence infs(x) of s
on x ? V based on Shortest Path
Input a weighted interaction network and a
source node s
Output functional influence pattern of s
Measurements

PathRatio
PathRatio is the natural aging or losing of
information propagation in the network.
SPath(s,y) is all the shortest paths
between node s and node y.
PR(s,y) is the PathRatio between node s and node
y.
PathStrength
PS(P) measures the strength of path P
using weights on the edges along the path P.

8
Framework of functional influence simulation

Algorithm
Initialize inf(s)
Compute initial flow I(s ? y) by
Update inf(y) by
Repeat 3 for every node in the network.
Finally, the functional profile,
is generated for every node in the network.

F(d) is the functional distribution model. d is
the distance between node s and node y. PR(s,y)
is the Path Resistance between node s and node y.
Inf(s) is the initial functional influence from
node s. Infs(y) is the functional influence
received by node y from node s.
9
Functional Module Detection (FMD)
10
FlowChart for functional module detection
11
Functional Modularity Detection

Experimental Data
DIP (4935 proteins, 14162 interaction)
Evaluation
Functional categories and annotations from MIPS
Hyper-geometric p-value
Result

12
Computational Epidemiology

Computational Epidemiology
is a multidisciplinary field utilizing
techniques to develop tools and models to aid
epidemiologists in their study of the spread of
diseases.

1. Developing a virus spread and containment
respond model
4. Analyzing results of the containment strategy
(death toll vs. strategies)
2. Understanding virus spread and identifying
critical properties
3. Utilizing this finding into real infectious
virus spread
13
Virus Spread Network Model

What represent nodes and edges in virus spread
network model?
Node
Person (community network)
Town or place (road network)
Edge
Interaction (community network)
Pathway (road network)
Weight of nodes and edges
Changed by time t based on virus spread dynamics
model
Node weight Status of health (0 1)
Edge weight Status of strength (0 1)

14
Model Scheme

Spread Model
Spreading phase edges which are in the region of
spreading will be damaged
Defense Model
Signaling and propagation phase nodes which have
a certain number of damaged edges will send
signals to neighbor nodes
Defense action phase nodes which have a certain
level of signals from neighbor nodes will remove
all edges of those nodes

Virus progression to neighbor nodes
Signaling alarms to neighbor nodes from infected
neighbor node
Culling nodes to prevent from virus progression
15
Spread Model

Spreading Model
Simulating disease spreading
Damaging nodes and edges which are in a virus
spread radius from center
Virus Spread by r(t)

16
Defense Model

Defense Model
Simulating defense system of disease spreading
and message spreading
Culling interactions from damaged nodes in order
to stop spreading (Edge Culling in Green Circles)

17
Problem / Solution Approach

Which element of virus spread system has the
greatest impact on containment campaign?
Identifying critical element of system by
computational modeling and stochastic simulation.
How to plan a effective containment campaign for
minimizing damages by virus spread?
Mining best combination of critical parameters
under certain conditions.

Parameters
Critical parameter
Simulation Analysis
18
Application

Virus Spread Simulation on the road network at
the city of Oldenburg, German
Green edges Healthy edges
Red edges Damaged edges by spread process
Blue edges Damaged edges by defense process

Uncontrolled ? 0.02
Intermediate ? 0.12
Controlled ? 0.22

19
Osteoporosis

Osteoporosis
Definition a systemic skeletal disease
characterized by low bone mass and
micro-architectural deterioration of bone tissue
leading to enhanced bone fragility and a
consequent increase in fracture risk
25 million people in the United States are
suffered.
10 billion dollars are expended by medical
charges including rehabilitation and treatment
facilities.
Research Funding will be 200 billion by the year
of 2040

Normal
Osteoporosis
20
Challenges

Diagnosis of Osteoporosis?
Traditional method of evaluating bone strength is
by assessing bone mineral density (BMD).
Limitations on BMD
A major limitation of BMD is that it incompletely
reflects variation in bone strength.
Other factors like bone microarchitecture
contribute substantially to bone strength
By evaluating bone microstructure we can improve
determination of bone quality and strength

Computational Model on Bone Microstructure
21
Computational Model on Bone Microstructure

Questions
What is the better way to evaluate bone strength?
How can we identify fragile locations of the bone
structure?
Why dont we think this problem in a new
direction?
Let me think this problem with the structural
point of view.
Graph-based approach of bone microstructure
Bone microstructure contributes on bone strength.
We suppose rod-like mineral fibers represented by
edges in a graph.
It is capable of quantitative
assessment of bone mineral
density and bone micro-architecture

22
Model Approach

Bone is not a uniformly solid material, but
rather has some spaces between its hard elements.
Designing a network approach model for the bone
microstructure.
Quantitative assessment of bone mineral density
could be successfully done with this approach.

23
Bone Network Model

Creating Bone Network
A femur bone image from patients with
osteoporosis by DXA scan.
By image profiling on DXA scan image, we create
bone network based on the bone density.
What represent nodes and edges in bone network
model?
Node fiber binding point for bone cell movements
and biochemical interactions
Edge a group of mineralized fibers
Weight of nodes and edges
Node weight average weight of directly connected
edges
Edge weight Strength status of mineralized fibers

24
Problem / Solution Approach

What alternative ways for determining the
strength of bone rather than Bone Mineral Density
(BMD)?
?Designing a computational model of bone
microstructure.
How can we identify fragile locations of the bone
structure?
?Creating algorithms for mining weak locations
from a computational model of bone microstructure.

Human Bone
Bone Model
25
Identifying Critical Locations

Information Propagation Model
An algorithm to find critical edges in bone
network
Measuring the quantity of stress energy in each
edge
Cutting the most critical edge by Information
Propagation Model
Iteratively run to find the next critical edges.
It stops at the first isolated network

26
Conclusions

Various applications are generating data very
rapidly and in great volume, demanding data
mining approaches.
Network-based approaches look promising to solve
complex problems.
This research requires close collaboration among
multidisciplinary groups.
Semi-supervised approaches to integrate domain
knowledge into data mining tools are important to
the success of the research.