Network Inference - PowerPoint PPT Presentation

About This Presentation
Title:

Network Inference

Description:

Network Inference. Chris Holmes. Oxford Centre for Gene Function, &, Department of Statistics ... Challenges of inferring network topology & the structure of ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 24
Provided by: stats5
Category:

less

Transcript and Presenter's Notes

Title: Network Inference


1
Network Inference
  • Chris Holmes
  • Oxford Centre for Gene Function, ,
  • Department of Statistics
  • University of Oxford

2
Overview
  • Statistical Inference
  • Challenges of inferring network topology the
    structure of local dependencies
  • Use of Integrative Genomics to aid inference
  • Conclusions

3
Inference
  • Inference is the process of learning from data
  • We have two objects to infer
  • Network structure (topology)
  • Functional form of the dependencies within a
    given network structure

4
Probabilistic (Bayesian) Networks
  • Graphical structure used to define interactions
    which encode a set of conditional independencies
  • Way of simplifying a joint distribution
  • Have become extremely popular in genomics
  • - R. Cowell et al, Springer (1999)
  • - Friedman, http//www.cs.huji.ac.il/nir/

5
Probabilistic Networks
  • Advantages
  • Coherent axiomatic framework
  • Provides a calculus for integrating information
    from multiple sources that guards against logical
    inconsistencies
  • Allows precise statements of uncertainty
  • - on global network structure (topologies), and
    marginals
  • Sequential Experimental design
  • - Calculate optimal follow up experiments to
    learn most about the network structure given
    current state of knowledge

6
Probabilistic Networks
  • Disadvantages
  • Causal relationships not explicitly handled
  • Dawid AP. Causal inference without
    counterfactuals (with Discussion). J Am Statist
    Assoc (2000)
  • Restrictions on valid structures
  • Hammersley-Clifford theorem Rue Held, Gaussian
    Markov Random Fields, Chapman Hall (2005)

7
Network Inference
  • Prior on network space leads to posterior
  • Computational framework to learn
  • Markov Chain Monte Carlo Wilks et al, MCMC in
    practice, Springer, (1999)
  • Stochastic search

8
Hypothesis-Driven Networks
  • Originally networks were hypothesis driven
  • Well defined small networks
  • Experiments set up to test specific hypothesis
  • Then arrival of high-throughput genomic
    (disruptive) technologies
  • Treats network structure unknown
  • Data mining (data dredging?)

9
(No Transcript)
10
Bayesian Network Approach
Aim is to find graph topology that maximises
likelihood given the data
11
Finding Optimal Network Hard Problem
12
Data Driven Networks
  • Data is extremely sparse, compared with the
    dimensionality of the network space
  • Great uncertainty in any conclusions
  • High numbers of false positives (false
    connections) and false negatives (missing
    connections)
  • This uncertainty is encompassed in a fully
    Bayesian model, via the posterior distribution on
    network space, Pr(F y)

13
The Learned Network Structure
14
Data Driven Networks
  • A problem with data mining approaches
  • Often the data goes in one end and the answer
    comes out the other end untouched by human
    thought adapted from Doug Altman

15
Further complicating issues
  • Dynamic networks
  • Imoto (2002) Beal et al, Bioinformatics (2005)
  • Network Dynamics
  • Luscombe et al, Nature, (2004)
  • Interventional analysis
  • Ideker et al, Science, (2002)

16
Way Forward
  • More refined Prior structures
  • Multiple information sources
  • Literature mining
  • Rajagopalan, Bioinformatics (2005)
  • Comparative genomics
  • Amoutzias, EMBO (2004)
  • Combining other genomic measurement platforms
  • Schadt et al, Nat. Genet. (2005) Zhu et al,
    Cytogenet Genome Res. (2004) Beer and Tavazoie,
    Cell. (2004)

17
Improving Network Inference
Perturbations
Genetics
Biological Context
Expression observations
Regulatory Signals
Comparative Genomics
18
Integrative Genomics
  • Combine information from multiple sources to
    improve precision
  • Information is preserved across sources while
    noise (random variation) is independent across
    information sources

19
Germline DNA
ENVIRONMENT
Somatic DNA
RNA
Protein
Physiology
Sequencing SNPs
Epigenetics CGH
Microarrays
Proteomics
Metabonomics
20
Schadt et al.,
Schadt, Nat. Genet. July 2005.
21
Transcription cis and trans motifs
AND Logic, OR Logic
AND Logic
OR Logic, NOT Logic
Combinatorial patterns help identify groups of
transcripts predicted to show similar abundance
profiles
  • Beer and Tavazoie, Cell. 2004

Solid Actual expression Dashed Predicted
22
Conclusions
  • Current move back towards more hypothesis driven
    analysis on smaller networks
  • Conditioning on a well characterised network
    structures and using multiple data sources to
    infer and explore local topographic regions

23
References
  • Bayes nets Friedman, http//www.cs.huji.ac.il/ni
    r/
Write a Comment
User Comments (0)
About PowerShow.com