Finding regulatory modules: A statistical approach - PowerPoint PPT Presentation

1 / 19

About This Presentation

Title:

Finding regulatory modules: A statistical approach

Description:

Find sequences of 2 or more sites shared across a number of annotations (common annotations) ... Transitions probabilities depend only on the present state ... – PowerPoint PPT presentation

Number of Views:49

Avg rating:3.0/5.0

Slides: 20

Provided by: mikhailv

Category:

Tags: approach | finding | modules | regulatory | scan | statistical

Transcript and Presenter's Notes

Title: Finding regulatory modules: A statistical approach

1
Finding regulatory modules A statistical approach

Mikhail Velikanov
Linnaeus Centre for Bioinformatics

2
Introduction

Regulatory modules (RMs) sets of regulatory
sites that work cooperatively
TF binding sites and promoter elements
Splicing enhancers and suppressors
Site clusters
site A AND (site B OR site C) AND (NOT site D)
Beads on a spring
site B is 20 3 bp downstream of site A
Distance distributions have a short range and a
well-defined peak

3
Searching for RMs Setup of the Problem
Motifs
Annotations

Seq. length constant and small (0.5 kb)
Num. of sites 20
No overlapping sites
Sites characterized by
Identity
P-values ( pt)
shown by width

Look for annotation patterns that occur
consistently in all or some of the sequences.
4
RMs as Annotation Alignments

Align sites by identity
Find sequences of 2 or more sites shared across a
number of annotations (common annotations)
Conditions
Distances between sites are similar
P-values of aligned sites are similar
P-values of aligned sites are small

Need a function that measures how well conditions
(1-3) are satisfied (strength of common
annotation).
5
Strength of common annotation site p-values

Assume a common annotation of S sites supported
by N sequences
For the i-th site, let pimin, pimax be the
smallest and the largest of the N p-values
pimax measure of how small p-values are
Ri pimax/pimin measure of similarity
Probability pi of observing p-values as similar
and as small in N random annotations

6
Strength of common annotationdistances between
sites

Account for no overlaps between sites
renormalization of pi for each site
p0 1 - ?pi positions between sites
Compute approximately probability of common
annotation PCA as a function of p0, p1, , pS
Strength of common annotation
Z -ln PCA

S

i1

7
Searching for the strongest common annotations

Given an input set of annotations, define groups
of annotations such that
each group has at least one common annotation
the strongest common annotation of each group is
distinct
NB Groups may fully or partially overlap!

Cannot use standard clustering algorithms.
8
Classification Algorithm

Find pairs of annotations with at least one
common annotation
Each pair is a nucleus of a potential group
Each group grows by adding annotations one at a
time
the group retains its strongest common annotation
at each step
each addition maximizes the group strength
annotation added to one group remains available
for addition to other groups
Where does the growth stop?

(strength group strength, Zg)
9
Stopping criterion

No more annotations can be added
group contains all annotations in the input set
change in the strongest common annotation
Formed during growth of another group
ignore current group (pruning)
Group strength is too small
adding an unrelated annotation
group strength Zg is a score (Zg gt 0)
can be computed for groups of random annotations
by the extremal types theorem
lim Prob(Zgrand gt Zg) 1 - exp-(Zg/b)-a
threshold on Zg
numerical calibration of a, b for all possible N,
S

n ? 8
10
From annotation groups to RMs

Need a way to
account for optional sites
search for homologous RMs

11
RMs as generalized HMMs

Generalized (duration) HMMs (gHMMs or dHMMs)
consist of 2 types of states
motif states (PSSMs)
annotation sites
spacer states (distance distributions)
gaps between sites
States are connected according to certain
topology
Transitions probabilities depend only on the
present state
Common annotations of groups are simple gHMMs

12
RMs as generalized HMMs
Can make a single model because of the overlap!

Common annotations define gHMM states
Overlaps define topology and provide estimates of
transition probabilities
Multiple matches to the model

13
From annotations to RMs
RMs
14
Testing the Method Test 1

25 random DNA sequences, 20 are seeded with an
RM
2 sites with low p-value (lt 10-3) separated by 20
25 bp
Scan sequences with unrelated motif subject to
p-value threshold
3rd site (random noise in annotations)

m0
m1
15
Testing the Method Test 2

25 random DNA sequences, 2 non-overlapping groups
of 10 and 11 sequences
each group is seeded with a distinct RM (2
sites)
distance between sites is 20 25 bp or 52 55
bp
Extra site added as before

16
Testing the Method Test 3

25 random DNA sequences, 2 overlapping groups of
12 and 14 sequences
same RMs as in previous test
groups overlap by 5 sequences
Extra site added as before

17
Summary

A method for discovery of regulatory modules
given a set of annotated sequences
Builds RMs from recurrent annotation patterns
Treats site p-values and distances in consistent
statistical framework
Can use prior information on RMs (Bayesian
approach)
RMs are output as gHMMs
flexibility of RMs structure (topology)
searching for homologous RMs

18
Future developments

Testing the method on real data
upstream regions of bacterial operons
bacterial Fe-regulons
other benchmark sets?
Algorithm improvements
better stopping criterion (use properties of
distance distributions)
more precise computation of common annotation
strength
better similarity measure for site p-values
(reduce compensation)

19
Acknowledgements

Thanks to David Ardell (LCB, Uppsala) and
Georgiy Sofronov (Univ. of Queensland, Brisbane)
for many fruitful discussions

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

Introducing-PowerShowcom PowerPoint PPT Presentation

Introducing-PowerShowcom - Introducing-PowerShowcom (Without Music)

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

Computational Discovery of Gene Modules and Regulatory Networks PowerPoint PPT Presentation

Computational Discovery of Gene Modules and Regulatory Networks - Computational Discovery of Gene Modules and Regulatory Networks Georg Gerber MIT Department of EECS and MIT/Harvard Health Sciences and Technology | PowerPoint PPT presentation | free to view

Computational Discovery of Gene Modules and Regulatory Networks PowerPoint PPT Presentation

Computational Discovery of Gene Modules and Regulatory Networks - 28 genes were selected by the GRAM algorithm; all are involved in respiration. ... Respiration. Hap4. Activator, required for pheromone response 0.64 ... | PowerPoint PPT presentation | free to view

Identify regulatory modules from gene expression data PowerPoint PPT Presentation

Identify regulatory modules from gene expression data - Identify regulatory modules from gene expression data Xu Ling 02/09/2005 Introduction Much of a cell s activity is organized as a network of interacting modules ... | PowerPoint PPT presentation | free to view

Special Topics in Genomics Cis-regulatory Modules and Phylogenetic Footprinting PowerPoint PPT Presentation

Special Topics in Genomics Cis-regulatory Modules and Phylogenetic Footprinting - Special Topics in Genomics. Cis-regulatory Modules and Phylogenetic Footprinting ... A threshold method (Wasserman et al. Nature Genetics, 2000) ... | PowerPoint PPT presentation | free to view

Identify regulatory modules from gene expression data PowerPoint PPT Presentation

Identify regulatory modules from gene expression data - Identifying this organization is crucial for understanding cellular responses to ... Motif Finding (AlignACE) putative regulatory elements (TFBSs) ... | PowerPoint PPT presentation | free to view

Finding Regulatory Binding Motifs in Genomic Sequences PowerPoint PPT Presentation

Finding Regulatory Binding Motifs in Genomic Sequences - Scrambled Egg. Bacon. Cereal. Hash Brown. Orange Juice. By ... Scrambled Egg. Bacon. Cereal. Hash Brown. Orange Juice. Look at genes always expressed together: ... | PowerPoint PPT presentation | free to view

Procurement Capacity Strengthening Initiatives in Uganda: an Anti-Corruption approach PowerPoint PPT Presentation

Procurement Capacity Strengthening Initiatives in Uganda: an Anti-Corruption approach - Procurement Capacity Strengthening Initiatives in Uganda: an Anti-Corruption approach Presented at the 3rd OECD-DAC Joint Venture on Procurement held in Arusha, Tanzania | PowerPoint PPT presentation | free to view

Deciphering Gene Regulatory Networks by in silico approaches PowerPoint PPT Presentation

Deciphering Gene Regulatory Networks by in silico approaches - Deciphering Gene Regulatory Networks by in silico approaches. Sridhar Hannenhalli ... The ischemic and idiopathic cases are consistent ... | PowerPoint PPT presentation | free to view

Promoter Discovery: A Correlation Mining Approach PowerPoint PPT Presentation

Promoter Discovery: A Correlation Mining Approach - Motif discovery after clustering ... M1 M2 = increase gene expression change from Day 1 to Day 4. Yi Lu Wayne State University ... | PowerPoint PPT presentation | free to view

Genetic Regulatory Networks and Systems Biology PowerPoint PPT Presentation

Genetic Regulatory Networks and Systems Biology - GRN Example 3: Sea Urchin Endoderm Development. GRN Example 4: RNA interference ... (Part of) Sea Urchin GRN for development. Hood-Galas. Nature, Jan 23 03 ' ... | PowerPoint PPT presentation | free to view

Finding biological sequence motifs PowerPoint PPT Presentation

Finding biological sequence motifs - ... file containing multiple dna or protein sequences motif width how many motifs wanted Calculate the background frequencies of ... Phylogenetic footprinting ... | PowerPoint PPT presentation | free to view

Advanced Algorithms and Models for Computational Biology -- a machine learning approach PowerPoint PPT Presentation

Advanced Algorithms and Models for Computational Biology -- a machine learning approach - Title: 1: Expression networks Author: epxing Last modified by: epxing Created Date: 9/7/2005 4:42:18 AM Document presentation format: On-screen Show | PowerPoint PPT presentation | free to view

A systems biology approach to the identification and analysis of transcriptional regulatory networks PowerPoint PPT Presentation

A systems biology approach to the identification and analysis of transcriptional regulatory networks - ... subnets, & automatically determines the most appropriate number of communities ... is a measure of similarity between two vectors (each vector contains 153 slots ... | PowerPoint PPT presentation | free to view

Statistical Learning from Relational Data PowerPoint PPT Presentation

Statistical Learning from Relational Data - Webpages (& the entities they represent), hyperlinks. Social networks ... Topics of linked webpages are correlated. Data instances are not identically distributed: ... | PowerPoint PPT presentation | free to view

Procurement Capacity Strengthening Initiatives in Uganda: an Anti-Corruption approach PowerPoint PPT Presentation

Procurement Capacity Strengthening Initiatives in Uganda: an Anti-Corruption approach - Procurement Capacity Strengthening Initiatives in Uganda: an Anti-Corruption approach Presented at the 3rd OECD-DAC Joint Venture on Procurement held in Arusha, Tanzania | PowerPoint PPT presentation | free to view

Review: RECOMB Satellite Workshop on Regulatory Genomics PowerPoint PPT Presentation

Review: RECOMB Satellite Workshop on Regulatory Genomics - Experimentally investigated binding in those promoters with no TATA-box ... TATA, DPE, and MTE can all. independently support transcription ... | PowerPoint PPT presentation | free to view

Statistical Alignment and Footprinting PowerPoint PPT Presentation

Statistical Alignment and Footprinting - Steel and Hein,2001 Holmes and Bruno,2001. C. T. C. A. C. Emit functions: e ... Use PWM and Bruno-Halpern (BH) method to make TF specific evolutionary models ... | PowerPoint PPT presentation | free to view

An Overview of Weighted Gene Co-Expression Network Analysis PowerPoint PPT Presentation

An Overview of Weighted Gene Co-Expression Network Analysis - ... Statistical Applications in Genetics and Molecular Biology: Vol. 4: No. 1, Article 17. ... Hypothesis: molecular wiring makes us human. A. B. Human. Chimp ... | PowerPoint PPT presentation | free to view

Cis-regultory module PowerPoint PPT Presentation

Cis-regultory module - Cis-regultory module 10/24/07 | PowerPoint PPT presentation | free to view

Is TMS Useful for the Treatment of Depression PowerPoint PPT Presentation

Is TMS Useful for the Treatment of Depression - Distinguished Professor of Psychiatry, Neurology and Radiology ... Views in heat of debate are not my more reasoned approach outside of debate ... | PowerPoint PPT presentation | free to view

Bank Capital and Loan Loss Reserves under Basel II: Implications for Latin America and Caribbean Cou PowerPoint PPT Presentation

Bank Capital and Loan Loss Reserves under Basel II: Implications for Latin America and Caribbean Cou - A simple meter for credit risk. A regulatory strategy for countries that are not ready to adopt the IRB but ... Objective of the paper (1) ... | PowerPoint PPT presentation | free to view

V21 Metabolic Pathway Analysis (MPA) PowerPoint PPT Presentation

V21 Metabolic Pathway Analysis (MPA) - duplicate modes (a pair of rows is only combined if it fulfills the condition ... Finding all the EFM1 and EFM2 are One would only find the ... | PowerPoint PPT presentation | free to view

Machine Learning for HighThroughput Biological Data PowerPoint PPT Presentation

Machine Learning for HighThroughput Biological Data - Predicting the operons in E. coli. Chromatin Remodelers and Nucleosome ... Finding Operons in E. coli (Craven, Page, Shavlik, Bockhorst and Glasner, 2000) ... | PowerPoint PPT presentation | free to view

Data Ownership and Security (RCR Week-3 Lecture) PowerPoint PPT Presentation

Data Ownership and Security (RCR Week-3 Lecture) - Data Ownership and Security (RCR Week-3 Lecture) Delia Y. Wolf, MD, JD, MSCI Associate Dean, Regulatory Affairs & Research Compliance Harvard School of Public Health | PowerPoint PPT presentation | free to view

Systematic Review of Prognostic Tests PowerPoint PPT Presentation

Systematic Review of Prognostic Tests - Systematic Review of Prognostic Tests Prepared for: The Agency for Healthcare Research and Quality (AHRQ) Training Modules for Medical Test Reviews Methods Guide | PowerPoint PPT presentation | free to view

Working with the SBA on EGovernment and Small Business Innovation Research SBIR PowerPoint PPT Presentation

Working with the SBA on EGovernment and Small Business Innovation Research SBIR - 'Regulatory Reform is just as important as tax reform for strengthening the economy' ... E-Forms (like an Intuit's Turbo Tax). Customer Agent. Partnership. ... | PowerPoint PPT presentation | free to view

Have You Ever Tried Herding Cats? PowerPoint PPT Presentation

Have You Ever Tried Herding Cats? - Have You Ever Tried Herding Cats? Debi Damas, RN Sr. Director Regulatory Compliance and Content Silverchair Learning Systems * How many agree that providing education ... | PowerPoint PPT presentation | free to view