Phylogenetic footprinting - PowerPoint PPT Presentation

1 / 28

About This Presentation

Title:

Phylogenetic footprinting

Description:

Phylogenetic footprinting is a method for the discovery of ... Orthologous vs. Analogous ... Analogous sequences have same kind of function but are not related ... – PowerPoint PPT presentation

Number of Views:493

Avg rating:3.0/5.0

Slides: 29

Provided by: csHel

Category:

more less

Transcript and Presenter's Notes

Title: Phylogenetic footprinting

1
Phylogenetic footprinting
Topics in Computational Biology Ilkka
Vaahtoranta 4.3.2004
2
Phylogenetic footprinting

Introduction
Methods used
Substring Parsimony Problem (Torsten)
Results

3
Introduction
4
The Problem

Major challenge of current genomics is to
understand how gene expression is regulated.
An important step towards this understanding is
the capability to identify regulatory elements.

5
In a Nutshell

Phylogenetic footprinting is a method for the
discovery of regulatory elements in a set of
orthologous regulatory regions from multiple
species. It does so by identifying the best
conserved motifs in those orthologous regions.

Idea of phylogenetic footprinting was first
invented as early as 1988 (Tagle, Koop,
Goodman...)
It was at that time little ahead of its time
Only few sequences from related species were
available

6
Orthologous vs. Analogous

Orthologous sequences have the same function in
different species and are related
Analogous sequences have same kind of function
but are not related
Phylogenetic footprinting uses othologous
sequences

7
Regulatory Elements
RE's
Exon
Intron
5
Promoter sequence
Gene
Promoter

Lies usually before the actual gene
Rarely after the gene
Aproximately 600-1000 bp long sequence
Holds regulatory elements

Regulatory elements

Relatively short sequences, from 5 to 25 bp long
May hold gaps
Appear in othervice non-functional sequence

8
Multiple genes in single species

Single species
Related genes
This technique is used to find common regulatory
factors
Only in given organism
REs of single gene are not found
This is not phylogenetic footprinting

9
Multiple species with orthologous regulatory
regions
What do we need to identify regulatory elements?

Set of orthologous non-functinal DNA from species
that are related
For an example one might use the non-coding
sequence of insulin in ten different vertebrates
If well conserved, possible RE
This is phylogenetic footprinting

10
Why examine non-functional sequences?

Functional sequences evolve slower rate than
non-functional sequences cause of the selective
pressure
A transition in a functional sequence (gene) may
change the whole function of coded protein
A transition in a non-functional sequence (RE)
may only change expression freqvency of a gene

11
Phylogenetic footprinting exploits the mutation
rate difference of functional and non-functional
sequences
12
Methods used
13
Global Multiple Alignment

CLUSTLAW, GMA tool
Global Multiple alignment drawbacks
It is a np hard problem.
If optimal MA could identify all REs, we could
not compute it.
Because REs are quite short (10 in 1000
nucleotides), noise of diverged non-functional
sequences will overcome the short conserved
signal.

14
Classical motif finding

Standard motif finding (MEME, AlignAce,
ANN-Spec....)
Segment based motif finding (DIAGLIN...)
Outperform global multipple alignment

All have important shortcoming

Do not take account phylogenetic relationships
Closely related sequences have too high weight

15
Substring Parsimony Problem, a motif finding
algorithm

Formalization of the PF idea
Also NP-hard problem but easy to tune up to
eliminate exponential behavior
Substring parsimony searches for best alignments
in given sequence set.
Difference between substringparsimony and
multiple alignment lies in given phylogenetic
tree.
Multiple alignment does not care about
relationships of given species. This leads in
situation where closely related sequences of
given set gets relatively high weight in the
solution.