Exon prediction by Genomic Sequence alignment Burkhard Mo - PowerPoint PPT Presentation

About This Presentation
Title:

Exon prediction by Genomic Sequence alignment Burkhard Mo

Description:

Burkhard Morgenstern. Institut f r Mikrobiologie und Genetik ... Since mid-Eighties: Feng/Doolittle, Higgins/Sharp, Taylor, ... `Progressive' Alignment ... – PowerPoint PPT presentation

Number of Views:367
Avg rating:3.0/5.0
Slides: 67
Provided by: gob5
Category:

less

Transcript and Presenter's Notes

Title: Exon prediction by Genomic Sequence alignment Burkhard Mo


1

Burkhard Morgenstern Institut für Mikrobiologie
und Genetik Grundlagen der Bioinformatik Multipl
es Sequenzalignment Juni 2007


2
Progressive Alignment
  • Most popular approach to (global) multiple
    sequence alignment
  • Progressive Alignment
  • Since mid-Eighties Feng/Doolittle,
    Higgins/Sharp, Taylor,

3
Progressive Alignment
  • WCEAQTKNGQGWVPSNYITPVN
  • WWRLNDKEGYVPRNLLGLYP
  • AVVIQDNSDIKVVPKAKIIRD
  • YAVESEAHPGSFQPVAALERIN
  • WLNYNETTGERGDFPGTYVEYIGRKKISP

4
Progressive Alignment
  • WCEAQTKNGQGWVPSNYITPVN
  • WWRLNDKEGYVPRNLLGLYP
  • AVVIQDNSDIKVVPKAKIIRD
  • YAVESEAHPGSFQPVAALERIN
  • WLNYNETTGERGDFPGTYVEYIGRKKISP
  • Guide tree

5
Progressive Alignment
  • WCEAQTKNGQGWVPSNYITPVN
  • WW--RLNDKEGYVPRNLLGLYP-
  • AVVIQDNSDIKVVP--KAKIIRD
  • YAVESEASFQPVAALERIN
  • WLNYNEERGDFPGTYVEYIGRKKISP
  • Profile alignment, once a gap - always a gap

6
Progressive Alignment
  • WCEAQTKNGQGWVPSNYITPVN
  • WW--RLNDKEGYVPRNLLGLYP-
  • AVVIQDNSDIKVVP--KAKIIRD
  • YAVESEASVQ--PVAALERIN------
  • WLN-YNEERGDFPGTYVEYIGRKKISP
  • Profile alignment, once a gap - always a gap

7
Progressive Alignment
  • WCEAQTKNGQGWVPSNYITPVN-
  • WW--RLNDKEGYVPRNLLGLYP-
  • AVVIQDNSDIKVVP--KAKIIRD
  • YAVESEASVQ--PVAALERIN------
  • WLN-YNEERGDFPGTYVEYIGRKKISP
  • Profile alignment, once a gap - always a gap

8
Progressive Alignment
  • WCEAQTKNGQGWVPSNYITPVN--------
  • WW--RLNDKEGYVPRNLLGLYP--------
  • AVVIQDNSDIKVVP--KAKIIRD-------
  • YAVESEA---SVQ--PVAALERIN------
  • WLN-YNE---ERGDFPGTYVEYIGRKKISP
  • Profile alignment, once a gap - always a gap

9
Progressive Alignment
  • WCEAQTKNGQGWVPSNYITPVN--------
  • WW--RLNDKEGYVPRNLLGLYP--------
  • AVVIQDNSDIKVVP--KAKIIRD-------
  • YAVESEA---SVQ--PVAALERIN------
  • WLN-YNE---ERGDFPGTYVEYIGRKKISP
  • Most important implementation CLUSTAL W

10
Progressive Alignment
  • CLUSTAL W Thompson et al., 1994 (17.000
    citations)
  • Pairwise distances as 1 - percentage of identity
  • Calculate un-rooted tree with Neighbor Joining
  • Define root as central position in tree
  • Define sequence weights based on tree
  • Gap penalties calculated based on various
    parameters

11
Tools for multiple sequence alignment
  • Problems with traditional approach
  • Results depend on gap penalty
  • Heuristic guide tree determines alignment
    alignment used for phylogeny reconstruction
  • Algorithm produces global alignments.



12
Tools for multiple sequence alignment
  • Problems with traditional approach
  • But
  • Many sequence families share only local
    similarity
  • E.g. sequences share one conserved motif



13
Local sequence alignment
EYENS

ERYENS
ERYAS
Find common motif in sequences ignore the rest
14
Local sequence alignment
E-YENS

ERYENS
ERYA-S
Find common motif in sequences ignore the rest
15
Local sequence alignment

E-YENS
ERYENS
ERYA-S
Find common motif in sequences ignore the rest
Local alignment
16
Local sequence alignment
Traditional alignment approaches Either global
or local methods!
17
New question sequence families with multiple
local similarities


Neither local nor global methods appliccable
18
New question sequence families with multiple
local similarities


Alignment possible if order conserved
19
The DIALIGN approach
The DIALIGN approach
20
The DIALIGN approach
The DIALIGN approach
21
The DIALIGN approach
The DIALIGN approach
22
The DIALIGN approach
The DIALIGN approach
23
The DIALIGN approach
The DIALIGN approach
24
The DIALIGN approach
The DIALIGN approach
25
The DIALIGN approach
The DIALIGN approach
26
The DIALIGN approach
The DIALIGN approach
27
The DIALIGN approach
The DIALIGN approach
28
The DIALIGN approach
The DIALIGN approach
29
The DIALIGN approach
The DIALIGN approach

30
The DIALIGN approach
The DIALIGN approach
Consistency!
31
The DIALIGN approach
The DIALIGN approach
32
The DIALIGN approach
The DIALIGN approach
33
The DIALIGN approach
The DIALIGN approach
34
The DIALIGN approach
The DIALIGN approach
35
The DIALIGN approach
The DIALIGN approach
36
The DIALIGN approach
The DIALIGN approach
37
The DIALIGN approach
The DIALIGN approach
38
The DIALIGN approach
The DIALIGN approach
39
The DIALIGN approach
The DIALIGN approach
40
The DIALIGN approach
The DIALIGN approach
41
The DIALIGN approach
The DIALIGN approach
42
The DIALIGN approach
The DIALIGN approach
43
The DIALIGN approach
The DIALIGN approach
44
The DIALIGN approach
The DIALIGN approach
45
The DIALIGN approach
The DIALIGN approach
46
The DIALIGN approach
The DIALIGN approach
47
The DIALIGN approach
The DIALIGN approach
48
The DIALIGN approach
The DIALIGN approach
49
T-COFFEE

C. Notredame, D. Higgins, J. Heringa (2000),
T-Coffee A novel algorithm for multiple sequence
alignment, J. Mol. Biol.

Problem progressive alignment can go wrong if
mistakes are made at an early stage.
Example
50
T-COFFEE


SeqA GARFIELD THE LAST FAT CAT SeqB GARFIELD
THE FAST CAT SeqC GARFIELD THE VERY FAST
CAT SeqD THE FAT CAT
51
T-COFFEE


SeqA GARFIELD THE LAST FAT CAT SeqB GARFIELD
THE FAST CAT SeqC GARFIELD THE VERY FAST
CAT SeqD THE FAT CAT
52
T-COFFEE



53
T-COFFEE
  • Idea consider different pairwise alignments
    (local and global)
  • check how these alignments support each other


54
T-COFFEE



55
T-COFFEE



56
T-COFFEE

  • T-COFFEE
  • Less sensitive to spurious pairwise similarities
  • Can handle local homologies better than CLUSTAL

57
Evaluation of multi-alignment methods

  • Alignment evaluation by comparison to trusted
    benchmark alignments.
  • True alignment known by information about
    structure or evolution.

58
Evaluation of multi-alignment methods
  • For protein alignment
  • M. McClure et al. (1994)
  • 4 protein families, known functional sites
  • J. Thompson et al. (1999)
  • Benchmark data base, 130 known 3D structures
    (BAliBASE)
  • T. Lassmann E. Sonnhammer (2002)
    BAliBASE simulated evolution (ROSE)


59
Evaluation of multi-alignment methods


60
Evaluation of multi-alignment methods

  • Alignment evaluation by comparison to trusted
    benchmark alignments.
  • True alignment known by information about
    structure or evolution.

61
Evaluation of multi-alignment methods



62

Evaluation of multi-alignment methods

1aboA 1 .NLFVALYDfvasgdntlsitkGEKLRVLgynhn
..............gE 1ycsB 1
kGVIYALWDyepqnddelpmkeGDCMTIIhrede............deiE
1pht 1 gYQYRALYDykkereedidlhlGDILTVNkgs
lvalgfsdgqearpeeiG 1ihvA 1
.NFRVYYRDsrd......pvwkGPAKLLWkg.................eG
1vie 1 .drvrkksga.........awqGQIVGWYctn
lt.............peG 1aboA 36
WCEAQt..kngqGWVPSNYITPVN...... 1ycsB 39
WWWARl..ndkeGYVPRNLLGLYP...... 1pht 51
WLNGYnettgerGDFPGTYVEYIGrkkisp 1ihvA 27
AVVIQd..nsdiKVVPRRKAKIIRd..... 1vie 28
YAVESeahpgsvQIYPVAALERIN...... Key alpha
helix RED beta strand GREEN core blocks
UNDERSCORE
BAliBASE Reference alignments
63
Evaluation of multi-alignment methods
  • 5 categories of benchmark sequences (globally
    related, internal gaps, end gaps)
  • CLUSTAL W, RPPR perform well on globally related
    sequences, DIALIGN superior for local
    similarities
  • Conclusion no single best multi alignment
    program!


64
Evaluation of multi-alignment methods
  • T. Lassmann E. Sonnhammer (2002)
    BAliBASE simulated evolution (ROSE)


65



66
Result DIALIGN best for distantly related
sequences, TCOFFEE best for closely related
sequences

Write a Comment
User Comments (0)
About PowerShow.com