Legend - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Legend

Description:

Legend this site is under selection, Pr(w1) =0.95 At the left corner: the aligners for whose alignment of the gene at least 1 site was inferred to be under ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 27
Provided by: penc2
Learn more at: http://genome.cshlp.org
Category:
Tags: legend | search | star

less

Transcript and Presenter's Notes

Title: Legend


1
Legend
  • this site is under selection, Pr(wgt1) gt0.95
  • At the left corner the aligners for whose
    alignment of the gene at least 1 site was
    inferred to be under selection, Pr(wgt1) gt 0.95
    (12 species alignments only)
  • Sites inferred to be under selection with
    Pr(wgt1)gt0.5 based on PAMLs BEB analysis (Clark
    et al. 2007). When more than 1 alignment is shown
    for the gene, the star indicates for which
    alignment are the shown sites (for example TC
    T-Coffee).
  • The reported probabilities are based on
    correspondingly BEB or NEB analysis (melanogaster
    group alignments only all 12 species alignments
    are based on BEB analysis)

Amap
BEB
NEB
2
  • 12 species alignments
  • examples

3
Working definitions
  • Correct
  • the codons of at least 1 of the inferred sites
    under selection (Pr(wgt1)gt0.95) are most likely
    correctly aligned
  • Misaligned
  • there is no inferred positively selected site
    where the codons are most likely correctly
    aligned

4
FBgn0036058 Correct
Amap
Clustal
Muscle
Probcons
T-Coffee
T-Coffee
All 5 aligners produced the same alignment in
the shown region, and in all cases this site was
inferred to be under selection
5
FBgn0040696 Misaligned end of CDS problems
Amap
Clustal
Muscle
Probcons
T-Coffee
T-Coffee
Muscle
TC
This gene has at least 1 site with Pr(wgt1)1 for
all 5 aligners, all in this same region
6
FBgn0022960 Misaligned start of CDS problems
gross misalignment
T-Coffee
Amap
This gene has at least 1 site with Pr(wgt1)gt0.99
for all 5 aligners, all in the shown region The
underlined sequences are almost 100 identical,
however T-Coffee did not align them correctly
A
7
FBgn0031478 Misaligned fast evolving region
Muscle
Probcons
T-Coffee
Clustal
Probcons
For Clustal, in this region all Pr(wgt1)lt0.5
except for the site corresponding to Probcons
303 K (with Pr 0.53)
P
8
FBgn0034434 Misaligned repeats (H, Q)
Muscle
Probcons
Muscle
Clustal
For Clustal, on this site Pr(wgt1)lt0.5.
M
9
FBgn0002932 Misaligned 2 different transcripts
Clustal
Muscle
T-Coffee
Clustal
Probcons
C
10
FBgn0004380 Misaligned
T-Coffee
T-Coffee
Muscle
Only the alignment with T-Coffee has Pr(wgt1) gt
0.95.
T
11
FBgn0039025 Misaligned indels and repeats
Muscle
Probcons
T-Coffee
Muscle
Clustal
M
The shown region is followed by a very conserved
200 aa sequence.
12
FBgn0037580 Misaligned
Probcons
Probcons
There is no reason why the R at position 40, D
pseudoobscura , should be before and not after
the gap. Exactly the same column but without R in
the Amap alignment resulted in Pr(wgt1)0.89.
13
  • melanogaster group
  • examples

14
Working definitions
  • Correct
  • the codons causing positive selection are most
    likely correctly aligned
  • Misaligned
  • the codons causing positive selection are
    likely incorrectly aligned
  • Significant?
  • partial misalignments, which are likely to
    significantly affect the statistical significance
    of the PAML LRT/FDR results

15
FBgn0033942 Correct
T-Coffee
1
NEB
All 4 BEB analysis sites have well aligned
codons 1 example with 1 of the 4 sites
1
BEB
1
16
FBgn0031155 (likely) Correct
1
T-Coffee
BEB
BEB
2
1
2
  • 1, 2 2 well aligned sites
  • BEB analysis
  • total sites in the wgt1 category is 122 (28 of
    all sites)
  • At least 6 of the ones with Pr gt 0.9 are well
    aligned at least 4 are not
  • no sites with Pr gt 0.95

17
FBgn0032627, part 1Misaligned not due to lack
of information
2
T-Coffee
1
Amap
1
BEB
T
  • 1 region that is misaligned with T-Coffee, but
    not AMAP
  • 2 the start codon aligned with a non-start
    codon is selection (Pr 0.943)
  • start/end problems seem common, mel sequence is
    often but not always missing

2
18
FBgn0032627, part 2Misaligned, with an attempt
to mask
T-Coffee
1
Amap
1
1 Is a not well masked region X masked sites
T
BEB
T
BEB
19
FBgn0025815Misaligned fast evolving region
T-Coffee
T-Coffee
Unreliable at the codon level, though clearly
the region is evolving faster than the rest of
the gene
BEB
20
FBgn0036686 Misaligned Repeats
Probcons
NEB
  • T-Coffee compared to Probcons highest BEB
    Pr(wgt1) with Probcons is only 0.6

T
P
BEB
21
FBgn0036195 Misaligned alternative splicing
and/or annotation and/or non amino acid level
polymorphism
T-Coffee
1
BEB
  • 1 These 2 sites (RR) are the only fast evolving
    sites in a very well conserved gene
  • Ncbi search on the left
  • the sequence observed in dmel, ending with RR,
    can also be found in dana ending with RR too
  • dsec, dsim and dyak get similar hits

1
1
22
FBgn0050166 Part 1 of 2 Misaligned end of
CDS issues
T-Coffee
1
BEB
1 This region accounts for 35 sites with
Pr(wgt1)gt0.99
BEB
23
FBgn0050166 Part 2 of 2 Misaligned different
sequence in dana indel
T-Coffee
T-Coffee
BEB
24
FBgn0030998Significant?
1
T-Coffee
T-Coffee
2
2
2
NEB
BEB
2
  • 1 Well aligned (site 614)
  • 2 Likely misaligned (site 25, 17, 19)
  • The remaining not shown sites have dubious
    alignments (and lower Pr(wgt1)). There are a total
    of only 10 sites in the w gt 1 class

25
FBgn0034295Significant? .
1
1
1
1
T-Coffee
T-Coffee
  • 1 Sites with good alignments (2, 48, 49, 81,
    82 all with BEB P(wgt1) gt 0.9)
  • 2 Simple repeats region, in dana the repeat is
    different

2
NEB
BEB
1
2
26
FBgn0033607 Misaligned alternative splicing
T-Coffee
BEB
This is the end of the CDS. This is the only site
in the wgt1 PAML M8 class in the gene, as well as
the only site with Pr(wgt1) after NEB analysis
(with Pr(wgt1)1.000).
Write a Comment
User Comments (0)
About PowerShow.com