Title: Patterns of Substitution and Replacement
1Patterns of Substitution and Replacement
2(No Transcript)
3(No Transcript)
4(No Transcript)
5(No Transcript)
6Pattern of Substitution in Pseudogenes
Based on a sample of 105 mammalian
retropseudogenes.
7The sum of the relative frequencies of
transitions is 68 If all mutations occur with
equal frequencies the expectation is 33
8In comparison to the 50 expectation, 59.2 of
all substitutions are from G and C, and 56.4 of
all substitutions are to A and T.
In the absence of selection, DNA will tend to
become AT-rich
9(CG dinucleotides excluded)
10(No Transcript)
11(No Transcript)
12(No Transcript)
13Pattern of Substitution in mtDNA
Based on 95 sequences from human and chimpanzee.
14The sum of the relative frequencies of
transitions is 94 If all mutations occur with
equal frequencies the expectation is 33
Based on 95 sequences from human and chimpanzee.
15Mutations Strand (Leading and Lagging) Effects
16Possible inequalities between strands A change
from G to A actually means that a GC pair is
replaced by an AT pair. This can occur as a
result of either a G mutating to A in the one
strand or a C to T mutation in the complementary
strand. Similarly, a change from C to T can
occur as a result of either a C mutating to T in
one strand or a G mutating to A in the other.
17Detection of Strand Inequalities in Mutation Rates
- If G ? A on leading strand, then C ? T on
lagging strand - If G ? A on lagging strand,then C ? T on leading
strand - If G ? A on leading G ? A on lagging,then G ?
A C ? T
18If there are no differences in the mutation
pattern between the two strands, then
19Is G ? A C ? T?
The transitional rate between pyrimidines (C, T)
is much higher than that between purines (G, A),
suggesting different patterns and rates of
mutation between the two strands.
20Pattern of amino-acid replacement
21Physicochemical distances measures for
quantifying the dissimilarity between two amino
acids.
22(No Transcript)
23(No Transcript)
24Granthams physicochemical distances between
pairs of amino acids
25The most similar amino acid pairs are leucine and
isoleucine (Grantham's distance 5) and leucine
and methionine (Grantham's distance 15).
26215
205
202
The most dissimilar amino acid pairs
27A replacement of an amino acid by a similar one
(e.g., leucine to isoleucine) is called a
conservative replacement. A replacement of an
amino acid by a dissimilar one (e.g., glycine to
tryptophan) is called a radical replacement.
28Empirical findings
During evolution, amino acids are mostly replaced
by similar ones.
29Similar amino acids
Dissimilar amino acids
30Similar
Dissimilar
31Kimura 1985
32Exchanges between similar structures occur
frequently. Exchanges between dissimilar
structures occur rarely. Nothing happens, but
if it does, it doesnt matter.
33Amino-acid exchangeability Numbers in
parentheses denote codon family for amino acids
encoded by two codon families 60-90 of the
amino-acid replacements involve the nearest or
second nearest neighbors in the ring
Argyles exchangeability ring
34What protein properties are conserved in
evolution? Protein specific constraints The
evolution of each protein-coding gene is
constrained by the specific functional
requirements of the protein it produces.
General constraints Are there general
properties that are constrained during evolution
in all proteins?
35bulkiness (volume)
low
high
degree of conservation
36hydrophobicity
low
high
degree of conservation
37polarity
low
high
degree of conservation
38optical rotation
low
high
degree of conservation
39surprise!
charge
optical rotation
low
high
degree of conservation
40(No Transcript)
41(No Transcript)
42(No Transcript)
43(No Transcript)
44(No Transcript)
45(No Transcript)
46(No Transcript)
47(No Transcript)
48Amino-acid composition may be an important factor
in determining rates of nucleotide substitution.
49Most conserved amino acids Glycine is
irreplaceable because of its small size. Lysine
is irreplaceable because of its involvement in
amidine bonds that crosslink polypeptide
chains Cysteine is irreplaceable because of its
involvement in cystine bonds that crosslink
polypeptide chains Proline is irreplaceable
because of its contribution to the contortion of
proteins.
50Does the frequency of amino acids in proteins
reflect functional need or availability?
51The frequencies of nucleotides in vertebrate mRNA
are 22.0 uracil, 30.3 adenine, 21.7 cytosine,
and 26.1 guanine.
52The expected frequency of a particular codon can
be calculated by multiplying the frequencies of
each of the nucleotides comprising the codon.
53The expected frequency of the amino acid can be
calculated by adding the frequencies of each
codon that codes for that amino acid.
54For example, the codons for tyrosine are UAU and
UAC, so the random expectation for its frequency
is 1.057(0.220)(0.303)(0.220)
(0.220)(0.303)(0.217) 0.0309 Since 3 of the
64 codons are stop codons, this frequency for
each amino acid is multiplied by a correction
factor of 1.057.
55By plotting the expected frequency against the
observed frequency, we can see if some amino
acids are occurring more or less often than
expected by chance. If the observed and expected
frequencies are close to equal, we would expect a
regression line with a slope 1.
56Excluding arginine, the correlation between
observed and expected frequencies was highly
significant (r 0.9). Arginine frequency seems
to be affected by selection acting on one or more
of its codons.
57Conclusions Instead of an amino acid frequency
being determined by functional requirements, its
frequency seems to determined by nucleotide
composition and the number of codons for the
amino acid.