Molecular basis of evolution. - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Molecular basis of evolution.

Description:

Analysis of mitochondrial DNA proposes that Homo sapiens ... Human/carp. 0.216. 0.205. 0.186. Human/kangaroo. 0.134. 0.129. 0.121. Human/cow. Gamma-distance ... – PowerPoint PPT presentation

Number of Views:102
Avg rating:3.0/5.0
Slides: 28
Provided by: Pan92
Category:

less

Transcript and Presenter's Notes

Title: Molecular basis of evolution.


1
Molecular basis of evolution.
  • Goal to reconstruct the evolutionary history of
    all organisms in the form of phylogenetic trees.
  • Classical approach phylogenetic trees were
    constructed based on the comparative morphology
    and physiology.
  • Molecular phylogenetics phylogenetic trees are
    constructed by comparing DNA/protein sequences
    between organisms.

2
Evolution of mankind.
  • Analysis of mitochondrial DNA proposes that Homo
    sapiens evolved from one group of Homo erectus in
    Africa (African Eve) 100,000 200,000 years ago.

American indians I, 25-35,000
Europeans 40-50,000
American indians II, 7-9,000
Asians 55-75,000
Africans 100,000
Adam appeared 250,000 years ago, much earlier!
3
Mechanisms of evolution.
  • By mutations of genes. Mutations spread through
    the population via genetic drift and/or natural
    selection.
  • - By gene duplication and recombination.

4
Mutational changes of DNA sequences.
  • 1. Substitution. 3.
    Insertion.
  • Thr Tyr Leu Leu Thr
    Tyr Leu Leu
  • ACC TAT TTG CTG ACC TAT TTG
    CTG
  • ACC TCT TTG CTG ACC TAC TTT
    GCT G
  • Thr Tyr Leu Leu Thr
    Tyr Phe Ala
  • 2. Deletion. 4.
    Inversion.
  • Thr Tyr Leu Leu Thr
    Tyr Leu Leu
  • ACC TAT TTG CTG ACC TAT TTG
    CTG
  • ACC TAT TGC TG- ACC TTT ATG
    CTG
  • Thr Tyr Cys Thr
    Phe Met Leu

5
Synonymous and nonsynonymous nucleotide
substitutions.
  • Synonymous substitutions in codons do not change
    the encoding amino acid, non-synonymous
    substitutions do.
  • ds/dn lt 1 indicates positive natural selection.
  • ds, dn - of synonymous substitutions per
  • non-synonymous site

6
Gene duplication and recombination.
  • New genes/proteins occur through gene duplication
    and recombination.

Gene 1
Ancestral globin

duplication
Gene 2
globin
globin
hemoglobin
myoglobin
New gene
Duplication
Recombination
7
Measures of evolutionary distance between amino
acid sequences.
  • 1. P-distance. Evolutionary distance is usually
    measures by the number of amino acid
    substitutions.

nd number of amino acid differences between two
sequences n number of aligned amino acids.
8
Poisson correction for evolutionary distance.
  • 2. PC-distance. Takes into account multiple
    substitutions and therefore is proportional to
    divergence time.
  • PC-distance can be expressed through the
    p-distance

9
Another method to estimate evolutionary
distances amino acid substitution matrices.
  • 3. Distance from amino acid substitution
    matrices. Substitutions occur more often between
    amino acids of similar properties.
  • - Dayhoff (1978) derived first matrices from
    multiple alignments of close homologs.
  • - The number of aa substitutions is measured
    in terms of accepted point mutations (PAM) one
    aa substitution per 100 sites.
  • - Dayhoff-distance can be approximated by
    gamma-distance with a2.25.

10
Fixation of mutations.
  • Not all mutations are spread through population.
    Fixation when a mutation is incorporated into a
    genome of species.
  • Majority of mutations are neutral (Kimura), do
    not effect the fitness of organism.
  • Fixation rate depends on the size of population
    (N), fitness (s) and mutation rate (µ)

11
Phylogenetic analysis.
  • Phylogenetic trees are derived from multiple
    sequence alignments. Each column describes the
    evolution of one site.
  • Each position/site in proteins/nucleic acids
    changes in evolution independently from each
    other.
  • Insertions/deletions are usually ignored and
    trees are constructed only from the aligned
    regions.

12
Evolutionary tree constructed from rRNA analysis.
13
The concept of evolutionary trees.
  • Trees consist of nodes and branches, topology -
    branching pattern.
  • The length of each branch represents the number
    of substitutions occurring between two nodes. If
    rate of evolution is constant, branches will have
    the same length (molecular clock hypothesis).
  • The distance along the tree is calculated by
    summing up all intervening branch lengths.
  • Trees can be binary or bifurcating.
  • Trees can be rooted and unrooted. The root is
    placed by including a taxon which is known to
    branch off earlier than others.

14
Accuracies of phylogenetic trees.
  • Two types of errors
  • Topological error
  • Branch length error
  • Bootstrap test
  • Resampling of alignment columns with
    replacement recalculating the tree counting how
    many times this topology occurred bootstrap
    confidence value. If it is close to 100
    reliable topology/interior branch.

15
Estimation of species divergence time.
  • Assumption rate constancy, molecular clock.
  • Find T1, if T2 is known.

T1
T2
A
B
C
16
Estimation of evolutionary rates in hemoglobin
alpha-chains.
P-distance PC-distance Gamma-distance
Human/cow 0.121 0.129 0.134
Human/kangaroo 0.186 0.205 0.216
Human/carp 0.486 0.665 0.789
Estimate the evolutionary rate of divergence
between human and cow (time of divergence between
these groups is 90 millions years).
17
Methods for phylogenetic trees construction.
Set of related sequences
Multiple sequence alignments
Strong sequence similarity?
Maximum parsimony methods
Yes
No
Recognizable sequence similarity?
Yes
Distance methods
No
Analyze reliability of prediction
Maximum likelihood methods
18
1. Distance methods. Calculating branch lengths
from distances.
A B C
A ----- 20 30
B ----- ----- 44
C ----- ----- -----
a
c
b
19
Neighbor-joining method.
  • NJ is based on minimum evolution principle (sum
    of branch length should be minimized).
  • Given the distance matrix between all sequences,
    NJ joins sequences in a tree so that to give the
    estimate of branch lengths.
  • Starts with the star tree, calculates the sum of
    branch lengths.

C
B
b
c
D
a
d
e
A
E
20
Neighbor-joining method.
  • 2. Combine two sequences in a pair, modify the
    tree.
  • 3. Treat cluster CDE as one sequence X,
    calculate average distances between A and X,
    B and X, calculate a and b.

C
B
c
b
d
D
a
e
A
E
4. Treat AB as a single sequence, calculate c, d
and e. 5. Calculate the sum of branch lengths,
S. 5. Repeat the cycle and calculate S for other
pair, choose the lowest S.
21
Classwork I
  • Given a multiple sequence, construct distance
    matrix (p-distance) and calculate the branch
    lengths.
  • APTHASTRLKHHDDHH
  • ALTKKSTRIRHIPD-H
  • DLTPSSTIIR-YPDLH

22
Classwork II NJ tree using MEGA.
  • Go to CDD webpage and retrieve alignment of
    cd00157 in FASTA format.
  • Import this alignment into MEGA and convert it to
    MEGA format http//www.megasoftware.net/mega3/mega
    .html .
  • 3. Construct NJ tree using different distance
    measures with bootstrap.
  • 4. Analyze obtained trees.

23
2.1 Maximum parsimony definition of informative
sites.
  • Maximum parsimony tree tree, that requires the
    smallest number of evolutionary changes to
    explain the differences between external nodes.
  • Site, which favors some trees over the others.
  • 1 2 3 4 5 6 7
  • A A G A C T G
  • A G C C C T G
  • A G A T T T C
  • A G A G T T C
  • Site is informative (for nucleotide sequences) if
    there are at least two different kinds of letters
    at the site, each of which is represented in at
    least two of the sequences.

24
2. Maximum parsimony.
Site 3
1.G
3.A
1.G
2.C
2.C
1.G
G
A
A
A
A
A
2.C
4.A
3.A
4.A
4.A
3.A
Tree 1.
Tree 2.
Tree 3.
Site 3 is not informative, all trees are realized
by the same number of substitutions. Advantage
deals with characters, dont need to compute
distance matrices. Disadvantage
- multiple substitutions are not
considered - branch
lengths are difficult to calculate
- slow
25
2.3 Maximum parsimony method.
  • Identify all informative sites in the alignment.
  • 2. Calculate the minimum number of
    substitutions at each informative site.
  • 3. Sum number of changes over all informative
    sites for each tree.
  • 4. Choose tree with the smallest number of
    changes.

26
Maximum likelihood methods.
  • Similarity with maximum parsimony
  • - for each column of the alignment all
    possible trees are calculated
  • - trees with the least number of
    substitutions are more likely
  • Advantage of maximum likelihood over maximum
    parsimony
  • - takes into account different rates of
    substitution between different amino acids and/or
    different sites
  • - applicable to more diverse sequences

27
Classwork maximum marsimony.
  1. Search the NCBI Conserved Domain Database for
    pfam00127.
  2. Construct maximum parsimony tree using MEGA3.
  3. Analyze this tree and compare it with the
    phylogenetic tree from the research paper.
Write a Comment
User Comments (0)
About PowerShow.com