Phylogenetic Analysis - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Phylogenetic Analysis

Description:

... among a set of species is called a Phylogeny, represented by a phylogenetic tree. ... Alignment and tree-building can proceed simultaneously. Phylogeny of ... – PowerPoint PPT presentation

Number of Views:128
Avg rating:3.0/5.0
Slides: 30
Provided by: sch17
Category:

less

Transcript and Presenter's Notes

Title: Phylogenetic Analysis


1
Phylogenetic Analysis
  • Greek phylon race
  • genetic -- birth

2
Phylogenetic Analysis
  • The evolutionary relationship among a set of
    species is called a Phylogeny, represented by a
    phylogenetic tree.
  • Infer phylogenetic tree (Reconstruction) from
    observation of the existing organisms
  • Then morphological characters
  • Now molecular sequences!
  • Zuckerkandl Pauling 1962

3
Relationship to MSA
  • Multiple alignment of sequences should take
    account of their evolutionary relationship. (Some
    multiple alignment algorithms do use a guide
    tree)
  • Alignment and tree-building can proceed
    simultaneously

4
Phylogeny of
  • Orthologues divergence from a common ancestor,
    speciation
  • -- in different species
  • Paralogues divergence from gene duplication
  • -- within same species/organism

Speciation
1A
5
Elements of a Tree
  • Leaves/Nodes sequences
  • Taxa (singular taxon) outer leaves
  • Edges edge lengths correspond to evolutionary
    time periods
  • Roots

6
Molecular Clock Theory
7
Molecular Clock Theory I(Zukerkandl Pauling,
early 1960s)
  • For any given protein, accepted mutations in the
    amino acid sequence for the protein occur at
    constant rate
  • Implication
  • of accepted mutations proportional to length of
    time interval
  • All proteins/species observed today have the same
    molecular age
  • Works well for closely related species

8
Molecular Clock Theory II
  • Rate of accepted mutations maybe different for
    different proteins (depending on their tolerance
    for mutations)
  • Different parts of a protein may evolve at
    different rates

Counting mutations
2
3
2
3
4
1
4
1
9
Distance-based Methods
  • We assume that the distance between each pair
    of sequences is proportional to the evolutionary
    time between them.

10
How to Collect Distance Data
  • Lab methods mix single strands of DNA from
    different species, measure how tightly they
    associate.
  • Sequence analysis methods estimate number of
    mutations based on sequence comparisons

11
Fill Out A Distance Matrix
12
Ultrametric Distance Matrices
  • D is an ultrametric distance matrix, if and only
    if
  • for every three indices i, j and k there is a tie
    for the maximum of D(i,j), D(i,k) and D(j,k).
    That is, the maximum is not unique.

13
Test if the data is ultrametric
Mol. Clock Theory I is valid for this group of
seq.s
14

Ultrametric
  • not-ultrametric

2
3
2
3
4
1
1
4
Constant mutation rate
15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
(No Transcript)
22
When MCT. 1 fails
  • The distance matrix is no longer ultrametric

23
When distance is additive
Inferring an inner node
k
j
m
i
24
Neighbor Joining
  • Can we use this fact to construct trees?
  • Infer inner nodes
  • Gradually strip off leaves (outer nodes)

25
Finding Neighboring Leaves
  • Let
  • where
  • Theorem if D(i,j) is minimal (among all pairs of
    leaves), then i and j are neighbors in the tree

g
j
i
h
26
Neighbor Joining
  • Set L to contain all leaves
  • Iteration
  • Choose i,j such that D(i,j) is minimal
    (neighbors)
  • Create new node k, and set
  • remove i,j from L, and add k
  • Terminatewhen L 2, connect two remaining
    nodes

27
(No Transcript)
28
(No Transcript)
29
NJ will construct the correct tree (if additive)
Write a Comment
User Comments (0)
About PowerShow.com