The Genome Access Course Phylogenetic Analysis - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

The Genome Access Course Phylogenetic Analysis

Description:

Developed by Willi Henning (Grundz ge einer Theorie der Phylogenetischen ... fitch/kitsch. drawtree/drawgram. Maximum Parsimony. Most common method ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 33
Provided by: jamesg61
Category:

less

Transcript and Presenter's Notes

Title: The Genome Access Course Phylogenetic Analysis


1
TheGenomeAccessCoursePhylogenetic Analysis
2
The Great Chainof Being
From Didacus Valades, Rhetorica Christiana (1579)
3
Phylogenetics
  • Developed by Willi Henning (Grundzüge einer
    Theorie der Phylogenetischen Systematik, 1950
    Phylogenetic Systematics, 1966)

4
What is the ancestral sequence?
  • pfeffer
  • pepper
  • (pf/p)e(ff/pp)er

5
What is the ancestral sequence?
  • five
  • quinque
  • pento
  • funf
  • panj
  • viisi
  • pompe
  • queig
  • pindzé

penkwe
6
What is the ancestral sequence?
  • HGLCSAIP
  • HGICSGIP
  • HG(L/I)CS(A/G)IP

7
Evolutionary Trees
  • A tree is a connected, acyclic 2D graph
  • Leaf Taxon
  • Node Vertex
  • Branch Edge
  • Tree length sum of all branch lengths
  • Phylogenetic trees are binary trees

8
A Generic Tree
9
Evolutionary Trees
  • Rooted
  • common ancestor
  • unique path to any leaf
  • directed
  • Unrooted
  • root could be placed anywhere
  • fewer possible than rooted

10
Rooted Tree
generated by DRAWGRAM (PHYLIP)
11
Unrooted Tree
generated by DRAWTREE (PHYLIP)
12
Possible Evolutionary Trees
13
Paralogs Orthologs
1A
2A
3A
1B
2B
3B
14
Genes vs. Species
  • Sequences show gene relationships, but
    phylogenetic histories may be different for gene
    and species
  • Genes evolve at different speeds
  • Horizontal gene transfer

15
(No Transcript)
16
(No Transcript)
17
Methods for Phylogenetic Analysis
  • Character-State
  • Maximum Parsimony
  • Maximum Likelihood
  • Genetic Distance
  • Fitch Margoliash
  • Neighbor-Joining
  • Unweighted Pair Group

18
Phylogenetic Software
  • PHYLIP
  • PAUP (Available in GCG)
  • TREE-PUZZLE
  • PhyloBLAST
  • Felsenstein maintains an extensive list of
    programs on the PHYLIP site

19
PHYLIP Programs
  • dnapars/protpars
  • dnadist/protdist
  • dnaml (use fastDNAml instead)
  • neighbor
  • fitch/kitsch
  • drawtree/drawgram

20
Maximum Parsimony
  • Most common method
  • Allows use of all evolutionary information
  • Build and score all possible trees
  • Each node is a transformation in a character
    state
  • Minimize treelength
  • Best tree requires the fewest changes to derive
    all sequences

21
Which is the more parsimonious tree?
9 Node Crossings
8 Node Crossings
22
Maximum Likelihood
  • Reconstruction using an explicit evolutionary
    model
  • Tree is calculated separately for each nucleotide
    site. The product of the likelihoods for each
    site provides the overall likelihood of the
    observed data.
  • Demanding computationally
  • Slowest method
  • Use to test (or improve) an existing tree

23
Clustering Algorithms
  • Use distances to calculate phylogenetic trees
  • Trees are based on the relative numbers of
    similarities and differences between sequences
  • A distance matrix is constructed by computing
    pairwise distances for all sequences
  • Clustering links successively more distant taxa

24
DNA Distances
  • Distances between pairs of DNA sequences are
    relatively simple to compute as the sum of all
    base pair differences between the two sequences
  • Can only work for pairs of sequences that are
    similar enough to be aligned
  • All base changes are considered equal
  • Insertion/deletions are generally given a larger
    weight than replacements (gap penalties).
  • Possible to correct for multiple substitutions at
    a single site, which is common in distant
    relationships and for rapidly evolving sites.

25
Amino Acid Distances
  • More difficult to compute
  • Substitutions have differing effects on structure
  • Some substitutions require more than one DNA
    mutation
  • Use replacement frequencies (PAM, BLOSUM)

26
Fitch Margoliash
  • 3 sequences are combined at a time to define
    branches and calculate their length
  • Additive branch lengths
  • Accurate for short branches

27
Neighbor Joining
  • Most common method of tree construction
  • Distance matrix adjusted for each taxon depending
    on its rate of evolution
  • Good for simulation studies
  • Most efficient computationally

28
UPGMA Unweighted Pair Group Methods Using
Arithmetic Averages
  • Simplest method
  • Calculates branch lengths between most closely
    related sequences
  • Averages distance to next sequence or cluster
  • Predicts a position for the root

29
Phylogenetic Complications
  • Errors
  • Loss of function
  • Convergent evolution
  • Lateral gene transfer

30
Validation
  • Use several different algorithms and data sets
  • NJ methods generate one tree, possibly supporting
    a tree built by parsimony or maximum likelihood
  • Bootstrapping
  • Perturb data and note effect on tree
  • Repeat many times
  • Unchanged 90, trees correctness is supported

31
Are there bugs in our genome?
N-acetylneuraminate lyase
32
The End
Write a Comment
User Comments (0)
About PowerShow.com