MOLECULAR PHYLOGENETIC - PowerPoint PPT Presentation

1 / 129
About This Presentation
Title:

MOLECULAR PHYLOGENETIC

Description:

gene segments and the whole mitochondrial genome. Fig. 25 Estimating the sample error ... Restriction site analysis and RFLP of coding and non-coding nuclear DNA ... – PowerPoint PPT presentation

Number of Views:240
Avg rating:3.0/5.0
Slides: 130
Provided by: xxx3205
Category:

less

Transcript and Presenter's Notes

Title: MOLECULAR PHYLOGENETIC


1
MOLECULAR PHYLOGENETIC ANALYSIS
Sirawut Klinbunga
National Center for Genetic Engineering and
Biotechnology, National Science and Technology
Development Agency (NSTDA) and
Center of Excellence for Marine Biotechnology,
Faculty of Science, Chulalongkorn University
2
UPGMA dendrogram of bee mites having different
host species (RAPD)
Phylogeny of primate lentiviruses (DNA
sequences)
Unrooted NJ tree of Apis cerana in Thailand
(RFLP)
Bootstrapped NJ tree of Apis cerana in Thailand
(microsatellites)
3
Molecular tools
Statistical interpretation of variation and
genetic distance
Phylogenetic reconstruction
4
Fig. 4 Phylogeny and three basic kinds used
to depict that phylogeny
5
Fig. 5 Horizontal and vertical axes of a tree
6
Fig. 1 Definitions of technical terms
7
Fig. 2 Tree are like mobiles
8
Fig. 3 Trees showing various degree of resolution
9
Phylogenetic tree can be classified into rooted
and unrooted trees.
has a particular node (R) showing an
evolutionary path. The root is the common
ancestor for all investigated taxa.
Rooted tree
Is a network showing relationships between OTUs
and does not define the location of the common
ancestor.
Unrooted tree
10
A
C
D
B
A
R
C
D
B
E
E
Time
The rooted phylogenetic tree
The unrooted phylogenetic tree
11
Fig. 7 Rooted trees derived from an unrooted tree
12
Table 1 The possible number of rooted and
unrooted tree for up to 10 OTUs.
Number of OTUs Number of rooted trees
Number of unrooted trees
2 1 1 3 3 1 4
15 3 5 105 15 6
954 105 7 10,395
954 8 135,135 10,395 9
2,027,025 135,135 10 34,459,425
2,027,025
13
Fig. 8 Evolutionary changes
14
Fig. 9 Different patterns of ancestral and
derived states
15
Fig. 10 Cladograms and evolutionary trees
16
MATRIX DISTANCES
Let d (a,b) be the distance between sequences a
and b. A distance d is a metric if
1.
(non-negativity)
2.
(symmetry)
3.
(triangle inequality)
4. if and only if a b
(distinctness)
17
5. d(a,b) maximum d(a,c),d(b,c), the two
larges distances are equal.
Fig. 11 An ultrametric distance matrix between
four sequences and the corresponse ultrametric
tree
18
6. d(a,b) d(c,d) maximum d(a,c) d(b,d),
d(a,d) d(b,c), the four point condition.
Fig. 12 An additive distance matrix between four
sequences and the corresponding additive tree
19
Organismal phylogeny
Fig. 13 The difference between monophyly and
non-monophyly
20
Fig. 14 Variation in evolutionary rate among
genes from the same organisms
21
Fig. 15 Phylogeny of 3 species and six genes
from a gene duplication
22
Progestin membrane receptor Component 1 (PGMRC1)
23
Fig. 2 Length heteroplasmy observed from the
amplified product of A. cerana
24
Fig. 2 Heteroplasmy observed from the amplified
product of ITS-1 from different individuals of
the same species
25
Inferring molecular phylogeny
Fig. 17 A parsimony tree and a distance tree
from the same sequences
26
Fig. 18 Tree construction using a clustering
method
27
Fig. 19 A optimality method assigns to every
possible tree
28
Classification of tree construction methods
Methods for construction of phylogenetic tree can
be classified into two types depending on the
type of data used.
Table 2 Classification of phylogenetic
reconstruction methods
Method Stepwise clustering Exhaustive
search
Distance matrix UPGMA (assuming molecular
clock) Neighbor-joining
Fitch-Margoliash KITCH (F-M
assuming molecular clock)
Character state Maximum
parsimony Maximum likelihood
29
Desirable properties of tree construction methods
1. Efficiency (how fast is the method?)
2. Power (how much data does the method need to
produce a reasonable result?)
3. Consistency (will it converge on the right
answer given enough data ?)
4. Robustness (will minor violations of the
methods assumptions result in poor estimate of
phylogeny?)
5. Falsifiability (will the method tell us when
its assumptions are violated that we should not
be using the method at all?)
30
Fig. 20 Additive and ultrametric trees for the
same sequences
31
1. ATATT 2. ATCGT 3. GCAGT 4. GCCGT
PARSIMONY ANALYSIS
Fig. 21 The most parsimoniuus tree for site 1.
32
Fig. 22 The most parsimonious trees for site 2-5.
33
Fig. 23 The condition required by UPGMA
34
Fig. 24 An example where UPGMA will construct
the wrong tree.
35
Table 3 Comparisons of trees constructed from
mtDNA gene segments and the whole mitochondrial
genome
36
Fig. 25 Estimating the sample error
37
Fig. 26 Bootstrapping of data
38
Fig. 26 (cont.) Bootstrapping of data
39
Fig. 26 (cont.) Bootstrapping of data
-Bootstrapped values superimposed on the original
tree. -Bootstrapped consensus tree.
40
Fig. 16 Types of consensus trees
41
A bootstrapped neighbor-joining tree illustrating
relationships between Tra-2 of various taxa.
42
Fig. 1 Multiple alignments of 16S rDNA sequences
from different individuals of The same species
43
Analysis of DNA Polymorphisms using DNA sequencing
Homologous genes can be reliably identified
for comparison.
Recombinant sequences (heterozygous for a
particular locus or primers anneal to multiple
sites). Therefore, the individual should be
homozygous or homogeneous for interested
sequences.
If the sequences are part of a gene family,
it is critical that orthologous genes are
being compared
44
Sequence data
Align sequences
Choose between distance -based and
character-based approach
Distances
Characters
Choose the optimal criteria Parsimony
Maximum likelihood
Choose distance measure Optimal
criteria Single-tree algorithm UPGMA
NJ Fitch-Margoliash KITCH Distance
Wagner
Test reliability of the tree by analytical and/or
resampling procedure
45
TYPES OF DATA Haplotypic data (Dominant
markers) Nucleotide sequences Restriction
sites of animal mtDNA RFLP of animal
mtDNA Multilocus DNA fingerprinting RAPD AFLP
SSCP
46
Genotypic data (codominant markers) Allozymes Re
striction site analysis and RFLP of coding and
non-coding nuclear DNA Single copy nuclear DNA
(sequencing and RFLP) Exon primed intron
crossing-PCR (EPIC- PCR) Minisatellites and
Microsatellites
47
Sequence Alignments
-Multiple alignments (ClustalW or
ClustalX) Alignments of nucleotide sequences
Pairwise alignments Multiple alignments of COI
and 16S rDNA Alignments of protein sequences -CBX
from various species Apis mellifera (AMZGC,
XM_393875), Homo sapiens (HSHP1?, U26312 HSCBX3,
NM_007276 HSHP1, AF13660 HSCBX5-HP1a, CR457418
HSCBX5, NM_012117 and HSCBX1, NM_006807), Mus
musculus (MMCBX3, NM_007624 and MMCBX5,
NM_007626), Gallus gallus (GGCBX3, NM_204643 and
GGCBX1, NM_204332), Xenopus laevis (XLHP1?,
AY168926 XLCBX3, BC046570 and XLHP1a, AF009820),
Cricetulus griseus (CGHP1a, AY548740 and CGHP1ß,
AY548739), Danio rerio (DRCBX1, NM_199746)
48
Sequence Alignments
-Load sequences (FASTA format). -Colors are
illustrated for common residues. -Setting the
alignment parameters -Individual alignment each
sequence to each other sequences in a series of
pairwise alignment. -Set of pairwise alignment to
create the GUIDE TREE. -The guide tree to help
MULTIPLE ALIGNMENTS. Alignment parameters -Pairwis
e alignment -Slow accurate or Fast
approximate -Gap Opening Penalty (10) and Gap
Extension Penalty (0.1) Decreasing the GOP allows
MORE GAP and produce LESSER MISMATCH but may
result in MATCH THAT DO NOT HOMOLOGY.
49
Sequence Alignments
-In creating the GAP PENALTY will result in the
opposite effect. -Alignment of DNA sequences (OK
with the default) -Alignment of protein sequences
GOP 35 and GEP 0.75 as the starting
point. Weight matrix parameters Match
1.9 Mismatch 0 all X or N is regarded as
MATCH. The Protein Weight Matrix -highest score
for MATCH -Some mismatches get higher scores that
others based on biochemical and functional
similarities.
50
Sequence Alignments
-BLOSUM matrix-the best for searching
database. -PAM -GONET (updated PAM) Multiple
alignment parameters -Choose the same
matrix. -For DNA sequences, the default setting
is OK. -For protein sequences, GOP 15.0 and GEP
0.3. -Delay divergence sequences (determine how
different two sequences must be in order for
their incorporation into multiple alignments, 25
or the default of 30) -OUTPUT FILE FORMAT.
51
Sequence Alignments
CREATING THE ALIGNMENTS -Do complete alignment
now. positions that are fully conserved
one of the strong groups is fully conserved (all
of the amino acids at that position belong to the
same group). . one of the weak groups of amino
acids is similarly fully conserved. Increasing
the PW gap penalty to 100 and 7.5 and Multiple
alignment to 100 and 3.0. -Choose Reset All Gaps
Before Alignment. -Do complete alignment.
52
(No Transcript)
53
Protein Sequences
Manual adjustment for protein alignments. -D, R,
E, N, K (Polar, surface of the folded
proteins) -F, A, M, I, L, Y, V and W
(hydrophobic, internal and slowly) -C, H, Q, S
and T may be replaced with any other residue. -D,
R, E, N, K, C, H, Q, S and T when conserved are
likely to be involved in the active site of the
enzymes.
54
Sequence Alignments
Aligning New Sequences to an Existing
Alignment Aligning Two Existing
Alignment -Profile alignments -Load file 1 -Load
file 2 -Align Sequence to Profile 1. Or -Align
Profile 2 to Profile 1 (two aligned file
only). -Do complete alignment now.
55
1. ATATT 2. ATCGT 3. GCAGT 4. GCCGT
PARSIMONY ANALYSIS
Fig. 21 The most parsimoniuus tree for site 1.
56
Fig. 22 The most parsimonious trees for site 2-5.
57
Cross Species Transmission and Epidemiological
Information of HIV
58
(No Transcript)
59
Fig. 2 A strict consensus tree constructed from
10,000 of the most parsimonious trees based on
combined psbA-trnH, and petA-psbJ sequences (tree
length 321 steps consistency index, CI 0.82
and retention index, RI 0.91).
60
Genetic distance (Sequence divergence) between
two sequences
Kimuras two parameter method assumes the
transitional substitution per site per year (a)
to be different from that of transversional
mutations (b)
b
A C
b
b
a
a
T G
b
61
Calculation of genetic distance between a pair
of sequences
d 1 ln (a) 1 ln (b) 2 4
Where a 1/ (1-2P-Q) b
1/1-2Q) P and Q the proportions of
transitional and
transversional differences between a pair
of sequences.
62
Subtype A is commonly found in Central and West
Africa.
Subtype B predominates in Developed world.
Fig. 29 Unrooted NJ tree of 72 gag gene (1254
bp).
63
Fig. 30 Unrooted NJ tree of 64 isolates of HCV E1
gene (576 bp).
64
Fig. 31 Unrooted NJ tree of 76 isolates of HCV
NA-5 gene (222 bp).
65
Fig. 44 Unrooted NJ tree of abalone genotypes
(16S rDNA sequences)
66
Analysis of DNA polymorphisms using RFLP
(fragment) and mapping (restriction sites)
Heritability (fidelity of transmission and mode
of inheritance)
Repeatability
Independence of characters (e.g. Mbo I (GATC)
and Bam HI (GGATCC) or the non-independence from
a C-T transition which eliminates a Bam HI site
(GGATCC) but produce an Hin fI site (GANTC))
67
Restriction fragment length polymorphisms (RFLPs)
If two individuals posses electrophoretically
identical fragments, these fragments are assumed
to identify homologous status of DNA segments.
The compared fragments may be paralogous rather
than orthologous (a serious problem for
restriction analysis of PCR products where
multiple copies or pseudogenes may be amplified
by a pair of primers).
Convergent fragments
Different sized fragments may actually homologous.
68
Restriction site polymorphisms
Co-migrating homologous fragment in different
individuals represent products that are
identical by descent.
Eliminate problems from convergence.
Individuals showing heteroplasmy can be compared.
69
Fig. 2 An example of restriction analysis of the
16S rDNA (A) of P. monodon from Chumphon (lanes
26), Phangnga (lanes 710), and Satun (lanes 11,
13, and 14), and the COI-COII (B) of P. monodon
from Chumphon (lanes 214) digested with Mbo I
and Taq I, respectively. Lanes 1 are undigested
PCR products. A 100-bp ladder was used as the DNA
marker (lanes M).
70
Restriction patterns
Restriction maps (cutting site sharing)
Restriction fragment length Polymorphisms (fragmen
t site sharing)
Phylogenetic trees - UPGMA - Neighbor-joining -
Fitch-Margoliash - KITCH
Genetic distance between genotypes
Loss and gain restriction sites
Genetic distance between populations and/or
species
Maximum parsimony
Bootstrapping
71
Calculation of restriction cleavage site data
Similarity between genotype patterns using
the equation
Sxy 2 nxy / ( nx ny)
where nx and ny the number of restriction
sites observed in the xth and yth
genotypes nxy the number of shared
restriction sites.
Genetic distance can then be estimated using
d (-ln Sxy) / r
where r the number of base pairs
recognised by restriction enzyme.
72
Calculation of restriction fragment data
When restriction sites are not known, d can be
estimated by
d -( 2 / r ) ln ( G )
Where r the number of recognised
sequences at the
restriction site. G is determined by the
following iterative formula
G F (3-2G1) 1/4
73
This iterative computation is repeated until G
G1. An initial trial value of G1 F 1/4 is
recommended.
F if the similarity between genotype patterns and
estimated by,
F 2 mxy / ( mx my )
where mx and my the number of restriction
fragment in
the x th and y th genotypes whereas mxy
the number of shared fragments
between two genotypes.
74
Nucleotide differences between populations
Nucleotide diversity within a population is
calculated by
dx nx xi xj d ij nx - 1
where nx the number of sequence samples
dij the number of nucleotide
substitutions per site between the i th
and j th genotype. xi and xj the sample
frequencies of the i th and j th
genotypes in population X (Nei, 1987).
75
Nucleotide diversity between two populations is
estimated by
dxy xi yj d ij
where d ij the nucleotide substitutions
between the i th and j th
genotype from population
X and Y, respectively.
Nucleotide divergence between two populations
is then calculated by
dA dxy - ( dx dy)/2
76
Fig. 41 Unrooted NJ tree of oysters in Thailand
(RFLP)
77
Fig. 43. A neighbor-joining tree summarizing the
genetic relationships of Pomacea canaliculata, P.
ampullacea, P. polita, P. pesmei, and P.
angelica constructed using the average genetic
distance between pairs of composite haplotypes.
78
Fig. 42 UPGMA dendrogram of abalone in Thailand
(RFLP)
79
Fig. 32 Unrooted NJ tree of Apis cerana in
Thailand (RFLP)
80
Randomly Amplified Polymorphic DNA (RAPD)
Fig. 5 RAPD patterns for P. monodon usingUBC268.
The P. monodon specimens were from Satun (lanes 1
and 2), Trang (lanes 3 and 4), Trat (lanes 58),
Phangnga (lanes 9 and 10) and Chumphon (lanes 11
and 12). An arrowhead indicates a 260-bp RAPD
marker observed in almost all of the specimens
from Trat, but not in the Andaman Sea P. monodon.
81
RAPD MARKERS
Fig. 15 Species-specific RAPD markers resulted
from amplification of total DNA of H. asinina
from HARAYE (lanes 1 - 4), HATRAW (lanes 5 - 8)
and HACAMHE (lanes 9 - 13) with OPB11 (A), UBC195
(B) and UBC271 (C), respectively.
82
AFLP
Fig. 5 A 4.5 denaturing polyacrylamide gel
electrophoresis showing AFLP products of 10
bulked DNA PMF1 - 5 (lanes 1 - 5), PMM1 5
(lanes 6 - 10) using primers E314/M3-15 PMF1 -
BF5 (lanes 11 - 15), PMM1 - PMM5(lanes 16 - 20)
using primers E314/M3-16 An arrowhead
indicates a candidate sex-specific AFLP marker.
Lanes M1 and M2 are 100 bp and 50 bp DNA markers,
respectively.
83
Nuclear DNA Polymorphism
84
Statistical analysis for genetic
relatedness determine by DNA fingerprinting and
RAPD
Estimation of genetic distance
The index of similarity between individuals is
firstly calculated.
For eukaryotes
Sxy 2 nxy / ( nx ny)
where nx and ny the number of fragments
observed in
the xth and yth individuals
nxy the number of shared fragments
between such two individuals.
85
For prokaryotes
Jxy Cxy / (nx ny - Cxy )
where Cxy the number of positive matches
between two individuals
nx and ny the number of scored bands in
individuals x and y, respectively.
Genetic distance is then calculated by
d 1 - Sxy or d 1 - Jxy
86
Calculation of genetic distance between pairs of
populations (or species) for eukaryotic
organisms.
The similarity coefficiency between a pair of
individuals was calculated using the equation
Sxy 2 nxy / ( nx ny)
Similarity index within a population (S) is
calculated as the average of Sxy across all
pairwise comparisons between individuals within
such a population.
87
Genetic similarity between populations with a
correction for within population similarity is
Saij 1 Sij/ - 0.5 ( Si Sj)
where Si Sj the S estimates for population
i and j Sij/ the average similarity between
random pairs of individuals across
population i and j .
Saij was then converted to the genetic distance (
Dij) using the equation
Dij 1 - Saij
88
Fig. 34 UPGMA dendrogram of bee mites having
different host species (RAPD)
Tropilaelaps koenigerum Tropilaelaps clareae
89
Fig. 35 Unrooted NJ tree of mud crabs in
Thailand (RAPD)
90
Analysis of DNA polymorphisms using
Microsatellite loci
Determine genetic diversity in species that
show low levels of variation.
Determine genetic variation at intraspecific
level when populations of such a species are
inbred or experienced severe bottleneck effects.
Carried out pedigree analysis in selective
breeding programmes.
91
Perfect and compound microsatellites.
Heterologous microsatellite primers.
Stutter bands.
92
MICROSATELLITES
93
Statistical analysis for microsatellite data
Allele frequencies
For diploid taxa, the frequency of a particular
allele in a population can be calculated as
p ( 2 NAA Naa) / 2N
where p the frequency of the A allele
N the total number of individuals in
the investigated sample NAA and Naa
the number of homo- and hetero-
zygotes for such a locus.
94
Variation
Heterozygosity (H) is the proportion of
heterozygous individuals and can be calculated by
H (NAB / N) / n
where n is the number of loci. When the
population is in Hardy-Weinberg equilibrium,
heterozygosity can be calculated from allele
frequencies at a given locus by
h 1 - xi2
Where xi is the frequency of the i th allele. H
is the mean of h over all loci.
95
Genetic distance
For population X and Y, the probability of
identity of two randomly chosen genes at a single
locus (Jk) is
jx xi2 and jy yi2
where xi and yi are the frequencies of the ith
alleles at a given locus in population X and Y,
respectively.
The probability of identity of a gene at the same
locus in population X and Y is
jxy xi yi
96
The normalised identity between population X and
Y with respect to all loci is
I Jxy ( JxJy )1/2
Where Jx, Jy and Jxy represent arithmetic means
of jx , jy and jxy , respectively, which are
taken overall loci.
Genetic identity between X and Y population is
then converted to Neis standard genetic
distance using the formulation
D - ln (I)
97
Fig. 33 Bootstrapped NJ tree of Apis cerana in
Thailand (microsatellites)
98
Phylogenetic Analysis Using Protein Sequences
PAM values (numbers of accepted point
mutations per 100 amino acids).
Two homologous proteins that have a common
ancester and a PAM distance of 250-300 (80-85
different) or more cannot be distinguished from
two randomly chosen and aligned proteins of
similar length.
Manual adjustment for protein alignments. -D,
R, E, N, K (Polar, surface of the folded
proteins) -F, A, M, I, L, Y, V and W
(hydrophobic, internal and slowly) -C, H, Q, S
and T may be replaced with any other residue. -D,
R, E, N, K, C, H, Q, S and T when conserved are
likely to be involved in the active site of the
enzymes.
99
Why Protein Sequences?
-Phylogenetic noise reduction -Multigene
families and posttranscriptional
editing -Expressed sequence tag (EST)
Fig. 19 A phylogenetic tree from phosphoglycerate
kinase
100
Protein Distances
Where p is the proportion of observed differences
between any two homologous sequences (Kimura,
1983) and usually reliable for p 0.7
Prodist in PHYLIP
101
Small Ubiquitin Modifier-1 (SUMO-1)
  • SUMO-1 plays an important role in diverse
    reproductive functions such as spermmatogenesis
    and modulation of steroid receptor activity.
  • SUMO-binding motif have been identified in the
    androgen receptor (AR), progesterone receptor
    (PR) and glucocorticoid receptor (GR) suggesting
    distinct roles for sumoylation, a
    posttranslational modification system that
    covalently attaches SUMO-1 to target protein, in
    steroid receptor activity for growth and
    reproduction.

Discovery of SUMO-1 in P. monodon
102
A bootstrapped neighbor-joining tree illustrating
relationships between Dmc1 of various taxa.
103
A bootstrapped neighbor-joining tree illustrating
relationships between a homologue of cyclophilin
A from P. monodon (PMCYA), and that of various
taxa.
104
Progestin membrane receptor Component 1 (PGMRC1)
105
Population Genetics ARLEQUIN - Powerful genetic
analysis packages performing a wide variety of
tests, including hierarchical analysis of
variance. Has import feature for GENEPOP files.
May give spurious results if input contains a lot
of missing data. BOTTLENECK - Detection of
historical population bottlenecks from allele
frequency data. DNASP - Analysis of nucleotide
polymorphism from aligned DNA sequence data. Very
useful for population genetic analyses of
sequence data, including tests for selection.
FSTAT Calculates FST, RST and tests the
estimates, among other standard population
genetics statistics. Also produces and tests
pairwise FST values. New and useful feature is
the estimation of allelic richness corrected for
sample size, and tests for differences in genetic
diversity between groups of samples.
106
GDA - Program for the analysis of discrete
genetic data, based on Weir (1996) Genetic Data
Analysis GENAlEx - Excel Add-In for the analysis
of genetic data. Particularly useful for dominant
data such as RAPD and AFLP data. GENEPOP -
performs exact tests for deviation from
Hardy-Weinberg, linkage disequilibrium,
population differentiation and isolation by
distance (DOS). There is also a web based version
of the program. GENETIX Powerful analysis
package for population genetics, but you have to
understand French. Has nice features such as a
PCA on individual genotypes and permutation tests
of FST. LAMARC Package of programs for
computing population parameters, such as
population size, population growth rate and
migration rates by using likelihoods over all
possible gene genealogies for samples of data
(sequences, microsatellites, and electrophoretic
polymorphisms) from populations . Includes
Coalesce, Fluctuate, Migrate and Recombine.
107
MICROSAT - Program for microsatellite analysis
(distances, Fst, Rst), which has been compiled
for both DOS and Mac MLNE - Maximum likelihood
estimation of Ne with simultaneous estimation of
immigration from another population. NE
ESTIMATOR - uses three different methods of Ne
estimation. Accepts GENEPOP and ARLEQUIN input
files. PCAGEN Principal component analysis on
allele frequency data with significance testing.
POPGENE - Windows (3.1, 95, NT) program for the
analysis of genetic variation among and within
populations using co-dominant and dominant
markers, and quantitative data. REAP - DOS
package for the analysis of (mtDNA) RFLP data.
Old and tested. Provides haplotype and nucleotide
diversity, nucleotide divergence and acrries out
Monte Carlo tests for differences in haplotype
frequencies.
108
RSTCalc - DOS program for the analysis of
microsatellite data. Calculates a variety of
statistics assuming stepwise mutation models.
STRUCTURE - uses a clustering method to identify
population structure and assigns individuals to
those populations. No a priori division into
samples required. TFPGA - Tools for Population
Genetic Analysis. Can perform hierarchical
analyses and use dominant data. Assignment tests
and Mixed stock analysis AFLPOP (scroll to the
bottom of this page) - Allocation of individuals
to populations from AFLP data GENECLASS -
Windows 95/NT program for assignment and
exclusion of individuals to/from populations.
Similar, but more extended stats to DOH (see
above) and IMMANC (see below). IMMANC - A
program that calculates the probability that an
individual is an immigrant, or has recent
immigrant ancestry, using the multilocus genotype
109
WHICHRUN - Population assignment of individuals
based on multilocus genotype data. Several
different methods, including Maximum Likelihood
and Jackknifing. Accepts GENEPOP files and
prepares input files for SPAM. WHICHLOCI - Uses
genotype data to identify the loci most useful
for population assignment. Takes GENEPOP files.
SPAM - Statistics Package for Analyzing
Mixtures. Advanced mixed stock analysis using
Likelihood and Bayesian statistics. Input files
can be prepared using WhichRun.
Phylogenetics Link collections Other Phylogeny
Programs by Joe Felsenstein here at U of
Washington. Very extensive and regularly
updated. Software links at the University of
Glasgow, including programs for tree building and
sequence alignment.
110
Sequence alignment and editing CLUSTAL-one of the
most widely used and powerful multiple alignment
programs. Documentation, download, read the
manuscript and online help (Strasbourg) from
Finnish EMBnet node. BioEdit Sequence alignment
editor featuring automated links to ClustalW and
BLAST, as well as to relevant websites. Very
useful. Specific Programs MEGA Molecular
Evolutionary Genetics Analysis - reconstructs
phylogenies using distance matrices and maximum
parsimony methods, and includes neighbour
joining, branch-and-bound parsimony methods and
bootstrapping. The manual is available in HTML
format.http//www.zi.biologie.uni-muenchen.de/str
immer/puzzle.html PAUP 4.0 The long awaited
phylogeny program - still a beta version, but
it's on sale from Sinauer Assoc. MMBL has PAUP,
if you want to use it. PHYlogeny Inference
Package - PHYLIP Extensive package of programs
for inferring phylogenies for Windows. Sometimes
clumsy in data handling, but fast, versatile and
powerful. DAMBE Software package for data
analysis in molecular biology and evolution.
111
Fig. 1 The PHYLIP software package
112
Applications of phylogenetic analysis for
molecular genetic studies of organisms
113
Macroevolution and Species Phylogenies
the principle of divergence, plays, I believe,
an important part in the origin of
species. Charles Darwin 5 September
1857
114
Macroevolution and Species Phylogenies
115
The Journey of Man Out of Africa?
Microevolution
Mitochondrial DNA polymorphism
116
The Journey of HIV Out of Africa Again
117
Fig. 1. (A) An unrooted phylogenetic tree of
samples from genus Zingiber based on trnL and
rps16 sequences (B) An unrooted phylogenetic
tree of samples from genus Zingiber based on
chemical profiles identified using GC/MS. L8 from
genus Alpinia is used as an outgroup. Numbers
above the lines are bootstrap values in the
phylogenetic tree.
118
Molecular epidemiology of Escherichia coli
causing neonatal meningitis
119
Fig. 1. Phylogram based on a 237 nt fragment of
the 5NCR from 112 Swedish BVDV isolates. Within
ovals pairs of sequences from herds where
suspected epidemiological relationships are
supported by the phylogenetic analysis.
120
Veronica Chemical characters for the support of
phylogenetic relationships based on nuclear
ribosomal and plastid DNA sequence data
-Carnoside (subgenus Chamaedrys) -8-hydroxyflavon
es (subgenera Pocilla and Pentasepalae)
121
(No Transcript)
122
ALFPm1 ---------MRV--LVSFLMALSLIALMP-RCQGQ
GVQDLLPALVEKIAGLWHSDEVEFL ALFPm2
---------MRV--LVSFLMALSLIALMP-RCQGQGVQDLLPALVEKIAG
LWHSDEVEFL ALFPm3 ---------MRVSVLVSLVLVVSL
VALFAPQCQAQGWEAVAAAVASKIVGLWRNEKTELL ALFPm5
---------MRVSVLVSLVLVVSLVAVFAPQCQAQGWEAVAAAVASKI
VGLWRNEKTELL ALFPm4 MYLSSYLISLTVTVLVKYHSSF
SPSLFLCHFFLIPRLHFSNLFVRSPPTRLWRNEKTELL L.
polyphemus --------------------------------EGGIWT
QLALALVKNLATLWQSGDFQFL T. tridentatus
--------------------------------EGGIWTQLALALVKNLAT
LWQSGDFQFL L. polyphemus2 ------------------------
---------DGIWTQLIFTLVNNLATLWQSGDFQFL
.
. . . ALFPm1 GHSCRYSQRPSFYRWELYFNGR
MWCPGWAPFTGRCE------------------------ ALFPm2
GHSCRYSQRPSFYRWELYFNGRMWCPGWAPFTGRSRTRSPSGAIEH
ATRDFVQKALQS-- ALFPm3 GHECKFTVKPYLKRFQVYYK
GRMWCPGWTAIRGEASTRSQSGVAGKTAKDFVRKAFQK-- ALFPm5
GHECKFTVKPYLKRFQVYYKGRMWCPGWDGHQRRSQHTQSVRGS
WKDSQRLRSESFPERS ALFPm4
GHECKFTVKPYLKRFQVYYKGRMWCPGWTAIRGEASTRSQSGVAGKTAKD
FVRKAFQK-- L. polyphemus GHECHYRVNPTVKRLKWKYKGKFW
CPSWTSITGRATKSSRSGAVEHSVRDFVSQAKSS-- T.
tridentatus GHECHYRVNPTVKRLKWKYKGKFWCPSWTSITGRATKS
SRSGAVEHSVRDFVSQAKSS-- L. polyphemus2
DHECHYRIKPTFRRLKWKYKGKFWCPSWTSITGRATKSSRSGAVEHSVRN
FVGQAKSS-- .. . .
. .. ALFPm1
------------------------------------------
-- ALFPm2 -------------NLITEEDARIWLEH-----
------------ ALFPm3 -------------GLISQQEAN
QWLSS----------------- ALFPm5
HLSTGGQPVAQLIGLLLYEELSVFSCSWQWKLYHFDFLCFSFQY ALFPm
4 -------------GLISQQEANQWLSS-------------
---- L. polyphemus -------------GLITEKEAQTFISQYQ-
-------------- T. tridentatus -------------GLITEKE
AQTFISQYE--------------- L. polyphemus2
-------------GLITQRQAEQFISQYN---------------


Fig. 45 Multiple alignment of anti-LPS homologues
in P. monodon
123
Fig. 46 Phylogenetic relationships of anti-LPS in
the black tiger shrimp (Penaeus monodon)
124
A. ALFPm2 MRVLVSFLMALSLIALMPRCQGQGVQDLLPALVEKIA
GLWHSDEVEFLGHSCRYSQRPSF ALFPm1
MRVLVSFLMALSLIALMPRCQGQGVQDLLPALVEKIAGLWHSDEVEFLGH
SCRYSQRPSF
ALFPm2
YRWELYFNGRMWCPGWAPFTGRSRTRSPSGAIEHATRDFVQKALQSNLIT
EEDARIWLEH ALFPm1 YRWELYFNGRMWCPGWAPFTGRCE-----
-------------------------------
..
125
B. ALFPm4 MYLSSYLISLTVTVLVKYHSSFSPSLFLCH
FFLIPRLHFSNLFVRSPPTRLWRNEKTELL ALFPm3
---------MRVSVLVSLVLVVSLVALFAPQCQAQGWEAVAAAVASKIVG
LWRNEKTELL .
. . . . ALFPm4
GHECKFTVKPYLKRFQVYYKGRMWCPGWTAIRGEASTRSQSGVA
GKTAKDFVRKAFQKGL ALFPm3
GHECKFTVKPYLKRFQVYYKGRMWCPGWTAIRGEASTRSQSGVAGKTAKD
FVRKAFQKGL
ALFPm4
ISQQEANQWLSS ALFPm3 ISQQEANQWLSS
C. ALFPm5
MRVSVLVSLVLVVSLVAVFAPQCQAQGWEAVAAAVASKIVGLWRNEKTEL
LGHECKFTVK ALFPm3 MRVSVLVSLVLVVSLVALFAPQC
QAQGWEAVAAAVASKIVGLWRNEKTELLGHECKFTVK

ALFPm5 PYLKRFQVYYKGRMWCPGW
DGHQRRSQHTQSVRGSWKDSQRLRSESFPERSHLSTGGQPV ALFPm3
PYLKRFQVYYKGRMWCPGWTAIRGEASTRSQSGVAGKTAKDF
VRKAFQKGLISQQEANQW
. .. ..
. . ALFPm5 AQLIGLLLYEELSVFSCSWQWKL
YHFDFLCFSFQY ALFPm3 LSS------------------
-------------- .
D. ALFPm5
---------MRVSVLVSLVLVVSLVAVFAPQCQAQGWEAVAAAVASKIVG
LWRNEKTELL ALFPm4 MYLSSYLISLTVTVLVKYHSSFS
PSLFLCHFFLIPRLHFSNLFVRSPPTRLWRNEKTELL
. . .. .
. ALFPm5
GHECKFTVKPYLKRFQVYYKGRMWCPGWDGHQRRSQHTQSVRGSWKDSQR
LRSESFPERS ALFPm4 GHECKFTVKPYLKRFQVYYKGRM
WCPGWTAIR---------------------GEASTRS
.
. . ALFPm5 HLSTGGQPVAQLIGLLLYE
ELSVFSCSWQWKLYHFDFLCFSFQY ALFPm4
QSGVAGKTAKDFVRKAFQKGLISQQEANQWLSS-----------
..... .

126
Fig. 47 Phylogenetic relationships of Crustin in
various shrimp species and newly isolated
Gly-AMPPm1 in Penaeus monodon.
127
CrusPm2 ----------------------------------
------------------RFRHEASQ CrusPm1
--------------------------------------------------
---------- CrusPm3 GGGAYGGGLGGGLGGGGVNGGGL
GGGLGGGVNGGGLGGGLGGGVNGGGLGGGLGGGVHGG CrusPm4
----------------------------------------------
-------------W CrusLs -------------------
----------------------------------------- CrusLs2
-------------------------------------GKYRG
FGQPLGGLGVPGGGVGVG CrusLv1
-------------------------------------GKFRGFGRPFGGL
GGPGGGVGVG
CrusPm2
FTGFRSHSFRKHLAVVSAHGGRPGARPGGFPAGVPGGFPGGVPGEF
PAPHLGGFLSVTAP CrusPm1 -------------------
QSWHGGRPGGFPGGGRP--GGFPGGGRPGGRPGGFPSVTAP CrusPm3
GLGGGLGGGVNGGGLGGGVHGGGLGGGLGGGLSGGLGGGLGR
PGGGLGRPGGGLRPGSRG CrusPm4
RFRGGVNGGGLGGGLGGGVHGGGLGGGLGGGLSGGLGGGLGRPGGGLGRP
GGGLRPGSRG CrusLs -----------------------
--GPGGFSGGVPGGFPGGRPGGFPGGVPGGFPSATAP CrusLs2
VGGGLGGGLGGGLGGGLGGGLGGGLGGLGGGLGGLGGGLGGGLGGG
LGGGLGGGLGGSHG CrusLv1 GG-----------------
--------------------FPGGGLGVGGGLGVGGGLGTG

. CrusPm2
PATCRRWCRTPEDAVYCCESKYEPEAPVGTKPLDCPRVRDTCPPVRFGGL
AP-VTCSSDL CrusPm1 PASCRRWCETPENAFYCCESRYE
PEAPVGTKILDCPKVRDTCPPVRFLAVEQPVPCSSDY CrusPm3
PSTCRYWCTTPEGKQYCCEDKNEPEIPVGTKPLDCPQVR-TCPRFQ
GPP----VTCSHDF CrusPm4 PSTCRYWCTTPEGKQYCCE
DKNEPEIPVGTKPLDCPQVRPTCPRFQGPP----VTCSHDF CrusLs
PATCRRWCKTPENQAYCCETIFEPEAPVGTKPLDCPQVRPTC
PPTRFG--GRPVTCSSDY CrusLs2
TSDCRYWCKTPEGQAYCCESAHEPETPVGTKPLDCPQVRPTCPRFHGPP-
---TTCSNDY CrusLv1 TSDCRYWCKTPEGQAYCCESAHE
PETPVGTKPLDCPQVRPTCPRFHGPP----TTCSNDY
. .
.. CrusPm2
KCGGLDKCCFDRCLKEHVCKPPSFYSHFA-- CrusPm1
KCGGLDKCCFDRCLGQHVCKPPSFYEFFA-- CrusPm3
KCAGLDKCCFDRCLGEHVCKPPSFYGRNVKG CrusPm4
KCAGLDKCCFDRCLGEHVCKPPSFYGRNVKG CrusLs
KCGGLDKCCFDRCLGEHVCKPPSFYSQFR-- CrusLs2
KCAGLDKCCFDRCLGEHVCKPPSFFGQQIFG CrusLv1
KCAGLDKCCFDRCLGEHVCKPPSFFGSQVFG
.
128
DEVELOPMENT OF PCR ARRAYS
A neighbor-joining tree constructed from Pearson
correlation coefficient.
129
Thank you
Write a Comment
User Comments (0)
About PowerShow.com