Title: DNA sequence evolution in Sunflower and Lettuce
1DNA sequence evolution in Sunflower and Lettuce
Thesis capstone report
Advisor Dr. Loren Rieseberg Dr. Sum Kim
Major Bioinformatics 07/16/2004
2Background
- Sunflower and lettuce represent two major
subfamilies of the Compositae Family, which is
one of the largest and most diverse families of
flowering plants - Sunflower is an important oil seed crop and
domestication and breeding have focused on seed
traits. - Lettuce is an important leaf crop and
domestication and breeding have focused on leaf
traits. - Extensive lettuce and sunflower EST database
available (CGPDB)
3Background
- Examination of DNA differences between closely
related species of Compositae will provide
insight into the nature of mutational rates and
processes in this family - Hypothesis genes associated with primary
domestication traits (seeds in sunflower and
leaves in lettuce) will evolve faster than genes
expressed in other tissues. - Hypothesis upstream enzymes in metabolic
pathways will evolve less rapidly than downstream
enzymes.
4Goals
- Compare distribution of indels and base
substitutions among closely related lettuce and
sunflower EST sequences - Compare rates of EST sequence divergence for
genes from different tissue types - Compare rates of EST sequence divergence from
different pathway, and protein evolution among
specific genes along major metabolic pathways
5Data
http//cgpdb.ucdavis.edu
6CGPDB
- contains about 112,000 individual ESTs sequenced
from both sunflower and lettuce - Sunflower about 44,000 individual ESTs,
previously assembled into 4430 unique contigs,
were sequenced from two Helianthus annuus
cultivars RHA801(exotic) and RHA280(oil). - Lettuce around 68,000 ESTs, previously assembled
into 8179 unique contigs(genes), sequenced from
two species Lactuca serriola (wild) and L.
sativa (cultivated)
7Goals
- Compare distribution of indels and base
substitutions among closely related lettuce and
sunflower EST sequences - Compare rates of EST sequence divergence for
genes from different tissue types and metabolic
pathways - Compare rates of EST sequence divergence from
different pathway, and protein evolution among
specific genes along major metabolic pathways
8Data Analysis Example from sunflower
Genotype 1
Genotype 2
9Data Analysis comparison of complete EST
sequence
10Data Analysis comparison of coding region
only
sun_vs_ath_TIGR_unique lettuce_vs_ath_TIGR_unique
11Result - Sequences information for assembling
Contigs and conseneus
12Result Comparison of complete EST sequences
between two sunflower and two lettuce genotypes
13Result Comparison of coding region only
14- Conclusion1
- Substitutions are 3-6 times more frequent than
indels in both sunflower and lettuce, regardless
of whether coding regions or complete EST
sequences are analyzed.
15Goals
- Compare distribution of indels and base
substitutions among closely related lettuce and
sunflower EST sequences - Compare rates of EST sequence divergence for
genes from different tissue types and metabolic
pathways - Compare rates of EST sequence divergence from
different pathway, and protein evolution among
specific genes along major metabolic pathways
16Data Analysis EST divergence for different
tissue types
- Lettuce
- TAG0 - callus - "cls"
- TAG1 - roots - "rot"
- TAG2 - none (leaf) - "not"
- TAG3 - flowers pre-fert - "flr"
- TAG4 - flowers post-fert - "flo"
- TAG5 - chemical induction - "chi"
- TAG6 - none - "nos"
- TAG7 - roots env stress - "rts"
- TAG8 - shoots env stress - "shs"
- TAG9 - germinating seeds - "gsd"
- TAG10 - flowers env stress - "fls"
- TAG11 - leaves dark grow - "lvd
- Tag_1_7 all root related contigs
- Tag_3_4_10 All flower related contigs
- Tag_7_8_10 All contigs related to environment
stress
- Sunflower
- TAG0 - callus - "cls"
- TAG1 - roots - "rot"
- TAG2 - disk ray flowers - "drf"
- TAG3 - flowers pre-fert - "flr"
- TAG4 - developing kernel - "dkn"
- TAG5 - chemical induction - "chi"
- TAG6 - none - "nos"
- TAG7 - roots env stress - "rts"
- TAG8 - shoots env stress - "shs"
- TAG9 - germinating seeds - "gsd"
- TAG10 - flowers env stress - "fls"
- TAG11 - hulls - "hls
- Tag_1_7 all root related contigs
- Tag_3_10 All contigs related to flower
- Tag_7_8_10 All contigs related to environment
stress
17Result number of tissue-specific contigs in
sunflower
18Result number of tissue-specific contigs in
lettuce
Lettuce TAG-specific contig information
800
738
700
600
500
contigs with coding region
found in both genotypes
400
336
278
300
140
200
85
78
69
60
100
30
29
18
14
11
0
0
0
TAG1(rot)
TAG3(flr)
TAG0(cls)
TAG2(no)
TAG4(flo)
TAG5(chi)
TAG7(rts)
TAG6(nos)
TAG8(shs)
TAG9(gsd)
TAG10(fls)
TAG11(lvd)
TAG_1_7(root)
TAG_7_8_10(stress)
TAG_3_4_10(flower)
Tissue and Treatment
19Result Rates of sequence divergence among
tissue-specific contigs in sunflower and
lettuce
20Result Comparison of rates of sequence
divergence for genes expressed in seeds
versus other
16.00
T-test SubRateKH vs SubRateNonKH P-value
0.0009414
14.00
indel rate
12.00
10.00
Content
8.00
Substitution
6.00
rate
4.00
2.00
0.00
DknHls
Non-DknHls
Other
Seeds
21Result Rates of sequence divergence among
treatment-specific contigs in sunflower and
lettuce
22- Conclusion2
- As predicted, sunflower genes expressed in seeds
evolve significantly faster than genes expressed
in other tissues. Artificial selection for large
seeds and high seed oil content may contribute to
these higher rates. -
- For lettuce, there are no significant differences
in rates of sequence evolution among different
tissues - No differences were found in sunflower or lettuce
among biotic and abiotic stress treatments
23Goals
- Compare distribution of indels and base
substitutions among closely related lettuce and
sunflower EST sequences - Compare rates of EST sequence divergence for
genes from different tissue types and metabolic
pathways - Compare rates of EST sequence divergence from
different pathway, and protein evolution among
specific genes along major metabolic pathways
24Data Analysis EST divergence among metabolic
pathways
- To identify contigs for specific pathways, the
metabolic pathway information from TAIR (The
Arabidopsis Information Resource
http//www.arabidopsis.org/) database was
utilized. - Each contig in the CGPDB was assigned to an
Arabidopsis gene locus (or remained unassigned)
based on the BLAST results. - Genes (contigs) for different metabolic pathways
were clustered and protein divergence was
estimated.
25Data Analysis protein evolution along major
metabolic pathways
- Metabolic pathways
- Lipid metabolic pathways
- Phenylpropanoid biosynthetic pathways
- Cellulose, lignin, sucrose etc. metabolic
pathways - The nonsynonymous substitution rate (Ka) was
calculated for enzymes in different positions
along pathways - Software DnaSP 4.0 was utilized for this
calculation
26Result comparison of metabolic pathway genes
between CGPDB and TAIR
- Based on the blast results, the contigs in CGPDB
were compared with genes in TAIR and assigned to
appropriate pathways - Currently there are 186 pathways with more than
800 unique reactions in the TAIR database. For
these reactions, 1144 unique locus_iDs were
assigned to the enzymes involved. - Among the TAIR loci, 72.1 match Contigs in the
sunflower database and 83.15 match Contigs in
the lettuce database.
27Result Rates of sequence evolution for
sunflower metabolic pathway-specific contigs
28Result Rates of sequence evolution for lettuce
metabolic pathway-specific contigs
29Result Nonsynonymous substitution rate (Ka)
for genes along four metabolic pathways in
sunflower and lettuce
30Conclusion3
- Rates of sequence divergence did not differ among
metabolic pathways - Rates of protein evolution (Ka) did not vary
along metabolic pathways (i.e., upstream genes
evolved at the same rate as downstream genes)
31Summary
- Substitutions are much more frequent than indels
in both sunflower and lettuce - Sunflower genes expressed in seeds evolve
significantly faster than genes expressed in
other tissues - There are no significant differences in rates of
sequence evolution among different tissues in
lettuce - Rates of sequence divergence did not vary
significantly either among or along metabolic
pathways in either sunflower or lettuce
32Acknowledge
- Thanks
- Dr. Loren Rieseberg
- Dr. Sun Kim
- Dr. Sheri Church
- Dr. Zhao Lai