Title: Two Component Systems Sequence Characteristics Identification in Bacterial Genome
1Two Component Systems Sequence Characteristics
Identification in Bacterial Genome
- Yaw-Ling Lin
- Dept Computer Sci. Info. Management,
- Providence University, Taichung, Taiwan.
2Two-Component System
- Two-component systems (2CS)
- Sensor histidine kinase
- response regulator
- The major controlling machinery in order for
bacteria to encounter a diverse and often hostile
environment
32CS in Pseudomonas aeruginosa PAO1
- Genome 6.3M bp
- predicted genes 5570
- 123 genes were classified as 2CSs.
Complete genome sequence of Pseudomonas
aeruginosa PAO1, an opportunistic pathogen.
Nature. 2000 Aug 31406(6799)947-8. by Stover
CK, Pham XQ, Erwin AL, et al.
http//www.pseudomonas.com/
42CS in PAO1
52CS in PAO1
62CS in PAO1
72CS in PAO1
- There are 123 annotated 2CS genes in PAO1.
- Use systemic analysis of the evolutionary
relationships between the sensor kinase and
response regulator of a 2CS. - Construct phylogenic trees using Clustal-W for 54
sensor kinases and 59 response regulators.
82CS in PAO1 -- Sensor Tree
92CS Regulator Tree
10Subtrees Analysis of 2CS
11Co-evolution subtree Analysis
versus
SensOr Tree
Regulator Tree
12Distance Function
13Algorithm AGR
14k-agreement Problem
15k-agreement Algorithm
- Subtree sum stored the numbers of leaves on the
internal node --- bottom-up, O(n) time - Bucket sort the internal nodes --- O(n) time
- Identifies the subtrees of T1 with k leaves, say
A1 to Ap. --- O(n) time - Identifies the subtrees of T2 with k leaves, say
B1 to Bq. --- O(n) time - Cross checking A1 to Ap and B1 to Bq --- O(n)
time
16Co-evolution sensor/regulator Trees
- 46 paired clustered sensors and regulators
- Six groups of 2CS genes were identified
- Group A 2CS sensor genes PA1335, PA5166 and
PA5511 and their linked regulator genes PA1336,
PA5165, and PA5512 are apparently co-evolved from
a common ancestor. The PA5165/PA5166 pair is
happened to split earlier than the PA1335/PA1336
and PA5511/PA5512. - Group B, E and F each containing two 2CS operons
are probably also evolved by gene cluster
duplication. - Group C contained Bordetella parapertussis
bvgAS-like gene clusters PA3044/PA3045 and
PA3946/PA3947/PA3948. - Ggroup D, the gene clusters PS2571/PA2572 and
PA2882/PA2881 appeared to be perfectly
co-evolved.
17Co-evolution Tree Analysis
18Co-evolution Tree Analysis
19Correlation analysis
- Does gene duplication tend to occur within a
relative short distance on a bacterial genome? - Idea a dot-matrix plot will be created, with the
X-axis being the physical distance, and Y-axis
being the evolutionary distance, between two
comparing 2CS. - Some subset of 2CS, presumably functionally
related, could possess the correlation between
their physical and evolutionary distances.
20k-correlation Problem
21k-correlation is NP-complete
- Let M1 be an adjacent matrix of a graph G, and M2
be an zero matrix. - If we can solve the k-correlation problem in
polynomial time, then the maximum independent set
problem will be polynomial solvable.
22Conclusion
- Identifying novel 2CS in other bacteria genomes
as well as in eucaryotic genomes - Clustering analysis of 2CS for functional
prediction of uncharacterized genes - Co-evolutionary analysis of 2CS
23Future Research
- Identifying novel 2CS in other bacteria genomes
as well as in eucaryotic genomes - Clustering analysis of 2CS for functional
prediction of uncharacterized genes - Co-evolutionary analysis of 2CS