Title: Evolution of bacterial regulatory systems
1Evolution of bacterial regulatory systems
- Mikhail Gelfand
- Research and Training Center Bioinformatics
- Institute for Information Transmission Problems
- Moscow, Russia
CASB-20, UCDS, La Jolla, 13-14.III.2009
2Plan
- Co-evolution of transcription factors and their
binding motifs - Evolution of regulatory systems and regulons
3Regulators and their motifs
- Cases of motif conservation at surprisingly large
distances - Subtle changes at close evolutionary distances
- Correlation between contacting nucleotides and
amino acid residues
4NrdR (regulator of ribonucleotide reducases and
some other replication-related genes)
conservation at large distances
5DNA motifs and protein-DNA interactions
Entropy at aligned sites and the number of
contacts (heavy atoms in a base pair at a
distance ltcutoff from a protein atom)
CRP
PurR
IHF
TrpR
6The CRP/FNR family of regulators
7Correlation between contacting nucleotides and
amino acid residues
- CooA in Desulfovibrio spp.
- CRP in Gamma-proteobacteria
- HcpR in Desulfovibrio spp.
- FNR in Gamma-proteobacteria
Contacting residues REnnnR TG 1st arginine GA
glutamate and 2nd arginine
DD COOA ALTTEQLSLHMGATRQTVSTLLNNLVR DV COOA
ELTMEQLAGLVGTTRQTASTLLNDMIR EC CRP
KITRQEIGQIVGCSRETVGRILKMLED YP CRP
KXTRQEIGQIVGCSRETVGRILKMLED VC CRP
KITRQEIGQIVGCSRETVGRILKMLEE DD HCPR
DVSKSLLAGVLGTARETLSRALAKLVE DV HCPR
DVTKGLLAGLLGTARETLSRCLSRMVE EC FNR
TMTRGDIGNYLGLTVETISRLLGRFQK YP FNR
TMTRGDIGNYLGLTVETISRLLGRFQK VC FNR
TMTRGDIGNYLGLTVETISRLLGRFQK
TGTCGGCnnGCCGACA
TTGTGAnnnnnnTCACAA
TTGTgAnnnnnnTcACAA
TTGATnnnnATCAA
8The correlation holds for other factors in the
family
9The LacI family subtle changes in motifs at
close distances
G
n
A
CG
Gn
GC
10The LacI family systematic analysis
- 1369 DNA-binding domains in 200 orthologous rows
ltIdgt35, ltLgt71 ?.?. - 4484 binding sites, L20?., ltIdgt45
- Calculate mutual information between columns of
TF and site alignments - Set threshold on mutual information of correlated
pairs
11Definitions
Protein alignment
12Correlated pairs
13Higher order correlations
-ATIKDVAKRANVSTTTV-
AATTGTGAGCGCTCACT
SL
SQ
TL
TQ
14Not a phylogenetic trace
15NrtR (regulator of NAD metabolism)
16Comparison with the recently solved structure
correlated positions indeed bind the DNA (more
exactly, form a hydrophobic cluster)
17Catalog of events
- Expansion and contraction of regulons
- New regulators (where from?)
- Duplications of regulators with or without
regulated loci - Loss of regulators with or without regulated loci
- Re-assortment of regulators and structural genes
- especially in complex systems
- Horizontal transfer
18Regulon expansion, or how FruR has become CRA
- CRA (a.k.a. FruR) in Escherichia coli
- global regulator
- well-studied in experiment (many regulated genes
known) - Going back in time looking for candidate
CRA/FruR sites upstream of (orthologs of) genes
known to be regulated in E.coli
19Common ancestor of gamma-proteobacteria
Mannose
Glucose
ptsHI-crr
manXYZ
edd
epd
eda
adhE
aceEF
icdA
ppsA
pykF
mtlD
mtlA
Mannitol
pckA
gpmA
pgk
gapA
fbp
pfkA
aceA
tpiA
fruK
fruBA
Fructose
aceB
Gamma-proteobacteria
20Common ancestor of the Enterobacteriales
Mannose
Glucose
ptsHI-crr
manXYZ
edd
epd
eda
adhE
aceEF
icdA
ppsA
pykF
mtlD
mtlA
Mannitol
pckA
gpmA
pgk
gapA
fbp
pfkA
aceA
tpiA
fruK
fruBA
Fructose
aceB
Gamma-proteobacteria Enterobacteriales
21Common ancestor of Escherichia and Salmonella
Mannose
Glucose
ptsHI-crr
manXYZ
edd
epd
eda
adhE
aceEF
icdA
ppsA
pykF
mtlD
mtlA
Mannitol
pckA
gpmA
pgk
gapA
fbp
pfkA
aceA
tpiA
fruK
fruBA
Fructose
aceB
Gamma-proteobacteria Enterobacteriales E. coli
and Salmonella spp.
22Regulation of amino acid biosynthesis in the
Firmicutes
- Interplay between regulatory RNA elements and
transcription factors - Expansion of T-box systems (normally RNA
structures regulating aminoacyl-tRNA-synthetases)
23Recent duplications and bursts ARG-T-box in
Clostridium difficile
24 caused by loss of transcription factor AhrC
25Duplications and changes in specificity
ASN/ASP/HIS T-boxes
26Blow-up 1
27Blow-up 2. Prediction
- Regulators lost in lineages with expanded
HIS-T-box regulon??
28 and validation
- conserved motifs upstream of HIS biosynthesis
genes - candidate transcription factor yerC co-localized
with the his genes - present only in genomes with the motifs upstream
of the his genes - genomes with neither YerC motif nor HIS-T-boxes
attenuators
Bacillales (his operon)
Clostridiales Thermoanaerobacteriales Halanaerobia
les Bacillales
29The evolutionary history of the his genes
regulation in the Firmicutes
30T-boxes Summary / History
31Life without Fur
32Regulation of iron homeostasis (the Escherichia
coli paradigm)
- Iron
- essential cofactor (limiting in many
environments) - dangerous at large concentrations
- FUR (responds to iron)
- synthesis of siderophores
- transport (siderophores, heme, Fe2, Fe3)
- storage
- iron-dependent enzymes
- synthesis of heme
- synthesis of Fe-S clusters
- Similar in Bacillus subtilis
33Regulation of iron homeostasis in a-proteobacteria
- Experimental studies
- FUR/MUR Bradyrhizobium, Rhizobium and
Sinorhizobium - RirA (Rrf2 family) Rhizobium and Sinorhizobium
- Irr (FUR family) Bradyrhizobium, Rhizobium and
Brucella
34Distribution of transcription factors in genomes
Search for candidate motifs and binding sites
using standard comparative genomic techniques
35Regulation of genes in functional subsystems
Rhizobiales
Bradyrhizobiaceae
Rhodobacteriales
The Zoo (likely ancestral state)
36Reconstruction of history
Frequent co-regulation with Irr
Strict division of function with Irr
Appearance of theiron-Rhodo motif
37All logos and Some Very Tempting Hypotheses
2
- Cross-recognition of FUR and IscR motifs in the
ancestor. - When FUR had become MUR, and IscR had been lost
in Rhizobiales, emerging RirA (from the Rrf2
family, with a rather different general
consensus) took over their sites. - Iron-Rhodo boxes are recognized by IscR directly
testable
1
3
38Summary and open problems
- Regulatory systems are very flexible
- easily lost
- easily expanded (in particular, by duplication)
- may change specificity
- rapid turnover of regulatory sites
- With more stories like these, we can start
thinking about a general theory - catalog of elementary events how frequent?
- mechanisms (duplication, birth e.g. from enzymes,
horizontal transfer) - conserved (regulon cores) and non-conserved
(marginal regulon members) genes in relation to
metabolic and functional subsystems/roles - (TF family-specific) protein-DNA recognition code
- distribution of TF families in genomes
distribution of regulon sizes etc.
39People
- Andrei A. Mironov software, algorithms
- Alexandra Rakhmaninova SDP, protein-DNA
correlations - Anna Gerasimova (now at LBNL) NadR
- Olga Kalinina (on loan to EMBL) SDP
- Yuri Korostelev protein-DNA correlations
- Olga Laikova LacI
- Dmitry Ravcheev CRA/FruR
- Dmitry Rodionov (on loan to Burnham Institute)
iron etc. - Alexei Vitreschak T-boxes and riboswitches
- Andy Jonson (U. of East Anglia) experimental
validation (iron) - Leonid Mirny (MIT) protein-DNA, SDP
- Andrei Osterman (Burnham Institute)
experimental validation
- Howard Hughes Medical Institute
- Russian Foundation of Basic Research
- Russian Academy of Sciences, program Molecular
and Cellular Biology