Title: Intersubunit contacts are often facilitated by specificity-determining positions
1Intersubunit contacts are often facilitated by
specificity-determining positions
- Computational identification of protein positions
that possibly account for precise recognition of
the interaction partner
2- Abundance of sequence data
- Little experimental information on protein
function - gt annotation by homology
- Even less information on protein specificity
- gt prediction of specificity-determining
positions (SDPs)
3SDP (Specificity-Determining Position)
- Alignment position that is conserved within
groups of proteins having the same specificity
(specificity groups) but differs between them
SDP is not equivalent to a functionally important
position!
4What can we infer from SDPs?
- Targets for protein functional redesign
- Specificity signature
- Sites of protein-protein interaction
5Talk overview
- SDPpred, an algorithm for identification of SDPs
- A studied example isocitrate/isopropylmalate
dehydrogenases - Link to PPI
6SDPpred
- Multiple protein alignment divided into
specificity groups
AQP spQ9L772AQPZ_BRUME ----------------
---------------------mlnklsaeffgtfwlvfggcgsa ilaa-
-afp-------elgigflgvalafgltvltmayavggisg--ghfnpavs
lgltv iiilgsts------------------------------slap--
---------------- qlwlfwvaplvgavigaiiwkgllgrd------
--------------------------- ------ GLP
spP11244GLPF_ECOLI ------------------------
----msqt---stlkgqciaeflgtglliffgvgcv aalkvag------
---a-sfgqweisviwglgvamaiyltagvsg--ahlnpavtialwl gl
ilaltd------------------------------dgn-----------
---g-vpr -flvplfgpivgaivgafayrkligrhlpcdicvveek--e
tttpseqkasl-------- ------
SDPpred
SDPs positions best discriminating between
specificity groups
7What is in the black box the algorithm
- Mutual information Ip reflect the extent to which
an alignment position tends to be a SDP. - Statistical significance of Ip.
- Expected mutual information Ipexp of an
alignment column. - Z-score.
- (MirnyGelfand, 2002, J Mol Biol, 321(1))
- Are 5 SDP with Z-score gt10.5 better than 10 SDP
with Z-score gt9.0? Bernoulli estimator for
selection of proper number of SDPs - Smoothed amino acid frequencies a leucine is
more a methionine than a valine, and any arginine
has a dash of lysine
- ratio of occurences of amino acid a in group i
in position p to the height of the alignment
column - frequency of amino acid a in position
p - fraction of proteins in group i
8Other similar techniques
- Evolutionary trace (Lichtarge et al. 1996, 1997)
- Evolutionary rate shifts (Gaucher et al. 2002) ?
- Surface patches of slowly evolving residues
(Rate4Site, Pupko et al. 2002) ? - PCA in sequence space (Casari et al. 1999, del
Sol Mesa et al. 2003) - Correlated mutations (Pazos and Valencia, 2002)
- Prediction of functional sub-types (Hannenhalli
and Russell, 2000) and identification of PSDR
(Mirny and Gelfand, 2002)
9Special features of SDPpred
- Smoothed amino acid frequencies allow to account
for functional (structural, chemical,
evolutionary, ) similarities among amino acids - Automatic cutoff setting -gt no prior knowledge
about protein family - Does not require 3D structure -gt use of
structural data solely for interpretation and
verification of results
- Kalinina OV, Mironov AA, Gelfand MS,
Rakhmaninova AB. (2004) Protein Sci 13(2) 443-56 - Kalinina OV, Novichkov PS, Mironov AA, Gelfand
MS, Rakhmaninova AB. (2004) Nucl Acids Res 32(Web
Server issue) W424-8. - http//math.belozersky.msu.ru/psn/
10Example isocitrate/isopropylmalate
dehydrogenases (IDH/IMDH)
- IDH catalyzes the oxidation of isocitrate to
a-ketoglutorate and CO2 (TCA) using either NAD or
NADP as a cofactor in different organisms from
bacteria to higher eukaryotes - IMDH catalyzes oxidative decarboxylation of
3-isopropylmalate into 2-oxo-4-methylvalerate
(leucine biosynthesis) in bacteria and fungi
11IDH/IMDH combinations of specificities towards
substrate and cofactor
- NAD-dependent IDHs
- NADP-dependent IDHs from bacteria and archaea
(type I) - NADP-dependent IDHs from eukaryota (type II)
- NAD-dependent IMDH
Eukaryota
Archaea Bacteria Eukaryota
Mitochondria
Archaea Bacteria
12IDH/IMDH selecting specificity groups
- All NAD-dependent vs. all NADP-dependent
- All IDHs vs. all IMDHs
- Four groups
IDH (NADP) type II
IDH (NADP) type II
IDH (NADP) type II
IDH (NAD)
IDH (NAD)
IDH (NAD)
IMDH (NAD)
IMDH (NAD)
IMDH (NAD)
IDH (NADP) type I
IDH (NADP) type I
IDH (NADP) type I
13IDH/IMDH predicted SDPs (cofactor-specific)
Substrate
Cofactor
SDPs
Subunit I
Subunit II
NADP-dependent IDH from E. coli (1ai2)
14IDH/IMDH predicted SDPs (substrate-specific)
Substrate
Cofactor
SDPs
Subunit I
Subunit II
NADP-dependent IDH from E. coli (1ai2)
15IDH/IMDH predicted SDPs (four groups)
Substrate
Cofactor
SDPs
Subunit I
Subunit II
NADP-dependent IDH from E. coli (1ai2)
16IDH/IMDH predicted SDPs (overview)
17IDH/IMDH SDPs predicted for different groupings
All NAD-dependent vs. all NADP-dependent -gt
cofactor-specific SDPs
All IDHs vs. all IMDHs -gt substrate-specific SDPs
208Arg
337Ala
100Lys
300Ala
105Thr
341Thr
229His
154Glu
103Leu
233Ile
97Val
158Asp
115Asn
305Asn
308Tyr
98Ala
155Asn
231Gly
327Asn
287Gln
344Lys
164Glu
345Tyr
351Val
241Phe
38Gly
40Asp
104Thr
Color code Contacts substrate Contacts
cofactor Contacts the other subunit Contacts
substrate AND cofactor Contacts substrate AND the
other subunit
107Val
152Phe
161Ala
232Asn
245Gly
323Ala
31Tyr
36Gly
162Gly
Four groups
45Met
18IDH/IMDH SDPs in contact with cofactor
Substrate (isocitrate)
100Lys, 104Thr, 105Thr, 107Val, 337Ala,
341Thr substrate-specific and four group SDPs,
functionally not characterized
Cofactor (NADP)
Nicotinamide nucleotide
Adenine nucleotide
344Lys, 345Tyr, 351Val cofactor-specific
SDPs, known determinants of specificity to
cofactor
NADP-dependent IDH from E. coli (1ai2)
19Clusters of SDPs on the intersubunit contact
surface
Cluster II
Cluster I
20and in other protein families
- The LacI family of bacterial transcription
factors - Bind specific operator sequences upon interaction
with effector molecules, mainly various sugars
Cluster I
Effector
Cluster II
DNA operator
LacI (lactose repressor) from E.coli (1jwl)
21- Bacterial membrane transporters from the MIP
family - Water and glycerol/water channels
Cluster II
Cluster I
Substrate (glycerol)
Subunit I
Glpf (glycerol facilitator) from E. coli (1fx8)
22Conclusions
- SDPpred, a method for identification of amino
acids that account for differences in protein
specificity - Results obtained for several protein families of
different functional type agree with structural
and experimental data - A substantial fraction of SDPs are located on the
intersubunit contacts interface, where they form
distinct spatial clasps
23- Olga V. Kalinina
- Pavel S. Novichkov
- Andrey A. Mironov
- Mikhail S. Gelfand
- Aleksandra B. Rakhmaninova
- Department of Bioengineering and Bioinformatics,
Moscow State University, Moscow, Russia - Institute for Information Transmission Problems
RAS, Moscow, Russia - State Scientific Center GosNIIGenetika, Moscow,
Russia
- Acknowledgements
- Leonid A. Mirny
- Olga Laikova
- Vsevolod Makeev
- Roman Sutormin
- Shamil Sunyaev
- Aleksey Finkelstein
Thank you!