Title: About the use of Protein Models
1About the use of Protein Models
- Jan 28 2003
- University of Basel
2Classification of protein models
- Two criteria are important
- Model correctness is dictated by the quality of
the sequence alignment. If the alignment
contains errors, then the model will have major
errors. - Model accuracy is dictated by the deviation of
the modelling templates used relative to the
experimental control structure.
3Sequence alignment error
Dihydrolipoamide Dehydrogenase
4Comparison of experimental structures as a
function of sequence identity
Rmsd of common core
Sequence identity
Chothia C, Lesk AM (1986) The relation between
the divergence of sequence and structure in
proteins. EMBO J. 5, 823-826.
5Model quality will determine its possible
applications
- Incorrect models can only be of use if the area
of interest is right. This can happen in cases
where large proteins have both well defined
regions and areas with little known structural
similarities. - Correct, but low accuracy models (lt 70 sequence
identity) cannot generally be used in detailed
interaction studies (drug design), but have many
applications
6Interpreting the impact of mutations on protein
function. The link to disease - CD40 Ligand
- X-linked hyper-IgM syndrome (XHM) is an
immunodeficiency caused by mutations in the gene
encoding the CD40 ligand (CD40L). The disease is
characterized by severely reduced serum levels of
IgG, IgA, and IgE, with normal to elevated IgM.
The disease is caused by mutations of the CD40
ligand (CD40L) gene. CD40L is a type II
transmembrane protein that belongs to the tumor
necrosis factor (TNF) superfamily, and is mainly
expressed by activated CD4 T cells. Interaction
between CD40L and its counter-receptor CD40
(expressed by B lymphocytes) is a key signal in
memory B cell generation and germinal center
formation. Defective expression of CD40L leads to
failure to mount secondary antibody responses to
T-dependent antigens, accounting for the
increased susceptibility to bacterial infections
observed in XHM patients.
7Case study of the Trp 140 to Gly mutation
- CD40L G140 is present and expressed but non
functional. - W140 points into the core of the protein.
- The subunit conformation is thus not correct and
the resulting CD40L form is not functional.
8Case of the W140 to R mutation
9Prioritisation of residues to mutate to determine
protein function
- The discovery of gene function in the
post-genomic era will require a sustained
experimental effort, which includes the creation
of molecular mutants. The prioritisation of
residues to mutate will be greatly optimised by
considering the 3-D structure of the target
protein. In many cases, one is then able to
predict the nature of the change. These
predictions can then in be interpreted in the
light of the model.
10The Fas - Fas Ligand interaction
- Fas ligand (FasL, also called CD95 ligand) is a
40 kDa type II membrane protein belonging to the
tumor necrosis factor (TNF) family of proteins.
This family consists of trimeric ligands that
induce defined cellular responses upon binding to
their respective receptors. Receptors of the TNF
receptor family are type I membrane proteins.
They are characterized by the presence of
cystein-rich motives conferring an elongated
structure to their extracellular domains. - FasL is one of the major effector of CD8
cytotoxic T lymphocytes and natural killer
cells. It is also involved in the establishment
of peripheral tolerance, in the
activation-induced cell death of lymphocytes and
in the delimitation of immunoprivileged regions
such as the eye and testis.
11Ligand receptor interaction I
- P206 and Y218 of FasL are predicted to be very
close to the Fas receptor. - Several artificial mutants of both residues
result in 100 to 500 fold less active FasL
despite normal expression and glycosylation. - One mutant (Y218D) was surprisingly active and in
sharp contrast with Y218R which is 10000 fold
less active than wt
12Ligand receptor interaction II
- The environment of Y218 is very polar as well as
basic (R86, R89, K78 and K217). - The additional mutant Y218F was created to
confirm this (loss of hydroxyl group on Y218). - Y218F presented only intermediate activity,
compatible with loosing the hydroxyl group.
13Providing hints for protein function C. elegans
insulin-like proteins
- Insulin and related peptides are key hormones for
the regulation of growth and metabolism.
Originally discovered in mammals, insulin-related
peptides have been identified in chordates,
mollusks and insects. Then, an insulin
receptor-like gene, daf-2, was reported in the
nematode Caenorhabditis elegans. The authors
showed that, as in mammals, this gene is involved
in the regulation of metabolism. Interestingly,
they also showed that this gene affects the
worms longevity. Thus, their finding not only
demonstrate that the genetic circuitry that
regulates glucose metabolism was already present
in the last common ancestor of mammals and
nematodes, more than 600 million years ago, but
also suggests a possible link between aging and
glucose metabolism. Further understanding of
insulin-like signaling pathway in C. elegans
requires the identification of the ligand(s) of
this receptor. Hence we analyzed protein
sequences issued from C. elegans genome project
to search for insulin-related peptides.
14Profile searches 10 new seqeunces
15Comparison between M04D8.3 (Q21506) and hIGF-1
16Towards in silico drug design
- Model or x-ray structures of protein targets
- Finding potential binding sites
- Defining the physicochemical and structural
features of such binding sites - Fitting known small molecules into identified
binding sites in silico screening - Designing new compounds
17GPCRs
- G protein-coupled receptors (GPCRs) mediate our
sense of vision, taste, smell and pain. They are
also involved in cell communication and
recognition processes. They are a major class of
drug targets. - GPCRs can now be modelled with medium accuracy,
and the resulting models can be used to assist
drug discovery projects.
18Validation of modelling and docking methods using
bovine Rhodopsin
- A) Comparison of predicted (green) and x-ray
structure (blue) of bovine Rhodopsin (RMS of TM
regions of 3.1 Å) - B) Comparison of the docked and x-ray structure
of cis-retinal.
19DNA gyrase
- DNA gyrase is involved in the vital process of
DNA replication, transcription and recombination.
It is a procaryotic Topoisomerase II with no
direct mammalian counterpart. - DNA gyrase is a well established target for
antibiotics (quinolones, coumarins,
cyclothialidines). - Resistances to these compounds are a serious
issue. Thus new classes of compounds are needed.
20The Target
- The ATP-binding B subunit of the DNA gyrase was
used as a target for a rational approach. - A combination of these methods was used
- in silico screening to reduced the compound set
to screen, - A biased HTS of DNA gyrase for further reduction,
- Validation of hits based on biophysical
properties, - A 3D guided optimisation process.
21Results
- The in silico screening allowed the reduction of
the initial data set containing 350 000 compounds
to 3000 molecules. - Testing these 3000 selected compounds in the DNA
gyrase assay provided 150 hits clustered in 14
classes. Seven classes could be validated as
true, novel DNA gyrase inhibitors that act by
binding to the ATP binding site located on
subunit B - The 3D guided optimization provided highly potent
DNA gyrase inhibitors, e.g., the
3,4-disubstituted indazole being a 10 times more
potent DNA gyrase inhibitor than Novobiocin.