Title: Genomics
1Lecture 19
(A) Protein-protein interactionand(B) Nucleic
Acid Structure
Introduction to Bioinformatics
2Lecture 19AProtein-protein interactions
- Complexity
- Multibody interaction
- Diversity
- Various interaction types
- Specificity
- Complementarity in shape and binding properties
3PPI Characteristics
- Universal
- Cell functionality based on protein-protein
interactions - Cyto-skeleton
- Ribosome
- RNA polymerase
- Numerous
- Yeast
- 6.000 proteins
- at least 3 interactions each
- 18.000 interactions
- Human
- estimated 100.000 interactions
- Network
- simplest homodimer (two)
- common hetero-oligomer (more)
- holistic protein network (all)
4Interface Area
- Contact area
- usually gt1100 Å2
- each partner gt550 Å2
- each partner loses 800 Å2 of solvent accessible
surface area - 20 amino acids lose 40 Å2
- 100-200 J per Å2
- Average buried accessible surface area
- 12 for dimers
- 17 for trimers
- 21 for tetramers
- 83-84 of all interfaces are flat
- Secondary structure
- 50 a-helix
- 20 b-sheet
- 20 coil
- 10 mixed
- Less hydrophobic than core, more hydrophobic than
exterior
5Complexation Reaction
- A B ? AB
- Ka AB/AB ? association
- Kd AB/AB ? dissociation
6Experimental Methods for determining PPI
- 2D (poly-acrylamide) gel electrophoresis ? mass
spectrometry - Liquid chromatography
- e.g. gel permeation chromatography
- Binding study with one immobilized partner
- e.g. surface plasmon resonance
- In vivo by two-hybrid systems or FRET
- Binding constants by ultra-centrifugation,
micro-calorimetry or competition - Experiments with labelled ligand
- e.g. fluorescence, radioactivity
- Role of individual amino acids by site directed
mutagenesis - Structural studies
- e.g. NMR or X-ray
7PPI Network
http//www.phy.auckland.ac.nz/staff/prw/biocomplex
ity/protein_network.htm
8Binding vs. Localization
strong
Non-obligatetriggered transient e.g. GTPPO4-
Non-obligatepermanente.g. antibody-antigen
Obligateoligomers
Non-obligateco-localised e.g. in membrane
Non-obligateweak transient
weak
co-expressed and at same place
different places
9Some terminology
- Transient interactions
- Associate and dissociate in vivo
- Weak transient
- dynamic oligomeric equilibrium
- Strong transient
- require a molecular trigger to shift the
equilibrium - Obligate PPI
- protomers no stable structures on their own (i.e.
they need to interact in complexes) - (functionally obligate)
10Analysis of 122 Homodimers
- 70 interfaces single patched
- 35 have two patches
- 17 have three or more
11Interfaces
12Interface
rim
interface
13Interface composition
- Composition of interface essentially the same as
core - But surface area can be quite different!
different surface/interface areas
14Some preferences
prefer
avoid
15Ribosome structure
- In the nucleolus, ribosomal RNA is transcribed,
processed, and assembled with ribosomal proteins
to produce ribosomal subunits - At least 40 ribosomes must be made every second
in a yeast cell with a 90-min generation time
(Tollervey et al. 1991). On average, this
represents the nuclear import of 3100 ribosomal
proteins every second and the export of
80 ribosomal subunits out of the nucleus every
second. Thus, a significant fraction of nuclear
trafficking is used in the production of
ribosomes. - Ribosomes are made of a small and a large subunit
Large (1) and small (2) subunit fit together
(note this figure mislabels angstroms as
nanometers)
16Ribosome structure
- The ribosomal subunits of prokaryotes and
eukaryotes are quite similar but display some
important differences. - Prokaryotes have 70S ribosomes, each consisting
of a (small) 30S and a (large) 50S subunit,
whereas eukaryotes have 80S ribosomes, each
consisting of a (small) 40S and a bound (large)
60S subunit. - However, the ribosomes found in chloroplasts and
mitochondria of eukaryotes are 70S, this being
but one of the observations supporting the
endosymbiotic theory. - "S" means Svedberg units, a measure of the rate
of sedimentation of a particle in a centrifuge,
where the sedimentation rate is associated with
the size of the particle. Note that Svedberg
units are not additive. - Each subunit consists of one or two very large
RNA molecules (known as ribosomal RNA or rRNA)
and multiple smaller protein molecules.
Crystallographic work has shown that there are no
ribosomal proteins close to the reaction site for
polypeptide synthesis. This suggests that the
protein components of ribosomes act as a scaffold
that may enhance the ability of rRNA to
synthesise protein rather than directly
participating in catalysis. - The differences between the prokaryotic and
eukaryotic ribosomes are exploited by humans
since the 70S ribosomes are vulnerable to some
antibiotics that the 80S ribosomes are not. This
helps pharmaceutical companies create drugs that
can destroy a bacterial infection without harming
the animal/human host's cells!
1770S structure at 5.5 Å
(Noller et al. Science 2001)
1870S structure
1930S-50S interface
- Overall buried surface area 8500 Å2
- lt 37.5 Å2
- 37.5 Å2 75 Å2
- gt 75 Å2
20Protein-nucleic acid Interactions
21Interactions in the Ribosome
22Docking - ZDOCK
- Protein-protein docking
- 3-dimensional (3D) structure of protein complex
- starting from 3D structures of receptor and
ligand - Rigid-body docking algorithm (ZDOCK)
- pairwise shape complementarity function
- all possible binding modes
- using Fast Fourier Transform algorithm
- Refinement algorithm (RDOCK)
- Take top 2000 predicted structures from ZDOCK
(RDOCK is too computer intensive to refine very
many possible dockings) - three-stage energy minimization
- electrostatic and desolvation energies
- molecular mechanical software (CHARMM)
- statistical energy method (Atomic Contact Energy)
- 49 non-redundant unbound test cases
- near-native structure (lt2.5Å) on top for 37 test
cases - for 49 within top 4
23Protein-protein docking
- Finding correct surface match
- Systematic search
- 2 times 3D space!
- Define functions
- 1 on surface
- r or d inside
- 0 outside
d
r
24Protein-protein docking
- Correlation function
- Ca,b,g 1/N3 So Sp Sq exp2pi(oa pb qg)/N
Co,p,q
25Docking Programs
- ZDOCK, RDOCK
- AutoDock
- Bielefeld Protein Docking
- DOCK
- DOT
- FTDock, RPScore and MultiDock
- GRAMM
- Hex 3.0
- ICM Protein-Protein docking (Abagyan group,
currently the best) - KORDO
- MolFit
- MPI Protein Docking
- Nussinov-Wolfson Structural Bioinformatics Group
26Docking Programs
- Issues
- Rigid structures or made flexible?
- Side-chains
- Main-chains
- Full atomic detail or simplified models?
- Docking energy functions (purpose built force
fields)
27Docking exampleantibody HyHEL-63 (cyan)
complexed with Hen Egg White Lysozyme
The X-ray structure of the antibody HyHEL-63
(cyan) uncomplexed and complexed with Hen Egg
White Lysozyme (yellow) has shown that there are
small but significant, local conformational
changes in the antibody paratope on binding. The
structure also reveals that most of the charged
epitope residues face the antibody. Details are
in Li YL, Li HM, Smith-Gill SJ and Mariuzza RA
(2000) The conformations of the X-ray structure
Three-dimensional structures of the free and
antigen-bound Fab from monoclonal antilysozyme
antibody HyHEL-63. Biochemistry 39 6296-6309.
Salt links and electrostatic interactions
provide much of the free energy of binding. Most
of the charged residues face in interface in the
X-ray structure. The importance of the salt link
between Lys97 of HEL and Asp27 of the antibody
heavy chain is revealed by molecular dynamics
simulations. After 1NSec of MD simulation at
100C the overall conformation of the complex has
changed, but the salt link persists. Details are
described in Sinha N and Smith-Gill SJ (2002)
Electrostatics in protein binding and function.
Current Protein Peptide Science 3 601-614.
28Introduction to Bioinformatics
- Lecture 19B
- Nucleic acid structure
29Nucleic Acid Basics
- Nucleic Acids Are Polymers
- Each Monomer Consists of Three Moieties
- Nucleotide
- A Base A Ribose Sugar A Phosphate
- Nucleoside
- A Base Can be One of the Five Rings
30- Pyrimidines and Purines can Base-Pair
(Watson-Crick Pairs)
31(No Transcript)
32- Unlike three dimensional structures of proteins,
DNA molecules assume simple double helical
structures independent of their sequences. There
are three kinds of double helices that have been
observed in DNA type A, type B, and type Z,
which differ in their geometries. The double
helical structure is essential to the coding
function of DNA. Watson (biologist) and Crick
(physicist) first discovered the double helix
structure in 1953 by X-ray crystallography. - RNA, on the other hand, can have as diverse
structures as proteins, as well as simple double
helix of type A. The ability of being both
informational and diverse in structure suggests
that RNA was the prebiotic molecule that could
function in both replication and catalysis (The
RNA World Hypothesis). In fact, some viruses
encode their genetic materials by RNA (retrovirus)
33Forces That Stabilize Nucleic Acid Double Helix
- There are two major forces that contribute to
stability of helix formation - Hydrogen bonding in base-pairing
- Hydrophobic interactions in base stacking
5
3
Same strand stacking
cross-strand stacking
3
5
34Types of DNA Double Helix
- Type A major conformation of RNA, minor
conformation of DNA - Type B major conformation of DNA
- Type Z minor conformation of DNA
5
5
3
3
3
5
Z
A
B
3
3
3
5
5
5
Narrow tight
Wide Less tight
Left-handed Least tight
35Three Dimensional Structures of Double Helices
A-DNA
Minor Groove
Major Groove
36Secondary Structures of Nucleic Acids
- DNA is primarily in duplex form.
- RNA is normally single stranded which can have a
diverse form of secondary structures other than
duplex.
37More Secondary Structures of Nucleic Acids
Pseudoknots
Source Cornelis W. A. Pleij in Gesteland, R. F.
and Atkins, J. F. (1993) THE RNA WORLD. Cold
Spring Harbor Laboratory Press.
383D Structures of RNA Transfer RNA Structures
Secondary Structure of tRNA
Tertiary Structure of tRNA
TyC Loop
Anticodon Stem
Variable loop
D Loop
Anticodon Loop
Gm, Cm, etc., are modified bases
393D Structures of RNA Ribosomal RNA Structures
Secondary Structure Of large ribosomal RNA
Tertiary Structure Of large ribosome subunit
rRNA Secondary Structure Based on Phylogenetic
Data
40Central Dogma of Molecular Biology
Transcription
Translation
Replication
DNA
mRNA
Protein
Transcription is carried out by RNA polymerase
(II) Translation is performed on
ribosomes Replication is carried out by DNA
polymerase Reverse transcriptase copies RNA into
DNA
Transcription Translation Expression
41But DNA can also be transcribed into non-coding
RNA
- tRNA (transfer) transfer of amino acids to
theribosome during protein synthesis. - rRNA (ribosomal) essential component of the
ribosomes (complex with rProteins). - snRNA (small nuclear) mainly involved in
RNA-splicing(removal of introns). snRNPs. - snoRNA (small nucleolar) involved in chemical
modifications of ribosomal RNAs and other RNA
genes. snoRNPs. - SRP RNA (signal recognition particle) forms
RNA-protein complex involved in mRNA secretion. - Further microRNA,,eRNA, gRNA, tmRNA etc.
42Eukaryotes have spliced genes
- Promoter involved in transcription initiation
(TF/RNApol-binding sites) - TSS transcription start site
- UTRs un-translated regions (important for
translational control) - Exons will be spliced together by removal of the
Introns - Poly-adenylation site important for transcription
termination (but also mRNA stability,
export mRNA from nucleus etc.)
43DNA makes mRNA makes Protein
44Some facts about human genes
- There are about 20.000 25.000 genes in the
human genome ( 3 of the genome) - Average gene length is 8.000 bp
- Average of 5-6 exons per gene
- Average exon length is 200 bp
- Average intron length is 2000 bp
- 8 of the genes have a single exon
- Some exons can be as small as 1 or 3 bp
45DMD the largest known human gene
- The largest known human gene is DMD, the gene
that encodes dystrophin 2.4 milion bp over 79
exons - X-linked recessive disease (affects boys)
- Two variants Duchenne-type (DMD) and Becker-type
(BMD) - Duchenne-type more severe, frameshift-mutations
Becker-type milder phenotype, in frame-
mutations
Posture changes during progression of Duchenne
muscular dystrophy
46Nucleic acid basics
- Nucleic acids are polymers
nucleotide
nucleoside
- Each monomer consists of 3 moieties
47Nucleic acid basics (2)
- Purines and Pyrimidines can base-pair (Watson-
Crick pairs)
Watson and Crick, 1953
48Nucleic acid as hetero-polymers
(Ribose sugar, RNA precursor)
(2-deoxy ribose sugar, DNA precursor)
- REMEMBER
- DNA deoxyribonucleotidesRNA ribonucleotides
(OH-groups at the 2 position) - Note the directionality of DNA (5-3 3-5) or
RNA (5-3) - DNA A, G, C, T RNA A, G, C, U
(2-deoxy thymidine tri- phosphate, nucleotide)
49So
RNA
50Stability of base-pairing
- C-G base pairing is more stable than A-T (A-U)
base pairing (why?) - 3rd codon position has freedom to evolve
(synonymous mutations) - Species can therefore optimise their G-C content
(e.g. thermophiles are GC rich) (consequences for
codon use?)
Thermocrinis ruber, heat-loving bacteria
51DNA compositional biases
- Base compositions of genomes GC (and therefore
also AT) content varies between different
genomes - The GC-content is sometimes used to classify
organism in taxonomy - High GC content bacteria Actinobacteriae.g. in
Streptomyces coelicolor it is 72Low GC
content Plasmodium falciparum (20) - Other examples
Saccharomyces cerevisiae (yeast) 38
Arabidopsis thaliana (plant) 36
Escherichia coli (bacteria) 50
52Lets return to DNA and RNA structure
- Unlike three dimensional structures of proteins,
DNA molecules assume simple double helical
structures independent on their sequences. - There are three kinds of double helices that have
been observed in DNA type A, type B, and type Z,
which differ in their geometries. - RNA on the other hand, can have as diverse
structures as proteins, as well as simple double
helix of type A. - The ability of being both informational and
diverse in structure suggests that RNA was the
prebiotic molecule that could function in both
replication and catalysis (The RNA World
Hypothesis). - In fact, some viruses encode their genetic
materials by RNA (retrovirus)
53Three dimensional structures of double helices
Side view A-DNA, B-DNA, Z-DNA
Space-filling models of A, B and Z- DNA
Top view A-DNA, B-DNA, Z-DNA
54Major and minor grooves
55Forces that stabilize nucleic acid double helix
- There are two major forces that contribute to
stability of helix formation - Hydrogen bonding in base-pairing
- Hydrophobic interactions in base stacking
5
3
Same strand stacking
cross-strand stacking
3
5
56Types of DNA double helix
- Type A
- major conformation RNA
- minor conformation DNA
- Right-handed helix
- Type B
- major conformation DNA
- Right-handed helix
- Type Z
- minor conformation DNA
- Left-handed helix
57Secondary structures of Nucleic acids
- DNA is primarily in duplex form
- RNA is normally single stranded which can have a
diverse form of secondary structures other than
duplex.
58Non B-DNA Secondary structures
Hoogsteen basepairs
Source Van Dongen et al. (1999) , Nature
Structural Biology 6, 854 - 859
59More Secondary structures
- Cloverleaf rRNA structure
16S rRNA Secondary Structure Based
onPhylogenetic Data
Source Cornelis W. A. Pleij in Gesteland, R. F.
and Atkins, J. F. (1993) THE RNA WORLD. Cold
Spring Harbor Laboratory Press.
603D structures of RNA transfer-RNA structures
- Secondary structure of tRNA (cloverleaf)
- Tertiary structure of tRNA
613D structures of RNA ribosomal-RNA structures
- Secondary structure of large rRNA (16S)
- Tertiary structure of large rRNA subunit
623D structures of RNA Catalytic RNA
- Secondary structure of self-splicing RNA
- Tertiary structure of self-splicing RNA
63Some structural rules
- Base-pairing is stabilizing
- Un-paired sections (loops) destabilize
- 3D conformation with interactions makes up for
this
64Final notes
- Sense/anti-sense RNAantisense RNA blocks
translation through hybridization with coding
strand
Example. Tomatoes synthesize ethylene in order to
ripe. Transgenic tomatoes have been constructed
that carry in their genome an artificial gene
(DNA) that is transcribed into an antisense
RNA complementary to the mRNA for an enzyme
involved in ethylene production ? tomatoes make
only 10 of normal enzyme amount.
- Sense/anti-sense peptidesHave been
therapeutically usedEspecially in cancer and
anti-viral therapy
- Sense/anti-sense proteinsDoes it make
(anti)sense?Codons for hydrophilic and
hydrophobic amino acids on the sense strand may
sometimes be complemented, in frame, by codons
for hydrophobic and hydrophilic amino acids on
the antisense strand. Furthermore, antisense
proteins may sometimes interact with high
specificity with the corresponding sense
proteins BUT VERY RARE HIGHLY CONSERVED CODON
BIAS