Title: Introduction to the RNA Folding Problems
1Introduction to the RNA Folding Problems
C.-M. Chen thanks Shi-Jie Chen at University of
Missouri-Columbia for providing the materials
2What is RNA?
3RNA Primary Structure
(-e)
Structure of RNA backbone
5'
(-e)
(-e)
(-e)
3'
- RNA chain directionality 5'?3'
- Backbone carries charge (-e) on each nucleotide
- Formation of an RNA structure requires cations
4Four Types of Bases
Adenine (A)
Uracil (U)
Guanine (G)
Cytosine (C)
Purines
Pyrimidines
5Waston-Crick canonical base pair
Bases Pairs
A
U
C
G
6RNA Ribonucleic acid Polynucleotide
P
O
c
c
c
O
P
each bond 1.5 A nucleotide structure
We need 7 torsional angles per nucleotide to
specify the 3D structure of an RNA
7Torsion angles are like rotamers of protein side
chain
8RNA Structure
9RNA secondary structure base pairing
U
Base stacking provides stability
10RNA Helix
A-form RNA helix Grooves Binding sites
11RNA Secondary Structure Motif
12The Definition of RNA Secondary and Tertiary
Structure
A graphic representation of base pairing
13Secondary Structure Contact (Base Pair)
Tertiary Structure Contact (Base Pair)
14An RNA Secondary Structure
15RNA Pseudoknot
16RNA Tertiary Structure
tRNA 2 Structure
tRNA 3 Structure
17Tertiary Interactions that Fold tRNA
18Tertiary Interactions are Critical to Functions
TAR
bound form
free form
- Base triplet is the key for TAR function
- (to open up the major groove for protein
binding)
19 20The Central Dogma
transcription
splicing
mRNA
tRNA
translation
ribosome
DNA pre mRNA mRNA protein
21RNAs are Critical to Cellular Functions
- Messenger RNA (mRNA)
- codes for protein
- Small nuclear RNAs (snRNA)
- splice mRNA in nucleus
- Transfer RNA (tRNA) carries
- amino acid to ribosome
- Ribosomal RNA (rRNA) is the
- integral part of the ribosome
22The RNA Folding Problem
23Goal To predict
structure
stability folding kinetics
function
of an RNA from its sequence
Ultimate goal To
predict RNA function
from its sequence
24Why Study RNA Folding Stability?
Ribosome binds here
mRNA
- mRNA has sufficient time to equilibrate before
- translation is initiated equilibrium
stability - Stability is tied to function
25Why Study RNA Folding Kinetics?
- B A conversion is slow as compared with
the translational process - Conformation B is kinetically
trapped.
Kinetics is tied to Function
26RNA Folding Energetics
27Folding Free Energy of Secondary Structure
Folding free energy
?G G ( secondary structure) - G (
)
?G ?H T ?S
28Stabilizing Forces for RNA Secondary Structure
- Restriction of rotor ?S (strong) lt
0
- Base stacking ?H (strong)
lt 0
- Hydrophobic effect ?S (weak) gt 0
- Hydrogen bonding ?H (weak) lt 0
Stability Stacking-Restriction of rotor
29?G for a Secondary Structure
Nearest-Neighbor Model
- stability stacking local
interaction between -
adjacent base pairs - example( 1M Na, 37C)
5'
3'
?H (kcal/mol) ?S (eu) ?G ?H-T?S (kcal/mol)
-8.0 -19.4 -2.0
-14.2 -34.9 -3.4
0.8
-14.2 -34.9 -3.4
-8.0 -19.4 -2.0
G
c
G
c
G
c
A
A
c
G
G
c
G
c
5'
3'
?Gtot -6.6 kcal/mol
30Experimental Thermodynamic Parameters
?H for base stacks
31?S for base stacks
32?S for loops
33RNA Secondary Structure Prediction
34Phylogenetic Method
- Structure is more conserved than sequence
- Compare sequences of RNAs with the same function
from different species - Find covariance bases that conserve base pairs
- (W-C
pairs G-C, A-U)
c
G
UGGUG CACCA
A
U
UAGUC GACUA
G
c
G
c
UGGUG GACCA
U
A
Known structure
35Free Energy minimization
- Particularly useful if only one sequence is
available - For all the possible secondary structures for a
given sequence, find the structure with the min
?G
a. Algorithm lowest ?G for all 5-nt 6-nt
full sequence b. Usually
have multiple optimal structures
http//www.bioinfo.rpi.edu/zukerm/rna/mfold-3.1.h
tml
36Ion-Dependence of RNA folding
37H2O and metal ions are integral parts of nucleic
acid structure
38Na stabilizes secondary structure
From Tinoco Bustamante,JMB (1999) 273,271
- Na by 10 folds Tm by 3.8 C
39Multivalent Ions Stabilize Tertiary Fold
Pseudoknot
40Mg2 Stabilization
Na 200mM
50
From Tinoco Bustamante,JMB (1999) 273,271
41RNA conformational changes are ion-dependent
tRNA
42RNA folding kinetics strongly depends on ions
Na
Secondary structure
Mg2
Tertiary structure
Metal ion binding sites can be formed before,
during, or after the formation of the tertiary
structure
43RNA Folding vs Protein Folding
44RNA Protein
types of sidechains 4 20
backbone 7 2
secondary structure helices a, ß,
of folded states often gt 1 usually 1
folding driving force base stacking specific H? nonspecific
secondary structure stability stable without tertiary (7bp 10 kcal/mol) unstable w/t tertiary (?Gtot 10 kcal/mol)
folding pathway multistate, hierarchical usually kinetically controlled usually 2-state usually thermodynamically controlled
electrostatics highly charged variable
45Part II. Basic Thermodynamics
Thermodynamics is for systems in thermal
equilibrium
46- The population (concentration) of
molecules in (macro)state A is determined by the
free energy
Low free energy High population
47- Relative population between U N
- FU - FN work required to convert U to N
U/(UN)
N
U
48U
F
G
?G
More stable N
Larger
?G
49- Conformational fluctuation
A
A1
A
A
GA
GA
A1
A1
A2
t
t
50- Cooperativity (Two-State-ness)
Two-State (U N) Transition
Corbett Roche, Biochemistry 23, 1888 (1984)
51- Cooperativity reveals the free energy profile
A
A1
A2
A1
A1
A2
A1
A'
A2
A1
A2
52Part III. Why study Thermodynamics ?
53- Thermodynamics can reveal the energetics of
molecular interactions
Inter-molecular potential
Vaporization heat of the liquid gives the
microscopic energy
Binding Energy
54- Stability and conformational fluctuations are
tied to function
De Smit van Duin 1990
mRNA
- Ribosomal RNA binds here to initialize the
translation - t fold ltlt t initiation system reaches
equilibrium
55Part IV. Statistical Thermodynamics of RNA Folding
56Free Energy landscape
Free energy landscape
Population distribution
Multi-dimensional landscape more
structural information.
Valley Stable state
Hill Unstable state
57Free energy is determined by the sum over all
possible conformations
Free energy
Partition function
58Nucleotide sequence
All the possible conformations
A
U
C
G
E1 e-E1/kT
A
A
U
C
G
E2 e-E2/kT
U
C
G
G
A
U
C
E3 e-E3/kT
A
U
C
G
Q
59R17 virus macrostate A component free energy
?G ?H T ?S
(37ºC) kcal/mol
Additive sum of each component free energy
?FA - 21.8 kcal/mol
60Enthalpy parameter ?H (1 M Na)
61Entropy parameter ?S (1 M Na)
62Loop entropy parameter ?S (1 M Na)
63Part V. The need to develop a (new) statistical
mechanical theoryfor RNA folding
64Accuracy for experimental predictions is poor.
Heat Capacity C(T) Temperature T
Gluick Draper 1995
McCaskill 1990
65- Entropy Problem
- Conformation ensemble is incomplete.
- Non-physical entropy is used for
- multi- branched loops.
?S loop length
Jacobson-Stockmayer Theory
?S ln loop length
66- Interferences between different structural
- units are ignored.
- Additivity in free energy
67Inter-subunit interferences are important
3D lattice enumeration
O712701 (exact) ltlt 3.2 1010 Relative error
for ?G 79
68Part VI. A Graph-Theoretic Approach
69The Entropy problem is a bottleneck
- RNA is a polymer
- too many conformations (3100 1050)
- chain connectivity
- excluded volume - a many-body problem
- RNA is a biopolymer
- sequence structure relationship
- compact 3D structure
- long range contact
1
100
U
70A polymer graph approach
1 graph (macrostate) many
conformations (microstates)
71Secondary structure Graphs without crossing
links
Tertiary structure
RNA secondary structure
72(No Transcript)
73Free energy is determined by the sum over graphs
Number of chain conformations for a graph
The key is to compute
74Conformational count for a graph
- Ising model-type approach
75Conformational count for a Subunit
66 matrix (S) for the conformational count
76Inter-Subunit Interferences - 1
Y12Y213
66 matrix (Y) for the inter-subunit viability
77Inter-Subunit Interferences - 2
Y24Y42 0
78(No Transcript)
79Part VII. Application to a Model System
80EAUEUA -e/2
EGCECG -e
81Heat Capacity Melting Curve
2D 3D lattice model for chain conformations
82Both 2D and 3D model give 2-state transition (for
the given sequence and the simplified model)
Free energy landscape
2D
3D
N/NNnumber of native/non-native contacts
83Density of States
Slope1/Tm Tm(3D) lt Tm (2D)
for 2-state U N transition
- Larger entropy change in 3D model
Sharper transition
- Accurate entropy calculation is important for
conformational change
84Part VIII. Applications to Realistic RNA folding
85Beyond the Lattice Model
- One scaling factor does it all
Loop formation
1
Realistic Conformation
Lattice Conformation
1-angle / monomer
7-rotatble angle/nt
U
U
U
U
86Theory Meets Experiments -1
melting curves
Less cooperative
Experimental ?H parameters for base stacks are
used
87Theory Meets Experiments - 1
melting curves
Tm
88Conformational change of an RNA hairpin
- Single pathway ( efficiency 100)
Energy-entropy competition conformational
change
89 Free energy landscape for an RNA hairpin
90Conformational change of a multi-branched RNA
- Multiple pathway (efficiency lt100)
91Free energy landscape for a multi-branched RNA
92RNA Has a Rugged Free Energy Landscape
Strong stacking interaction
Stable intermediates can easily be formed
Rugged free energy landscape
RNA folding is not a simple two-state process
93Part IX. RNA Tertiary Structure Folding
Thermodynamics
94More Complex Tertiary Structural Motifs
95Part X. RNA-RNA Complex
96- Complex formation ? Rigid docking
- Interplay between inter- and intra-chain
stability
97Free Energy of Complex Formation
Free energy of binding?FF2- (F1F1 )
?Fgt0 free form is more favorable
?Flt0 bound form is more favorable
98Binding free energy
99Dependence on the nucleation parameter
Competition between combinatory entropy and
(translation rotational) entropies
100The conformational entropy for RNA-RNA complex
101Conformational change in a model complex system
102Energy Function
EAUEUA -e/2
EGCECG -e
103Conformational changes in a model complex system
2000 gt 410