Title: BCB 444/544
1 BCB 444/544
- Lecture 25
- More RNA Structure
- BCB 544 Projects
- 25_Oct19
2 Required Reading (before lecture)
- Mon Oct 15 - Lecture 23
- Protein Tertiary Structure Prediction
- Chp 15 - pp 214 - 230
- Wed Oct 17 Thurs Oct 18 - Lecture 24 Lab 8
(Terribilini) - RNA Structure/Function RNA Structure
Prediction - Chp 16 - pp 231 - 242
-
- Fri Oct 18 - Lecture 25 ( Mon Oct 22)
- Gene Prediction
- Chp 8 - pp 97 - 112
-
3Homework Assignment
- ALL HomeWork 4 (emailed posted online Sat
AM) - Due Mon Oct 22 by 5 PM (not Fri Oct 19)
-
- Read
- Ginalski et al.(2005) Practical Lessons from
Protein Structure Prediction, Nucleic Acids Res.
331874-91. http//nar.oxfordjournals.org/cgi/cont
ent/full/33/6/1874 - (PDF posted on website)
- Although somewhat dated, this paper provides a
nice overview of protein structure prediction
methods and evaluation of predicted structures. - Your assignment is to write a summary of this
paper - for details see HW4 posted online sent
by email on Sat Oct 13
4BCB 544 Only New Homework Assignment
- 544 Extra2 (posted online Thurs?)
- Due Fri Nov 2 by 5 PM
- HW2 is next step in Team Projects
- Will end lecture a few minutes early today - to
allow time to meet discuss 544 Teams Projects -
5 Seminars this Week
- BCB List of URLs for Seminars related to
Bioinformatics - http//www.bcb.iastate.edu/seminars/index.html
- Oct 18 Thur - BBMB Seminar 410 in 1414 MBB
- Sachdeve Sidhu (Genentech) Phage peptide and
antibody libraries in protein engineering and
ligand selection - Was great talk!
- Oct 19 Fri - BCB Faculty Seminar 210 in 102 ScI
- Lyric Bartholomay (Ent, ISU) Computational
Biology and vector-borne disease from the field
to the bench
6Another local example Combining Structure
Prediction, Machine Learning "Real" (wet-lab)
Experiments to Investigate the Lentiviral Rev
Protein A Step Toward New HIV
Therapies
Susan Carpenter (Washington State
Univ) Wendy Sparks Yvonne Wannemuehler Drena
Dobbs, GDCB Jae-Hyung Lee Michael
Terribilini Kai-Ming Ho, Physics Yungok
Ihm Haibo Cao Cai-zhuang Wang Gloria Culver,
BBMB Laura Dutca
7Chp 16 - RNA Structure Prediction
- SECTION V STRUCTURAL BIOINFORMATICS
- Xiong Chp 16 RNA Structure Prediction
(Terribilini) - RNA Function
- Types of RNA Structures
- RNA Secondary Structure Prediction Methods
- Ab Initio Approach
- Comparative Approach
- Performance Evaluation
8RNA Function
This slide has been changed
- Storage/transfer of genetic information
- Newly discovered regulatory functions
- miRNA si RNA pathways, especially
- Catalytic
9RNA types functions
Types of RNAs Primary Function(s)
mRNA - messenger translation (protein synthesis) regulatory
rRNA - ribosomal translation (protein synthesis) ltcatalyticgt
tRNA - transfer translation (protein synthesis)
hnRNA - heterogeneous nuclear precursors intermediates of mature mRNAs other RNAs
scRNA - small cytoplasmic signal recognition particle (SRP) tRNA processing ltcatalyticgt
snRNA - small nuclear snoRNA - small nucleolar mRNA processing, polyA addition ltcatalyticgt rRNA processing/maturation/methylation
regulatory RNAs (siRNA, miRNA, etc.) regulation of transcription and translation, other??
10RNA Structures
- RNA forms complex 3D structures
- Mainly "single-stranded" - but
- Single RNA strandscan self-hybridize to form
- Base-paired regions
11Levels of RNA Structure
This slide has been changed
- Like proteins, RNA has primary, secondary, and
tertiary structure ( quaternary structure, too) - Primary structure Ribonucleotide sequence
- Secondary structure Helix vs turn (base-paired
vs single-stranded) Note in RNA, helices often
involve long-range interactions - Tertiary structure 3D structure (also due to
long-range interactions) - Quaternary structure complex of 2 or more RNA
strands
Rob Knight Univ Colorado
12Common structural motifs in RNA
- Helices
- Loops
- Hairpin
- Interior
- Bulge
- Multibranch
- Pseudoknots
- Tetraloops
Fig 6.2 Baxevanis Ouellette 2005
13Covalent non-covalent bonds in RNA
This is a new slide
- Primary
- Covalent bonds
- Secondary/Tertiary
- Non-covalent bonds
- H-bonds
- (base-pairing)
- Base stacking
-
Fig 6.2 Baxevanis Ouellette 2005
14RNA Structure Prediction
This slide has been changed
- RNA tertiary structure is very difficult to
predict - Focus on predicting RNA secondary structure
- Given an RNA sequence, predict its secondary
structure - Almost all methods ignore higher order secondary
structures such as pseudoknots tetraloops - Specialized software is available for predicting
these
15RNA Pseudoknots Tetraloops
This is a new slide
- Often have important regulatory or catalyltic
functions
Pseudoknot
Tetraloop
http//academic.brooklyn.cuny.edu/chem/zhuang/QD/m
ckay_hr.gif
http//www.lbl.gov/Science-Articles/Research-Revie
w/Annual-Reports/1995/images/rna.gif
16Base Pairing in RNA
This slide has been changed
- G-C, A-U, G-U ("wobble") many variants
See IMB Image Library of Biological Molecules
http//www.fli-leibniz.de/ImgLibDoc/nana/IMAGE_NAN
A.htmlbasepairs
17Experimental RNA structure determination?
- X-ray crystallography
- NMR spectroscopy
- Enzymatic/chemical mapping
-
-
18RNA Secondary Structure Prediction Methods
This slide has been changed
- Two (three, recently) main types of methods
- Ab initio - based on calculating most
energetically favorable secondary structure(s) - Energy minimization (thermodynamics)
- Comparative approach - based on comparisons of
multiple evolutionarily-related RNA sequences - Sequence comparison (co-variation)
- Combined computational experimental
- Use experimental constraints when available
19RNA Secondary structure prediction - 1
This is a new slide
- Energy minimization (thermodynamics)
- Algorithms
- Dynamic programming to find
- high probability pairs
- (also, some Genetic algorithms)
- Software
- Mfold - Zuker
- RNAfold (Vienna Package) -Hofacker
- RNAstructure - Mathews
- Sfold - Ding Lawrence
R Knight 2005
20RNA Secondary structure prediction - 2
This is a new slide
2) Comparative sequence analysis (co-variation)
- Algorithms
- Mutual information
- Context-free grammars
- Software
- RNAlifold
- Foldalign
- Dynalign
-
21RNA Secondary structure prediction - 3
This is a new slide
3) Combined experimental computational
- Experiments
- Map single-stranded vs double-stranded regions
in folded RNA - How?
- Enzymes S1 nuclease, T1 RNase
- Chemicals kethoxal, DMS, OH?
- Software
- Mfold
- Sfold
- RNAStructure
- RNAFold
- RNAlifold
221 - Ab Initio Prediction
This slide has been changed
- Requires only a single RNA sequence
- Calculates minimum free energy structure
- Base-paired regions have lower free energy, so
methods "attempt to find secondary structure with
maximal base pairing" (Careful!) - IMPORTANT Largest contribution to energy is to
nearest neighbor (base-stacking) interactions,
not base-pairing!
23Ab Initio Prediction Clarifications
This slide has been changed
- Free energy is calculated based on parameters
determined in the wet lab - Correction Use known energy associated with
each type of nearest-neighbor pair
(base-stacking) (not base-pair) - Base-pair formation is not independent multiple
base-pairs adjacent to each other are more
favorable than individual base-pairs -
cooperative - because of base-stacking
interactions - Bulges and loops adjacent to base-pairs have a
free energy penalty
24Ab Initio Prediction What are the assumptions?
This is a new slide
- Native tertiary structure or "fold" of an RNA
molecule is (one of) its "lowest" free energy
configuration(s) - Gibbs free energy ?G in kcal/mol at 37?C
- equilibrium stability of structure
- lower values (negative) are more favorable
- Is this assumption valid?
- in vivo? - this may not hold, but we don't
really know -
-
25Energy minimization What are the rules?
This is a new slide
What gives here?
Why 1.2 vs 1.6?
C Staben 2005
26Energy minimization calculations Base-stacking
is critical
This is a new slide
- Tinocco et al.
C Staben 2005
27Ab initio RNA Structure Prediction Uses
Nearest-neighbor parameters
This is a new slide
- Most methods for ab initio prediction (free
energy minimization) use nearest-neighbor energy
parameters (derived from experiment) for
predicting stability of an RNA secondary
structure (in terms of ?G at 37?C) -
- most available software packages use same set
of parameters - - Mathews, Sabina, Zuker
28Ab Initio Energy Calculation
This slide has been changed
- Search for all possible base-pairing patterns
- Calculate total energy of each structure based on
all stabilizing and destabilizing forces
- Total free energy for a specific RNA conformation
Sum of incremental energy terms for - helical stacking
- (sequence dependent)
- loop initiation
- unpaired stacking
(favorable "increments" are lt 0)
Fig 6.3 Baxevanis Ouellette 2005
29Dot Matrices
- Can be used to find all possible base pair
patterns - Compare input sequence to itself and put a dot
where there is a complimentary base
R Knight 2005
30Dynamic Programming
This slide has been changed
- Finding optimal secondary structure is difficult
- lots of possibilities - Compare RNA sequence with itself
- Apply scoring scheme based on energy parameters
for base stacking, cooperativity, and penalties
for destabilizing forces - Find path that represents most energetically
favorable secondary structure
31Problem with DP Approach
- DP returns SINGLE lowest energy structure
- There may be many structures with similar
energies - Also, predicted secondary structure is only as
good as energy parameters used - Solution return multiple structures with near
optimal energies
32Popular Ab Initio Prediction Programs
- Mfold
- Combines DP with thermodynamic calculations
- Fairly accurate for short sequences, less
accurate as sequence length increases - RNAfold
- Returns multiple structures near predicted
optimal structure - Computes larger number of potential secondary
structures than Mfold, so uses a simplified
energy function
332 - Comparative Prediction Approaches
- Use multiple sequence alignment
- Assume related sequences fold into same secondary
structure
34Co-variation patterns in MSAs are critical
- RNA functional motifs are conserved
- To maintain RNA structure during evolution, a
mutation in a base-paired residue must be
compensated for by a mutation in residue with
which it pairs - Comparative methods search for co-variation
patterns in MSAs
35Consensus Structures
- Predict secondary structure of each individual
sequence in a MSA - Compare all structures and try to identify a
consensus structure
36Popular Comparative Prediction Programs
- Two main types
- Require user to provide MSA
- RNAalifold
- No MSA required
- Foldalign
- Dynalign
37RNAalifold
- Requires user to provide MSA
- Creates a scoring matrix combining minimum free
energy and co-variation information - DP used to identify minimum free energy structure
38Foldalign
- User provides pair of unaligned RNA sequences
- Constructs alignment computes conserved
structure - Suitable only for relatively short sequences
39Dynalign
- User provides two unaligned input sequences
- Calculates possible secondary structures using
algorithm similar to Mfold - Compares multiple structures from both sequences
to find a common structure
403 - Popular Programs that use Combined
Computational Experimental Approaches
- Mfold
- Sfold
- RNAStructure
- RNAFold
- RNAlifold
41Comparison of Predictions for Single RNA using
Different Methods
JH Lee 2007
42Comparison of Mfold Predictions -/ Constraints
Mfold plus constraints -54.84 kcal/mol
Mfold -126.05 kcal/mol
JH Lee 2007
43Performance Evaluation
This slide has been changed
- Ab initio methods? correlation coefficient
20-60 - Comparative approaches? correlation coefficient
20-80 - Programs that require user to supply MSA are more
accurate - Comparative programs are consistently more
accurate than ab initio - Base-pairs predicted by comparative sequence
analysis for large small subunit rRNAs are 97
accurate when compared with high resolution
crystal structures! - Gutell, Pace - BEST APPROACH? Methods that combine
computational prediction (ab initio
comparative) with experimental constraints (from
chemical/enzymatic modification studies)
44BCB 544 "Team" Projects
- 544 Extra HW2 is next step in Team Projects
- Write 1 page outline
- Schedule meeting with Michael Drena to discuss
topic - Read a few papers
- Write a more detailed plan
- You may work alone if you prefer
- Last week of classes will be devoted to Projects
- Written reports due Mon Dec 3 (no class that
day) - Oral presentations (15-20') will be Wed-Fri Dec
5,6,7 - 1 or 2 teams will present during each class
period - See Guidelines for Projects posted online