RNA%20Secondary%20Structure - PowerPoint PPT Presentation

About This Presentation
Title:

RNA%20Secondary%20Structure

Description:

RNA viruses: Retroviruses (HIV), Coronavirus (SARS),. Functions of RNAs ... There has been a substantial acceleration in RNA structure determinations since the ... – PowerPoint PPT presentation

Number of Views:570
Avg rating:3.0/5.0
Slides: 41
Provided by: hein
Category:

less

Transcript and Presenter's Notes

Title: RNA%20Secondary%20Structure


1
RNA Secondary Structure
What is RNA? Definition of RNA secondary
Structure RNA molecule evolution Algorithms for
base pair maximisation Chomskys Linguistic
Hierarchy Stochastic Context Free Grammars
Evolution Miscelaneous topics
2
Base PairingFrom Przytycka
3
An Example t-RNA
From Paul Higgs
4
Known RNAs
t-RNA (transfer-) m-RNA (messenger-) mi-RNA
(micro-) Sn-RNA (small nuclear) RNA-I
(interfering) Srp-RNA (Signal Recognition
Particle) 5S RNA 16S RNA 23S RNA RNA viruses
Retroviruses (HIV), Coronavirus (SARS),. .
5
Functions of RNAs
Information Transfer mRNA
Codon -gt Amino Acid adapter tRNA
Other base pairing functions ???
Enzymatic Reactions
Structural
Metabolic ???
Regulatory RNAi
6
Known RNA Structures http//www.rnabase.org/metaan
alysis/ httpp//www.sanger.ac.uk/Software/rfam
http//www.scor.lbl,gov
Rfam database of RNA alignments and secondary
structure models Scor - database of RNA
experimentally solved structures
Figure 1 The cumulative number of publicly
available RNA containing structures determined by
x-ray crystallography (red), nmr spectroscopy
(purple) or all techniques combined (blue) has
been steadily increasing since the first RNA
containing structure was released in 1978. There
has been a substantial acceleration in RNA
structure determinations since the mid-1990s.
Figure 2 In a positive new trend, the average
number of conformational map outliers per residue
solved has shown a consistent downtrend recently.
Interestingly, most of the improvement can be
attributed to structures determined by x-ray
crystallography. There has been no consistent
trend for structures determined by NMR
spectroscopy.
7
RNA SS recursive definition Nussinov (1978)
remade from Durbin et al.,1997
Secondary Structure Set of paired positions on
inteval i,j. A-U C-G can base pair. Some
other pairings can occur triple interactions
exists. Pseudoknot non nested pairing i lt
j lt k lt l and i-k j-l.
j-1
i1
j-1
i
j
i1
j
j
j
i
i
i
k
k1
i,j pair
j unpaired
i unpaired
bifurcation
8
RNA Secondary Structure
(
)
N1
NL
)
)
(
(
N1
NL
N1
NL
)
)
NL
N1
(
)
Nk
N1
Nk1
NL
)
)
The number of secondary structures
Waterman,1978
9
RNA Matching Maximisation.remade from Durbin et
al.,1997
Example GGGAAAUCC (A-U G-C)
G G G A A A U C C
0 0 02 03 04 05 16 27 3
0 0 0 0 0 0 1 2 32
0 0 0 0 0 1 2 23
0 0 0 0 1 1 14
0 0 0 1 1 15
0 0 1 1 16
0 0 0 07
0 0 0
0 0
A
A
G G G A A A U C
A
U
C
G
C
G
G
10
RNA Secondary Structure Evolution
From Durbin et al.(1998) Biological Sequence
Comparison
11
Inference about hidden structure
Observable
Unobservable
Goldman, Thorne Jones, 96
Knudsen Hein, 99
Pedersen Hein, 03
Observable
Unobservable
12
Goldman, Thorne Jones Structure Evolution
1 A S D F G H J K L P 2 A S D F G H J K
L P 3 D S D F G K J K L C 4 D S D F G K
J K L C HMM ?? x x x x x
?????????????? x x L x x x
13
Three Questions
What is the probability of the data? What is the
most probable hidden configuration? What is the
probability of specific hidden state?
Training Given a set of instances, find
parameters making them
probable if they were independent.
O1 O2 O3 O4 O5 O6 O7 O8 O9
O10
H1
H2
H3
14
The Basic Calculations
What is the most probable hidden configuration?
What is the probability of specific hidden
state?
The time required for these calculations is
proportional to K2L, where K is the number of
hidden states and L the length of the sequence.
15
Empirical Doublet Models
Alignment of slowly N related molecules L
long AUUGCAUUCCAAUUGCAUUCCA rN1,N2
(N1-gtN2,N2-gtN1)/NP/U(NP/U-1)/2 N1 not
N2 AUUGCAUUCCAAUUGCAUUCCA where NP/U is
number of paired/unpaired in alignment AUUGCAUUCCA
AUUGCAUUCCA rN1,N2 N1rN1,N2/N2 AUUGCAUUC
CAAUUGCAUUCCA
Partial Doublet Model AU UA GC CG
UG GU AU -1.16 .18 .5 .12 .02
.27 UA .18 -1.16 .12 .5 .27 .02 CG
.33 .08 -.82 .13 .02 .23 CG .08
.33 .13 -.82 .23 .02 UG .08 1.00 .1
1.26 -2.56 .04 GU 1.00 .08 1.26 .1
.04 -2.56
Singlet/Marginalized Doublet Model A
C G U A -.75/-1.15 .16/.13
.32/.79 .26/.23 C .4/.09 -1.57/-.84
.24/.16 .93/.59 G .55/.45 .17/.13
-.96/-.7 .24/.11 U .35/.18 .51/.70
.19/.16 -1.05/-1.03
16
Doublet Evolution From Bjarne Knudsen
17
Structure Dependent Evolution RNA
U A C A C C G U
U A C A C C G U
U A C A C C G U
U A C A C C G U
18
Structure Dependent Evolution RNA
19
Grammars Finite Set of Rules for Generating
Strings
Regular
Context Free
Context Sensitive
General (also erasing)
finished no variables
20
Chomsky Linguistic Hierarchy Source Biological
Sequence Comparison W nonterminal sign, a any
sign, ??????? are strings, but ?, not null
string. ? Empty String Regular Grammars
W --gt aW W --gt a Context-Free Grammars
W --gt ? Context-Sensitive Grammars
?1W?2 --gt ?1????2 Unrestricted Grammars
?1W?2 --gt ? The above listing is in
increasing power of string generation. For
instance "Context-Free Grammars" can generate all
sequences "Regular Grammar" can in addition to
some more.
21
Simple String Generators Terminals (capital)
--- Non-Terminals (small) i. Start with S
S --gt aT bS T
--gt aS bT ? One sentence odd of as S-gt
aT -gt aaS gt aabS -gt aabaT -gt aaba ii. ?S--gt
aSa bSb aa bb One sentence (even length
palindromes) S--gt aSa --gt abSba --gt abaaba
22
Stochastic Grammars
The grammars above classify all string as
belonging to the language or not.
All variables has a finite set of substitution
rules. Assigning probabilities to the use of
each rule will assign probabilities to the
strings in the language.
If there is a 1-1 derivation (creation) of a
string, the probability of a string can be
obtained as the product probability of the
applied rules.
i. Start with S. S --gt (0.3)aT (0.7)bS
T --gt (0.2)aS (0.4)bT (0.2)?
0.2
0.7
0.3
0.3
S -gt aT -gt aaS gt aabS -gt aabaT -gt aaba
0.2
ii. ?S--gt (0.3)aSa (0.5)bSb (0.1)aa (0.1)bb
0.1
0.3
0.5
S -gt aSa -gt abSba -gt abaaba
23
Secondary Structure Generators
S --gt LS L .869 .131 F
--gt dFd LS .788 .212 L --gt
s dFd .895 .105
24
SCFG Analogue to HMM calculations (Durbin et
al,1998)
What is the probability of the data? What is the
most probable hidden configuration? What is the
probability of specific hidden state?
S
W
WL
WR
j
L
i
1
i
j
The time required for these calculations is
proportional to K2L3, where K is the number of
hidden states and L the length of the sequence.
25
(No Transcript)
26
RNA Secondary Structure Knudsen Hein, 03
27
1. Accuracy as certainty threshold is increased.
2. Accuracy as function of sequence number
From Knudsen Hein (1999)
28
RNA Secondary Structure Knudsen Hein, 03
29
Observing Evolution has 2 parts
P(x)
x
x
P(Further history of x)
30
RNA Structure Prediction and Alignment
Can only align molecules of same type.
Sankoff, 1985 Combined RNA secondary
structure alignment Gorodkin 1997 Foldalign
only hairpins 2002
Dynalign Perriquet 2002 Carnac
31
RNA Structure Representations
Circle with chords
Full Description
E Mountains
Ordered Tree
Balanced Nested Parenthesis
From Fontana, 2003 Moulton et al.,2002
32
RNA Structure Evolution
Insertion-deletion process of Doublets Singlets
There are methods of tree alignments that could
probably be extended to statistica tree alignment.
33
Metrics on RNA StructuresMoulton,2000
Base Pair Metrics Tree Metrics Mountain Metrics
34
Population Genetics of Coupled Mutations W.Stephan
,96 P.Higgs,98
Possible separation of long term and short term
evolution Creation of Linkage Disequilibrium of
paired sites.
35
Singlet?Doublet Models Kirby et al, 95, Tillier
et al.,98, Savill et al.,01
Jukes-Cantor with bias toward base pairing
1/4ml, 1 difference, pairing
gained 1/4m, 1 difference,
pairing unchanged Ri,j 1/4m/l,
1 difference pairing lost 0,
2 differences
36
Contagious Dependencies Overlapping Reading
Frames CG frequencies Pedersen Jensen,01
n n n n n n n n n n n
37
Doublet?Tetraplet Models Nerman Durbin at
B.Knudsens exam 02
Stacking
In principle a 44 times 44 matrix (65.536
entries!!) is need, but proper parametrisation
and symmetries is could reduce this substantially.
38
RNA Protein Structure Dependent Molecular
Evolution
Singlet
Straight forward, no interference from RNA level.
Doublets
What seems to be needed is a parametrisation of
how base pairing creates departure from a
independent singlet,singlet model.
39
Miscellaneous Topics
RNA Folding Molecular Dynamics of RNA
Structures RNA Structure Sequence
Landscapes RNA Homology Modelling Threading RNA
Gene Finding Close to Optimal Structures Constrain
t Satisfaction Modelling
40
Literature www-sites
Eddy, S. Non-coding RNA genes and the modern RNA
world.Nat Rev Genet. 2001 Dec2(12)919-29.
Review. Eddy, S. Computational genomics of
noncoding RNA genes Cell. 2002 Apr
19109(2)137-40. Review. Fontana (2002)
Modelling evo-devo with RNA BioEssays
24.12.1164-77 Knudsen, B. and J.J.Hein (2003)
"Practical RNA Folding (In Press, RNA) Knudsen,
B. and J.J.Hein (1999) "Using stochastic context
free grammars and molecular evolution to predict
RNA secondary structure (Bioinformatics vol 15.5
15.6.446-454) Moore (1999) Structural Motifs in
RNA Ann.Rev.Biochem. 68.287-300. Moulton et al.
(2000) Metrics on RNA Secondary Structures
J.Compu.Biol. 7.1/2.277- Perriquet et al.(2003)
Finding the common homologous structure shared by
two homologous RNAs. Bioinformatics
19.1.108-116.
http//www.imb-jena.de/RNA.html http//scor.lbl.go
v/index.html http//www.rnabase.org/metaanalysis/
Write a Comment
User Comments (0)
About PowerShow.com