Title: Bioinformatics: Applications
1Bioinformatics Applications
- ZOO 4903
- Fall 2006, MW 1030-1145
- Sutton Hall, Room 312
- RNA regulation Alternative splicing and miRNAs
2Lecture overview
- What weve talked about so far
- Methods of finding genes in prokaryotes
eukaryotes - Overview
- Genes can have multiple isoforms
- Alternative splicing - tinkering with Natures
parts list - Isoforms can be mapped with ESTs
- miRNAs provide another means of regulating genes
3mRNA splicing
4mRNA Splicing
5Question
- Q Making genes with introns consumes a lot of
cellular energy, which seems like a waste when
96 of the gene is thrown away in humans. Why
bother making RNA that is just going to be
discarded?
6Question
- Q Making genes with introns consumes a lot of
cellular energy, which seems like a waste when
96 of the gene is thrown away in humans. Why
bother making RNA that is just going to be
discarded? - A Being able to recombine genes in different
ways is apparently worth the cost
7Alternative Splicing
3
5
pre-mRNA
GT
AG
8Alternative splicing
- First discovered in immunoglobulin genes around
1980 - Is regulated by cellular machinery, but some
splicing may be leaky - Drosophila Dscam gene can have up to 38,000
possible splice variants
9(No Transcript)
10(No Transcript)
11Introns expression
- Highly expressed genes tend to have short introns
Nature Genetics 31, 415 - 418 (2002)
12Alternative splicing by species
Brett et al. 2002. Nature Genetics 3029-30.
13Possible effects of alternative splicing
- Estimated 80 of alternative splicing events
affect protein - Inclusion/exclusion of functional protein domains
(e.g. localization) - Change in protein structure
- Change in promoter, affecting translation
- Change in polyadenylation, affecting mRNA
stability
14Bioinformatics challenges
- Can we predict what genes will (or might) be
alternatively spliced? - Can we prediction what the effects of AS would
be? - Exon skipping
- Exon swapping (mutually exclusive exons)
- Exon joining (5, 3 or both)
- Intron retained (no splicing event)
15Types of alternative splicing
16Changing one protein form can have cascading
effects sex determination in Drosophila
17Alternative splicing is regulated
18But apparently there are also aberrant splicing
events
Trends in Genetics 20(2) Feb 2004
19Predicting AS by conserved non-coding regions
Alternatively spliced exons often show a higher
level of conservation and conservation that
extends into the flanking introns.
Philips et al. 2004. RNA
20Graphic representation of the differences between
alternative and constitutive exons
In nearest 100 nt of flanking upstream intron
In nearest 100 nt of flanking downstream intron
length of conserved region
Rotem Sorek et al. Genome Res. 2004 14 1617-1623
21Finding potential splice variants using ESTs
22Alternative splicing databases
- EBI Alternative splicing database
(http//www.ebi.ac.uk/asd/) - Alt splicing database (ASDB http//hazelton.lbl.
gov/teplitski/alt/) - ExInt (Sakharkar et al., 2000), data from GenBank
- YIDB (Lopez et al., 2000), data from EMBL
- HASDB (Modrek et al., 2001), data from UniGene
and Human Genome - Alternative-Exon Database (Stamm et al., 2000),
data from literature - MGAlign (Tan et al, 2003), mRNA/genome alignments
- And others
23Biological issues in Alternative Splicing
- How is alternative splicing regulated, and who
are the players? - How wide-spread are patterns of alternative
splicing for individual gene products relative to
theoretical complexity? - What is the origin of introns? Do they represent
boundaries between fusion of primitive
mini-proteins (introns early) or later insertion
of mobile elements (introns late) or both? - Do introns serve roles other than acting as
splice boundaries of protein-coding cassettes? - Why arent introns found very often in
prokaryotic genomes? - What are the structure/function consequences of
alternative splicing? What are the best studied
systems?
24- How is alternative splicing regulated, and who
are the players?
- There are a number of proteins involved in
alternative splicing - Spliceosome component concentrations can affect
patterns of splicing - Number of splicing combinations can potentially
be very rich
Park et al, PNAS 2004 Nov 9101(45)15974-9
252. How wide-spread are patterns of alternative
splicing for individual gene products relative
to theoretical complexity?
Exon arrays will soon help answer this question
263. What is the origin of introns? Do they
represent boundaries between fusion of primitive
mini-proteins (introns early) or later insertion
of mobile elements (introns late), or both?
Intron/Exon structure of the chicken pyruvate
kinase gene Nils Lonberg and Walter
Gilbert Harvard University and Biogen Corp. Cell
40 81-90, January 1985
The chicken pyruvate kinase gene is interrupted
by at least ten introns, including nine within
the coding region. The introns are not randomly
placed-they divide the coding sequence into
uniformly sized pieces encoding discrete elements
of secondary structure. The introns tend to fall
at interruptions between stretches of ?-helix or
?-sheet residues. This structure suggests that
introns were not inserted into a previously
uninterrupted coding sequence, but instead are
products of the evolution of the first pyruvate
kinase gene.
?
273. What is the origin of introns? Do they
represent boundaries between fusion of primitive
mini-proteins (introns early) or later insertion
of mobile elements (introns late), or both?
- Introns conserved within homologues of ancient
proteins tend to have exon insertion points
between codons (i.e. between reading frames or
phase 0) and purported to correlate to boundaries
between structural modules (introns-early) - Introns found in recent genes (eukaryotic
protein without prokaryotic homologues) not
correlated with codon phase or structural
boundaries. (introns-late)
284. Do introns serve roles other than acting as
splice boundaries of protein-coding cassettes?
- Spliced intronic sequences can be source of guide
RNAs for RNA editing. (gt 200 identified in
vertebrates, mostly from intronic sequences) - Can contain cis-acting transcriptional regulatory
sequences - Hotspots for recombinatory trans-splicing of
separate mRNAs to form novel messages - Enhancers of meiotic (germ-line) cross-over
events in coding regions - Site of signals for mRNA export and for
nonsense-mediated decay
295. Why arent introns found very often in
prokaryotic genomes?
- Bacterial doubling is faster than in
polymerase-catalyzed chromosomal replication.
This leads to pressure to streamline the
prokaryotic genome and eliminate non-coding
sequence. - In bacteria, unlike eukaryotes, translation
begins right away, before creation of the mRNA is
even complete. Therefore, movement of the
ribosome over the 5 splice site of the intron
RNA before the RNA polymerase makes it to the 3
end would interfere with folding and splicing of
the intron.
306. Structural and functional consequences of
alternative splicing
- Ion channel signaling behaviors altered
- -Glutamate receptor different receptors display
different activities - 2) Protein-protein binding surface altered
- -Neurexins AS changes ligand binding
- 3) Enzyme active sites altered
- -Phosphotyrosine phosphatase different isoforms
have different substrate specificities - 4) Enzyme allosteric site altered
- -Pyruvate kinase AS alteres allosteric
regulation - 5) DNA binding modules shuffled
- -Lola different isozymes have different zinc
finger combinations leading to different target
specifities
31Glutamate receptor Different isozymes display
unique activities responses
- I. 15 genes --gt 3 functional families
- --Differ in ligand recognition specificities
- II. Alternative splicing --gt 34 isoforms
- (Mutually exclusive domain shuffling (flip vs.
flop) and alternative 5 translational start
(long/short)) - --Differ in kinetics of repolarization.
- III. RNA editing --gt 70 total isoforms
- Adenosine deamination Q/R, R/G, I/Y, V/Y, V/C,
I/C alternative codons - --Differ in ion specificity and flux rate.
Flop AMPAR
Flip AMPAR
32RNA Editing
- RNA editing is the co- or post-transcriptional
alteration of RNA sequence from that encoded by
the genome - RNA editing can occur through
- 1. Nucleotide insertion
- 2. Nucleotide deletion
- 3. Nucleotide modification
- The most common example of RNA editing in mammals
is the formation of inosine through adenosine
deamination
33LNS domains Neurexin, laminins, agrin, slit.
Alternative exon sites accommodate insertions of
up to 30 residues.
34(No Transcript)
35(No Transcript)
36(No Transcript)
37(No Transcript)
38Modulating target recognition
Alternative splicing of lola generates 19
transcription factors controlling axon guidance
in Drosophila
39RNAs have many functions
- rRNA (ribosomal RNA)
- tRNA (transfer RNA)
- mRNA (Messenger RNA)
- snRNA (including snoRNA) (Small nuclear RNA-
splicing) - Telomerase RNA (telomerase maintenance)
- shRNA (short hairpin RNA silencing)
- gRNA (guide RNA RNA editing)
- Genomic RNA (viruses)
- siRNA (small interfering RNA RNA silencing)
- microRNAs - regulation
40microRNAs what are they?
- Family of non-coding RNAs
- Encoded individually and sometimes in clusters
- 21-25 nucleotides long
- Cleaved out of a bigger complex
- First identified in 1981 in C. elegans
- Act to inhibit translation
41microRNAs how do they work?
- Imperfect complimentary to mRNA
- miRNAs always interact with the 3 UTR
- Can alter mRNA stability and translational
initiation - Can have multiple target mRNAs
42microRNAs what is their role?
- Mammalian genomes encode between 200 and 500
miRNAs that affect up to a third of all genes - miRNA sequences are relatively conserved between
species - Host-virus interactions - can silence
retrotransposons and endogenous retroviruses - Important for development and differentiation
- Apoptosis and fat metabolism in flies
- T- and B-cell development in mice
- Gene regulation in cancer
- Diagnostic markers therapeutic targets
43miRNA structures
- Originate as a precursor RNA (primary miRNA
pri-miRNA) of several hundred base pairs - Pri-miRNA contain Caps and poly A tail
- Some contain introns while others do not
- Can be encoded by introns of other genes
- Transcribed by RNA Pol II
- Contain 80 bp imperfectstem loop
44miRNA processing
45Drosha and Dicer
- Members of the RNase III family
- Cleave dsRNA leaving 2 bp 3 overhangs
46Regulation of lin-14 by lin-4 miRNA
47Bioinformatics challenges
- Given miRNA sequences, predict target genes
(doable) - Given mRNA sequences, predict possible miRNAs
(hard, not done) - Current strategy Identify miRNAs experimentally,
then predict what genes they affect
48miRNA target prediction programs
3 and 5 regions most important to miRNA
function Therefore, can search for conserved 3
UTR regions
49miRNAs as biological dials
1) Switch or fine-tune 2) Single miRNA or
combinatorial 3) Mediated by sequence specific
interactions
50Summary
- Genes can have multiple isoforms
- Alternative splicing is a way of modifying
structure without reinventing it - Alternative splicing is (currently) hard to
predict based upon sequence alone - microRNAs are a relatively new discovery of
regulatory elements - microRNAs may regulate up to 1/3 of our genes
- Bioinformatics has lots of challenges in being
able to identify and predict these variants
additional regulators
51For next time
- Read Mount chapter 13, pages 612-28