Alternative Splicing - PowerPoint PPT Presentation

About This Presentation
Title:

Alternative Splicing

Description:

Affymetrix Microarrays Microarray Potential Applications Biological discovery new and better molecular diagnostics new ... finding and refining ... – PowerPoint PPT presentation

Number of Views:144
Avg rating:3.0/5.0
Slides: 32
Provided by: MarkCl1
Learn more at: http://dna.cs.byu.edu
Category:

less

Transcript and Presenter's Notes

Title: Alternative Splicing


1
Alternative Splicing
  • As an introduction to microarrays

2
(No Transcript)
3
(No Transcript)
4
(No Transcript)
5
(No Transcript)
6
Human Genome
  • 90,000 Human proteins, initially assumed near
    that number of genes (initial estimates 153,000)
  • The 1000 cell roundworm Caenorhabditis elegans
    has 19,500 genes, corn has 40,000 genes
  • Current estimates are 25,000 or fewer genes
  • Alternative splicing allows different tissue
    types to perform different function with same
    gene assortment

7
Implications
  • 75 of human genes are subject to alternative
    editing
  • faulty gene splicing leads to cancer and
    congenital diseases.
  • gene therapy can use splicing

8
Application
  • We talked before about apoptotis when the cell
    determines it cant be repaired
  • Bcl-x is a regulator of apoptotis, is
    alternatively spliced to produce either Bcl-x(L)
    that suppresses apoptosis, or Bcl-x(S) that
    promotes it.

9
(No Transcript)
10
Spliceosome
  • Five snRNA molecules U1, U2, U3, U4, U5, U6
    combine with as many as 150 proteins to form the
    spliceosome
  • It recognizes sites where introns begin and end
  • Cuts introns out of pre-mRNA
  • joins exons

11
(No Transcript)
12
Spliceosome
  • The 5 splice site is at the beginning of the
    intron, the 3 site is at the end
  • The average human protein coding gene is 28000
    nucleotides long with 8.8 exons separated by 7.8
    introns
  • exons are 120 nucleotides long while introns are
    100-100,000 nucleotides long

13
Splicing errors
  • familial dysautonomia results from a
    single-nucleotide mutation that causes a gene to
    be alternatively spliced in nervous system tissue
  • The decrease in the IKBKAP protein leads to
    abnormal nervous system development (half die
    before 30)
  • gt 15 of gene mutations that cause genetic
    diseases and cancers are caused by splicing
    errors.

14
Why splicing
  • Each gene generates 3 alternatively spliced mRNAs
  • Why so much intron (1-2 of genome is exons)?
  • Mouse and human differences are almost all
    splicing
  • Half of the human genome is made up of
    transposable elements, Alus being the most
    abundant (1.4 million copies)
  • They continue to multiply and insert themselves
    into the genome at the rate of one insertion per
    100 human births
  • mutations in the Alu can create a 5 or 3 site
    in an intron causing it to be an exon
  • This mutation doesnt impact existing exons
  • It only has effect when it is alternatively
    spliced in

15
(No Transcript)
16
Microarrays For Alt. Splicing
  • Use short oligonucleotides
  • Get a guess at the rate of expression of the oligo

Exon 1
Exon 2
Exon 4
Exon 5
Exon 3
17
AffymetrixMicroarrays For Alt. Splicing
Exon 1
Exon 2
Exon 4
Exon 5
Exon 3
Isoform 1
Exon 1
Exon 2
Exon 4
Exon 5
Isoform 2
Exon 1
Exon 3
Exon 5
18
Ideal Microarray Readings
Expression
a
b
c
d
e
Probe
Isoform 1
a
c
Exon 1
Exon 2
Exon 4
Exon 5
b
Isoform 2
a
d
Exon 1
Exon 3
Exon 5
e
19
Motivation
  • Why alternatively splice?
  • How does it affect the resulting proteins?
  • Look at domains
  • High level summary of protein
  • 80 of eukaryotic proteins are multi-domain
  • Domains are big relative to an exon

20
Some Previous Work
  • Signatures of domain shuffling in the human
    genome. Kaessmann, 2002.
  • Intron phase symmetry around domain boundaries
  • The Effects of Alternative Splicing On
    Transmembrane Proteins in the Mouse Genome.
    Cline, 2004.
  • Half of TM proteins studied affected by
    alt-splicing.

21
Method
  • Predict Alternative Splicing
  • Predict Protein Domains
  • Look for effects of Alt-Splicing on predicted
    domains
  • Swapping
  • Knockout
  • Clipping

22
Microarray Design
  • Genes based on mRNA and EST data in mouse
  • Mapped to Feb. 2002 mouse genome freeze
  • 500,000 probes (66,000 sets)
  • 100,000 transcripts
  • 13,000 gene models

23
Technical work
Genome Space
Overlap
gene models
Generated Data
transcripts
Overlap
Provided data
Overlap
Probe to transcript mapping
E_at_NM_021320 cc-chr10-000017.82.0 G6836022_at_J9
11445 cc-chr10-000017.91.1 G6807921_at_J911524_
RC cc-chr10-000018.4.0
probes
24
Predicting Alternative Splicing
  • Using mouse alt-splicing microarrays
  • Data from Manny Ares
  • 8 tissues
  • 3 replicates of each tissue

25
Predicting Alternative Splicing
  • General Approach Clustering, then Anti-Clustering

107 Clusters
Detail View
26
Gene Expression Measurement
  • mRNA expression represents dynamic aspects of
    cell
  • mRNA expression can be measured with latest
    technology
  • mRNA is isolated and labeled with fluorescent
    protein
  • mRNA is hybridized to the target level of
    hybridization corresponds to light emission which
    is measured with a laser

27
Gene Expression Microarrays
  • The main types of gene expression microarrays
  • Short oligonucleotide arrays (Affymetrix)
  • cDNA or spotted arrays (Brown/Botstein).
  • Long oligonucleotide arrays (Agilent Inkjet)
  • Fiber-optic arrays
  • ...

28
Affymetrix Microarrays
Raw image
1.28cm
107 oligonucleotides, half Perfectly Match mRNA
(PM), half have one Mismatch (MM) Raw gene
expression is intensity difference PM - MM
29
Microarray Potential Applications
  • Biological discovery
  • new and better molecular diagnostics
  • new molecular targets for therapy
  • finding and refining biological pathways
  • Recent examples
  • molecular diagnosis of leukemia, breast cancer,
    ...
  • appropriate treatment for genetic signature
  • potential new drug targets

30
Microarray Data Analysis Types
  • Gene Selection
  • find genes for therapeutic targets
  • avoid false positives (FDA approval ?)
  • Classification (Supervised)
  • identify disease
  • predict outcome / select best treatment
  • Clustering (Unsupervised)
  • find new biological classes / refine existing
    ones
  • exploration

31
Microarray Data Mining Challenges
  • too few records (samples), usually lt 100
  • too many columns (genes), usually gt 1,000
  • Too many columns likely to lead to False
    positives
  • for exploration, a large set of all relevant
    genes is desired
  • for diagnostics or identification of therapeutic
    targets, the smallest set of genes is needed
  • model needs to be explainable to biologists

32
Microarray Data Classification
Microarray chips
Images scanned by laser
Gene Value D26528_at
193 D26561_cds1_at -70 D26561_cds2_at
144 D26561_cds3_at 33 D26579_at
318 D26598_at 1764 D26599_at
1537 D26600_at 1204 D28114_at
707
Datasets
New sample
Data Mining model
Prediction ALL or AML
Write a Comment
User Comments (0)
About PowerShow.com