BIONF/BENG 203: Functional Genomics - PowerPoint PPT Presentation

1 / 60
About This Presentation
Title:

BIONF/BENG 203: Functional Genomics

Description:

BIONF/BENG 203: Functional Genomics Sources of Functional Data Lectures 1 and 2 Lecture TI 1 Trey Ideker UCSD Department of Bioengineering * Genetic interaction, in ... – PowerPoint PPT presentation

Number of Views:108
Avg rating:3.0/5.0
Slides: 61
Provided by: csewebUcs
Category:

less

Transcript and Presenter's Notes

Title: BIONF/BENG 203: Functional Genomics


1
BIONF/BENG 203Functional Genomics
Sources of Functional DataLectures 1 and 2
  • Lecture TI 1
  • Trey Ideker
  • UCSD Department of Bioengineering

2
Grading
  • 40 Problem Sets (best 4 of 5)
  • 30 Midterm
  • 30 Final Project

3
Outline of the course
Biological data sources (2)
Data pre-processing (6)
Total of 17 lectures
Project Presentations (2)
4
Functional Genomics Data
  • Expression
  • mRNA, protein
  • Molecular interactions
  • Protein, mRNA, small molecules
  • Knockout phenotypes
  • 1st, 2nd, higher orders
  • SNP sequence (polymorphism) data
  • Imaging data
  • Sub-cellular localization
  • Cell morphology
  • Gene ontology

5
Dividing the data into two classes of
informationBiological Networks and Network
States
  • Directly observe the network wires themselves
  • Protein-protein interactions
  • Two-hybrid system, coIP, protein antibody arrays
  • BIND, DIP
  • Protein-DNA interactions
  • Chromatin IP
  • BIND, Transfac, SCPD
  • Other types not yet possible
  • e.g., protein-small molecule
  • Observe molecular states that result from the
    interaction wiring
  • DNA/RNA Gene expression
  • DNA microarrays, SAGE
  • Protein levels, locations, and modifications
  • Mass spectrometry, fluorescence microscopy,
    protein arrays
  • Gross phenotypes
  • e.g., growth rates of single and double deletion
    strains

1)
2)
6
High-throughput methods for measuring cellular
states
  • Gene expression levels RT-PCR, arrays
  • Protein levels, modifications mass specProtein
    locations fluorescent tagging
  • Metabolite levels NMR and mass spec
  • Systematic phenotyping

7
The transcriptome and proteome
  • The transcriptome is the full complement of RNA
    molecules produced by a genome
  • The proteome is the full complement of proteins
    enabled by the transcriptome
  • DNA ? RNA ? protein
  • Genome ? transcriptome ? proteome
  • 30,000 genes ? ??? RNAs ? ??? proteins?
  • For example, the drosophila gene Dscam can
    generate 40,000 distinct transcripts through
    alternative splicing.
  • What is the minimum number of exons that would be
    required?

8
Expression High-throughput approaches
  • RNA
  • DNA Microarrays
  • cDNA / EST sequencing
  • RT-PCR
  • Differential display
  • SAGE
  • Massively parallel signature sequencing (MPSS)
  • Proteins
  • 2D PAGE
  • Mass spectrometry

9
Gene expression arrays
  • They are really, really, really, really, really,
    really, really, really, really, really, really,
    really, really important

10
Microarrays
  • Monitors the level of each gene
  • Is it turned on or off in a particular
    biological condition?
  • Is this on/off state different between two
    biological conditions?
  • Microarray is a rectangular grid of spots printed
    on a glass microscope slide, where each spot
    contains DNA for a different gene

11
Two-color DNA microarray design
Reverse Transcription
12
cDNA-chip of brain glioblastoma
13
Types of microarrays
  • Spotted (cDNA)
  • Robotic transfer of cDNA clones or PCR products
  • Spotting on nylon membranes or glass slides
    coated with poly-lysine
  • Synthetic (oligo)
  • Direct oligo synthesis on solid microarray
    substrate
  • Uses photolithography (Affymetrix) or ink-jet
    printing (Agilent)
  • All configurations assume the DNA on the array is
    in excess of the hybridized samplethus the
    kinetics are linear and the spot intensity
    reflects that amount of hybridized sample.
  • Labeling can be radioactive, fluorescent
    (one-color), or two-color

14
Microarray Spotter
15
Affymetrix High Density Arrays
16
Microarrays (continued)
  • Imaging
  • Radioactive 32P labeling Autoradiography or
    phosphorimager
  • Fluorescent labeling Confocal microscope
    (invented by Marvin Minsky!!)
  • Feature density
  • Nylon membrane macroarrays ? 100-1000 features
  • Glass slide spotted array ? 5,000 features / cm2
  • Synthesized arrays ? 50,000 features / cm2

17
Microarrayconfocal scanner
  • Collects sharply defined optical sections from
    which 3D renderings can be created
  • The key is spatial filtering to eliminate
    out-of-focus light or glare in specimens whose
    thickness exceeds the immediate plane of focus.
  • Two lasers for excitation
  • Two color scan in less than 10 minutes
  • High resolution, 10 micron pixel size

18
cDNA / EST sequencing projects
  • cDNA complementary or copy DNA
  • EST Expressed Sequence Tag
  • The microarray could be described as a closed
    system because information about RNAs is limited
    by the targets available for hybridization. RNAs
    not represented on the array are not
    interrogated.
  • Direct sequencing of cDNAs (yielding ESTs)
    overcomes this problem by large-scale random
    sampling of sequences from a whole-cell RNA
    extract
  • Statistical counting of distinct sequences
    provides an estimate of expression level
  • Conversely, cDNA library can be normalized to
    capture rare messages
  • Requires large scale sequencing to get
    statistical significance

19
cDNA / EST SequencingPreparation of a cDNA
library in phage l vector
20
SerialAnalysis ofGeneExpression
SAGE Technology
Takes idea of sequence sampling to the
extreme Generates short ESTs (9-14nt) which are
joined into long concatamers and then
sequenced 49 is 262,144, 5-fold the number of
human genes The count of each type of tag
estimates RNA copy number gt50X more efficient
than cDNA sequencing because many RNAs are
represented in a single sequencing run
21
Steps to SAGE
  • Copy mRNA ? ds cDNA using biotinylated (dT)
  • Cleave with anchoring enzyme (AE) which cleaves
    within 250bp of poly-A tail at 3 end.
  • Capture this segment on streptavidin beads
  • Ligate to linkers containing a type IIs
    restriction site, which cleave DNA 14 bp away
    from this site.
  • Ligate sequences to each other and PCR amplify
  • Cleave with AE to remove linkers
  • Concatenate, clone, and sequence

22
Velculescu et al. Science (1995)
WHY DI-TAGS? Ditags are used to detect bias in
the PCR amplification step. The probability of
any two tags being coupled in the same ditag is
small. Biased amplification can be detected as
many ditags always having the same 2 tags present.
B
A
B
A
B
A
PrimerA
PrimerB
PrimerA
PrimerB
23
SAGE (continued)
Example of a concatemer
CATGACCCACGAGCAGGGTACGATGATACATGGAAACCTATGCACCTTGG
GTAGCACATG
TAG1
TAG2
TAG3
TAG4
Counting the tags
24
Proteomics
  • SDS PAGE
  • 2D PAGE
  • MS/MS

25
An example SDS-PAGE
How many proteins are in a band?
Protein stains Silver Copper Coomassie Blue
26
2D-PAGE
Dimension 2 size
Dimension 1 Isoelectric focusing gel
27
2D gel from macrophage phagosomes
28
Mass spectrometry
  • Mass spectrometers consist of three essential
    parts
  • Ionization source Converts peptides into
    gas-phase ions (MALDI ESI)
  • Mass analyzer Separates ions by mass to charge
    (m/z) ratio (Ion trap, time of flight,
    quadrupole)
  • Ion detector Current over time indicates amount
    of signal at each m/z value

29
MS/MS Overview
30
MS/MS Overview
31
(No Transcript)
32
(No Transcript)
33
A raw fragmentation spectrum
By calculating the molecular weight difference
between ions of the same type the sequence can be
determined. SEQUEST uses the fragmentation
pattern to search through a complete protein
database to identify the sequence which best fits
the pattern.
34
Tandem Mass Spec (MS/MS)
35
Typical nanoelectrospray source
36
Isotope Coded Affinity Tags (ICAT)
Mass spec based method for measuring relative
protein abundances between two samples
Heavy reagent d8-ICAT (Xdeuterium) Normal
reagent d0-ICAT (Xhydrogen)
ICAT Reagents
O
N
N
O
O
O
I
N
O
O
N
S
Biotin tag
Linker (d0 or d8)
Thiol specific reactive group
37
Protein Quantification Identification via ICAT
Strategy
100
Mixture 1
Light
Heavy
0
550
560
570
580
m/z
ICAT-labeled cysteines
Quantitation
100
NH2-EACDPLR-COOH
Combine and proteolyze (trypsin)
Affinity separation (avidin)
Mixture 2
0
200
400
600
800
m/z
ICAT Flash animation http//occawlonline.pearsone
d.com/bookbind/pubbooks/bc_mcampbell_genomics_1/me
dialib/method/ICAT/ICAT.html
Protein identification
38
ICAT continued
  • The heavy (blue) and light (gray) peptides are
    separated and quantified to produce a ratio for
    each peptide here, a single peptide ratio is
    shown
  • Each peptide is subjected to CID fragmentation in
    the second MS stage in order to identify it

39
Metabolomic measurements
  • 2D NMR or mass spectrometry
  • Currently not global and in less widespread use
    than microarrays, but have tremendous potential

40
Gene knockout and RNAi libraries for model
speciesExample from yeast
  • Replacement of yeast ORFS with kanMX gene flanked
    by unique oligo barcodes Yeast Deletion Project
    Consortium

41
YFP tagging for protein localization
YPF is green, transmitted light is red
NIC96 Nuclear Pore
TUB1 Tubulin cytoskeleton
HHF2 Histone Nucleus
BNI4 Bud neck
Images courtesy T. Davis lab See also recent work
byWeissman and OShea labs at UCSF
42
Systematic phenotyping
Barcode (UPTAG)
CTAACTC
TCGCGCA
TCATAAT

Deletion Strain
Growth 6hrs in minimal media (how many doublings?)
Rich media
Harvest and label genomic DNA
43
Systematic phenotyping with a barcode arrayRon
Davis and friends
  • These oligo barcodes are also spotted on a DNA
    microarray
  • Growth time in minimal media
  • Red 0 hours
  • Green 6 hours

44
Molecular Interactions
  • Among proteins, mRNA, small molecules, and so on

45
(No Transcript)
46
Also like sequence, protein interaction data are
exponentially growing
DIP Database Growthtotal interactions
EMBL Database Growthtotal nucleotides (gigabases)
10
5
0
1980
2000
1990
(As are the false positives!!!)
47
High-throughput methods for measuring interaction
networks
  • 2-hybrid
  • co-immunoprecipitation w/ mass spec
  • chIP-on-chip
  • systematic genetic analysis

48
Yeast two-hybrid method
Fields and Song
49
Detection of protein interactions with antibody
arrays
McBeath and Schreiber
50
Kinase-target interactions
Mike Snyder and colleagues
51
High-throughput methods for measuring networks
  • 2-hybrid
  • co-immunoprecipitation w/ mass spec
  • chIP-on-chip
  • systematic genetic analysis

52
Protein interactions by protein
immunoprecipitation followed by mass spectrometry
TEV Tobacco Etch Virus proteolytic site CBP
Calmodulin binding peptide Protein A IgG
binding from Staphylococcus
Gavin / Cellzome
53
TAP purification
Image courtesy of Bertrand Seraphin
54
High-throughput methods for measuring networks
  • 2-hybrid
  • co-immunoprecipitation w/ mass spec
  • chIP-on-chip
  • systematic genetic analysis

55
ChIP-chip measurement of protein?DNA interactions
From Figure 1 of Simon et al. Cell 2001
56
High-throughput methods for measuring networks
  • 2-hybrid
  • co-immunoprecipitation w/ mass spec
  • chIP-on-chip
  • systematic genetic analysis

57
Genetic interactions synthetic lethals and
suppressors
  • Genetic Interactions
  • Widespread method used by geneticists to discover
    pathways in yeast, fly, and worm
  • Implications for drug targeting and drug
    development for human disease
  • Thousands are now reported in literature and
    systematic studies
  • As with other types, the number of known genetic
    interactions is exponentially increasing

Adapted from Tong et al., Science 2001
58
Most recorded genetic interactions are synthetic
lethal relationships
A
B
A
DB
DA
B
DA
DB
Adapted from Hartman, Garvik, and Hartwell,
Science 2001
59
Synthetic-lethal protein interaction
A
B
X
A
B
Suppressor protein interaction
A
B
B
DB
X
A
B
DB
60
Interpretation of genetic interactions (Guarente
T.I.G. 1990)
Parallel Effects (Redundant or Additive)
Sequential Effects (Additive)
GOAL Identify downstream physical pathways
Single A or B mutations typically abolish their
biochemical activities
Single A or B mutations typically reduce their
biochemical activities
Write a Comment
User Comments (0)
About PowerShow.com