Darlene Goldstein - PowerPoint PPT Presentation

About This Presentation
Title:

Darlene Goldstein

Description:

both array types biased downwards (FC under-estimated compared to qRT-PCR) Evans et al. ... Victor Jongeneel, Christian Iseli, Brian Stephenson. DAF/DAFL. Otto ... – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 38
Provided by: mou98
Category:

less

Transcript and Presenter's Notes

Title: Darlene Goldstein


1
A Comparison of Microarray Platforms NUS IMS
Workshop7 January 2004
Darlene Goldstein
2
Talk Outline
  • Bioinformatics Core Facility at ISREC
  • Purpose of study
  • Platform technologies and study design
  • Comparisons between platforms
  • Conclusions and study completion

3
BCF What is it ?
  • ISREC-based, supported by the NCCR for molecular
    oncology, member group of the SIB
  • Created by the NCCR molecular oncology to assist
    its DAF (which is now absorbed into the DAFL) and
    its microarray users in their biomedical research
  • A group devoted to the bioinformatics and
    statistical aspects of gene expression research,
    in particular to the analysis of data generated
    with microarray technologies

4
BCF Main Components
  • Technical Support
  • advice in experimental design and data analysis
  • production, control, development of spotted
    arrays
  • processing of microarray data, quality assessment
  • Education
  • practical training through classes / workshops
  • Collaboration
  • statistical data analysis of research projects
  • Research Development
  • development / testing tools methods

5
Platform Comparison Study
  • Purpose
  • to assess accuracy and reproducibility of
    different gene expression platforms
  • to compare features of different measurement
    types
  • to understand the system (important for
    normalization and downstream analysis)
  • Impact
  • practical advice to DAF(L) and to NCCR microarray
    users
  • benefit to wider scientific community, especially
    if possible to somehow combine results across
    array types

6
Platforms and Study Design
  • Platforms
  • Affymetrix GeneChips, high-density short oligo
    arrays
  • Agilent long oligo arrays
  • in-house spotted cDNA arrays
  • MPSS (massively parallel signature sequencing, a
    digital gene expression technology patented by
    Lynx) in collaboration with the Ludwig Institute
    for Cancer Research originally intended as gold
    standard
  • Basic Design
  • 3 replicate measurements for two mRNAs (human
    placenta and testis)
  • dye swap for two-color systems (Agilent, cDNA)
  • 2 to 3 million tags sequenced for MPSS

7
Methods
  • Experimental Method (as recommended by
    specialists)
  • Affymetrix Biozentrum Basel
  • Agilent Institut Goustav Roussy, Paris
  • Spotted cDNA arrays Otto Hagenbuechle's group
  • (DAF, now DAFL)
  • MPSS Lynx (California), Victor Jongeneel's group
    (LICR)
  • qRT-PCR followup ( 250 genes), Robert Lyle,
    Patrick Descombes (UniGE)
  • Expression Quantification
  • as recommended by specialists (above),
  • but RMA for Affymetrix

8
Spotted cDNA arrays
Human 10k Array 8x4 subarrays
9
Affymetrix GeneChips
Image of hybridized array
10
MPSS
  • Uses microbeads with 100k identical DNA
    molecules attached
  • Captures and identifies transcript sequences of
    expressed genes by counting the number of
    individual mRNA molecules representing each gene
  • Individual mRNAs are identified through generated
    17-to 20-base signature sequence
  • Can use without organism sequence information
  • MPSS can accurately quantify transcripts as low
    as 5 transcripts per million (tpm) to above
    50,000 tpm

(information from Lynx web site)
11
Other comparison studies (I)
  • Yuen et al. 2002 Nuc. Acids Res. 30(10)e48
  • Affy MGU-74A, cDNA cell lines qRT-PCR 47 genes
  • both arrays sensitive (TP) and specific (TN) at
    identifying regulated transcripts
  • found comparable rank-order of gene regulation,
    but only modest correlation in fold-change
  • both array types biased downwards (FC
    under-estimated compared to qRT-PCR)
  • Evans et al. 2002 Eur. J. Neuroscience
    16409-413
  • Affy RG-U34A, SAGE to detect brain transcripts
    43 rat hippocampi evaluation based on 1000
    transcripts
  • 55 low, 90 high abundance transcripts detected

12
Other comparison studies (II)
  • Li et al. 2002 Toxicological Sciences 69383-390
  • Affy HuGene FL, HGU-95Av2, IncyteGenomics UniGemV
    2.0 (long cDNA) drug-treated cell lines at 8h
    and 24h qRT-PCR 9 genes
  • cross-hyb contributed to platform discrepancies
  • found Affy more reliable (sensitive)
  • Kuo et al. 2002 Bioinformatics 18405-412
  • Affy HU6800, cDNA, publicly available data on NCI
    60 2895 genes
  • found low correlation between measurements (but
    no control over lab procedures different groups
    had performed the original studies)

13
Other comparison studies (III)
  • Barczak et al. 2003 Genome Res. 131775-1785
  • 2 versions of spotted long oligo (Operon), Affy
    HGU-95Av2 cell lines 7344 genes
  • this large-scale analysis found strong
    correlations between relative expression
    measurements
  • similar results for amplified and unamplified
    targets
  • Tan et al. 2003 Nuc. Acids Res. 315676-5684
  • Agilent Human 1, Affy HGU-95Av2, Amersham
    Codelink UniSet Human I (30-mers) cell lines in
    serum-rich medium and 24h after serum removal
    2009 genes
  • modest correlations
  • little overlap in genes called DE
  • best agreement on DE calls (varying criteria)
    only 21
  • comparison studies by other groups world-wide are
    also in progress

14
Comparison Principle
  • Cross-platform gene matching done through the
    trome database of transcripts (constructed with
    the Transcriptome Analyzer program tromer)
  • Use only those genes we classify as reliably
    mapped between platforms (2500 genes) we have
    not (yet) looked at probe(set)s that could not be
    well-mapped to known transcripts
  • Peak technical performance this is a case
    study, not a systematic study does not take into
    account normal user variation, other mRNAs, etc.
  • Comparison based on M (log ratio) and A (average
    log intensity)
  • Unfortunately, accuracy cannot be properly
    assessed, as true M values are not known

15
cDNA array Performance
16
MA plots (examples)
Affy U133A
range background
NCCR h10kd
Agilent
17
M (putative effect) densities
18
(Difference in M) vs. A reproducibility
Affy U133A
y difference in M x average A
Agilent h1A
NCCR h10k
19
D (error) densities
20
Gene Matching
Probe(sets) / genes 18325 Agilent h1A
15688 24808 Affy U133A 14876 7812 NCCR h10k
6853
21
Gene matching also with MPSS
2494 Tromer clusters 4060 Affy probesets 2869
Agilent probes 2685 NCCR clones
22
Concordance in M density plots (I)
Agilent Affy NCCR
23
Concordance in M density plots (II)
Agilent Affy NCCR
24
Difficulty in comparing to MPSS ratios
25
MPSS difficulties, another illustration
26
Correlations
first quartile (25 least frequent RNAs)
fourth quartile (25 most frequent RNAs)
27
Agreement top up 200 (placenta)
M range Affy 1.66 - 7.94 Agil 1.48 -
6.17 NCCR 1.83 - 7.12
28
Agreement top down 200 (testis)
M range Affy -8.27 - -1.65 Agil -6.07 -
-1.47 NCCR -6.18 - -1.79
29
Comparison with MPSS, 99 CI (up)
30
Comparison with MPSS, 99 CI (Down)
31
MPSS CI Overlap
Overlap with the 99 CI for MPSS
Overlap with the 99.9 CI for MPSS
32
Overlap with MPSS
38
MPSS
74
(similar numbers also for Affy and Agilent) 56
of the 88 are in common to all 4
88
112
NCCR
missing or classified as unreliably mapped (tag
to gene not unique)
33
Conclusions (I)
  • The three microarray platforms compared performed
    very similarly in terms of which genes are
    detected as differentially expressed,
    distributions of M values, variability between
    replicate measurements ...
  • ... so similarly that it seems hard to find real
    differences
  • Most disagreement for low-expressed genes
  • RMA M values (Affy) are better variance-stabilized
    , but reproducibility is good for all platforms
    except for weak signals in Agilent (likely due to
    bg treatment)
  • RMA M values are more strongly compressed towards
    zero at low intensity reduces false positive
    calls but might make DE at low intensity
    undetectable (but is it detectable at all?)

34
Conclusions (II)
  • Microarrays vs MPSS
  • M values, quantitative comparison
  • the disagreement is large ...
  • ... so large that it is hard to reconcile
    the values, making it impossible to use MPSS as
    the gold standard
  • M values, qualitative comparison
  • there is a good degree of agreement
  • - approximately the same to all three
    microarray platforms

35
Conclusions (III)
  • MPSS predicts many more low-abundance genes to be
    (strongly) differentially expressed
  • The hybridization methods lose signal of
    low-abundance genes (due to the background
    fluorescence estimation?)
  • microarrays miss detection of most of the
    differential expression of low abundance
    transcripts, but it is also possible that MPSS is
    biased for many genes or less precise than this
    approach suggests
  • approach with confidence intervals for MPSS
  • (currently approximate CI that takes into
    consideration the sampling error on the counts,
    we have no replicated measurements for MPSS)

36
Completion of Study
  • Choose genes for qRT-PCR for which the platforms
    and MPSS disagree and (attempt to) address the
    questions
  • which platform is more accurate?
  • how does accuracy depend on the signal intensity?
  • do the microarrays miss DE frequently....?
  • ....and especially at weak signal intensity ?
  • which platform best detects low abundance RNAs?
  • does MPSS agree with QT-PCR?
  • Suggestions are welcome !! ?

37
Acknowledgements
  • Ludwig Institute for Cancer Research
  • Victor Jongeneel, Christian Iseli, Brian
    Stephenson
  • DAF/DAFL
  • Otto Hagenbuechle, Josiane Wyniger
  • UniGE
  • Robert Lyle, Patrick Descombes
  • BCF
  • Mauro Delorenzi, Eugenia Migliavacca
  • and everyone I inadvertently left out!
Write a Comment
User Comments (0)
About PowerShow.com