Techniques for Analysing Microarrays - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Techniques for Analysing Microarrays

Description:

Survival Curves: Gene PSA model. High ( = 25th percentile) Low ( 25th percentile) ... Royal Hospital for Women. Nigel Hacker. ANU/John Curtin. John Maindonald ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 25
Provided by: mar234
Category:

less

Transcript and Presenter's Notes

Title: Techniques for Analysing Microarrays


1
Techniques forAnalysing Microarrays
  • Which genes are involved in ovarian and
    prostate cancer?

2
Common Questions
  • Which genes are up or down in different
    conditions
  • Cancer patient versus Normal
  • Non-invasive cancer versus invasive cancer
  • (2) Which genes can differentiate between cancer
    sub-types?
  • (3) Which genes relate to the survival of the
    patient?
  • (4) Which genes may be in the same pathway as a
    gene of interest?

3
EOS chips
  • Use Affymetrix GeneChip technology
  • 25mers
  • 8 probes in a probe set
  • 59,000 probe sets 46,000 gene clusters(all
    human expressed sequences known at time)
  • Normalised distributions of all chips to each
    other (gamma distribution)
  • Single measure of intensity for each probe set
    (Tukeys trimean)

4
Variance increases with mean
Data after normalisation
Variance (log scale)
mean
After the fix.. (Add constant and log2)
Variance (linear scale)
mean
5
Which genes are differentially expressed between
ovarian cancer and normal ovaries?
  • 6 normal ovaries
  • 38 ovarian cancers
  • 3 mucinous
  • 5 endometriod
  • 30 serous

6
Statistical techniques
  • ranked t-statistics (unequal variance)
  • quantile-quantile plots against normal
    distribution
  • Westfall and Young permutation test
  • http//stat-www.berkeley.edu/users/terry/zarray/Ht
    ml/
  • S. Dudoit, Y.H. Yang, M. J. Callow and
    T.P.Speed.  Statistical methods for identifying
    differentially expressed genes in replicated cDNA
    microarray experiments. August 2000
  • Ratios of Cancer/Normal
  • .

7
t statistic
  • The tstat gets more extreme as
  • Difference in means
  • The standard deviation of each of the two samples
  • The size of the samples

-ve
0
ve
tstats ranked
8
Quantile-Quantile Plot
R library(sma) or R library(base)
9
Westfall and Young PermutationtpWY program
http//www.cbil.upenn.edu/tpWY/
  • 6 normal ovaries, 38 ovarian cancers
  • Randomise labels (OvCa, N)
  • Compute tstats
  • 100,000 iterations
  • Unadjusted p valueProportion of iterations
    where
  • p value adjusted for multiple testing

10
How many genes were statistically significant?
  • Ovarian Cancer Normal(Candidates for
    antibody therapy?)
  • 110 candidates (adjusted plt0.01)
  • 181 candidates (adjusted p lt0.05)
  • Ovarian Cancer Normal
  • (Candidates for tumor suppressor genes?)
  • 7 candidates (adjusted plt0.01)
  • 15 candidates (adjusted plt0.05)

11
High in cancer
Excel
12
Low in cancer
How can we deal with(a) Biological
variation? (b) More than one cause for cancer?
Excel
13
Which genes are differentially expressed between
non-invasive and invasive ovarian cancer?
No. samples. Non-invasive Invasive Mucinou
s 5 4 Endometriod 1 7 Serous 2 33
Future Model all variables together Now
ranked t-stats, qqplots
14
Assume equal variance for t-stats?
eg.mucinous cancer
S2 invasive (n4)
S2 non-invasive (n5)
Ratio variances
Theoretical quantiles (F distribution)
15
What to do when n2?
Assume equal variance? Error model?
16
Limitations of Westfall Young permutation method
No. samples. No. Permut. Non-invasive Inv
asive Mucinous 5 4 126 Endometriod 1 7 -
-- Serous 2 33 595
Not enough power when small sample sizes?
17
Mucinous non-invasive versus invasive
R library(base)
18
Which genes relate to prognosis of patients with
prostate cancer?
  • 72 patients with prostate cancer
  • Treatment Radical prostatectomy
  • 17 relapsed PSA rise gt0.4ng/ml

Methods R survival package SAS
19
Cox Proportional Hazards Model
Exponential(InvolvesGene PSA Independent of
Time)
Baseline hazard (Independent of gene expression
or PSA)
20
A
B
relapsed
21
Survival Curves Gene PSA model
High (gt 25th percentile)
Low (lt 25th percentile).
S(t)
S(t)
Time(disease free months)
Time(disease free months)
B
22
Hazard Ratio 75th/25th percentile
Probe set Hazards Ratio unadjusted p value
A 0.26 (95 CI
0.12 to 0.54) 0.000351 B 0.32 (95 CI 0.16 to
0.67) 0.002151 False discovery rate for top
50 candidates is 20 (SAM)
23
Summary
  • Which genes are up or down in different
    conditions?
  • - ranked t-statistics
  • - qq plots (normal distribution)
  • - Westfall Young permutations (multiple
    testing)
  • (2) Which genes relate to the survival of the
    patient?
  • - Cox proportional hazards
  • - SAM multiple testing

24
Acknowledgements
  • Garvan
  • Sue Henshall, Rob Sutherland,Patricia Vanden
    Bergh
  • EOS
  • Jordan Hiller, Daniel Afar, Kurt Gish, David Mack
  • Royal Hospital for Women
  • Nigel Hacker
  • ANU/John Curtin
  • John Maindonald
  • Yvonne Pittelkow
  • Walter and Elisa Hall Institute
  • Terry Speed, Natalie Thorne
  • University of Queensland
  • Jessica Marr
Write a Comment
User Comments (0)
About PowerShow.com