From Probe 'Cel to Expression Level - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

From Probe 'Cel to Expression Level

Description:

From Probe .Cel to Expression Level. Christine Steinhoff ... loess-norm. vsn-norm. log-conc. log-Int. Max Planck Institut. f r Molekulare Genetik ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 36
Provided by: WinC9
Category:
Tags: cel | expression | level | loess | probe

less

Transcript and Presenter's Notes

Title: From Probe 'Cel to Expression Level


1
From Probe .Cel to Expression
Level Christine Steinhoff Max Planck
Institut für Molekulare Genetik Computational
Molecular Biology Berlin
2
Max Planck Institut für Molekulare Genetik
Outline
Outline
  • Choose of Technology
  • Affymetrix Technology
  • Low Level Analysis Problems
  • Background
  • PM/MM
  • Summary Statistic
  • Comparison of different Low Level Analysis
    Procedures
  • MAS
  • Li/Wong
  • RMA
  • Comparison of different Normalization Strategies

3
Max Planck Institut für Molekulare Genetik
Dataprocessing
4
Max Planck Institut für Molekulare Genetik
Choose of Technology
Red Green Experiments
Affymetrix - Experiments
5
Max Planck Institut für Molekulare Genetik
Choose of Technology
Patient Control
6
Max Planck Institut für Molekulare Genetik
Levels of Replication
Hybridization
7
Max Planck Institut für Molekulare Genetik
From probe .cel to expression
... TGTGATGGTGGGAATGGGTCAGAAGGACTCCTATGTGGGTGACGAG
GCC
TTACCCAGTCTTCCTGAGGATACACCCAC
TTACCCAGTCTTGCTGAGGATACACCCAC
8
Max Planck Institut für Molekulare Genetik
probe .cel Problems
Background substraction PM / MM wanted one
value how to summarize ? variances across
array within probe set
1.5 2.4 10.4 0.1 ... 1.3 3.4
9
Max Planck Institut für Molekulare Genetik
what has been done?
MAS 5.0
Background Array is split up into K
rectangular zones (default K16) Control cells
and masked cells are not used Ranking
cells Zbg lowest 2 for that zone (average
background of that zone) smoothing
dk(x,y)distance from the center of the zone
to some coordinate (x,y) wk(x,y)1/(dk2 s)
(default s100) background ?k wk(x,y)
Zbg / ?k wk(x,y)
.
.
10
Max Planck Institut für Molekulare Genetik
what has been done?
MAS 5.0
PM - MM Signal calculation 1. Cell
intensities are preprocessed for global
background 2. Ideal Mismatch is calculated and
subtracted to adjust PM 3. Biweight estimator
as robust mean of resulting values 4. Signal is
scaled using trimmed mean
V i,j max (PM i,j - IM i,j , d) default d 2
-20 IM Ideal Match dependending on MM gt or
lt PM PV i,j log(V i,j) for j1,...,ni
11
Max Planck Institut für Molekulare Genetik
what has been done?
MAS 5.0
Summary SignalLogValue Tbi(PV i,1 , ... ,
PV i,ni) (one step Tukeys Biweight)
u (x-Median(PV i,1 , ... , PV i,ni) ) /
(constMAD eps) w(u)

(1 - u2)2 for u lt 1 0 else
12
Max Planck Institut für Molekulare Genetik
what has been done?
Li/Wong (PNAS 2001 vol 98 (1), pp31-36) Model
MMij ?j ?i ?j ? PMij ?j ?i ?j ?i
?j ? ?j baseline ?i expression for the gene
in the i th sample ?j rate of increase of the MM
response of j th probe pair ?j additional rate
of increase in the corresponding PM response ?
random error
13
Max Planck Institut für Molekulare Genetik
what has been done?
Li/Wong
Summary Statistic Least Square Fitting to PMij
- MMij ?i ?j ?ij ?ij N(0,?2) gives least
square estimate for ?
14
Max Planck Institut für Molekulare Genetik
what has been done?
RMA Irizarry/Bolstad/Speed (NAR, 2003 31(4),
e15) Background correction on raw intensity
scale subtraction Signal model PM background
signal bg s

background correction B(PM) E(sPM) s
exponential bg normal
optical noise non specific binding
15
Max Planck Institut für Molekulare Genetik
what has been done?
RMA
PM, MM Forget about MM Reason
mathematical subtraction does not translate into
biological meaning Future improve BG
correction by using MMs
16
Max Planck Institut für Molekulare Genetik
what has been done?
RMA Summary Statistic
Yijn ?jn ?jn ?ijn i1,...,I (chips) j1,...
,J (probes) n1,...,n (probe set) ?jn probe
affinity effect ?jn log scale expression
level ?ijn error iid N(0, ?2) ?j ?j 0 ? n -gt
median polish
Note Irizarry et al. (2003) recommend first
normalization than parameter estimation
17
Max Planck Institut für Molekulare Genetik
does it matter at all?
all spots
MAS 5.0
Li/Wong pm only
Av Diff pm only
Li/Wong pm-mm
RMA
bgMASAv Diff pm only
Av Diff pm - mm
18
Max Planck Institut für Molekulare Genetik
does it matter at all?
Reference distribution is normal for the log fold
change from Terry Speed, Summarizing and
comparing GeneChip? data
19
Max Planck Institut für Molekulare Genetik
definitions
For the rest of the talk (1) take background
(RMA like) (2) only use PM (we dont know a
better solution) (3) summarize using RMA
model bioconductorlibrary(affy) x ReadAffy(ce
lfile.path"/project/gene_expression/spikein/") da
ta.rma express ( x, subset NULL ,
bg.correct bg.correct.rma ,
pmcorrect.method"pmonly" , summary.stat
medianpolish , normalizeF , verbose
TRUE )
20
Max Planck Institut für Molekulare Genetik
Normalization
Problem Normalization ---gt Summary
Statistic Summary Statistic ---gt Normalization
first normalization
first summary
21
Max Planck Institut für Molekulare Genetik
Problem Normalization
User Defined Sets Housekeeping (?!) Controls
etc useful for Most Genes Changed- Settings
Entire Dataset useful for Most Genes
Unchanged- Settings
22
Max Planck Institut für Molekulare Genetik
Problem Normalization
Local Regression determine regression lines
locally
23
Max Planck Institut für Molekulare Genetik
Problem Normalization
24
Max Planck Institut für Molekulare Genetik
Problem Normalization
Goal Detection of Differentially expressed Genes
Var Stab ANOVA Lin Regr Least Med Local
Regr Mean Median Shorth Zscore Raw
d(i,j) 1 - 6/(N(N2-1)) ?k,l1...N d(i,j)k,l
genes ordered by abs(logratio) d(i,j)k,l
rank(genek)-rank(genel) if exists
N1 else
25
Max Planck Institut für Molekulare Genetik
Problem Normalization
Biological Evaluation (a) Northern Blotting
(b) quant. RT PCR (c) SAGE library (d)
quantifiable controls
26
Max Planck Institut für Molekulare Genetik
Dataset
Spike in dataset Design
flagged
flagged
27
Max Planck Institut für Molekulare Genetik
Dataset
conc
expset 1
expset 3
expset 2
Exp
spike ins
28
Max Planck Institut für Molekulare Genetik
Comparison of Normalization Strategies
quantile-norm.
q-spline-norm.
loess-norm.
log-conc
vsn-norm.
log-Int
29
Max Planck Institut für Molekulare Genetik
Comparison of Normalization Strategies
log-Int Chip2
log-Int Chip1
30
Max Planck Institut für Molekulare Genetik
Comparison of Normalization Strategies
31
Max Planck Institut für Molekulare Genetik
Comparison of Normalization Strategies
Biological Evaluation (a) Northern Blotting
(b) quant. RT PCR (c) SAGE library (d)
quantifiable controls
32
Max Planck Institut für Molekulare Genetik
Comparison of Normalization Strategies
(1) 0.845
(2) 0.845
(7) 0.854
(3) 0.859
(4) 0.851
(1) Raw data (2) median (3) ZScore (4) Overall
(linear) Regression (5) Local Regression (6)
Variance Stabilization (7) ANOVA
(5) 0.851
(6) 0.853
33
Max Planck Institut für Molekulare Genetik
Spoil the data
Plate specific effect Random
effect Labeling effect Scanner effect
34
Max Planck Institut für Molekulare Genetik
Comparison of Normalization Strategies
35
Max Planck Institut für Molekulare Genetik
Summary
  • Choose of Technology Crucial for Design of
    experiment
  • Low Level Analysis Problems
  • Background depending on the model different
    results!
  • PM/MM Forget about MM because mathematical
    subtraction does not translate into biological
    meaning
  • Summary Statistic decide first normalizing or
    first summarizing!
  • Comparison of different Low Level Analysis
    Procedures
  • MAS performs worst
  • Li/Wong performs well
  • RMA performs well
  • for good data it seems not to matter
  • Comparison of different Normalization
    Strategies
  • variance stabilization seems always to work
    quite well
Write a Comment
User Comments (0)
About PowerShow.com