Pei Wang1, Young Kim2 , Jonathan Pollack2, Hua Tang1 - PowerPoint PPT Presentation

1 / 1
About This Presentation
Title:

Pei Wang1, Young Kim2 , Jonathan Pollack2, Hua Tang1

Description:

DNA copy number alterations are key genetic events in the development and ... (2) Then we permute the order of the 23 samples (columns) of the expression ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 2
Provided by: huat
Category:

less

Transcript and Presenter's Notes

Title: Pei Wang1, Young Kim2 , Jonathan Pollack2, Hua Tang1


1
Effects of DNA Copy Number Alteration On RNA
Expression Level
Pei Wang1, Young Kim2 , Jonathan Pollack2, Hua
Tang1 1Department of Biostatistics, Division of
Public Health Science, Fred Hutchinson Cancer
Research Center 2Department of Pathology,
Stanford University,
Part B. How does DNA copy number alteration at
one gene affect the RNA expression level of other
genes?
Result
2. The observed distribution of mk and the null
distribution of mk are illustrated in Figure 3.
Abstract
DNA amplifications and deletions frequently
contribute to the development and progression of
cancers. It is of great interest to understand
how significantly the DNA copy number alterations
may affect RNA expression levels. To answer this
question, we study paired aCGH and RNA expression
profiles from 23 lung cancer cell lines. We
estimate the proportion of genes/clones in which
the DNA copy number affects RNA expression
levels. In addition, we investigate how the DNA
copy number alteration at one gene affects the
RNA expression level of other genes.
  • The numbers of genes having amplifications or
    deletions are summarized in the following table

.
Null Distribution
Method
  • Define Set A to be the 17764 genes having DNA
    copy number alterations (Left Table).
  • Identify genes whose expression has noticeable
    changes by the following criterion in at least
    20 samples, the expression fluorescence ratios
    are greater than 2 or smaller than 1/2. There are
    1994 (out of 23913) such genes/clones selected
    (Set B) based on the 23 expression profiles.
  • 3. Compare the expression of genes in Set B with
    the DNA copy numbers of genes in Set A
  • (1) For each pair of gene i ? A and gene j ? B,
    calculate the Pearsons correlation coefficient
    between the RNA expression of gene i and the DNA
    copy numbers of gene j. The result is a
    correlation matrix CorM (17764 by 1994), with
    each row corresponding to the DNA copy numbers of
    a gene in Set A, and each column corresponding to
    the RNA expression of a gene in Set B. This
    process is illustrated in Figure 2

Empirical Distribution of mk
2. The 2291901311304 p-values of Wilcox-Rank
tests for (a) H0 the expression levels of
this gene in the amplified samples are the same
as the expression levels in the no-alteration
samples. and the 229164608751 p-values of
Wilcox-Rank tests for (b) H0 the
expression levels of this gene in the k2 deleted
samples are the same as the expression levels in
the k0 no-alteration samples. are illustrated
in the Figure 1.
Figure 3. The histogram is the Empirical
Distribution of mk the blue curve is the
density function of Binom(1994, 0.1)
Introduction
  • DNA copy number alterations are key genetic
    events in the development and progression of
    human cancers.
  • Array-based CGH (aCGH) is a new technique to map
    DNA copy number alterations genome-wide at
    sub-megabase resolutions.
  • aCGH and expression profiling can be performed
    on the same microarray platform, which provides
    paired measurements of DNA copy number and RNA
    expression level of each gene/clone in one tumor
    sample.
  • By studying the paired aCGH and expression
    profiles, we can find out how the DNA copy number
    alterations affect the RNA expression levels.
  • The results shed light on how random genome
    events (due to genome instability) ultimately
    lead to tumor development.

3. Denote the ith gene as gi, we can calculate
Gene s1 Gene s2 . . . Gene sj
Gene 1 Gene 2 . . . Gene i
The value of pFDR(k) for different k is shown in
Figure 4.
CGH arrays
Expression Arrays

1994 Selected genes (Set B)
17764 genes (Set A)

23 samples
23 samples
Materials
pFDR(226)0.1
Correlation arrays (CorM )
  • DNA and RNA materials were collected from 23
    SCLC cell lines tissue culture repository.
  • cDNA microarrays were obtained form the Stanford
    Functional Genomics Facility, which represent
    more than 20000 mapped humna genes and ESTs.
  • aCGH and expression profilling were performed by
    Young Kim in Jonathan Pollacks lab according to
    published protocols. Pollack et al. 2002

Cor(i,j) Cor(Gene i, Gene sj )
17764 genes (Set A)
Figure 4.
Figure 1.
It requires migt226 to achieve an pFDR(positive
False Discovery Rate) smaller than 0.1. 4. Among
the 17764 genes/clones, 5486 genes/clones are
significant correlated with at least 226 other
genes. Therefore, a conservative approximation
for the number of genes/clones coming from the
alternative hypothesis would be 5486 90
4937. In other words
3. Denote the p-values as . We can
assume the distribution of each is from a
mixture density Where
Prob (H0 is true)
density of under H0.
density of , under
H1. Under H0, the p-values should be uniformly
distributed as U(0,1). Then, it would be
reasonable to assume that those large p-values
(e.g. from 0.5 to 1) are mainly contributed by
. It follows Storey et al. 2003.
With , we get
for amplification cases and
for deletion cases. In another word
1994 Selected genes (Set B)
Part A. How does DNA copy number alteration at
one gene affect the RNA expression level of the
same gene?
Figure 2.
(2) Then we permute the order of the 23 samples
(columns) of the expression arrays, and
re-calculate the correlation matrix. Denote this
new correlation matrix as PCorM. (3) For i1, 2,
, 1994 (i) Let Clow the 0.05 quantile of
PCorM, i Cup the 0.95 quantile
of PCorM, i (ii) CorMk,i (k1, 2, ,
17764) is called significant if
CorMk, i lt Clow, or CorMk, igtCup (4)
For each k1, 2, , 17764, count mk i
CorMk, i lt Clow, or CorMk, igtCup
In the lung cancer cell lines we examined, there
are at least 27.79 percent genes whose DNA copy
number alterations affect some other genes
expression at a statistical significant level.
Method
  • 1. Each CGH array is first processed by CLAC
    method, such that gain and loss regions are
    identified.
  • CLAC (Cluster Along Chromosome) is a
    statistical technique for calling gains and
    losses for CGH array data. Wang et al. 2004
  • For each gene/clone, suppose it has amplification
    in k1 samples, deletion in k2 sample and no
    alteration in k0 samples (k1k2k023). We
    perform Wilcox-Rank tests for
  • (a) If k0gt1, k1gt1
  • H0 The expression levels of this gene in
    the k1 amplified samples are no greater than the
    expression levels in the k0 no-alteration
    samples.
  • (b) If k0gt1, k2gt1
  • H0 The expression levels of this gene in
    the k2 deleted samples are no less than the
    expression levels in the k0 no-alteration samples.

Reference
Result
Wang, P., Kim, Y., Pollack, J.R., Narasimhan, B.,
and Tibshirani, R. A method for calling gains
and losses in array CGH data. Biostatistics,
accepted for publication, 4/2004. Pollack, J.R.,
Sorlie, T., Perou, C.M., Rees, C.A., Jeffrey,
S.S., Lonning, R.E., Tibshirani, R., Botstein,
D., Borresen-Dale, A.L., and Brown, P.O.
Microarray analysis reveals a major direct role
of DNA copy number alteration in the
transcriptional program o fhuman breast tumors.
Proc Natl Acad Sci USA, 9912963-12968,
2002 John D. Storey and Robert Tibshirani
Statistical Significance for Genome-Wide
Experiments Technique Report, Department of
Statistics, Stanford University.
  • The null hypothesis for each gene can be stated
    as
  • H0j the DNA copy number alteration of gene j
    in B does not affect any other genes expression
    (uncorrelated).
  • Then if H0 is true for gene k, from the
    definition of Clow and Cup, we have
  • Prob (CorMk, j lt Clow, or CorMk,
    jgtCup)0.1,
  • (j1, 2, , 1994). It follows
  • mk Binom (1994, 0.1).
  • Thus, we call Binom(1994, 0.1) as the null
    distribution of mk.

(a) When amplifications are observed, with a
probability of 33.4, the RNA expression level of
the corresponding gene will increase (by a
statistically significant amount). (b) when
deletions are observed, with a probability of
44.0, the RNA expression level of the
corresponding gene will decrease (by a
statistically significant amount).
Write a Comment
User Comments (0)
About PowerShow.com