DNA Copy Number Analysis - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

DNA Copy Number Analysis

Description:

Department of Genetics & Center for Genome Sciences. Washington ... Base Line Array (linear); Quantile Normalization;Contrast Normalization; etc. S Mean of S ... – PowerPoint PPT presentation

Number of Views:311
Avg rating:3.0/5.0
Slides: 37
Provided by: Qunyua6
Category:

less

Transcript and Presenter's Notes

Title: DNA Copy Number Analysis


1
DNA Copy Number Analysis
  • Qunyuan Zhang,Ph.D.
  • Division of Statistical Genomics
  • Department of Genetics Center for Genome
    Sciences
  • Washington University School of Medicine
  • 03 - 25 2008
  • GEMS Course M 21-621 Computational Statistical
    Genetics

2
Four Questions
  • What is Copy Number ?
  • What can Copy Number tell us?
  • How to measure/quantify Copy Number?
  • How to analyze Copy Number?

3
What is Copy Number ?
  • Gene Copy Number
  • The gene copy number (also "copy number
    variants" or CNVs) is the amount of copies of a
    particular gene in the genotype of an individual.
    Recent evidence shows that the gene copy number
    can be elevated in cancer cells. For instance,
    the EGFR copy number can be higher than normal in
    Non-small cell lung cancer. Elevating the gene
    copy number of a particular gene can increase the
    expression of the protein that it encodes.
  • From Wikipedia www.wikipedia.org

4
  • DNA Copy Number
  • A Copy Number Variant (CNV) represents a copy
    number change involving a DNA fragment that is 1
    kilobases or larger.
  • From Nature Reviews Genetics, Feuk et al. 2006
  • DNA Copy Number ? DNA Tandem Repeat Number
    (e.g. microsatellites)

  • lt10 bases
  • DNA Copy Number ? RNA Copy Number
  • RNA Copy Number Gene Expression Level
  • DNA transcription
    mRNA
  • Copy Number is the amount of copies of a
    particular fragment of nucleic acid molecular
    chain. It refers to DNA Copy Number in most
    publications.

5
What can Copy Number tell us?
  • Genetic Diversity/Polymorphisms
  • - restriction fragment length polymorphism (RFLP)
  • - amplified fragment length polymorphism (AFLP)
  • - random amplification of polymorphic DNA (RAPD)
  • - variable number of tandem repeat (VNTR e.g.,
    mini- and microsatellite)
  • - single nucleotide polymorphism (SNP)
  • - presence/absence of transportable elements
  • - structural alterations (e.g., deletions,
    duplications, inversions )
  • - DNA copy number variant (CNV)
  • Association with phenotypes/diseases
    genes/genetic factors

6
Genetic Alterations in Tumor Cells (DNA
Copy Number Changes)
7
How to measure/quantify Copy Number?
8
SNP Array From Image to Copy Number
Tumor red intensity
Normal green intensity
more DNA copy number more DNA
hybridization higher intensity
Red lt Green Deletion (CNlt2) Red gt Green
Amplification (CNgt2) Red Green No
Alteration (CN2)
9
Array CGH From Image to Copy Number
10
How to Analyze Copy Number?
11
  • General Procedures for Copy Number Analysis

12
Background Adjustment/Correction
Reduces unevenness of a single chip Makes
intensities of different positions on a chip
comparable Before adjustment
After
adjustment
Corrected Intensity (S) Observed Intensity
(S) Background Intensity (B) For each region
i, B(i) Mean of the lowest 2 intensities in
region i
AffyMetrix MAS 5.0
13
(No Transcript)
14
Normalization
Reduces technical variation between chips Makes
intensities from different chips
comparable Before normalization
After normalization
15
(No Transcript)
16
Raw Copy Number Data
17
Individual Level Analysis
  • Analysis for each individual sample (or each
    sample pair)
  • Smoothing
  • Significance test of CN amplification and
    deletion
  • Boundary finding (smoothing and segmentation)
  • CN estimation

18
Smoothing via Sliding Window
19
Smoothing (sliding window30 snps)
Affymetrix
Chrom. 7
Chrom. 7
CN
CN
Mbp
Mbp
Illumina
Chrom. 7
CN
Mbp
20
Significance Test of CN ChangesAn Example
21
Sliding Window Smoothing
22
Normalization
23
P-value calculation
24
Calculate FDR for each window
25
Select window (FDR lt 0.05)
26
Another Example Intensities and Raw CNs, Chr. 1
(Piar101)Black Normal, Red Tumor,
Green Tumor- Normal
27
Significance Test for Copy Number Changes
-log(p) values, TSP data, chr. 1, pair101
28
Segmentation (break chrom. into CN-homologous
pieces)BioConductor R Packages
(www.bioconductor.org)GLAD package, adaptive
weights smoothing (AWS) methodDNAcopy package,
circular binary segmentation method
29
CN Estimation Hidden Markov Model (HMM)
CNAT(www.affymetrix.com) dChip (www.dchip.org)
CNAG (www.genome.umin.jp)
position
hidden status (unknown CN )
observed status (raw CN log ratio of
intensities)
CN estimation finding a sequence of CN values
which maximizes the likelihood of observed raw
CN. Algorithm Viterbi algorithm (can be
Iterative) Information/assumptions below are
needed Background probabilities Overall
probabilities of possible CN values. P(CNx)
x0,1,2,3,4,, n (usually,nlt10) Transition
probabilities Probabilities of CN values of each
SNP conditional on the previous one. P(CN_i1xi
CN_ixj) x0,1,2,3,4,, or n Emission
probabilities Probabilities of observed raw CN
values of each SNP conditional on the
hidden/unknown/true CN status. P(log
ratioltxCNy)f(xCNy) xone of real numbers
y0,1,2,3,4, , or n
30
HMM Estimation of CN for Chr. 1
(Piar101)Black Normal Intensities, Red
Tumor Intensities, Green Tumor- Normal Blue
HMM estimated CNs in Tumor Tissue
31
Population Level Analysis
  • Analysis for the whole group (or sub-group) of
    samples
  • Overall significance test
  • Amplification and deletion frequencies
    summarization
  • Common/concurrent region finding

32
Raw CN Changes of Chr. 14(average over 400
pairs )
33
Genome-wide Raw Copy Number Changes(sliding
window plot, averaged over 400 pairs )
34
Sliding Window Test of Significance of CN
Changes -log(p) values, based on 400 pairs
35
Visualization of Concurrent Regions of Chr.
14(400 pairs)
samples
positions
36
Software
  • Affymetrix Chips (www.affymetrix.com)
  • Illumina Chips (www.illumina.com)
  • CNAT(www.affymetrix.com)
  • dChip (www.dchip.org)
  • CNAG (www.genome.umin.jp)
  • GenePattern www.broad.mit.edu/cancer/software/gen
    epattern/
  • BioConductor R Packages (www.bioconductor.org)
  • GLAD package, adaptive weights smoothing (AWS)
    method
  • DNAcopy package, circular binary segmentation
    method
Write a Comment
User Comments (0)
About PowerShow.com