Analysis of High-throughput Gene Expression Profiling - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Analysis of High-throughput Gene Expression Profiling

Description:

DNA/RNA Fingerprinting (RAP-PCR) Differential Display (DD-PCR) aCGH: array CGH (DNA level) ... Difference in DNA concentration on arrays (plate effects) ... – PowerPoint PPT presentation

Number of Views:301
Avg rating:3.0/5.0
Slides: 36
Provided by: mcsy103
Category:

less

Transcript and Presenter's Notes

Title: Analysis of High-throughput Gene Expression Profiling


1
Analysis of High-throughput Gene Expression
Profiling
2
Why to Measure Gene Expression
  • 1. Determines which genes are induced/repressed
    in
  • response to a developmental phase or to an
  • environmental change.
  • 2. Sets of genes whose expression rises and falls
  • under the same condition are likely to have a
  • related function.
  • 3. Features such as a common regulatory motif can
    be
  • detected within co-expressed genes.
  • 4. A pattern of gene expression may be used as an
  • indicator of abnormal cellular regulation.
  • A useful tool for cancer diagnosis

3
Transitional vs. High-throughput Approaches
Why to Measure Gene Expression in Large Scale?
4
Techniques Used to Detect Gene Expression Level
  • Microarray (single or dual channel)
  • SAGE
  • EST/cDNA library
  • Northern Blots
  • Subtractive hybridisation
  • Differential hybridisation
  • Representational difference analysis (RDA)
  • DNA/RNA Fingerprinting (RAP-PCR)
  • Differential Display (DD-PCR)
  • aCGH array CGH (DNA level)

High-throughput
5
Basic Information of Microarray, SAGE and cDNA
Library
6
(DNA) Microarray
  • 1. Developed around 1987.
  • 2. Employ methods previously exploited in
    immunoassay context specific binding and
    marking techniques.
  • 3. Two types of probes
  • Format I probe cDNA (5005,000 bases long) is
    immobilized to a solid surface such as glass
    widely considered as developed at Stanford
    University Traditionally called DNA microarrays.
  • Format II an array of oligonucleotide
    (2080-mer oligos) probes is synthesized either
    in situ(on-chip) or by conventional synthesis
    followed by on-chip immobilization developed at
    Affymetrix, Inc. Many companies are anufacturing
    oligonucleotide based chips using alternative
    in-situ synthesis or depositioning technologies.
    Historically called DNA chips.

7
Microarray
  • Single Channel sub-type classification
  • Dual Channel differential expression gene
    screening
  • Tissue microarray
  • Protein microarray

8
Array CGH
  • Detecting DNA copy variation via microarray
    approach
  • A hotspot in recent research works, especially in
    Cancer research

9
Microarray Analysis
Which genes are up-regulated, down-regulated,
co-regulated, not-regulated?
  • gene discovery
  • pattern discovery
  • inferences about biological processes
  • classification of biological processes

10
SAGE
  • Experimental technique assigned to gain a
    quantitive measure of gene expression.
  • 10-20 base tags are produced (immediately
    adjacent to the 3 end of the 3 most NlaIII
    restriction site).
  • The SAGE technique measures not the expression
    level of a gene, but quantifies a "tag" which
    represents the transcription product of a gene.

11
SAGE
Tags are isolated and concatermized. Relative
expression levels can be compared between cells
in different states.
12
SAGEmap (http//cgap.nci.nih.gov)
13
SAGE comparing two relational libraries
14
EST library (UniGene)
15
Gene expression info from Unigene Library
16
An Example of In-house EST Library Analysis
17
The Algorithms and Challenges of High-throughput
Gene Expression Analysis
18
Seeing is believing?
No, need to correct errors.
19
SAGE
  • A typical experiment requires 30,000 gene
    expression comparisons where normal and a
    diseased cell is compared.
  • The results were subject to the size and
    reliabilities of the SAGE libraries.
  • Statistical measures are used to filter out
    candidate genes to reduce the dimensionality of
    the data but it is tedious and time consuming to
    play with these measures until a good set is
    found.

20
SAGE
  • TPM a simple normalization method
  • TPMCount1000,000/TotalCount
  • Bayesian approach http//cancerres.aacrjournals.or
    g/cgi/content/full/59/21/5403

21
Microarray Sources of errors
  • systematic
  • random

log signal intensity
log RNA abundance
22
Sources of Errors (Cont.)
  • Printing and/or tip problems
  • Labeling and dye effects (differing amounts of
    RNA labeled between the 2 channels)
  • Differences in the power of the two lasers (or
    other scanner problems)
  • Difference in DNA concentration on arrays (plate
    effects)
  • Spatial biases in ratios across the surface of
    the microarray due to uneven hybridization
  • cDNA array cannot distinguish alternatively
    spliced forms

23
Errors that cannot be corrected by statistics
  • Competitive hybridization of different targets on
    the chip
  • Failure to distinguish different splicing forms
  • Misinterpretation of time course data when there
    are not sufficient points
  • Misinterpretation of relative intensity

24
Does clustered time course really mean
co-expression?
Picture taken from http//genomics.stanford.edu/ye
ast/additional_figures_link.html
Yes, you can study known system (such as cell
cycle) this way but, how about the unknown
systems?
25
Normalization by iterative linear regression
  • fit a line (ymxb) to the data set
  • set aside outliers (residuals gt 2 x s.e.)
  • D Finkelstein et al.
  • http//www.camda.duke.edu/CAMDA00/abstracts.asp

26
Normalization (Curvilinear)
G Tseng et al., NAR 2001
27
After Normalization
  • Differentially Expressed (DE) Gene screeing
  • T-test
  • T-statistics
  • SVM
  • Clustering
  • Hierarchical
  • SOM
  • K-means
  • Network (Pathway) analysis
  • BioCarta, KEGG, GO databases
  • Bayesian network learning
  • Topology

28
Bioinformatics challenges
  • 1. data management
  • 2. utilizing data from multiple experiments
  • 3. utilizing data from multiple groups
  • with different technologies
  • with only processed data available

29
Bioinformatics Analysis of Integrated Analysis of
Gene Expression Profiling
30
  • Large-scale meta-analysis of cancer microarray
    data identifies common transcriptional profiles
    of neoplastic transformation and progression
  • Daniel R. et al. PNAS, 2004(101), 9309-9314
  • T-test
  • Q values (estimated false discovery rates) were
    calculated as
  • where P is P value, n is the total number of
    genes, and i is the sorted rank of P value.

31
Cont. Meta-Profiling.
  • The purpose of meta-profiling is to address the
    hypothesis that a selected set of differential
    expression signatures shares a significant
    intersection of genes (a meta-signature), thus
    inferring a biological relatedness.

32
67 genes were screened by mata-analysis
33
Integrated Cancer Gene Expression Map
34
7 genes were discovered by the system
35
THANX!!
Write a Comment
User Comments (0)
About PowerShow.com