Title: Microarray Cancer Data Visualization Analysis in Relation to Pharmacogenomics
1Microarray Cancer Data Visualization Analysis in
Relation to Pharmacogenomics
2Microarray Data Acquisition
- What is Microarray
- Microarray data (scanned image data of expressed
genes) are obtained from microscope slides that
contain an ordered series of samples (DNA, RNA,
Protein, Tissue). - The type of microarray depends on the material
placed on it, for example DNA, DNA Microarray,
RNA, RNA Microarray etc. The most commonly used
microarray is the DNA microarray. - DNA Microarrays are ordered sets of gene-specific
probes fixed to a solid support to which
fluorescently labeled samples (with reverse
transcriptase enabling RNA to bind to spots of
cDNA) are hybridized for use in massively
parallel gene expression studies.
3Background Definition of Keywords
- Genetics has been the primary discovery engine
for modern biomedical science - Genetics is the study of heredity and how traits
are passed on through generations - Genomics is the study of genes and their
functions - Every human cell (with some rare exceptions)
contains 46 (organized as 23 pairs) linear
chromosomes (pieces of DNA). - The chromosomes contain genetic information,
which is organized into thousands of different
genes - A gene is a stretch of DNA, which codes for a
particular protein, whether it is a structural
protein (a protein that makes up part of a
structure of the cell, for example the cell wall)
or an enzyme.
4Microarray Technology and Pharmacogenomics
- Microarray technology has enabled many advances
in gene study (genomics science). - It provides a method of collecting thousands of
individual qualitative (such as gene category )
and/or quantitative (such as RNA level for an
entire experiment), measurements/attributes
simultaneously in a single sample. - The oncology field has been especially active and
to an extent successful in using microarrays to
differentiate between cancer cell types and to
obtain molecular signatures of the state of
activity of diseased cells of patient samples. - This approach of studying cancer provides a
better understanding of the underlying mechanism
for tumorigenesis, more accurate diagnosis, more
comprehensive prognosis, and more effective
therapeutic interventions
5Microarray Data and Pharmacogenomics contd
- Pharmacogenomics
- - studies the way a person responds to a drug,
by studying the inherited variations in genes
that dictate drug response including negative,
positive or no response) - General Practice
- Current drug therapy is empirically prescribed
to fit the needs of the average patient. - Effect
- Empirical prescription leads to undue toxicity
in cured patients and delays alternative active
therapies while causing unnecessary toxicity in
resistant ones. - Goal
- To obtain new and widely applicable validated
predictors of the likelihood of optimal drug
therapy response that will enable individually
tailored prescriptions.
6Visualization of Microarray
- Visualization of microarray
- - Enable the simultaneous visualization of
multiple expressed gene data attributes - - Provides visualized summaries of gene
expression data - - Provides genome researchers with meaningful
details (gene cluster summary, map position
within the genome, gene /protein sequences for
effective disease recognition - Visualization attributes
- - Quantitative attributes
- - RNA level p-Value Size of expressed genes
- - Qualitative
- Color
- Size and color are two attributes that can be
used to display quantitative differences in data
using most visualization tools - Visualization methods that enable the ability to
simultaneously visualize multiple data attributes
including the analysis of qualitative information
about either gene families or biological function
and quantitative information such as RNA level
and p-value simultaneously are very important.
7Source Microarray Visualization data
- BRAC 1 BRAC 2 (Onset) Microarray real-time data
- Control data from healthy cells
- Cells from patients undergoing treatment and have
undertaken neoadjuvant chemotherapy (treatment of
locally advanced and inoperable breast cancer-
given before surgery) - - aims at reducing tumor size and increasing
rates of breast conserving treatment
8GenePix Sample Data Format
Rank NAME Ch1 Net (Mean) Ch2 Normalized Net (Mean) Log(base2) of R/G Normalized Ratio (Mean) Regression Correlation Spot Flag
1 IMAGE199180 223 1 -7.986 0.799 0
2 IMAGE810625 119 1 -7.08 0.635 0
3 IMAGE52228 119 1 -7.08 0.611 0
4 IMAGE141726 631 10 -6.027 0.705 0
5 IMAGE74537 17093 330 -5.696 0.787 0
6 PEROU5D10 451 12 -5.195 0.946 0
7 IMAGE436741 11601 474 -4.613 0.686 0
8 IMAGE682522 711 30 -4.571 0.842 0
9 IMAGE782730 2852 126 -4.503 0.776 0
10 IMAGE41648 5618 269 -4.384 0.662 0
11 IMAGE46620 47 3 -4.155 0.782 0
12 IMAGE51865 4595 352 -3.707 0.615 0
13 IMAGE587847 4140 318 -3.701 0.71 0
14 IMAGE109440 10454 821 -3.671 0.679 0
15 IMAGE199367 21288 1749 -3.605 0.96 0
16 IMAGE247281 177 15 -3.565 0.674 0
17 IMAGE276688 110 10 -3.507 0.621 0
18 IMAGE80186 7123 636 -3.486 0.913 0
19 IMAGE214572 3764 339 -3.475 0.791 0
20 IMAGE810911 1497 136 -3.457 0.622 0
9MATLAB Gene Spatial image Representations
- The command gprread reads the data from the file
into a structure. - pd.ColumnNames enabled the read to the Structure
name/Fieldname fields with the following
resulting spatial images of microarray data. - Figuremaimage (pd,'F635 Median')
- Notice the very high background levels down the
right side of the array. Areas of high color
intensity signifies high level gene expression.
10Visualization results contd
- Visualization results scanned at 532F for breast
cancer cells - The "F532 Median" field corresponds to the
foreground of the green (Cy3) channel. - Figure maimage(pd,'F532 Median')
11Visualization results for the untreated Control
sample scanned at 532F
12Clustering Commands
- The xlsread function can be used to read in the
data from the XLS file and load the data into
MATLAB - numericData, textData xlsread(cancerdata.x
ls) - This reads the data in the spreadasheet in
two variable, numericData (stores numeric values)
and textData for text values - giValues numericData (,2 end)
- drugMechanism textData(2 end,1)
- To perform the clustering, the command below is
used - clustergram(giValues, rowlabels, drug,
columnlabels, tumorTypes)
13Cluster figure
14Visualization results
- A Subsection example of Unsupervised hierarchical
clustering
15Microarray Data Visualization Results
- Cross section of Hierarchical Clustering of
expressed genes
16Conclusion
- Significant differences in gene expression in
cancer specimens before and after treatment were
observed - Differences in the microarray spatial images
between the control and diseased cancer genes
were observed. - Further confirmation of whether the drug used is
providing effective therapy is an oncologists
call.
17Q A