Title: Blank slide/colon data
1Blank slide/colon data
CLUSTER ANALYSIS OF DNA AND ANTIGEN CHIP
DATA EYTAN DOMANY DEAD SEA,
OCT 2002
2The data
THE DATA EXPRESION LEVEL OF GENE i IN SAMPLE j
sample
1
36
1
gene
358
3THE METHOD 1
THE METHOD CLUSTER ANALYSIS. N OBJECTS BREAK
THEM INTO GROUPS ON THE BASIS OF SIMILARITY
4THE METHOD 2
- THE METHOD CLUSTER ANALYSIS.
- N OBJECTS BREAK THEM INTO GROUPS ON THE BASIS
OF SIMILARITY - OBJECTS GENES
- GENES WITH SIMILAR EXPRESSION PROFILES
MAY BE CO-REGULATED - PROVIDE GUESS FOR ROLE OF PROTEINS
- OBJECTS SAMPLES
- CLASSIFY TUMORS, DIAGNOSIS,
- PROGNOSIS, THERAPY
THE PROBLEM RAISED BY ESHEL BEN-JACOB CLASSIFYIN
G THE PATIENTS ON THE BASIS OF EXPRESSION OF
THOUSANDS OF GENES DOES NOT WORK, SINCE MOST
GENES ARE NOT RELEVANT TO THE QUESTION OF
INTEREST AND INTRODUCE ONLY NOISE.
5football
6THE SOLUTION WORK WITH SMALL SUBSETS OF GENES
AND SAMPLES. COUPLED TWO-WAY CLUSTERING Getz et
al PNAS (2000) Califano et al , Proc. Int.
Conf. Intell. Syst. Mol. Biol. (2000). Y. Cheng
and G. M. Church, Proc. Int. Conf. Intell. Syst.
Mol. Biol. (2000) IDENTIFY (CORRELATED) GROUPS
OF GENES AND USE THEIR EXPRESSION LEVELS TO
STUDY (CLUSTER) THE SAMPLES.
7glioblastoma
GLIOBLASTOMA
M. Hegi et al CHUV, G. Getz
CLONTECH ARRAYS
S3
S1(G1)
S2
Coupled Two-Way Clustering (CTWC) of 358 Genes
and 36 Samples
T
G12
GENES
G5
Astrocytoma(II) Secondary GBM
Primary GlioBlastoMa Cell Lines
G1(S1)
8S1(G5)
Super-Paramagnetic Clustering of All Samples
Using Stable Gene Cluster G5
S1(G5)
S14
S13
S12
S11
S10
Fig. 2B
9validation
G5Ver
10THE GENES OF G5
THE GENES OF G5
AB004904
STAT-induced STAT inhibitor 3
M32977
VEGF
M35410
IGFBP2
X51602
VEGFR1
M96322
gravin
AB004903
STAT-induced STAT inhibitor 2
PTN
X52946
J04111
c-jun
X79067
TIS11B
VEGF AND ITS RECEPTORS INSTRUMENTAL
IN ANGIOGENESIS INDUCED GROWTH OF BLOOD VESSELS,
ESSENTIAL FOR GROWTH BEYOND A CRITICAL SIZE. THE
COEXPRESSION OF IGFBP2 WAS INDEPENDENTLY
VERIFIED 1ST EVIDENCE FOR POSSIBLE ROLE IN
ANGIOGENESIS.
11colon paired G1(S1)
COLON CANCER 18 PAIRED CARCINOMA/NORMAL
4 PAIRED
ADENOMA/NORMAL Notterman et al Cancer Res.
(2001) Getz et al, Bioinformatics (in print)
12colon cancer carcinoma adenoma
COLON CANCER 18 PAIRED CARCINOMA/NORMAL
4 PAIRED
ADENOMA/NORMAL Notterman et al Cancer Res. (2001)
S1(G8) tumor/normal
S1(G3) protocol A /protocol B
13colon cancer tumor/normal dist.mat
COLON CANCER 18 PAIRED CARCINOMA/NORMAL
4 PAIRED
ADENOMA/NORMAL Notterman et al Cancer Res. (2001)
S1(G8) tumor/normal
distance matrix
14colon cancer protocol A/B dist.mat
COLON CANCER 18 PAIRED CARCINOMA/NORMAL Notterman
et al Cancer Res. (2001)
S1(G3) protocol A /protocol B
distance matrix
15Predicting response to doxorubicin
treatmentsuccessful for 3/20 patients
BREAST CANCER DATA (BOTSTEIN/BROWN LAB PEROU ET
AL, NATURE 2000) I.Kela, G. Getz
S1(G46)
20 patients before/after chemotherapy. 10 of the
before samples are in cluster b all 3
successful treatments samples in this
group. Intermediate expression level of the G46
genes may serve as a marker for a relatively high
success rate of the doxorubicin treatment
16survival S1(G33) Sorlie
BREAST CANCER DATA (BOTSTEIN/BROWN LAB), Sorlie
et al, PNAS (2001) Getz et al, Bioinformatics
(in print)
survival
S1(G33)
p53 status
Cluster (a) high expression levels of the genes
of G33, low survival, mutant p53. predictor of
survival.
17nointerpret
BREAST CANCER DATA (BOTSTEIN/BROWN LAB), Sorlie
et al, PNAS (2001)
Gene cluster G36 induces clear partition to two
classes of no known clinical interpretation
18skin cancer, UV
Givol, Rechavi, Dazard,... Hilah Gal
NORMAL HUMAN EPIDERMAL KERATINOCYTES
(NHEK) SQUAMOUS CARCINOMA CELLS
(SCC) IRRADIATE (2m) BY UVB MEASURE
EXPRESSION VS TIME NHEK UV t 0.5, 3,
6, 12, 24 hours NO UV t 0, 0.5,
12, 24 SCC UV t 0, 6, 12 UV
INDUCES DNA DAMAGE, WHICH ELICITS
APOPTOTIC RESPONSE. NHEK RESIST APOPTOSIS BY
SECRETION OF SURVIVAL FACTORS. THIS RESISTANCE TO
APOPTOSIS MAY PROMOTE EMERGENCE OF MALIGNANCY.
19S1(G28)
Squamous Carcinoma Cells UVB (SU)
UV/NON UV SEPARATION INDUCED BY G(28) DNA REPAIR
(GADD45A,B) ANTIOXYDANT (MT1G) GROWTH FACTORS,
INFLAMMATORY MEDIATORS
20S1(G24), S1(G18)
HIGH IN NHEK, ELEVATED WITH UV
S1(G18) TUMOR/NORMAL (5 genes)
Reordered Genes
Reordered Samples
PRO-APOPTOTIC GENES (PARP, CAS)
21Antigen chips
ANTIGEN CHIPS
IRUN COHEN FRANCISCO QUINTANA Guy Head Gaddy
Getz Hila Shtark Dafna Tsafrir Gadi Elitzur
DIABETES, ARTHRITIS
22SAMPLES N 40 SERA FROM 20 HEALTHY SUBJECTS
20 DIABETES TYPE 1 78 TESTED ANTIGENS (1 BLANK)
EACH WITH TWO MARKERS, IgG IgM AND
IgM MEASURE Aij - REACTIVITY OF SERUM j TO
ANTIGEN i M 176 MEASUREMENTS PER SAMPLE THE
DATA FORM A 176 X 40 ARRAY
23EACH ONE OF THE 40 SUBJECTS IS REPRESENTED BY 176
NUMBERS THE REACTIVITY OF HIS SERUM WITH THE
176 ANTIGENS. THESE 176 NUMBERS A1,j , A2,j
,...A176,j CONSTITUTE THE REACTIVITY PROFILE
OF SUBJECT j
QUESTION CAN ONE IDENTIFY PATTERNS OF
SIMILARITY BETWEEN THE
REACTIVITY PROFILES OF
SUBJECTS WITH DIABETES? DO THEY FORM
A DISTINCT GROUP FROM HEALTHY SUBJECTS?
ANSWER NO SEPARATION INTO DIABETES VS HEALTHY
IS FOUND WHEN WE USE ALL ANTIGENS
TO CHARACTERIZE THE SUBJECTS.
24CLUSTERING 176 ANTIGENS, USING THEIR
REACTIVITIES WITH 40 SERA
ANTIGEN CLUSTERS
THE ANTIGENS FORM DISTINCT GROUPS. THERE IS
STRUCTURE IN THEIR REACTIVITY PATTERNS.
25USE THE STABLE (SIGNIFICANT) ANTIGEN
CLUSTERS, ONE AT A TIME, TO CLUSTER THE SUBJECTS.
USING ANTIGEN CLUSTER 1 (Insulin GM, Collagen1
both) WE GET A GOOD CLASSIFIER A DIABETES
CLUSTER CATCHES 17/20 OF THE DIABETES
4/21 MISTAKES
USING A MAJORITY VOTE OF 5 CLASSIFIERS GET 90
26collaborators
Projects, Collaborators, Students/Postdocs
Cancer Colon D. Notterman
G. Getz,
M.Mashiah, H. Gal Breast
D. Botstein
I. Kela, G.Getz
Glioblastoma M.Hegi, S. Goddard
G. Getz Skin
D. Givol, G.Rechavi, J. DazardH.
Gal Leukemia E.
Canaani O.
Ravid, G.Getz, H.Agrawal P53 primary targets
D. Givol, K.Kannan,G.Rechavi G. Getz,
I.Kela P73 primary targets D. Givol,
G. Rechavi I. Kela MutP53 as
oncogen V. Rotter
O. Ravid Bone development
D. Gazit
O. Ravid Antigen Chips Diabetes
I. Cohen, F. Quintana G. Getz,
G. Hed, D. Tsafrir,
Arthritis I. Cohen, F. Quintana
I. Tsafrir,H.Shtern,G.Elitzur Neurotransmitt
ers M. Levite
D. Tsafrir,I. Tsafrir Tissue
dependence D. Lancet
H. Shtern Apoptosis, IL6
L.Sachs,Y.Lotem,D.Givol
H. Gal Meiosis in yeast M.
Primig M.
Katzenelenbogen Yeast cell cycle
M. Zhang
G. Getz, E. Levine Protein Struct. Classif.
M. Vendruscolo,G.Getz Low-T phase,
SpinGlass D. Stauffer, P. Young,
G. Hed
A. Hartmann, M.Palassini SPC
M. Blatt, S.
Wiseman
H. Agrawal, N. Shental CTWC
G. Getz, E.
Levine,O.Barad re/preprint available
Weizmann Inst.
Currently postdoc/student
27Summary
SUMMARY
THE COUPLED TWO WAY CLUSTERING METHOD
SUCCESSFULLY IDENTIFIED RELEVANT STRUCTURE AND
MEANING IN CANCER RELATED GENE MICROARRAY
DATA.
CTWC SERVER http//ctwc.weizmann.ac.il
www.weizmann.ac.il/hom
e/fedomany/ www.weizmann.ac.il/
physics/complex/compphys
FUNDING ISF, GIF, Ridgefield Found., Levine
Found, NIH, EC