Title: Gene Expression analysis in complex diseases
1Gene Expression analysis in complex diseases
- CHINDO HICKS, Ph.D.
- Department of Preventive Medicine and
Epidemiology - Bioinformatics Lecture for Graduate Students
- August 15th, 2007
2Outline of lecture
- Part I
- microarray platforms and data processing
- Analysis to identify differentially expressed
genes distinguishing disease from controls - Example 1 Obese vs Lean subjects
- Example 2 Schizophrenia vs normal controls
- Analysis to identify genes distinguishing
subtypes of cancer - What is clustering?
- Example 3 classification of subtypes of cancer
- Example 4 Drug effectiveness
3Outline of lecture
- Part II
- Bioinformatics tools for research
- Bioinformatics resources (databases)
- Problems for solving??
- Exam questions???
4Part I Spotted microarray platform
5Identification of differentially expressed genes
Disease
Control
Probe array
6Gene expression data N54,000
7Exampe 1 Identify genes that distinguish obese
from lean subjects in the Pima indian population
Lean 19
Obese 20
Fold change (FC) ?1 / ?2
T-test, null hypothesis H0 ?1 ?2
Alternative
hypothesis H1 ?1 ? ?2
Gene expression profiling based on RNA extracted
from the adipose tissue
8RESULT Set of up and down regulated genes
distinguishing obese (OB) from lean (L) subjects
9Example 2 Identify genes that distinguish
schizophrenia from control subjects
14 Matched controls
14 Schizophrenia
Fold change (FC) ?1 / ?2
T-test, null hypothesis H0 ?1 ?2
Alternative
hypothesis H1 ?1 ? ?2
Gene expression profiling based on RNA extracted
from the brain tissue
10Result Set of genes distinguishing schizophrenia
(SCH) from controls (CNTL)
11CLUSTERING What is a cluster?Identify genes
distinguishing the subgroups
12Aim of clustering Group genes according to their
similarity
- Given genes x (x1, , xn), y (y1, , yn)
13Hierarchical clustering
- Similarity of objects is represented in a tree
structure (dendrogram). - Advantage no need to specify the number of
clusters in advance. - Nested clusters can be represented.
14Example 3. Classification of subtypes of ovarian
cancer
104 patients
Goal Find gene clusters distinguishing subtypes
of ovarian cancer
15Clusters of genes distinguishing subtypes of
ovarian cancer
16Major drugs ineffective for many
17Cluster analysis of 118 anticancer drugs against
highly correlated genes
18Part II Bioinformatics tools and resources for
research
- Bioinformatics tools
- Nucleotide Sequence Analysis
- Protein Sequence Analysis and Proteomics
- Structures
- Genome Analysis
- Gene Expression
- Tools for Programmers
- Resources
- We labs
- Databases
19(No Transcript)
20(No Transcript)
21(No Transcript)
22(No Transcript)
23(No Transcript)
24Source http//www.ncbi.nlm.nih.gov/Tools/
25http//homepage.univie.ac.at/herbert.mayer/AtoZ.ht
ml1to9
26Problem solving
- (1) Use entrez (PubMed) or SWISS-PROT to find a
protein sequence for the CD44 gene, download the
sequence in FASTA format, use the sequence or
gene name to and ProDOM database to find the
domains for this gene - (2) Use entrez (PubMed) or SWISS-PROT to find a
protein sequence for the MMP-19 gene, down load
sequence in FASTA format, use the sequence or
gene name and Pfam database to find other members
of the MMP family of proteins and domains - Explain the functions of those domains