Introduction to - PowerPoint PPT Presentation

1 / 59
About This Presentation
Title:

Introduction to

Description:

... requires the data analysis of the microarrays here we study the clustering of ... 9.4.1 Data Description * Now select only those genes with an absolute ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 60
Provided by: Wes57
Category:

less

Transcript and Presenter's Notes

Title: Introduction to


1
Introduction to
Bioinformatics
2
Introduction to Bioinformatics.
LECTURE 9 Clustering gene expression
Chapter 9 The genomics of wine-making
3
Introduction to BioinformaticsLECTURE 9
CLUSTERING GENE EXPRESSION
9.1 Chateau Hajji Feruz Tepe Wine making dates
back to at least 5000 BC, based on archeological
finds in Iran Hajji Feruz Tepe .
Overview of Neolithic houses at Hajji Feruz Tepe
that yielded six wine jars in the floor along
one wall of the room.
4
Introduction to BioinformaticsLECTURE 9
CLUSTERING GENE EXPRESSION
Wine making dates back to at least 5000 BC,
based on archeological finds in Iran Hajji Feruz
Tepe .
One of six jars once filled with wine from the
Neolithic residence at Hajji Feruz Tepe (Iran).
Chemical analysis of patches of a reddish
residue covering the interior of this vessel
showed that this originally was resinated wine.
5
Introduction to BioinformaticsLECTURE 9
CLUSTERING GENE EXPRESSION
Recipe for wine making 1. fruit juice (or
other sugar-rich liquid) 2. yeast
Saccharomyces cerevisiae
6
(No Transcript)
7
Introduction to Bioinformatics9.1 CHATEAU HAJJI
FERUZ TEPE
Yeast (Saccharomyces cerevisiae) is a unicellular
fungus found naturally in grapevines and
responsible of wine-making fermenting sugars and
producing alchool.
8
(No Transcript)
9
(No Transcript)
10
Introduction to Bioinformatics9.1 CHATEAU HAJJI
FERUZ TEPE
From being budded off from its parent cell, to
reproducing its own offspring, each yeast cell
goes through a number of typical steps that also
involve changes in gene expression, turning whole
pathways on and off.
11
(No Transcript)
12
(No Transcript)
13
Introduction to BioinformaticsLECTURE 9
CLUSTERING GENE EXPRESSION
14
Introduction to Bioinformatics9.1 CHATEAU HAJJI
FERUZ TEPE
Remember, a gene is an on-off switch and RNa and
proteins are messengers between the genes. If a
gene is on the gene is expressed. The degree
to which the gene is expressed is called the
expression level of the gene. If a gene is off,
it can be said that it has expression level zero.
15
Introduction to Bioinformatics9.1 CHATEAU HAJJI
FERUZ TEPE
Today the study of such phenomena is possible
through the technology of microarray that can
measure the expression level of every gene in a
cell. With the gene expression data, genes can
be clustered on the basis of the similarity of
their expression profiles.
16
Introduction to Bioinformatics9.1 CHATEAU HAJJI
FERUZ TEPE
With water, sugar and flour, yeast ferments the
sugars in the dough and produces carbon dioxide
CO2 (this causes the dough to rise). In this
process it produces alcohol as a by-product
(originally perhaps as near-toxic
protection!). When the sugar supply is
exhausted S. cerevisiae must find a new source of
energy when oxygen is available it shifts to
respiration alcohol now becomes the source of
energy. This state change is called the
diauxic shift
17
Introduction to Bioinformatics9.1 CHATEAU HAJJI
FERUZ TEPE
S. cerevisiae is (one of) the most studied
organism in biology S. cerevisiae is a complex
unicellular Eukaryote 12.5 Mbp genome in 16
linear chromosomes (except mitochondriae)
containing 6400 genes (2000 more than E. coli).
18
Introduction to Bioinformatics9.1 CHATEAU HAJJI
FERUZ TEPE
19
Introduction to Bioinformatics9.1 CHATEAU HAJJI
FERUZ TEPE
S. cerevisiae can be regarded as a complex
factory transforming many raw materials to final
materials, involving many conveyor belts
between the genes Such a conveyor belt of
coupled expressed genes is called a genetic
pathway The diauxic shift means that the whole
system has to be transformed from the old process
to the new process, meaning that entire new
pathways are formed, and old pahways are shut-off.
20
Introduction to Bioinformatics9.1 CHATEAU HAJJI
FERUZ TEPE
Therefore it is usefull to monitor the
genome-wide expression of S. cerevisiae in time,
including the diauxic shift. Such a conveyor
belt of coupled expressed genes is called a
genetic pathway This monitoring can be done
with microarrays, the foremost important tools in
bioinformatics. Other dynamical processes as
the Cell Cycle can also be studied with
microarrays. This requires the data analysis
of the microarrays here we study the clustering
of expression profiles time series of expression
levels.
21
Introduction to BioinformaticsLECTURE 9
CLUSTERING GENE EXPRESSION
  • 9.2 Monitoring cellular communication
  • Purpose of microarrays snap-shot of the
    expression levels in the cell.
  • Expressed gene DNA ? mRNA ? proteins .
  • In the cell therefore expressed genes cause
    high numbers of mRNA molecules.
  • Idea of microarrays measure the concentrations
    of mRNA, and reverse-compute the DNA belonging to
    this mRNA.
  • As RNA can be spliced due to exons, the
    backward computed DNA is not entirely equal to
    the real DNA it is called cDNA complementary
    DNA.

22
Introduction to Bioinformatics9.2 MONITORING
CELLULAR COMMUNICATION
The cDNA computed from mRNA hints to an
expressed gene, the cDNA is stored as an EST
Expressed Sequence Tag. EST sequencing can
identify genes that are missed with ab initio
gene-finding methods, such as ORF-finder.
23
Introduction to BioinformaticsLECTURE 9
CLUSTERING GENE EXPRESSION
  • 9.3 Microarray technologies
  • A microarray is an array of sensitive spots,
    each containing a stretch of DNA, e.g. based on
    an EST
  • Hybridization (chemical binding) of the DNA
    with components in the substrate indicates the
    presence of the associated mRNA
  • The hybridization can be made visible by
    inserting fluoriscent molecules on the DNA (red,
    green) and later illuminating them with a
    suitable laser

24
(No Transcript)
25
Until recently we lacked tools to observe
genome-wide expression 1989 saw the introduction
of the microarray technique by Stephen Fodor
But only in 1992 this technique became
generally available but still very costly
26
(No Transcript)
27
Introduction to Bioinformatics9.3 MICROARRAY
TECHNOLOGIES
28
Introduction to Bioinformatics9.3 MICROARRAY
TECHNOLOGIES
Example of an Affymetrix microarray simulation.
Example of the simulated single-channel
oligonucleotide microarray slide image (crop from
top left corner) (a). We have used an Affymetrix
.cel file as the ground truth data. Thus the text
about the slide type is observable. Real
Affymetrix slide image is shown for comparison
(b).
29
Introduction to BioinformaticsLECTURE 9
CLUSTERING GENE EXPRESSION
  • 9.4 The diauxic shift and yeast gene expression
  • In 1997 DeRisi et alum used microarrays to
    measure the genome-wide expression on S.
    cerevisiae during the diauxic shift.
  • 9 initial hours of growth, 6 hours before the
    diauxic shift, and 6 hour there after.
  • They compared the mRNAs in the array at t
    time-steps before the diauxic shift, and compared
    those with the mRNA-levels at time 0.

30
Introduction to Bioinformatics9.4 THE DIAUXIC
SHIFT AND YEAST GENE EXPRESSION
This experiment gave a set of 43.000 ratios
seven time-points (t1, t2,, t7) of 6400 gene
expression levels normalized o their start
value. This is the reference design in
microarray literature
31
Introduction to Bioinformatics9.4 THE DIAUXIC
SHIFT AND YEAST GENE EXPRESSION
This experiment typically provides a time
series that is small relative to the size of the
genome here m7 timepoints for n6400 genes.
This is due to the cost of an array 1000
euro/array With this kind of experiment we can
in principle also reconstruct the gene regulatory
networks
32
Introduction to Bioinformatics9.4 THE DIAUXIC
SHIFT AND YEAST GENE EXPRESSION
  • 9.4.1 Data Description
  • First analyse the relative change in activity
  • Less than 5 of the genes change more than
    1.5-fold, or less then 0.67-fold.
  • fold-change f new_value/old_value if f gt 1
    the fold-chance is f, if f lt 1 then the
    fold-change is 1/f
  • Example x0 1, x1 0.3333, fold-change is
    -3, x0 1, x1 3, fold-change is 3.

33
Introduction to Bioinformatics9.4 THE DIAUXIC
SHIFT AND YEAST GENE EXPRESSION
  • 9.4.1 Data Description
  • Now select only those genes with an absolute
    fold-change above a certain threshold
  • abs(fold-change) gt threshold

34
Introduction to Bioinformatics9.4 THE DIAUXIC
SHIFT AND YEAST GENE EXPRESSION
  • 9.4.1 Data Clustering
  • Next, cluster the genes relative to their
    expression levels.
  • High intra-cluster similarity and low
    inter-cluster similarity.
  • Use a distance/similarity measure and a
    clustering algorithm.

35
Introduction to Bioinformatics9.4 THE DIAUXIC
SHIFT AND YEAST GENE EXPRESSION
  • Data Clustering
  • 1. Define a suitable Distance Measure d(x1,x2),
    e.g. Pearsons correlation coefficient, or a
    normalized distance like the Mahalanobis
    distance, or a metric like the generalized
    p-norm.
  • 2. Define a clustering criterion, e.g. C
    ?ij in same cluster dij - ?ij in different
    cluster dij.
  • 3. Apply a suitable clustering algorithm, e.g.
    hierarchical, or K-means clustering.

36
Introduction to Bioinformatics9.4 THE DIAUXIC
SHIFT AND YEAST GENE EXPRESSION
Hierarchical clustering
37
Introduction to Bioinformatics9.4 THE DIAUXIC
SHIFT AND YEAST GENE EXPRESSION
K-means clustering
38
Introduction to Bioinformatics9.4 THE DIAUXIC
SHIFT AND YEAST GENE EXPRESSION
  • Gene function and Clustering
  • 1. Genes with similar expression profiles have
    similar functions.
  • 2. Define a clustering criterion, e.g. C
    ?ij in same cluster dij - ?ij in different
    cluster dij.
  • 3. Apply a suitable clustering algorithm, e.g.
    hierarchical, or K-means clustering.

39
Introduction to Bioinformatics9.4 THE DIAUXIC
SHIFT AND YEAST GENE EXPRESSION
  • Gene function and Clustering
  • 1. Single linkage min i,j xi yj.
  • 2. Average linkage mean i,j xi yj.
  • 3. Centroid distance dAB mA mB

40
Introduction to Bioinformatics9.4 THE DIAUXIC
SHIFT AND YEAST GENE EXPRESSION
  • 9.4.3 Data Visualisation
  • In a tree using Hierarchic clustering.
  • In a plane using MDS

41
Introduction to Bioinformatics9.4 THE DIAUXIC
SHIFT AND YEAST GENE EXPRESSION
  • Gene function and Clustering
  • 2. Multi Dimensional Schaling

42
Introduction to Bioinformatics9.4 THE DIAUXIC
SHIFT AND YEAST GENE EXPRESSION
  • Gene function and Clustering
  • 1. Hierarchical clustering level of cut-off

43
Introduction to Bioinformatics9.4 THE DIAUXIC
SHIFT AND YEAST GENE EXPRESSION
  • Pre-processing
  • Select only genes with enough fold-change
  • Delete missing values

44
(No Transcript)
45
(No Transcript)
46
(No Transcript)
47
Introduction to Bioinformatics9.4 THE DIAUXIC
SHIFT AND YEAST GENE EXPRESSION
48
Heatmap timesteps ?
gene in hierarchical cluster ?
49
Introduction to BioinformaticsLECTURE 9
CLUSTERING GENE EXPRESSION
50
Introduction to BioinformaticsLECTURE 9
CLUSTERING GENE EXPRESSION
  • 9.5 CASE STUDY Cell-cycle regulated genes
  • A set of microarrays over the cell-cycle of
    yeast.

51
Introduction to BioinformaticsLECTURE 9
CLUSTERING GENE EXPRESSION
From being budded off from its parent cell, to
reproducing its own offspring, each yeast go
through a number of typical step that also
involve changes in gene expression, turning whole
pathways on and off.
52
Introduction to BioinformaticsLECTURE 9
CLUSTERING GENE EXPRESSION
Here we examine the expressions of the entire
yeast genome through two rounds of the cell
cycle. The temporal expression of genes are
measured by microarray at 24 time points every
five hours. In detail we have the expression
profile of about 6400 genes.
53
Introduction to Bioinformatics9.5 THE CELL CYCLE
54
Introduction to BioinformaticsLECTURE 9
CLUSTERING GENE EXPRESSION
55
Introduction to BioinformaticsLECTURE 9
CLUSTERING GENE EXPRESSION
56
Introduction to BioinformaticsLECTURE 9
CLUSTERING GENE EXPRESSION
57
Introduction to BioinformaticsLECTURE 9
CLUSTERING GENE EXPRESSION
58
Introduction to BioinformaticsLECTURE 9
CLUSTERING GENE EXPRESSION
59
END of LECTURE 9
Write a Comment
User Comments (0)
About PowerShow.com