Chip arrays and gene expression data - PowerPoint PPT Presentation

1 / 49
About This Presentation
Title:

Chip arrays and gene expression data

Description:

5. Which genes are not expressed in the brain of a retarded baby? ... Affymetrix: one can buy ready-made chips (human genome, mouse genome), or can ... – PowerPoint PPT presentation

Number of Views:100
Avg rating:3.0/5.0
Slides: 50
Provided by: tal8
Category:
Tags: arrays | baby | chimp | chip | data | expression | gene

less

Transcript and Presenter's Notes

Title: Chip arrays and gene expression data


1
Chip arrays
Chip arrays and gene expression data
2
Chip arrays
With the chip array technology, one can
  • measure the expression of 10,000 (all) genes at
    once
  • Can answer questions such as
  • Which genes are expressed in a muscle cell?
  • Which genes are expressed during the first weak
    of pregnancy in the mother? In the fetus?
  • Which genes are expressed in cancer?

3
Chip arrays
Classical chip array questions (continued)
4. If one mutates a TF which genes are not
expressed following this change? 5. Which genes
are not expressed in the brain of a retarded
baby? 6. Which genes are expressed when one is
asleep versus when the same person is awake?
4
Chip arrays
DNA chip in each cell theres a specific DNA
molecule. Upon hybridization with an mRNA
molecule (or cDNA one) the intensity of the
hybridization can be quantified by light.
5
Chip arrays
Various technologies
The two most common companies Affymetrix (uses
photolithography). Agilent (uses phosphoramidite
chemistry).
6
Chip arrays
Affymetrix
Affymetrix each probe is 25 bp a part of an
exon.
The reader
The chip itself
In one cm2 gt 106 different oligos
7
Chip arrays
Affymetrix
Affymetrix each probe is 25 nucleotides. Above
this, a technological problem exists the
synthesis becomes inaccurate With such short
probes, each mRNA can hybridize to more than one
probe. The solution, each gene is covered by
several distinct probes
8
Chip arrays
Affymetrix
Affymetrix one can buy ready-made chips (human
genome, mouse genome), or can design (print)
his own chip (more expensive)
9
Chip arrays
Affymetrix
  • Detection
  • mRNA is isolated from the tissue (cells, viruses)
  • cDNA is synthesized
  • The cDNA is fluorescently labeled
  • Sometimes, the cDNA is amplified using PCR
  • The intensity in each cell (probe) is measured by
    the reader

10
Chip arrays
Agilent
Agilent Developed DNA printers in each spot
pico-liters of nucleotides are added. They can
make probes up to 60 mers (Agilent is derived
from Hewlett-Packard)
Standard phosphoramidite chemistry
11
Chip arrays
Agilent
Hybridization to Agilent probes is more
accurate If there is an hybridization to a
probe, the gene it represents is probably
expressed
12
Chip arrays
Agilent
But, it is impossible to know how many probes are
in each cell. So absolute fluorescent intensities
are meaningless
13
Chip arrays
Agilent
Solution, in the same experiment, hybridize
samples with two conditions healthy cells versus
tumor cells The Agilent reader will give the
ratio of the two colors
14
Chip arrays
Stanford cDNA chips
In this approach, long cDNA sequences (gt300bp)
are produced in a cell (a clone) and are linked
to each chip cell. This produces long cDNAs and
saves synthesizing them a nucleotide at a time
(cheaper!) As in the case of Agilent, it is
impossible to control the number of probes in
each cell
15
Chip arrays
Output
Each cell is some measurement which is an output
of an optical scanner
16
Chip arrays
Output
Each gene is represented by several cells
(usually distributed in various places around the
chip)
Gene 2
Gene 1
Gene 3
17
Chip arrays
Output
Programs specific to each technology convert the
data from oligos to genes
18
Chip arrays
Technical noise
  • Microarray data are noisy because of technical
    issues
  • Variation introduced during sample preparation
  • Array manufacture (variation between supposedly
    identical arrays)
  • Hybridization (variation in the amount of a
    sample), and more

19
Chip arrays
Normalization
Microarray data are normalized to remove
technical noise. This step is done both within an
array and among arrays
20
Chip arrays
Repeats
The repeat can either be the same sample a
different chip or a real biological repeat a
different sample
21
Chip arrays
Differential expression
Genes 1 and 3 are not expressed the same in wt
versus treatment -gt they are differentially
expressed Statistically, t-test and/or ANOVA are
used to test if a specific gene is differentially
expressed
22
Chip arrays
Correcting for multiple tests
  • Because there are thousands of genes in each chip
    array experiments, even if none of the genes is
    differentially expressed, many false positive
    predictions are expected
  • Two approaches for correction
  • Bonferroni divides the P value cutoff by the
    number of genes. -gt many potential genes may be
    missed
  • False Discovery Rate (FDR) allows for a certain
    percent of false discoveries (e.g., 5)

23
Chip arrays
Expression profiles
Genes 1 and 2 show the same expression profile.
Same is true for genes 3 and 4 (highly expressed)
24
Chip arrays
Expression profiles
Genes with the same expression profile -gt
suggestive of a functional linkage (in this
example g1 and g3 may be specific to the brain
rather than just being house keeping genes that
are highly expressed in all tissues (g2).
25
Chip arrays
Clustering
In general, we want to find all the genes which
share the same expression profile -gt suggestive
of a functional linkage This is done by
clustering the genes with the same profile
26
Chip arrays
Clustering
Clustering of the conditions can suggest two
types of brain tumors (bt) Bi-clustering both
on the conditions and on the genes.
27
Chip arrays
Applications
Think of increasing the glucose concentration of
E.coli and making a chip array in these various
concentrations One can potentially discover all
genes in the glycolysis pathway Knocking out a
gene -gt discover all genes that interact with it
28
Chip arrays
Applications
Analyzing expression of genes can help reveal the
gene network of a given organism
29
Chip arrays
Gene network
30
Chip arrays
Classification (clinical)
Do I have a brain tumor?
31
Chip arrays
Presentation heat map
500 genes from (14 chips of) normal and (32 of)
ischemic human hearts.
32
Chip arrays
From a list of genes for characterization
It is often impossible to make sense by just
reading the name of the genes function The Gene
Ontology (GO) project enables to find whether
these differentially expressed genes share
something in common GO is a controlled
vocabulary that describes all annotated genes
33
Chip arrays
From a list of genes for characterization
One can compare if the GO category, for example
extracellular is more prevalent among the
differentially expressed genes relative to their
frequency among all genes
34
Chip arrays
Using chips to study evolution of expression
It is very problematic to use a human chip to
study gene expression in gorilla. Observed
differences in expression may reflect true
differences in expression levels. However, they
may also reflect bias introduced by the fact that
the many mRNAs of gorilla differ in sequence from
mRNA of humans, resulting in different levels of
hybridizations
35
Chip arrays
Using chips to study evolution of expression
36
Chip arrays
Using chips to study evolution of expression
Compared expression levels between humans,
chimpanzees, orangutans, and rhesus monkeys using
specially designed chips for each
genome. Concluded that for most genes the
expression level is conserved among primates
(this is expected since too high or too low
levels should be selected against)
37
Chip arrays
Using chips to study evolution of expression
Found shifts in expression level of TFs specific
to human (TFs are highly expressed in humans
compared to other primates) This supports the
theory that most of the significant differences
between human and chimp are in gene regulation
rather than in protein sequences
38
Chip arrays
Using chips to study evolution of expression
Genes that are highly expressed are slow evolving
39
Chip arrays
Sequence by hybridization
  • It was thought that the following procedure could
    work for sequencing a genome
  • Make a chip containing all x mers (e.g., x 25)
  • Hybridize a genome to the chip
  • By analyzing all the hybridizations with their
    overlaps assemble the genome
  • Problem it doesnt work

40
Chip arrays
ChIP-chip
41
Chip arrays
ChIP-chip
Chip-chip A method for measuring protein-DNA
interaction Proteins that bind DNA
includes Those responsible for transcription
regulation Transcription factors
(TFs) Replication proteins Histones
42
Chip arrays
ChIP-chip
ChIP-chip One chip is for Chromatin
ImmunoPrecipitation and the second chip is for
DNA microarrays The method is used mostly to
detect TF binding sites
43
Chip arrays
ChIP-chip
  • There must be an antibody to the TF
  • The DNA is broken into fragments
  • The DNA is chemically linked to the TF
  • DNATF is precipitated using the Ab
  • The bonds between DNA and TF are removed
  • DNA is determined by hybridization to the DNA chip

44
Chip arrays
ChIP-chip
Control with irrelevant antibodies
Reverse cross linking and DNA extraction Amplifica
tion and labeling
45
Chip arrays
Tiling arrays
Here the chip array should include not only
protein coding genes but also control regions, or
simply the entire genome
46
Chip arrays
Deep sequencing
Solexa and other methods for deep
sequencing Today technologies are being
developed that can sequence a lot of data in an
incredible rate A variant of the ChIP-chip method
is to sequence rather than to hybridize the DNA
47
Chip arrays
Protein-protein interactions
48
Chip arrays
Protein-protein interactions
Databases of protein-protein interactions DIP In
tAct MINT MIPS iHOP
49
Chip arrays
Protein-protein interactions
Protein-protein interactions are fundamental for
functional annotation If X interacts with Y Y
is known to be related to muscle development,
maybe X is also related to muscle
development Guilt by association
Write a Comment
User Comments (0)
About PowerShow.com