Title: Introduction%20into%20Micro%20Array%20Analysis
1Introduction into Micro Array Analysis
- Winfried Krueger, Ph. D.
- UCHC M.A.C.
2- Basic Concepts in Molecular Biology
- Introduction into Micro Array Methodology
- Image and Data Analysis
3All properties of a cell/organism are the result
of the action of proteins (biological robots
with a defined function)
4Proteins are heteropolymers com-posed of
aminoacids _____________The primary aminoacid
structure and the assembly reaction itself
determine the higher order structure of a protein
responsible for its function_____________________
Proteins are synthesized according to a
blue-print stored as genetic information in the
chromosomes
Introduction
5Chromosomes are Nucleoprotein Complexes
Histones
DNA
6DNA
7From DNA to Protein
DNA
G A C T G A C A G T T G
G A T A C G A T
nucleus
Transcription
mRNA
Processing/Export
Cytoplasm
tRNA
Translation
Aminoacids
Ribosome
Protein (polypeptide)
8The Genetic Code Encoding the Protein Building
Blocks
9Genes and Genome The Genome represents all the
DNA contained in an organism or a cell. Genes
are the functional and physical unit of heredity
passed from parent to offspring. Genes are
pieces of DNA, and most genes contain the
information for making a specific protein.
10There are three billion (3X 109)base pairs (bp)
of DNA in the human genome
Chromosome DNA amount (bp) 1 257
million 6 179 million 13 116 million
22 48 million X 152 million
11Organization of a Eucaryotic Gene
G e n e
12The Genome Projects
- Genome Projects are currently undertaken for
several model organisms including the human and
the mouse species. They are international
research projects to - Completely determine the linear order of its
nucleotide building blocks (Sequencing of the
genomic DNA) - Map each gene within the genome of the particular
species - The genome sizes range from 6x 106 bps (E.Coli)
to 3x109 base pairs (human and mouse) with 3,000
(E,Coli) to 70,000 (estimated human) genes
There are 23 chromosomes in human and current
data for chromosome 22 can identify so far only
761 genes.
13The EST and Unigene Concepts
The EST and Unigene concepts represent attempts
to determine through sequencing all genes in an
organism. Gene loci can be determined 1)Bioinfor
matically gt requires knowledge of identification
rules 2)Sequence analysis of mRNA Not all genes
are expressd in every tissue/cell genes are
expressed in a tissue/cell specific manner
Analysis of expressed sequences from mRNAs
isolated from all cells/tissues through partial
sequencing Expressed Sequence Tags
(EST) Sequence analysis of mRNA uses reverse
transcription into cDNA and this process leads
to sequences starting at multiple random
positions within the mRNA Unigene
Cluster (ESTs that can be allocated to a single
gene)
14UniGene Concept and Microarrays
- Genes are represented in a EST (expressed
sequence tag) clone library through multiple
orthogonal (non overlapping) and non orthogonal
(overlapping) EST clones - Unigene concept assigns a Cluster ID to each
group of linked ESTs - Unigene database also contains expression,
genomic and species information - permits the generation of prearrayed EST
libraries - with non redundant representation of genes
Attention unigene assignment is generally in
flux through continued updating of genetic
information
15Unigene Clonesets
Tissue specific cDNA Library
Sequencing
Robotic Clone selection
Robotic Clone selection
Pre-arrayed Unigene cDNA Library
DB curation
16Identification of an Unigene Cluster
EST linkage is accomplished through Blast
analysis of the ESTs against reference data
bases, e.g.
EST DB
ESTs
Non Orthogonal ESTs
Genbank
Orthogonal EST s
Ensembl
RefSeq
Unigene DB
17- Basic Concepts in Molecular Biology
- Introduction into Micro Array Methodology
- Image and Data Analysis
18Introduction
- Microarray technology is a high through put
method to simul-taneously assess the expression
state of a large number of genes - gene expression/protein profiling
- Technology is based on immobilization of probes
with known identity onto a substrate in known
locations - Measures the steady state levels of mRNA in cell
populations during their transition from one
biological state to a new one - through hybridization of a labeled copy from the
mRNA targets - Large number of data points obtained requires
advanced statistical approaches for validation
and pattern recognition
19Micro Arrays and the Physiological State of the
Cell
Micro array analysis
Micro array analysis
Micro array analysis captures the cellular
transcriptional state
20Microarrays measure Gene Expression Levels
Fold Expression Levels
Conditions/time points/etc
21Methodologies relevant to Micro Array Analysis
Exp. Systems
Tissue specific cDNA Library
Custom/Commercial micro arrays
cDNA synthesis
Tissue banks
Data Interpretation
Array Analysis
RNA isolation
Image Analysis Data Analysis Class. Pattern
Recognition (hierachial clustering, SOMs, Neural
networks Analysis, SVMs, PCA) Adv. Pattern
Recognition (annotation based clustering)
Gene Ontology Pathway Association CGAP
CGH Homologene Epidemiology Molecular
Pathology Toxicology
Modif. after D.Botwell
22Micro Array Types
DNA Array methodology is a RNA based technology
in which either
- arrays of cDNAs/Expressed Sequence Tags (ESTs) or
- arrays of Oligonucleotides
- arrays of selected Genomic DNA (known genomes)
are utilized for gene expression profiling, SNP
analysis or site recognition
23Types of robotic cDNA Arrayers
Pin Technology
Quill Pen Technology
Ink jet Technology
Pin Ring Technology
24Types of robotic Oligonucleotide Arrayers
Photolithography(Affymetrix)
Ink Jet Technology (Agilent)
25Probe preparation via PCR
Bacterial Suspension
Plasmid DNA
P1/P2
20-40x
Purification
26DNA Substrate Interactions
27- Target preparation
- Most genes are expressed ubiquitously at low
levels and only few genes are expressed cell
specifically regulated - Microarrays measure gene expression profiles of
large numbers of genes but not of marker genes - Increasing heterogeneity of a cell population
decreases the sensitivity of array derived
expression profiles - Increased need for highly purified cell
populations - 1) FACS based purification
- Cell specific markers amenable to fluorescent
labeling - GFP based sorting of cells from transgenic
animals - 2) Laser Capture micro dissection
28Laser Capture Microdissection
29The Hybridization Technique
The hybridization technique is based on the
molecular recognition of each nucleotide via base
pairing
Probe
Target
Hybrid
30Micro Arrays of cDNAs/expressed Sequence Tags
(EST)
fluorescent image1
robotic arrayer
RNA 1
target 1
target 2
RNA 2
cell type 2
fluorescent image 2
31Standard Target Labeling methods
odT20/Cx-dUTP
ss cDNA
Cx-cDNA
odT20
NH2
NH2
NH2
aa-dUTP
odT20 / aa-dUTP
odT20
T7odT20
P1-odT20
Strept avidin
Tyramide based Labeling
32Acceptor groups for Deoxynucleotides
- amino allyl dUTP
- Fluorescine coupled dUTP (Tyramide based
labeling) - - biotin coupled dUTP (Tyramide based labeling)
- - oligonucleotide (Dendrimer)
33Sample Images obtained by members of the M.A.C.
T7 amplified target
TSA Labeling
34Competitive Hybridization of cDNA/EST Micro Arrays
primary images
overlay of normalized fluorescence intensities
expression levels
0.2 cm
cell type 1
cell type 2
35P19 cDNA hybridization to ms 6400 genechip (known
genes)
Chip contains two replicate arrays at 180 mm
inter spot distance on poly-Lysine Each array
contains 6912 genes printed in duplicates (13824
features) 288 of the 6912 genes are control genes
printed in 16 clones/384 genes from Arabidopsis
and Aspergillus
36Quality Control in cDNA Micro Arrays
- Dual probe hybridization with simultaneous
detection of the dyes used - Arraying of replicates
- Incorporation of genes with constant expression
levels into the micro array - Arraying of heterologous cDNAs spiking of the
probes with those cDNAs for positive/negative
controls and for normalization of the
fluorescence intensities
37Micro Array Imaging
- Micro array imaging methods
- Phosphoimaging (33P)
- Fluorescence Imaging
-
- Confocal
- epifluorescence microscopy
- CCD technology)