PPT – Pr PowerPoint presentation | free to view

About This Presentation

Title:

Pr

Description:

High quality genome sequencing and annotation (2003) ... 30%) of the sequences thus far ascertained seem to code for proteins that are ... – PowerPoint PPT presentation

Number of Views:78

Avg rating:3.0/5.0

Slides: 44

Provided by: Vatou

Category:

Tags: ascertained

more less

Transcript and Presenter's Notes

Title: Pr

1
(No Transcript)
2
What next?

High quality genome sequencing and annotation
(2003)
Complete sequencing the genomes of other model
organisms (e.g. Mouse)
The next step Functional Genomics
Determine what our genes do through systematic
studies of function on a large scale
Transcriptomics - Comparative analysis of mRNA
expression /splicing
Proteomics - Comparative analysis of protein
expression and post-translational modifications
Structural genomics - Determine 3-D structures of
key family members
Intervention studies - Effects of inhibiting gene
expression
Comparative genomics - Analysis of DNA sequence
patterns of humans and well studies model
organisms

3
Beyond Genomics Systems Biology

Human Genome 30,000 to 60,000 genes
Human Proteome 300,000 to 1,200,000 protein
variants
Human Metabalome metabolic products of the
organism (lipids,carbohydrates, amino acids,
peptides, prostaglandins, etc)

4
Functional Genomics

Whole genome
Once the whole genome is truly known and the
whole genome sequences become available for an
organism, the challenge turns from identifying
parts to understanding function
Functional genomics
The post-genomic era is defined as functional
genomics
Assignation of function to identified genes
Organisation and control of genetic pathways that
come together to make up the physiology of an
organism

5
Functional Genomics

42 of human genes of unknown function have been
found in the human genome
assigning function to these genes using
systematic high throughput methods is required

6
The Periodic Table
Functional grouping of Chemical Elements
7
Biologists Periodic Table
Genomics

Will not be two-dimensional
Will reflect similarities at diverse levels
Primary DNA sequence in coding and regulatory
regions
Polymorphic variation within a species or
subgroup
Time and place of expression of RNAs during
development, physiological response and disease
Subcellular localisation and intermolecular
interaction of protein products

8
Gene Expression analysis

Array of hope?
Arrays offer hope for global views of
biological processes
Systematic way to study DNA and RNA variation
Standard tool for molecular biology research
clinical diagnostics
Labelled nucleic acid molecules can be used to
interrogate nucleic acid molecules attached to
solid support (remember Southern Blotting?)
(Refer to January 1999, Nature Genetics
Supplement, Volume 21)

9
Gene Expression analysis

DNA chips Also known as gene chips, biochips,
microarraysbasically DNA-covered pieces of glass
(or plastic) capable of simultaneously analysing
thousands of genes at a time they can be high
density arrays of oligonucleotides or cDNA
Chips allow the monitoring of mRNA expression on
a big scale (i.e many many genes at the same
time)

Pre-1995, Northern Blots used to look at gene
expression
10
Gene Expression analysis
Incyte
11
Gene Expression analysis
Affymetrix
12
Nanogen_Movie_1
Nanogen_Movie_2
Nanogen_Movie_3
Affymetrix_Movie_3
http//www.learner.org/channel/courses/biology/uni
ts/genom/images.html
13
(No Transcript)
14
Determining gene function
sequence homology
sequence motif
tissue distribution
chromsme localisation
function .
expression in disease
biochemical assays
proteomics .
expression in models
15
Protein synthesis
16
RNA synthesis and processing
17
Alternatively spliced mRNA
18
The transcriptome

DEFINITION
The mRNA collection content, present at any
given moment in a cell or a tissue, and its
behaviour over time and cell states
(Adam Sartel, COMPUGEN).
The complete collection of mRNAs and their
alternative splice forms is sometimes referred to
as the trancriptome. The transcriptome is teh
set of instructions for creating all of the
different proteins found in an organism.
(From Genome to Transcriptome, Incyte)

19
Genome, proteome and transcriptome
The Transcriptome
Golden path
Proteome information in
DNA technology
The Proteome
- Index to a range of possible proteins
- Useful as a map and for inter-organisms
analysis
- Describes what actually happens in the cell
- Complex tools, partial results
20
Use of transcriptome analysis

Discovery of new proteins
that are present in specific tissues
that have specific cell locations
that respond to specific cell states
Discovery of new variants
of important genes
that work to increase/decrease the activity of
the native protein
The transcriptome reflects tissue source (cell
type, organ) and also tissue activity and state
such as the stage of development, growth and
death, cell cycle, diseased or healthy, response
to therapy or stress..

21
Beyond genomicsproteomics

Proteomicswhere the genome hits the road
Proteomics refers to the simultaneous, large
scale analysis of all (or many) of the proteins
made in a cell at one time to get a global
picture of what proteins are made in cells and
when
Hopefully then we can determine the whys and
what we can thus do about it very important for
drug development
The proteome is the protein complement encoded by
a genome and the term was first proposed by an
Australian post-doc, Marc Wilkins in 1994

22
Beyond the genome Proteomics

Genomics involves study of mRNA expression-the
full set of genetic information in an organism
contains the recipes for making proteins
Proteins constitute the bricks and mortar of
cells and do most of the work
Proteins distinguish various types of cells,
since all cells have essentially the same
Genome their differences are dictated by which
genes are active and the corresponding proteins
that are made
Similarly, diseased cells may produce dissimilar
proteins to healthy cells
However task of studying proteins is often more
difficult than genes (e.g. post-translational
modifications can dramatically alter protein
function)

23
Beyond the genome Proteomics

Identification of all the proteins made in a
given cell, tissue or organism
Identification of the intracellular networks
associated with these proteins
Identification of the precise 3D-structure of
relevant proteins to enable researchers to
identify potential drug targets to turn protein
on or off
Proteomics very much requires a coordinated focus
involving physicists, chemists, biologists and
computer scientists

24
Beyond the genome Proteomics

Major challenge-how do we go from the treasure
chest of information yielded by genomics in
understanding cellular function
Genomics based approaches initially use
computer-based similarity searches against
proteins of known function
Results may allow some broad inferences to be
made about possible function
However, a significant percentage (gt30) of the
sequences thus far ascertained seem to code for
proteins that are unrelated at this level to
proteins of known function

25
Beyond the genome Proteomics

Beyond the genetic make-up of an individual or
organism, many other factors determine gene and
ultimately protein expression and therefore
affect proteins directly
These include environmental factors such as pH,
hypoxia, drug treatment to name a few
Examination of the genome alone can not take into
account complex multigenic processes such as
ageing, stress, disease or the fact that the
cellular phenotype is influenced by the networks
created by interaction between pathways that are
regulated in a coordinated way or that overlap

26
Beyond the genome Proteomics

Genomic analysis has certainly provided us with
much insight into the possible role of particular
genes in disease
However proteins are the functional output of the
cell and their dynamic nature in specific
biological contexts is critical
The expression or function of proteins is
modulated at many diverse points from
transcription to post-translation and very little
of this can be predicted from a simple analysis
of nucleic acids alone
There is generally poor correlation between the
abundance of mRNA transcribed from the DNA and
the respective proteins translated from that mRNA
Furthermore, transcript splicing can yield
different protein forms
Proteins can undergo extensive modifications such
as glycosylation, acetylation, and
phosphorylation which can lead to multiple
protein products from the same gene

27
Proteomics Tools

The core methodologies for displaying the
proteome are a combination of advanced separation
techniques principally involving two-dimensional
electrophoresis (2D-GE) and mass spectrometry

http//www.learner.org/channel/courses/biology/uni
ts/proteo/images.html
http//www.childrenshospital.org/cfapps/research/d
ata_admin/Site602/mainpageS602P0.html
28
2D-GE basic methodology

Sample (tissue, serum, cell extract) is
solubilized and the proteins are denatured into
polypeptide components
This mixture is separated by isoelectric focusing
(IEF) on the application of a current, the
charged polypeptide subunits migrate in a
polyacrylamide gel strip that contains an
immobilized pH gradient until they reach the pH
at which their overall charge is neutral
(isoelctric point or pI), hence producing a gel
strip with distinct protein bands along its
length
This strip is applied to the edge of a
rectangular slab of polyacrylamide gel containing
SDS. The focused polypeptides migrate in an
electric current into the second gel and undergo
separation on the basis of their molecular size

29
2D-GE basic methodology

The resultant gel is stained (Coomassie, silver,
fluorescent stains) and spots are visualized by
eye or an imager. Typically 1000-3000 spots can
be visualized with silver. Complementary
techniques, e.g. immunoblotting allow greater
sensitivity for specific molecules.
Multiple forms of individual proteins can be
visualized and the particular subset of proteins
examined from the proteome is determined by
factors such as initial solubilization
conditions, pH range of the IPG and gel gradient

30
General schematic of 2D-PAGE for protein
identification in Toxicology
31
General strategy for proteomic analysis
Sample solubilization
Sample growth
Isoelectric focusing (IPG)
2D-PAGE
Immunoblot (Western)
Image analysis
Isolation of spots of interest
Trypsin digestion of proteins
MS analysis of tryptic fragments
Identification of proteins
32
Nature of IPG determines spot location on 2D-PAGE
33
Limitations of 2D-GE

In the large scale analysis of proteomics, 2D-GE
has been the major workhorse over the last 20
years-its unique application in being able to
distinguish post-translational modifications and
is analytically quantitative
However despite the significant improvements
(e.g. immobilized pH gradients) to the technique
and its coupling with MS analysis it is still
difficult to automate
Although at first glance the resolution of 2D
seems very impressive, it still lags behind the
enormous diversity of proteins and thus
comigrating protein spots are not uncommon
This is especially of concern when trying to
distinguish between highly abundant proteins e.g.
actin (108 molecules/cell) and low abundant like
transcription factors (100-1000)-this is beyond
the dynamic range of 2D
Enrichment or prefractionation can often overcome
such discrepancies

34
Limitations of 2D-GE

Chemical heterogeneity of proteins also presents
a major limitation
Thus the full range of pIs and MWs of proteins
exceeds what can routinely be analyzed on 2D-GE.
However improvements to IPGs is expected to
overcome some of these constraints and greatly
imrpove the coverage of the entire proteome of
the cell
Problems liked with extraction and solubilization
of proteins prior to 2D-GE present an even
greater challenge-especially for extremely
hydrophobic proteins, such as membrane and
nuclear proteins. Again recent advances in buffer
composition has diminished the scale of this
problem

35
Differential Gel Electrophoresis (DiGE)
36
Protein identification and characterization

Specialized imaging software allows for a more
detailed analysis of spot identification and
comparison between gels, and treatments
By a process of subtraction, differences (e.g.
presence, absence, or intensity of proteins or
different forms) between healthy and diseased
samples can be revealed
Cross-references to protein databases allow
assignment by known pIs and apparent molecular
size. Ultimate protein identification requires
spot digestion (enzymatic) and analysis of charge
and mass by mass spectrometry (MS)
Spot cutter tools can be coupled to image
analysis tools and in gel tryptic digestion
techniques in 96 or 384 well format can greatly
reduce the bottle-neck in sample identification
by MS

37
Protein analysis by MS

Compared to sequencing, MS is more sensitive
(femtomole to attomole concentrations) and is
higher throughput
Digestion of excised spot with trypsin results in
a mixture of peptides. These are ionized by
electrospray ionization from liquid state or
matrix-assisted laser desorption ionization from
solid state (MALDI-TOF) and the mass of the ions
is measured by various coupled analyzers (e.g.
time of flight measures the time for ions to
travel from the source to the detector, resulting
in a peptide fingerprint
The resultant signature is compared with the
peptide masses predicted from theoretical
digestion of protein sequences found in
databases-identification of protein!
Tandem MS allows one to obtain actual protein
sequence information-discrete peptide ions can be
selected and further fragmented, and complex
algorithms employed to correlate exp data with
database derived peptide sequences

38
MS analysis
39
MS analysis
40
Antibody arrays
Good for low-abundance proteins Problem is
antibody specificity
41
Protein microarrays
42
Caveats

The technology of proteomics is not as mature as
genomics, owing to the lack of amplification
schemes akin to PCR. Only proteins from a natural
source can be analyzed
The complexities of the proteome arise because
most proteins seem to be processed and modified
in complex ways and can be the products of
differential splicing
in addition protein abundance spans a range
estimated to be 5 to 6 orders of magnitude in
yeast and 10 orders of magnitude in humans.