Day 7: Using genomics to predict new pathways - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

Day 7: Using genomics to predict new pathways

Description:

Allowing us to interpret the function of proteins within the context in which ... association (operon) with genes involved in the nucleoside salvage pathway. ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 41
Provided by: huynenm
Category:

less

Transcript and Presenter's Notes

Title: Day 7: Using genomics to predict new pathways


1
Day 7 Using genomics to predict new pathways
2
  • Genome sequences
  • Allowing us to interpret the function of proteins
    within the context in which they occur
  • Reverse this process predict the function of a
    protein from the context in which it tends to
    occur ? prediction of protein function/pathways
    from genome sequences

3
what on earth does the ketoglurateferredoxin
oxidoreductase do in P. abyssi when there are no
connecting enzymes of the citric acid cycle ?
4
2-ketoglutarate likely derived from glutamate
5
Succinyl-CoA can be broken down via
Methyl-malonyl CoA
6
Instead of interpreting, actually predicting
protein function using genomic association
deoxycitidine
Cdd
deoxyuridine, deoxythimidine
DeoA
Glyceraldehyde-3-p, acetaldehyde
deoB
deoC
deoxyribose-1-P
deoxyribose-5-P
DeoD
purine deoxyribonucleosides
deoB ?
M.genitalium M.tuberculosis
deoD deoC deoA cdd pmm

7
  • Prediction that the cdd gene encodes a protein
    that (also) functions as a phosphoribomutase is
    based on
  • Genomic association (operon) with genes involved
    in the nucleoside salvage pathway.
  • Conservation of this association among distantly
    related species.
  • Substrate specificity is less conserved than
    catalytic function ? conserved is the mutase
    function, altered is the substrate specificity
    from a mannose/glucose to a ribose.
  • A phosphoribose mutase is required, and otherwise
    absent from the genome
  • Such predictions of course have to be confirmed
    by experimental research

8
Define distantly related species..
Remember the rapid shuffling of genomes (compared
to 16S rRNA identity)
9
Variations in the genome rearrangements dependent
on the relative direction of transcription ?
hints to the operon organization of genes in
prokaryotes
10
Except for the theoretical argument proteins
that are not only encoded in the same operon, but
this organization is actually conserved in
evolution, we also need experimental benchmarks
(compare the protein sequence similarity ?
homology benchmarking via the structure) Dandekar
, Snel, Huynen and Bork, TIBS 1998. Conservation
of gene order a fingerprint of proteins that
physically interact
11
..Benchmarking..
12
Conservation of the Tryptophane synthesis operon
among the compared genomes
13
Types of Genomic Association for the Prediction
of Functional Interaction
  • I gene fusion/fission
  • II conservation of gene order (operons)
  • III co-occurrence of genes in genomes
  • IV shared regulatory elements
  • V coexpression data

14
Gene fission in the evolution of carbamoyl
phosphate synthase B (carB)
15
Predicting functional interactions between
proteins by the co-occurrence of their genes in
genomes.
Distribution of four M.genitalium genes among 25
genomes MG299 (pta) 0 0 0 1 1 0 0 0 0 1 1 0 1 0
1 1 0 0 0 1 0 1 1 1 1 MG357(ackA) 0 0 0 1 1 0 0 0
0 1 1 0 1 0 1 1 0 0 0 1 0 1 1 1 1 MG019(dnaJ) 0 0
1 1 1 1 1 1 0 1 1 1 1 0 1 1 1 0 0 1 1 1 1 1
1 MG305(dnaK) 0 0 1 1 1 1 1 1 0 1 1 1 1 0 1 1 1 0
0 1 1 1 1 1 1
Using the mutual information between genes as a
scoring heuristic for their co-occurrence. M(pta,
ackA)0.69 (phospotransacetylase, acetate
kinase) M(dnaJ, dnaK)0.55 (heat shock
proteins) M(dnaJ, ackA)0.19
16
..entropy and mutual information
H (i) - Si Pi log Pi
H (j) - Sj Pj log Pj
H (i,j) - Si,j Pi,j log Pi,j
Entropy (H) is the disorderdness of the system,
is maximal when all states occur with equal
frequency, minimal when one state dominates the
distribution. In terms of the distribution of
genes,it is maximal when genes occur with 50
frequency.
M (i,j) H (i) H (j) - H (i,j)
Mutual information (M) is the sum of the
individual entropies minus the combined entropy.
It is maximal when individual entropies are
maximal (P0.5) and the combined entropy is
minimal (of the four possibilities, 0 0, 0 1, 1 0
and 1 1, only two are occupied 0 0 and 1 1 or 0
1 and 1 0)
17
Applicability of using Genomic context
information for

M.genitalium genes
Gene-order 215
Fusion 27
480 genes in total
Co-occurrence 54
18
Selectivity of Genomic Context for function
prediction
19
Correlation between the strength of the genomic
and functional associations (operon)
20
Correlation between the strength of the genomic
and functional associations (fusion)
21
Correlation between the strength of the genomic
and functional associations (co-occurrence)
22
Genomic context vs. homology based function
prediction in M.genitalium

Context 238
Homology 368
21
26
Added info from genomic context
23
Combining homology information with genomic
association for function prediction
Repeated occurrence of MG009, one of the most
widespread enzymes on earth, encoding a
phosphohydrolase, with thymidilate kinase (tmk)
suggests a role of MG009 in pyrimidine metabolism.
24
Conservation of gene order of the hypothetical
gene MG134 with dnaX, RecR suggests physical
interaction between their gene products
25
From pairwise interactions to functional modules,
pathways
26
The first iteration of trpB in M. jannaschii
(MJ1038) retreives trpA (MJ1037), with which trpB
physically interacts
27
(No Transcript)
28
Genomic context indicates a link between the
Shikimate and Tryptophane synthesis pathways
tyrA
aroB
asd
truA
aroE
aroC
hemK
hyp
trpF
trpC
trpE
Shikimate pathway
trpG
trpA
trpD
trpB
Tryptophane synthesis pathway
hyp
2c-rr
29
Modular gain and loss of genes in the Pyrococci
30
Enzymes that are encoded in conserved operons and
that are lost/gained together catalyze reactions
that are closer in metabolic space than ones that
are in conserved in operons but that are not
gained/lost together
31
Limited Relevance of Gene Order for Functional
Interaction in eukaryotes
  • operons in Nematodes
  • Gene-order conservation of co-expressed genes
    between the fungi of C.albicans and S.cerevisiae

32
Divergently transcribed, co-regulated gene pairs
tend to be conserved between S.cerevisiae and C.
albicans
33
Finding Interaction Partners for a Human Disease
Gene frataxin
  • Friedreichs ataxia
  • No (homolog with) known function
  • No gene fusion or gene order conservation

34
(No Transcript)
35
Ancestor Proteobacteria
fdx
IscS
IscU
IscR
RnaM
(time)
36
The mitochondrial HSP70 protein that is involved
in iron-sulfur cluster (isc) assembly in yeast is
derived from DnaK, rather than from HscA (the
proteobacterial isc HSP70), indicating a
paralogous switch in isc assembly from the
proteobacteria to the eukaryotes.
37
A comparative genome analysis based system view
of iron-sulfur cluster assembly
Isa1,2p
IscR
Nfu1p
Nfs1p
Isu1p
Ssq1p, Jac1p
EC2524
38
Mitochondrial iron-sulfur assembly
Arh1/fpr
Atm1
Cys
NifS
e-
fdx
e-
S
2Fe2S
Ala
e.g. fdx, Complex I
Fe
NifU
HscA/SSQ1, HscB frataxin ?
39
Large scale (omics) experimental approaches to
physical interaction - 2-hybrid -
co-precipitation, masspec
40
Further Reading
  • Genomic context Huynen M, Snel B, Lathe W 3rd,
    Bork P. (2000) Predicting protein function by
    genomic context quantitative evaluation and
    qualitative inferences.Genome Res.
    10(8)1204-10.
  • Genomic context Gabaldon T, Huynen MA. (2004)
    Prediction of protein function and pathways in
    the genome era. Cell Mol Life Sci. 2004 61
    930-44.
Write a Comment
User Comments (0)
About PowerShow.com