Gipsi LimaMendez - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Gipsi LimaMendez

Description:

The ACLAME project aims at offering a repository for the ... traG. RO00029. trbB. trbC. trbE. trbJ. RO00022. trbF. trbG. trbI. trbL. RO00034. RO00035. repA ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 35
Provided by: form4
Category:
Tags: limamendez | gipsi | trag

less

Transcript and Presenter's Notes

Title: Gipsi LimaMendez


1
The ACLAME database of Mobile Genetic Elements
and associated tools for in silico analysis of
(pro)phages
Service de Bioinformatique des Génomes et des
Réseaux (BiGRe) Université Libre de Bruxelles,
Bvd du Triomphe, 1050 Bruxelles
Gipsi Lima-Mendez gipsi_at_scmbb.ulb.ac.be
Raphaël Leplae raphael_at_scmbb.ulb.ac.be
Ariane Toussaint ariane_at_scmbb.ulb.ac.be
2
ACLAME project (1) (A CLAssification of Mobile
Elements)
The ACLAME project aims at offering a repository
for the collection and analysis of prokaryotic
Mobile Genetic Elements (MGEs) i.e. phages,
plasmids and all MGEs that reside integrated in
their host genome, from IS sequences (see also
the IS-Finder database, http//www-is.biotoul.fr/h
ttp//www-is.biotoul.fr). The general schema of
the ACLAME database rests on the mosaic structure
of MGEs. Elements in different traditional
classes perform similar functions using similar
proteins. Hence, ACLAME aims at describing MGE as
composites of functional modules.
3
ACLAME project (2)
  • The next slide provides a few examples
  • - To move form one position to another on their
    host genome most IS sequences and the mutator
    phages use a DDE transposase, which recognizes,
    binds and cleaves the ends of those sequences.
    The transposase gene and the target sites (called
    IR in IS and att in phages) form a module ( in
    yellow)
  • Similalry integrons, conjugative transposons and
    phages (the latter not shown here) integrate and
    excise by means of an integrase of the
    Tyrosine-based families acting on att sites (in
    green)
  • Conjugative transposons and conjugatiuve
    plasmids share conjugation machinery (including
    the type IV secretion related mating pair
    formation apparatus, in red)

4
Modular structure of MGE's
Integrons
Pant
Cassette
tnpA
IRL
IRR
Integron
atti
int
IS
Plasmid
ant
Pant
orfA
orfB
Integron with cassette
atti
ant
int
tnpA
Mutator Phage (Mu)
attL
attR
Late genes
c
ner
A
B
Terminal repeats
R3 R2 R1
L1 L2 L3
Conjugative transposon or ICE (Tn4371)
RO00024
RO00055
RO00007
RO00029
RO00022
RO00034
RO00035
RO00041
RO00039
RO00013
RO00014
RO00015
RO00016
RO00017
RO00018
RO00033
RO00002
RO00003
RO00009
RO00012
RO00011
RO00010
RO00008
RO00006
RO00004
trbJ
trbI
trbL
parA
parB
traG
trbB
trbC
trbE
trbF
trbG
repA
traF
tnpA
traR
int
attR
13 bph genes (biphenyl degradation)
attL
Type IV secretion system
5
ACLAME project (3)
The problem of classifying MGEs thus moves from
the daunting task of deciding fixed categories
for combinatorial elements, to that of
identifying and classifying their constituent
modules. The next slide illustrates a vision of
what such modules could be, in term of individual
MGE proteins, organized into families of related
proteins that act within a complex/functional
module (phage heads and tails, conjugation/secreti
on apparatus etc.), various MGEs being
assortments of those modules.
6
Basic ACLAME concept (Merlin et al.
2000) Reconstruction of Various Bacterial MGE's
Proteins
7
The next slides illustrate how MGE protein
families are assembled and analyzed within the
ACLAME schema. 1- MGE (so far phage and plasmid)
protein sequences are extracted from the NCBI
database and they are - Compared all vs. all
using Blatsp, which provides a matrix of
pair-wise similarity scores. - All compared to
the protein sequences in the NRDB-NCBI, Swisprot
and SCOPE databases using Psi-Blast.
8
Generating protein families Proteins clustering
All-vs-all
MGE proteins
9
2- The similarity matrix is used for clustering
with the TRIBE-MCL clustering algorithm (Enright
et al. 2000), with E-value threshold and
inflation values that were shown to best
reproduce the SCOPE protein families (clusters,
Leplae et al. 2004) and the IS sequence families
(IS-Finder database, Siguier et al. 2007). 3-
Multiple sequence alignments (MSA) are generated
for all protein families of 3 or more members. 4-
The MSA is used to generate a HMM profile for the
families. 5- The HMM profiles are compared with
protein sequences in NRDB, Swisprot and SCOPE.
10
Generating protein families Proteins clustering
All-vs-all
MGE proteins
HMM Profile
MSA
11
6- All that information and additional
experimental evidence available in the literature
are used to assign a function to the families .
For the purpose of this functional annotation,
a list of functions has been assembles and is
progressively implemented into a structured
ontology based on the Gene Ontology (GO,
http//www.geneontology.org, ) format (see more
about the PhiGO ontology below).
12
Generating protein families Proteins clustering
All-vs-all
MGE proteins
HMM Profile
MSA
ACLAME Classification
Functional annotation
13
ACLAME is a relational database. It contains a
number of tables that are linked. Each table can
be browsed and it is possible to navigate between
the tables. MGE genomes linked to
NCBI MGE hosts MGE proteins families with a
functional annotation ACLAME list of functions
14
View of protein families in ACLAME version 0.2.
Hits of HMM in databases
Click to view family
15
View of one phage protein family in ACLAME
version 0.2.
View MSA of the family
ACLAME function
Link to GO ontology
View Hits in databases
Click to View protein
View hits of HMM in databases
16
View of one phage protein in ACLAME version 0.2.
ACLAME function
View secondary structure prediction
Link to NCBI
Back to family view
17
View of genomes list in ACLAME version 0.2
go to genome view
18
One genome view
go to protein view
go to family view
go to NCBI
19
Blast over the ACLAME content
Access to Blastp of ACLAME content
20
View of the ACLAME Blastp output
Go to best hit protein view
Go to Family of best hit protein view
Query has no significant similarity with ACLAME
content
21
  • ACLAME TOOLS (1)
  • PhiGO ontology for annotation of phage proteins.

22
The PhiGO Phage Ontology.
  • Structured list of terms that should capture
    everything that's known about phage gene products
    in terms of
  • Molecular functions
  • Biological processes
  • Components

To fit the Gene Ontology (Harris et al. 2004),
PhiGO, is in the OBO format and formalized as a
Directed Acyclic Graph (DAG) where "nodes are
terms and edges the type of relationship (is-a
or part-of) that relates them.
23
Acyclic graph of the term "viral genome
replication" as it presently stands in GO"
is_a
biological process GO0016032
reproductive process GO0022414
is_a
part_of
Viral genome replication GO0019079
is_a
is_a
viral reproduction process GO0022415
reproduction GO0016032
part_of
viral infectious cycle GO0019058
viral reproduction GO0016032
is_a
part_of
is_a
The term labeling a node refers to this node and
all of its children.
24
View of one term and its definition with AmiGO
viewer http//aclame.ulb.ac.be/Classification/phag
e_functions.html
25
ACLAME TOOLS (2) - Prophinder prediction of
prophages in complete bacterial genome sequences.
26
Prophinder general outline (1)
  • Download all translated CDS (protein) sequences
    of bacterial genomes
  • Compare to phage proteins in ACLAME
  • Encode hits on bacterial genome sequence
  • Walk along that genome with a window of
    adjustable size
  • Use binomial formula to calculate probability to
    observe at least n hits in a window of size w

- Calculate significance score - Nb tests Nb
CDS (w size 1) - Eval Pval Nb test - Sig
-log(Eval) For a window of a given size at a
given position along the genome, search for
segments with best sig values.
27
Prophinder general outline (2)
  • Implementation of biological criteria
  • - Presence of an integrase gene at one extremity
  • - Detection of short direct flanking repeats
  • - No repeat of tpical phage genes (e.g. head and
    tail major proteins)

28
Sliding window
Significance Matrix
int
Direct repeat
29
Access to the list of genomes analyzed with
Prophinder
View preditions for that genome
30
View of prediction on the host genome map
31
View of Orfs in prediction
Hit in ACLAME
No hit in ACLAME
Link to best hit in ACLAME
32
Heatmap view of ACLAME hits
33
http//aclame.ulb.ac.be ACLAME
database http//aclame. ulb.ac.be/prophinder
ACLAME Prophinder viewer http//aclame.ulb.ac.be
//functions ACLAME list of functions
http//aclame.ulb.ac.be/Classification/phage_func
tions.html PhiGO viewer and dowload of PhiGO
flat files with definitions http//www.godatabase
.org/cgi-bin/amigo/go.cgi/ GO
database http//www.godatabase.org/dev/java/oboedi
t/docs/index.html download OBO-edit
34
References
  • Enright AJ, Van Dongen S, Ouzounis CA. 2002 An
    efficient algorithm for large-scale detection of
    protein families. Nucleic Acids Res. Apr
    301575-84.
  • Harris, M.A. Et al. 2004. The Gene Ontology (GO)
    database and informatics resource. Nucleic Acids
    Res 32 D258-261.
  • Leplae, R., A. Hebrant, S.J. Wodak, and A.
    Toussaint. 2004. ACLAME a CLAssification of
    Mobile genetic Elements. Nucleic Acids Res 32
    Database issue D45-49.
  • Merlin, C., J. Mahillon, J. Nesvera, and A.
    Toussaint. 2000. Gene recruiters and
    transporters the modular structure of bacterial
    mobile elements. In The horizontal gene pool
    bacterial plasmids and gene spread (ed. C.M.
    Thomas), pp. 363-409. Harwood Academic
    Publishers, Amsterdam
  • Siguier, P., J. Perochon, L. Lestrade, J.
    Mahillon, and M. Chandler. 2006. ISfinder the
    reference centre for bacterial insertion
    sequences. Nucleic Acids Res 34 D32-36.
Write a Comment
User Comments (0)
About PowerShow.com