Hybrid Protein Model HPM : - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Hybrid Protein Model HPM :

Description:

helix -strand. Coil (50 %) N states description ... PB m regular helix. Accurate 3D structure approximation : rmsd 0.42 . Protein encoding : ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 18
Provided by: dsimbI
Category:
Tags: hpm | helix | hybrid | model | protein

less

Transcript and Presenter's Notes

Title: Hybrid Protein Model HPM :


1
Hybrid Protein Model (HPM) A Method for
Building a Library of Overlapping Local
Structural Prototypes. Sensitivity Study and
Improvements of the Training.
C. Benros, A.G. de Brevern and Pr. S. Hazout
Equipe de Bioinformatique Génomique et
Moléculaire INSERM E0346, Université Paris 7,
case 7113, 2, place Jussieu, 75251 Paris - FRANCE.
2
Structural Genomics
MSTPALRKR RDKFRMQNR DPNPTSPAN
Support of the biological function(s)
Amino acid sequence
Protein 3D structure
Limits of the experimental methods of protein
structure determination X-ray crystallography,
NMR spectroscopy
In silico Prediction
The amino acid sequence specifies the 3D
structure.
3
Global Protein Structure Prediction
100
Comparative Modelling
(Baker Sali, 2001)
Sequence Identity ()
30
Threading
(Xu et al., 2001)
25
Ab Initio
(Bonneau et al., 2002)
4
Protein Structure Description
  • 3 - states secondary structures
  • N states description

(Unger et al., 1989 Rooman et al., 1990
Schuchhardt et al., 1996 Fetrow et al., 1997
Bystroff Baker, 1998 Camproux et al., 1999
de Brevern et al., 2000)
Specific sequence structure relationship at a
local level
5
Goal of the study
Compression of a non-redundant protein structure
databank
? Building of a library of overlapping local
structural prototypes
Hybrid Protein Model (HPM)
  • Method
  • A self-organizing linear neural network that
    carries out a structural
  • alignment of protein fragments by stacking.
  • Definition of dependent structural prototypes
    insuring the continuity
  • of the local folds.
  • Studied aspects

(i) Sensitivity study
(ii) Improvement of the training introduction
of gaps
6
Structural Alphabet
  • 16 Protein Blocks (PBs)
  • - labeled from a to p.
  • 5 successive C? defined by
  • 8 dihedral angles (?, ?).
  • Specific PBs
  • PB d ? regular ? strand
  • PB m ? regular ? helix
  • Accurate 3D structure approximation
  • rmsd lt 0.42 Å

Protein encoding
(de Brevern et al., 2000)
- Databank of 675 protein structures ? 144,229
PBs
139,503 fragments
fklmmmmmmmn
3D structure
1D representation
Overlapping protein fragments of 7 consecutive
PBs (L7)
7
Hybrid Protein Model
? Ring of neurons
N
(i) A neuron is defined by L PB
distributions.
1
N-1
2
(ii) Two consecutive neurons share (L-1)
distributions.
s
s1
? Matrix N x 16
(iii) The HPM is characterized by N PB
distributions.
p
(iv) 1 neuron ? 1 structural prototype ?
cluster of fragments sharing similar
local structure.
PBs
a
N
1
s-w
s
sw
Structural prototypes or neurons
L
Information sharing
Continuity
(de Brevern Hazout, 2001, 2003)
8
Training
  • Identification phase
  • Adequacy score

p
HPM
Fs(PB)
PBs
a
N neurons
1
s-w
s
sw
Fs(PBk) frequency of PBk in the HPM site
s FR(PBk) frequency of PBk in the databank
L
fklmmmm
  • Enrichment

Smax
Sc(s)
- If PB PBk
- elsewhere (the 15 other PBs)
sopt
1
s
N
  • Enrichment phase

Winning neuron
  • Training coefficient

p
m
HPM
l
k
?0 initial value K control parameter
PBs
f
a
t number of fragments presented to HPM T
total number of fragments
1
sopt
soptw
sopt-w
9
Final Hybrid Protein Model
p
  • Parameters - N 120 neurons
  • - ?0 0.20
  • - K 0.5

? ?
m
PBs
? ?
d
a
Structural Profile
1
N120
- Distribution of the structural fragments
3000
2000
Numbers
- Specificity along the HPM Neq exp(H)
Average Neq 2.19 PBs Range 1.0 6.6
1000
0
14
10
Neq
6
2
- Structural stability rmsd Average rmsd
0.95 Å Range 0.25 Å 1.48 Å
2.0
rmsd(Å)
1.0
0.0
10
Final Hybrid Protein Model
p
m
Library of overlapping local structural
prototypes
PBs
d
a
1
N120
10
13
16
19
10 ? ddddddd (? strand 1.02 Å)
N
C
13 ? ddddfkl (transition ??? 0.99 Å)
N
C
Fold continuity
16 ? dfklcfk (??? 1.09 Å)
N
C
N
C
19 ? lcfklmm (??? 0.60 Å)
11
Final Hybrid Protein Model
p
m
PBs
d
a
1
N120
25
28
85
109
C
25 and 28 ? mmmmmmb and mmmmmno (? helix
region 0.82 Å and 0.56 Å)
N
N
C
85 ? dfknopa (turn 1.15 Å)
C
N
109 ? cdddehj (? strand region 1.08 Å)
N
C
12
Improvement Training with structural gaps
  • Main interest

Structural alignment with gaps to take account of
- the variability in length of the regular
secondary structures. - the heterogeneity of
some local structures.
  • Score adequacy calculation

Ex gap of length 3
- Gap of length g (0 ? g ? L-1)
Identification
  • No cost for the gap is introduced
  • in the score calculation

p
HPM
PBs
  • Selection rule for the fragments
  • with gap

a
N neurons
1
sopt
fklm---mmm
Enrichment
p
m
score without gap
l
HPM
k
maximal score with a gap g
PBs
f
? coefficient (? lt 1)
a
13
Improvement results (1)
Introduction of gaps in the structural fragments
  • Illustration Structural fragments with
    different C-terminal extremity

m
n
o
p
HPM
c
a
c
27
27 without gap
mmmmmmn mmmmmmn mmmmmmn
m
m
m
n
27 with gap
m
m
m
p
m
m
mmmm_ _ _ _ _ccd mmmm_ _ _ _pcc mmmm_ _ _ _pcb
mmmm_ _ _ _pcc
m
c
m
c
14
Improvement results (2)
  • Neq ? specificity improvement

Neq(s) expH(s)
with H(s) Shannon entropy in HPM site s
Specificity along the HPM for fragments with gaps
(in black) and without gaps (in white)
? Improvement of the structural alignment
15
Sensitivity study results
  • Optimal HPM size ( N, number of neurons)

Compromise
between - Representativeness for each
fragment, only 1 representative and -
Continuity sequentiality along the HPM.
? Choice of N 120 neurons
  • Learning coefficients ?0 and K

An initial strong training (?0 high) globally
leads to a PB specificity improvement.
No significant improvement of the training
by varying the control parameter K.
16
Interests and limitations of HPM
  • Interests

(1) - Fast and efficient tool for performing a
multiple structural alignment with gaps
of protein fragments. (2) - Basis concept
Information sharing ? continuity between the
neurons. ? Clustering into overlapping
structural prototypes
  • Limitations

(1) - Determination of the optimal number N of
neurons (2) - Linearity of HPM
17
Perspectives
Analysis of the sequence structure relationship
Search of structural similarity between proteins
Protein local structure prediction from sequence
  • Strategy developed

Ex Candidates proposed for 19GS
p
HPM
PBs
a
Proposition of long local structural candidates
1
1.33 Å (34 aa)
Scoring matrix
Amino acid target sequence
0.45 Å (26 aa)
L
1
N
Write a Comment
User Comments (0)
About PowerShow.com