Protein Secondary Structures - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Protein Secondary Structures

Description:

From http://www.imb-jena.de. Helices. phi(deg) psi(deg) H-bond pattern ... N -2 0 6 1 -3 0 0 0 1 -3 -3 0 -2 -3 -2 1 0 -4 -2 -3 3 0 -1 -4 ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 39
Provided by: cbs6
Category:

less

Transcript and Presenter's Notes

Title: Protein Secondary Structures


1
Protein Secondary Structures
  • Assignment and prediction

Pernille Haste Andersen
2
Secondary Structure Elements
ß-strand
3
Use of secondary structure
  • Classification of protein structures
  • Definition of loops (active sites)
  • Use in fold recognition methods
  • Improvements of alignments
  • Definition of domain boundaries

4
Classification of secondary structure
  • Defining features
  • Dihedral angles
  • Hydrogen bonds
  • Geometry
  • Assigned manually by crystallographers or
  • Automatic
  • DSSP (Kabsch Sander,1983)
  • STRIDE (Frishman Argos, 1995)
  • DSSPcont (Andersen et al., 2002)

5
Dihedral Angles
6
Helices
phi(deg) psi(deg)
H-bond pattern ----------------------------------
-------------------------------- right-handed
alpha-helix -57.8 -47.0
i4 pi-helix -57.1
-69.7 i5 310 helix
-74.0 -4.0 i3 (omega is 180 deg in
all cases) ---------------------------------------
--------------------------
From http//www.imb-jena.de
7
Beta Strands
Hydrogen bond patterns in beta sheets. Here a
four-stranded beta sheet is drawn schematically
which contains three antiparallel and one
parallel strand. Hydrogen bonds are indicated
with red lines (antiparallel strands) and green
lines (parallel strands) connecting the hydrogen
and receptor oxygen.
8
Secondary Structure Elements
ß-strand
9
Secondary Structure Type Descriptions
10
Automatic assignment programs
  • DSSP ( http//www.cmbi.kun.nl/gv/dssp/ )
  • STRIDE ( http//www.hgmp.mrc.ac.uk/Registered/Opti
    on/stride.html )
  • DSSPcont ( http//cubic.bioc.columbia.edu/services
    /DSSPcont/ )

RESIDUE AA STRUCTURE BP1 BP2 ACC N-H--gtO
O--gtH-N N-H--gtO O--gtH-N TCO KAPPA ALPHA
PHI PSI X-CA Y-CA Z-CA 1 4 A E
0 0 205 0, 0.0 2,-0.3 0, 0.0
0, 0.0 0.000 360.0 360.0 360.0 113.5 5.7
42.2 25.1 2 5 A H - 0 0
127 2, 0.0 2,-0.4 21, 0.0 21, 0.0 -0.987
360.0-152.8-149.1 154.0 9.4 41.3 24.7
3 6 A V - 0 0 66 -2,-0.3
21,-2.6 2, 0.0 2,-0.5 -0.995
4.6-170.2-134.3 126.3 11.5 38.4 23.5 4
7 A I E -A 23 0A 106 -2,-0.4
2,-0.4 19,-0.2 19,-0.2 -0.976
13.9-170.8-114.8 126.6 15.0 37.6 24.5 5
8 A I E -A 22 0A 74 17,-2.8
17,-2.8 -2,-0.5 2,-0.9 -0.972
20.8-158.4-125.4 129.1 16.6 34.9 22.4 6
9 A Q E -A 21 0A 86 -2,-0.4
2,-0.4 15,-0.2 15,-0.2 -0.910 29.5-170.4
-98.9 106.4 19.9 33.0 23.0 7 10 A A
E A 20 0A 18 13,-2.5 13,-2.5
-2,-0.9 2,-0.3 -0.852 11.5 172.8-108.1 141.7
20.7 31.8 19.5 8 11 A E E A 19
0A 63 -2,-0.4 2,-0.3 11,-0.2 11,-0.2
-0.933 4.4 175.4-139.1 156.9 23.4 29.4
18.4 9 12 A F E -A 18 0A 31
9,-1.5 9,-1.8 -2,-0.3 2,-0.4 -0.967
13.3-160.9-160.6 151.3 24.4 27.6 15.3 10
13 A Y E -A 17 0A 36 -2,-0.3
2,-0.4 7,-0.2 7,-0.2 -0.994
16.5-156.0-136.8 132.1 27.2 25.3 14.1 11
14 A L E gtgt -A 16 0A 24 5,-3.2
4,-1.7 -2,-0.4 5,-1.3 -0.929
11.7-122.6-120.0 133.5 28.0 24.8 10.4 12
15 A N T 45S 0 0 54 -2,-0.4 -2,
0.0 2,-0.2 0, 0.0 -0.884 84.3 9.0-113.8
150.9 29.7 22.0 8.6 13 16 A P T
45S 0 0 114 0, 0.0 -1,-0.2 0, 0.0
-2, 0.0 -0.963 125.4 60.5 -86.5 8.5 32.0
21.6 6.8 14 17 A D T 45S- 0 0
66 2,-0.1 -2,-0.2 1,-0.1 3,-0.1 0.752
89.3-146.2 -64.6 -23.0 33.0 25.2 7.6 15
18 A Q T lt5 0 0 132 -4,-1.7
2,-0.3 1,-0.2 -3,-0.2 0.936 51.1 134.1
52.9 50.0 33.3 24.2 11.2 16 19 A S E
lt A 11 0A 44 -5,-1.3 -5,-3.2 2, 0.0
2,-0.3 -0.877 28.9 174.9-124.8 156.8 32.1
27.7 12.3 17 20 A G E -A 10 0A
28 -2,-0.3 2,-0.3 -7,-0.2 -7,-0.2 -0.893
15.9-146.5-151.0-178.9 29.6 28.7 14.8 18
21 A E E -A 9 0A 14 -9,-1.8
-9,-1.5 -2,-0.3 2,-0.4 -0.979
5.0-169.6-158.6 146.0 28.0 31.5 16.7 19
22 A F E A 8 0A 3 12,-0.4
12,-2.3 -2,-0.3 2,-0.3 -0.982 27.8
149.2-139.1 120.3 26.5 32.2 20.1 20 23
A M E -AB 7 30A 0 -13,-2.5 -13,-2.5
-2,-0.4 2,-0.4 -0.983 39.7-127.8-152.1 161.6
24.5 35.4 20.6 21 24 A F E -AB 6
29A 45 8,-2.4 7,-2.9 -2,-0.3 8,-1.0
-0.934 23.9-164.1-112.5 137.7 21.7 37.0
22.6 22 25 A D E -AB 5 27A 6
-17,-2.8 -17,-2.8 -2,-0.4 2,-0.5 -0.948
6.9-165.0-123.7 138.3 18.9 38.9 20.8 23
26 A F E gt S-AB 4 26A 76 3,-3.5
3,-2.1 -2,-0.4 -19,-0.2 -0.947 78.4
-27.2-127.3 111.5 16.4 41.3 22.3 24 27
A D T 3 S- 0 0 74 -21,-2.6 -20,-0.1
-2,-0.5 -1,-0.1 0.904 128.9 -46.6 50.4 45.0
13.4 42.1 20.2 25 28 A G T 3 S 0
0 20 -22,-0.3 2,-0.4 1,-0.2 -1,-0.3
0.291 118.8 109.3 84.7 -11.1 15.4 41.4
17.0 26 29 A D E lt S-B 23 0A 114
-3,-2.1 -3,-3.5 109, 0.0 2,-0.3 -0.822
71.8-114.7-103.1 140.3 18.4 43.4 18.1 27
30 A E E -B 22 0A 8 -2,-0.4
-5,-0.3 -5,-0.2 3,-0.1 -0.525 24.9-177.7
-74.1 127.5 21.8 41.8 19.1
11
Secondary Structure Prediction
DSSP
  • What to predict?
  • All 8 types or pool types into groups

H alpha helix G 310 -helix I 5
helix (pi helix) E extended strand B
beta-bridge T hydrogen bonded turn S
bend C coil
12
Secondary Structure Prediction
Straight HEC
  • What to predict?
  • All 8 types or pool types into groups

H alpha helix E extended strand T
hydrogen bonded turn S bend C
coil G 310-helix I 5 helix (pi helix) B
beta-bridge
13
Secondary Structure Prediction
  • Simple alignments
  • Align to a close homolog for which the structure
    has been experimentally solved.
  • Heuristic Methods (e.g., Chou-Fasman, 1974)
  • Apply scores for each amino acid an sum up over a
    window.
  • Neural Networks (different inputs)
  • Raw Sequence (late 80s)
  • Blosum matrix (e.g., PhD, early 90s)
  • Position specific alignment profiles (e.g.,
    PsiPred, late 90s)
  • Multiple networks balloting, probability
    conversion, output expansion (Petersen et al.,
    2000).

14
Improvement of accuracy
15
Simple Alignments
  • Solved structure of a homolog to query is needed
  • Homologous proteins have 88 identical (3
    state) secondary structure
  • If no close homologue can be identified
    alignments will give almost random results

16
Amino acid preferences in ?-Helix
17
Amino acid preferences in ?-Strand
18
Amino acid preferences in coil
19
Chou-Fasman
20
Chou-Fasman
1. Assign all of the residues in the peptide the
appropriate set of parameters. 2. Scan through
the peptide and identify regions where 4 out of 6
contiguous residues have P(a-helix) gt 100. That
region is declared an alpha-helix. Extend the
helix in both directions until a set of four
contiguous residues that have an average
P(a-helix) lt 100 is reached. That is declared the
end of the helix. If the segment defined by this
procedure is longer than 5 residues and the
average P(a-helix) gt P(b-sheet) for that segment,
the segment can be assigned as a helix. 3.
Repeat this procedure to locate all of the
helical regions in the sequence. 4. Scan through
the peptide and identify a region where 3 out of
5 of the residues have a value of P(b-sheet) gt
100. That region is declared as a beta-sheet.
Extend the sheet in both directions until a set
of four contiguous residues that have an average
P(b-sheet) lt 100 is reached. That is declared the
end of the beta-sheet. Any segment of the region
located by this procedure is assigned as a
beta-sheet if the average P(b-sheet) gt 105 and
the average P(b-sheet) gt P(a-helix) for that
region. 5. Any region containing overlapping
alpha-helical and beta-sheet assignments are
taken to be helical if the average P(a-helix) gt
P(b-sheet) for that region. It is a beta sheet if
the average P(b-sheet) gt P(a-helix) for that
region. 6. To identify a bend at residue number
j, calculate the following value p(t)
f(j)f(j1)f(j2)f(j3) where the f(j1) value for
the j1 residue is used, the f(j2) value for the
j2 residue is used and the f(j3) value for the
j3 residue is used. If (1) p(t) gt 0.000075 (2)
the average value for P(turn) gt 1.00 in the
tetra-peptide and (3) the averages for the
tetra-peptide obey the inequality P(a-helix) lt
P(turn) gt P(b-sheet), then a beta-turn is
predicted at that location.
21
Chou-Fasman
  • General applicable
  • Works for sequences with no solved homologs
  • But the accuracy is low!

22
Neural Networks
  • Benefits
  • General applicable
  • Can capture higher order correlations
  • Inputs other than sequence information
  • Drawbacks
  • Needs many data (different solved structures).
    However, theese does exist today (nearly 2500
    solved structures with low sequence identity/high
    resolution.)
  • Complex method with several pitfalls.

23
Architecture
24
Sparse encoding
Inp Neuron 1 2 3 4 5 6 7 8 9 10 11 12
13 14 15 16 17 18 19 20 AAcid A 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 R
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 N 0 0 1 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 D 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 C 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 Q 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
E 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 0 0
25
Input Layer
26
BLOSUM 62
27
Input Layer
28
Secondary networks(Structure-to-Structure)
29
PHD method (Rost and Sander)
  • Combine neural networks with sequence profiles
  • 6-8 Percentage points increase in prediction
    accuracy over standard neural networks
  • Use second layer Structure to structure network
    to filter predictions
  • Jury of predictors
  • Set up as mail server

30
PSI-Pred (Jones)
  • Use alignments from iterative sequence searches
    (PSI-Blast) as input to a neural network
  • Better predictions due to better sequence
    profiles
  • Available as stand alone program and via the web

31
Position specific scoring matrices (PSI-BLAST
profiles)
32
Several different architectures
Output C C H H C C C
Output C C C C C C C
33
Activities to probabilities
34
Balloting procedure
35
Benchmarking secondary structure predictions
  • CASP
  • Critical Assessment of Structure Predictions
  • Sequences from about-to-be-deposited-structures
    are given to groups who submit their predictions
    before the structure is published
  • Every 2. year
  • EVA
  • Newly solved structures are send to prediction
    servers.
  • Every week

36
EVA results (Rost et al., 2001)
  • PROFphd 77.0
  • PSIPRED 76.8
  • SAM-T99sec 76.1
  • SSpro 76.0
  • Jpred2 75.5
  • PHD 71.7
  • Cubic.columbia.edu/eva

37
Links to servers
  • Database of links
  • http//mmtsb.scripps.edu/cgi
  • bin/renderrelres?protmodel
  • ProfPHD
  • http//www.predictprotein.org/
  • PSIPRED
  • http//bioinf.cs.ucl.ac.uk/psipred/
  • JPred
  • http//www.compbio.dundee.ac.uk/www-jpred/

38
Practical Conclusion
  • If you need a secondary structure prediction use
    one of the newer ones such as
  • ProfPHD,
  • PSIPRED, and
  • JPred
  • And not one of the older ones such as
  • Chou-Fasman
  • Garnier
Write a Comment
User Comments (0)
About PowerShow.com