Protein Structure Alignment - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Protein Structure Alignment

Description:

Focus of this research : find a good similarity score for two residues in the ... Limitation 2: the displacement of subdomain within one structure can result in a ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 27
Provided by: jieb
Category:

less

Transcript and Presenter's Notes

Title: Protein Structure Alignment


1
Protein Structure Alignment
  • William R. Taylor and Christine A. Orengo
  • J Mol. Biol. (1989) 208. 1-22

Presented byJie BaoDept of Computer
Science Iowa State Universitybaojie_at_cs.iastate.ed
u http//www.cs.iastate.edu/baojie
2
Abstract
  • Focus of this research find a good similarity
    score for two residues in the structures being
    compared , and then fo the matching
  • The score is based on distance plot analysis
  • Matching is done by dynamic programming
  • Advantage
  • Insensitive to insertions and deletions
  • No initial seeding is needed
  • Test it on samples
  • Globins, Calcium-binding proteins ,Rhodanese,
    Immunoglobulin domains, Plastocyanin/azurin,
    Lysozyme

3
Outline
  • Problem
  • Methods
  • Result
  • Discussion

4
Whats the problem
  • Align two sequence based on their structure
    position

5
Existing Methods
  • Least-squares Matthews Rossmann 1985
  • Limitation 1 equivalence of positions must be
    established before the superposition is preformed
  • Limitation 2 the displacement of subdomain
    within one structure can result in a poor overall
    fit between topologically equivalent structures.
  • Search in rotational space Rossmann etc.
    1973-1977
  • Comparing all pairs of structural fragments
  • Both of them are computationally demanding and
    the latter is sensitive to insertion/deletions

6
Outline
  • Problem
  • Methods
  • Result
  • Discussion

7
Dynamic Programming- Basic sequence alignment
  • Sequence AlignmentAADADEFGHAADCDEAGH
  • Identical pair 2insertion gap penalty g
    1Window 4
  • Sij Dij max Si1,j1 max Sk,j1 g
    , kgti2 max Si1,l g, l gt j2
  • Start from lower rightEnd with upper left
  • The highest score is found and its inheritance
    path is traced back

-1
2
-1
-1
2
2
8
Dynamic Programming -Basic interatomic distance
matching(1)
  • Basic interatomic distance matching method
  • Consider only alpha-carbon atoms of two
    structures and compare distance between them
  • Similarity score s a /(Adij-Bdklb) where
    Adij,Bdkl are interatomic distances a
    limits the maximum possible score b
    preventing division by zero
  • Score between i in A and k in B Sik Sum-nmn
    a/ (Adi,Im-Bdk,kmb)
  • Based on the score, standard dynamic programming
    algorithm can used for alignment of positions

9
Dynamic Programming -Basic interatomic distance
matching(2)
  • Explanation of the position similarity score

B
A
j
dij
in
kn
i-n
k-n
k
i
The overall score is given by the sum of
individual distance comparisons
10
Dynamic Programming -Structural Environment
matching
  • The basic method is adequate for matching local
    structures
  • but it was disrupted if the range of comparison
    (-n to n) spanned an insertion/deletion
    discontinuity
  • Do DP on Lower level of distance comparison to
    produce a best equivalence of position between
    two environments.Sik max a/ (Adij-Bdklb)
  • a 50, b 5

11
Dynamic Programming -Alignment dependency
between levels
  • Every comparison of the last step will produce an
    alignment of matched distances which is alignment
    of the two sequence too.
  • The values along he trace-back path in the lower
    level matrix were accumulated in corresponding
    elements in the higher level matrix
  • Apply cutoff on S to prevent the excessive
    accumulation of background noise sqrt200N
    , N is the length of shorter sequence numbers
    lt cutoff are ignored
  • Advantage large contribution are made only to he
    upper matrix for regions that match well in the
    lower comparison.
  • Weakness solely based on interatomic distances
    and has the limitation that similar distances
    achieve a high score even when these are between
    pairs of atoms that might be in completely
    different relative directions

12
Dynamic Programming -Alignment dependency
between levels (2)
  • The method is used at 2 levels
  • First to find the best equivalence of distances
    for the 2 residues being compares
  • Then at a higher level to find the best
    equivalence of residues within 2 sequence
  • Gap penalty of 5 is applied

13
Dynamic Programming -Vector comparison method
  • Solution compare interatomic vector rather than
    simple distances
  • The vector s is defined in the local frame of
    reference for every residue.
  • The Similarity score is changed to s a
    /((AVij-BVkl)2b)
  • Prepare the X-Y-Z
  • X-axis was defined by the n-c vector
  • A tentative Y-axis by the Cbeta-H vector
  • Z-axis was their mutual perpendicular vector
  • Y-axis was redefined as perpendicular to X and Z.

j
dij
in
i
i-n
Vij
14
Dynamic Programming -the final method
  • The vector-based method uses 3-d distances
  • Higher dimensions also could incorporate any data
    that can be defined at the residue level.
  • Nature of the amino acid can be usedSik max
    (wDRiRka)/ (Adij-Bdklb) DXY is the value
    in the Dayhoff matrix for the exchange of Amino
    Acids of type X and Y w is he weight, default
    it 1,0 a 40, b 2

15
Implementation
  • Implemented in SSAP(Structure and squence
    alignment program)Written in C , run on VAX-II
    under VMSSeparate FROTRAN program was used to
    prepare dataData from PDB

CPU Time of run(mintues) SeqLen, Win Time 50
20 5.2 30 10.3 100
30 36.0 60 121.0150
20 36.2 40 130.4
60 269.6
16
Outline
  • Problem
  • Methods
  • Result
  • Discussion

17
Data Set
PDB ID used in the paper/ Current ID
18
Results GlobinsMBN, HHB
This Work
4HHB haemoglobinAlpha-chain Beta-chain
5MBN myoglobin
LC
  • Compared with Lesk Chothia 1980, conventional
    superposition
  • Result
  • alpha and beta chains of 4HHB are the most
    similar
  • s(MBN, beta) gt s(MBN, alpha)

Structural comparison of the globins
19
Results Calcium-binding proteinsParvalbumin,
CIB
  • Calcium-binding proteins contains two motifs
    (helices) C D and (eponymous) E F
  • Parvalbumin (helices) also contains A B
  • The algorithm aligned the correct ion binding
    motifs in both structures, ignoring the redundant
    motif in parvalbumin (AB)compared with
    Cariepy Hodges 1983

20
Results - Rhodanese
  • Align the two halves of rhodanese
  • Compared with Ploegman 1978 least sequence, the
    result is identical in all but minor aspects
  • Graphic observation reveals this work is more
    plausible

1RHD, with two similar alternatingbeta/alpha-type
domains
21
Results Immunoglobulin domains
3FABAntigen-binding fragment
1FC1 Constant fragment
22
Results Immunoglobulin domains
Structure of Immunoglobulin
  • Compares each domain against every other
  • Produce correct alignment in 9 of 15 comparsions
  • Compared with Lesk Chothia 1982

Immunoglobulin heavy H Immunoglobulin Light L
include 6 all-beta domains of two types
constant (C) and variable (V)
23
Results - Lysozyme
  • Lysozyme (hen egg-white) 6LYZ
  • Lysozyme (T4) 1LZM/2LZM
  • Compared with Rossmann Argos 1976, 1977 Matthews
    1981
  • Found the first common helix to be displaced by
    one residue form both puvlished comparsions
  • Aligement of the initial beta-strand agrees with
    that of R A 1976
  • The reminder of the alignment is the same for
    both methods, except for trivial displacements at
    the fringes of equivalent blocks

6LYZ
2LZM
24
Results Plastocyanin/azurin
  • Plastocyanin 3PCY
  • Azurin 1AZU
  • Compared with Chothia Lesk 1982 and Adman 1985
  • Except for some minor insertions and deletions on
    the fringes, the alignments agrees with others.

3PCY
1AZU
25
Outline
  • Problem
  • Methods
  • Result
  • Discussion

26
Discussion
  • The results demonstrated that this method produce
    is equal in quality , and in some cased superior
    , to those reported.
  • This method is insensitive to the displacement of
    equivalent substructures
  • Future work develop statistical criteria for
    evaluating the significance of structural
    comparisons.
  • And it can be extended beyond residue level , eg.
    Secondary structure
Write a Comment
User Comments (0)
About PowerShow.com