A computational study of protein folding pathways - PowerPoint PPT Presentation

About This Presentation
Title:

A computational study of protein folding pathways

Description:

Some building blocks may be considered critical for correct folding. ... It likely to be inserted between sequentially connected building blocks. ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 40
Provided by: Nur84
Category:

less

Transcript and Presenter's Notes

Title: A computational study of protein folding pathways


1
A computational study of protein folding pathways
  • Reducing the computational complexity of the
    folding process using the building block folding
    model.
  • Nurit Haspel, Chung-Jung Tsai, Haim Wolfson and
    Ruth Nussinov

2
The building blocks model(Chung Jung Tsai)
  • Protein folding is a hierarchical process.
  • A protein is constructed from HFUs.
  • HFU - the result of a combinatorial assembly of
    building blocks.
  • Building block - a contiguous, highly populated
    fragment.
  • The building block model allows illustrating the
    protein folding pathway.

3
An outline of the building blocks algorithm
  • Scoring function - measures the relative
    stability of a candidate building block
  • Three ingredients
  • Compactness
  • Degree of isolation
  • hydrophobicity
  • The result - an anatomy tree that illustrates
    the most probable folding route.

4
The Scoring Function
Z - Compactness H - hydrophobicity I - Isolation
5
Compactness, Hydrophobicity and Isolation
definitions
  • Compactness -
  • Hydrophobicity -
  • Isolation -

6
The Cutting Procedure
  • Locating a basket of candidate building blocks
    (relatively stable contiguous fragments)
  • Assign a stability score to all the candidate
    fragments
  • Collect the local minima in the fragment map
    (best score in a given radius).
  • Recursively splitting the protein top-down
  • Search the basket for a set of fragments that
    constitute the whole fragment, allowing a short
    overlap (7 residues) and a gap of up to 15
    residues.
  • Minimum building block size - 15.
  • No node can have only one child (except for the
    root)
  • Stop when the node can not be split any further
  • In this work, building blocks up to level 6.

7
Example - Annexin III
8
Example (cont.)
9
Example (cont.)
10
Usefulness of the anatomy tree
  • It is possible to see whether a protein folds
    through single or multiple route(s).
  • These routes can be observed by inspecting the
    fragment map (there can be more than one way to
    construct a tree).
  • Sequential versus non-sequential folding.
  • Sequential contact made only between
    consecutive building blocks.
  • Binary anatomy tree sequential folder.
  • Fast versus slow folding
  • Sequential folding proteins usually fold faster.
  • Climbing up the tree allows us to illustrate the
    folding process.

11
Critical building blocks (Sandeep Kumar)
  • Some building blocks may be considered critical
    for correct folding.
  • A critical building block is in contact with
    other building blocks in the protein.
  • It likely to be inserted between sequentially
    connected building blocks.
  • Without it, the other building blocks are likely
    to mis-associate.
  • The structure and sequence of a critical BB is
    more likely to be conserved.

12
Critical building block algorithm
  • For each building block
  • Compute its diff. contacting surface area .
  • Compute its Critical building block index
  • Compute its Z-score

13
Critical building blocks (cont.)
A building block is critical if
  • It is found at most levels below the hydrophobic
    folding unit level
  • It has a consistently high CIndex at different
    levels
  • Its CIndex is significant by at least 2 standard
    deviations in at least one level of protein
    anatomy

14
The goals of my research
  • Clustering the building blocks according to their
    3-D structures, using a rigid matching algorithm.
  • Analyzing the building blocks Sequence,
    stability distribution, size.
  • Analyzing the clusters Size, stability score
    distribution, sequence conservation, criticalness
    conservation.

15
The goals of my research (cont.)
  • Analyzing the critical building blocks position
    within the protein, relative stability, sequence
    and structure conservation.
  • Developing an algorithm that assigns a set of
    building blocks to a protein sequence, using
    sequence similarity, relative stability and more
    information.

16
Clustering the building blocks
  • Each cluster has representative members (one or
    more)
  • For each building block structure
  • Go over the clusters.
  • Match with cluster representative(s).
  • If matches (1.5A rmsd, 70 size) - join the
    building block to the cluster.
  • If no match found - open a new cluster with this
    building block as a representative.

Problem -O(n²) comparisons n - number of clusters
17
Clustering of the building blocks
Cluster 1
Cluster 2
Cluster n

?
?
18
Making clustering more efficient
  • Dividing the building blocks into SCOP families
    (proteins from the same family usually produce
    the same building blocks).
  • Clustering each family and then merge all the
    clusters - reduces the number of clusters at each
    instance.

19
Building block and cluster data
20
Distribution of number of clusters
21
An example of a cluster
22
Sequence analysis of the clusters
  • Sequence clustering of each structural cluster
    (using BLAST).
  • Creating a non-redundant sequence dataset.
  • Goal - finding a connection between (short)
    sequences and structures.

23
Statistical analysis of the clusters and of the
critical building blocks
  • Stability score distribution among cluster
    members.
  • Criticalness score distribution among cluster
    members.
  • Position distribution of the critical building
    blocks.
  • Stability score as a function of criticalness
    score.

24
An example of stability distribution
25
Criticalness score distribution within a cluster
26
An N-terminus critical building block example
27
A C-terminus critical building block example
28
A mid-sequence critical building block example
29
Distribution of the position inside the protein -
all-alpha, level 3
30
Stability vs. Criticalness score example
31
Stability score of critical and non-critical
building blocks (histogram)
Non-critical
Critical
32
Final goal
  • Given a sequence and using the information
    accumulated so far - is there a way of matching a
    set of building blocks to it?

33
The building block assignment algorithm
  • Perform sequence alignment of the protein
    sequence against the building block sequence
    database.
  • Construct a directed, acyclic graph.
  • Each matching building block is a graph vertex
    and is assigned a score depending on the sequence
    alignment score, building block stability and
    other parameters.
  • Directed edges connecting the fragments that
    match to consecutive areas in the protein
    sequence, allowing short overlaps and small gaps.
  • Edge score average score of connected vertices.

34
The building block assignment algorithm (cont.)
  • Add fictitious start and target vertices.
  • Connect start to all starting vertices
  • Connect all ending vertices to target.
  • Find shortest path from start to target using the
    Single source shortest path algorithm.
  • The path is an optimal building block
    assignment covering the protein sequence.

35
Illustration of the algorithm
36
Example ROP protein from E. coli (1rpo)
37
Example Myoglobin from sea hare (1mba)
38
Suggestions for future work
  • Improving the algorithm and adding new parameters
    to it (secondary structure alignment, trying
    other building blocks from the same cluster as
    the matching building blocks etc.).
  • Combinatorial assembly Yuvals work.
  • Further cluster analysis inquiring into
    sequence conservation
  • Conformation stability measurements (molecular
    dynamics)

39
Conclusions
  • Using the hierarchical folding model, It may be
    possible to reduce the folding complexity,
    assigning local substructures and then assembling
    them.
Write a Comment
User Comments (0)
About PowerShow.com