Yang Zhang - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Yang Zhang

Description:

Templates topologies in the core identified by SAL are quite similar to native : 5 ... 2. Structure alignment program SAL not perfect ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 17
Provided by: geneK8
Category:
Tags: sal | yang | zhang

less

Transcript and Presenter's Notes

Title: Yang Zhang


1
The protein structure prediction problem could be
solved using the current PDB library
  • Yang Zhang Jeffrey Skolnick
  • PNAS vol 102 (25 Jan. 2005)

4? ???, ???, ???
THE 8TH PROTEIN FOLDING SCHOOL
2
Objective
  • To examine,
  • whether all single-domain proteins are foldable
    based on the set of solved structures currently
    deposited in PDB
  • 2. whether the templates can be further improved
    by rearranging the fragments (TASSER)

3
PDB information
Database gt23,000 solved protein structures
(December 30, 2003) and 300 new entries added
each month. New fold entries keep decreasing
(e.g., the percentage of new folds 27 in 1995 ?
5 in 2001).
4
Our experimental Method
  • 1. Template identification
  • - SAL (NeedlemanWunsch dynamic program global
    alignment)
  • score(i, j) 20/(1 d3ij/5)
  • 2. Force Field construction
  • Ca and side-chain group (SG) regularities/correlat
    ions from the statistics of the PDB
  • propensities for predicted secondary structure
    from PSIPRED
  • tertiary consensus contact/distance restraints
  • a protein-specific SG pair potential, both
    extracted from the identified multiple templates.
  • 3. Structure assembly
  • full-length models constructed from assembly of
    the continuous fragments
  • from the templates got after the optimized force
    field

5
Overview of the TASSER method
By PROSPCETOR_3 (threading)
6
Benchmark set of targets and templates for test
proteins
  • Developed representative benchmark set of all
    single-domain structures in PDB with 41-200 a.a
  • Target set 1489 non homologous proteins
  • 448 a-proteins
  • 434 ß-proteins
  • 550 a ß-proteins
  • 57 (Ca-only targets or have irregular 2D
    structures)
  • Template
  • 3,575 representative proteins from PDB
  • Pairwise sequence identity to each other 35
    maximum

7
Summary of folding results
8
Improvements of initial alignment
9
Modeling Unaligned/Loop Regions
  • 1)RMSDlocal
  • - measures the modeling accuracy of the local
    conformation
  • 2) RMSDglobal
  • measures the modeling accuracy of the local
    conformation and global orientation
  • - loop size ??Accuracy of loop modeling ?
  • - TltM always
  • - Cutoff RMSDgloballt7 Ã… ?reasonable model
  • M loops up to 10 residues
  • T loops up to 28 residues
  • RMSDglobal cutoff ?
  • ? acceptable loop size in T and M ?
  • ? difference M?T ?

10
Representative examples
-Templates topologies in the core identified by
SAL are quite similar to native lt5 Ã… - Local
packing of the fragments and sometimes termini
are misoriented - Rearrangement using T force
field gt2Ã… improvement in the aligned region
Blue N terminal Red C terminal
11
Representative examples (fail)
Fail to model the configuration of the tail ?
Give a full-length RMSD 7.8 Ã… to native ?
Interactions with partner chains not
Included Proof cut the 1st 22 residues in the
N terminus associated with intermolecular interact
ions ? Core region of the first model 1.4
Ã… RMSD
12
New Fold Targets in CASP5
Acceptable models can be built from the initial
template alignments using our TASSER
13
Strong points
  • force field includes multiple sources of
    knowledge-based potentials and consensus tertiary
    restraints from multiple templates consensus
    spatial information
  • 2. Combination of the different types of energy
    terms
  • improvement because better correlation
    between model quality and energy
  • 3. Templates usually contain unphysical
    alignments because chain connectivity not
    considered in the initial alignments
  • ? T reassembly procedure converts these
    unphysical alignments into physical models
  • 4. Can change only in Tasser Relative
    orientation of template fragments

14
Weak points
  • Average sequence identity between target proteins
    and best template identified13
  • ?challenge correctly align the sequence of
    theses templates
  • 2. Structure alignment program SAL not perfect
  • ? not guaranteed to find the best structural
    alignment because the final alignment in this
    algorithm is sensitive to the initial guesses
    superposition
  • 3. The representative example have shown that the
    models can be
  • bad when interaction of the tails have to be
    taken into account

15
Concluding
  • all single-domain proteins are foldable based on
    the set of solved structures currently deposited
    in PDB
  • Best template can be further improved by
    rearranging the fragments (TASSER)

16
RMSD to native of the templates identified by the
structure alignment program SAL versus the
alignment coverage.
Write a Comment
User Comments (0)
About PowerShow.com