Comparative Protein Modeling - PowerPoint PPT Presentation

1 / 1
About This Presentation
Title:

Comparative Protein Modeling

Description:

Title: No Slide Title Author: Medical Illustration Unit Last modified by: McNair1 Created Date: 10/24/1997 5:44:18 AM Document presentation format – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 2
Provided by: Medical82
Category:

less

Transcript and Presenter's Notes

Title: Comparative Protein Modeling


1
Comparative Protein Modeling
Jason Wiscarson (jwiscarson_at_gmail.com), Lloyd
Spaine (llspaine_at_gmail.com)
  • Sequence Alignment and Modeling System with
    Hidden Markov Models (SAM)-T02 provides sequence
    alignment from the target sequence to all
    templates in steps
  • Find sequences similar to the target sequence.
  • Predict the secondary structure.
  • Find probable templates for threading.
  • Align the target with the templates.
  • Construct a fragment library for the target.
  • Build a 3-D model of the target.
  • Threading different proteins that have similar
    structures
  • Creates pseudo-protein models based on solved
    proteins.
  • Calculates energy value for the pseudo-protein
    models.
  • Ranks the alignments based on that energy value.

Selecting Templates and Improving Alignments
Introduction
Protein Model Refinement
Comparative or homology modeling, is a
computational tool used to predict
three-dimensional structure of proteins with
unknown structures. If the sequence and the
protein share sequence similarity, proteins with
known 3-D structures may serve as templates to
predict the unknown protein structure. The term
homology refers to evolutionary relationship
between two or more proteins that have the same
ancestor in an evolution tree regardless of their
sequence similarity. Proteins from similar
families often have similar functions, yet there
are many instances in which proteins have similar
structure but different functions. Therefore the
process to construct 3-D models of proteins shown
in Figure 1 is paramount.
  • Side-Chains with Rotamer Library (SCWRL)
    determines the most likely side-chain
    conformations by
  • Reading the initial structure and determining
    possible low energy side-chain conformations
    (rotamers).
  • Defining disulfide bridges and performing a
    dead-end elimination to get rid of rotamers.
  • Constructing a residue graph and determining the
    rotamer clusters and outputing the final
    structure.
  • Molecular Mechanics (MM) is a method that
    removes repulsive contacts between side chains by
    allowing the side chains to relax to low-energy
    rotamers.
  • Molecular Dynamics (MD) simulation involves
  • Warm-up, equilibrium, cool down
  • Sampling the trajectory during a production run
    time period and analyzing results.
  • Molecular Dynamics with Simulated Annealing
    (MD-SA) is an optimization method that works by
    heating a system, samples many energy states, and
    then slowly cools the system to ensure that the
    low-energy structures are found.

The first step is to improve the alignment and
select the template. This is where the sequence
of interest (target) and other sequences and
structures (template) are aligned. Afterwards,
the best templates are chosen based on
evolutionary distance as determined by a
phylogenic tree. Selecting Templates
structure for a protein model is done by
considering R-factor (residual index), the value
that relates how well predicted structure matches
experimental electron density maps. Improving
Sequence Alignment With Primary and Secondary
Structure Analysis is used to reveal regions rich
in proline, glutamic acid, serine, and threonine
(PEST regions) ? locate sequence repeats predict
percentage of buried versus accessible residues
and provide information about proteins
isoelectric point. Pattern and Motif-Based
Secondary Structure Prediction AA sequence ? 3D
structure. Well-known pattern and motif-based
secondary structure prediction methods include
PSIPRED, GenTHREADER, PREDATOR, PROF, MEMSAT, and
PHD.
Sequence Alignment
Find known sequences and 3-D structures related
to the target protein
  • Alignment based on evolutionary history is done
    to amino acid residues of target protein. The
    types of alignment are
  • Global alignment of regions that lack similarity
    and then search for similar regions.
  • Local alignment in regions with significant
    similarity first, and then align regions of
    optimally aligned residues.
  • To prepare sequences a database Sequence to
    Coordinates (S2C) is used to examine the
    differences that originate from the mutagenesis
    studies.
  • Alignment programs differ in the methods used
    but they score or evaluate the final alignment
    using gap penalties, similarity matrices and
    alignment scores.
  • Similarity Matrices describe the probability of
    a specific amino acid residue mutating to a
    different residue type. Common similarity
    matrices include
  • Point-Accepted Mutation per 100 amino acid
    residues (PAM), is based on the probability of an
    amino acid residue mutating to another amino acid
    residue.
  • BLOck SUstitution Matix (BLOSUM) matrices is
    similar to PAM but uses more diverse set of
    sequences.
  • Gonnet similarity matrices index and reorganize
    amino acids using a tree on small cluster of
    computers.
  • Clustal is an alignment program that aligns
    large sequences of varying similarity quickly.
    Sequences are progressively aligned based on the
    branching order in the phylogenetic tree.
  • Tree-Based Consistency Objective Function for
    Alignment Evaluation (T-Coffee) is a method to
    rectify progressive-alignment (heuristic) methods
    where errors in the first alignment cannot be
    corrected as other sequences are added to the
    alignment. It suffers from greediness, its
    inability to correct errors (addition or
    extension of a gap).
  • Divide-and-Conquer Alignment (DCA) method aligns
    sequences simultaneously. It uses the multiple
    sequence simultaneously (MSA) methodology.

Final Model
Align the target and template amino acid residues
Evaluate Model
Refine Model
Select templates and adjust/improve the alignments
Construct Model
Evaluating Protein Models
Figure 1 Flow chart that shows construction of
comparative protein models. The solid lines
represent comparative modeling steps, and dotted
lines represent parameters (template, alignment,
construction environment, or refinement method)
that can improve the quality of the protein model
Several methods exist to check imperfections in
the models including PROCHECK which does
statistical checks and indicates regions of a
protein structure that might require modification
because of nonoptimal stereochemistry. Verify
3D scores 3-D models with probability table and
assess probability that each amino acid residue
would occupy specific position in the 3-D
structure. ERRAT examines nonbonded distances
of C-C, C-N, C-O, N-N, N-O, and O-O atoms.
Protein Structure Analysis (ProSa) uses
potential of mean force which is change in
potential energy of a system caused by the
variation of a specific coordinate to locate the
regions of the protein structure that may contain
improper or unsuitable geometries. Protein
Volume Evaluation (PROVE) uses computed volume of
individual atoms as a means of evaluating the
viability of a protein model. Model Clustering
Analysis uses NMRCLUST, NMRCORE, and OLDERADO
which are programs that aid in the superposition
and clustering of protein structure.
Constructing Protein Models
Finding related sequences and structures
  • Satisfaction of Spatial Restraints (SSR)
    constructs a 3-D protein model using spatial
    restraints based on distances, bond angles,
    dihedral angles, dihedral pairs, etc.
  • Segment Match Modeling (SMM) constructs protein
    by
  • Choosing protein template.
  • Building list of possible template matches
  • Sorting templates by best fit to targets
    structure.
  • Using probabilities to select the best segment
    from a low pseudo-energy subset group.
  • Moving coordinates from best segments template
    protein.
  • Multiple Template Method (MTM) uses solved X-ray
    structures to build the target sequences protein
    model.
  • 3D-JIGSAW creates a homology model
  • Select and align templates, based on sequence.
  • Select template segments.
  • Create backbone (framework, scaffold).
  • Add side chains, refine and evaluate target
    protein model.

In comparative protein modeling several databases
are used to find genomic, amino acid, and protein
data. The Expert Protein Analysis System
(ExPASy) is the start for searching for proteins
and their related sequences. Swiss-Prot contains
data that has been refined by removing
unnecessary information and TrEMBL receives and
stores initial genomics data. PROSITE uses
tertiary structure and key amino acid residues
based on biologically significant
patterns. ENZYME retrieves an enzymes
recommended name, alternative names, catalytic
activity, cofactors, human genetic diseases, and
cross-references. SWISS-MODEL holds comparative
protein models that do not have a known 3-D
structure. Basic Local Alignment Search Tool
(BLAST) uses protein sequence to search and
analyze the sequences of interest locates
similar protein sequences sequence
alignments. Protein Data Bank (PDB) is a
repository for experimentally determined protein
3-D structures.
References
1 Esposito, E. X. Tobi, D. Madura, J. D.
Comparative Protein Modeling Reviews in
Computational Chemistry, Volume 22, 2006,
Wiley-VCH, John Wiley Sons, Inc. to be
published. 2 Ramachandran Plot and analine
structure http//www.cgl.ucsf.edu/home/glasfeld/t
utorial/AAA/AAA.html
Figure 2 Peptide bonds create rigid plates which
rotate about phi and psi.
Figure 3 A Ramachandran plot for the tripeptide
in Figure 2.
Write a Comment
User Comments (0)
About PowerShow.com