Protein Structure - PowerPoint PPT Presentation

1 / 72
About This Presentation
Title:

Protein Structure

Description:

Homology: Similarity in characteristics resulting from shared ancestry. ... LIBRA I. http://www.ddbj.nig.ac.jp/htmls/Email/libra/LIBRA_I.html. UCLA DOE ... – PowerPoint PPT presentation

Number of Views:360
Avg rating:3.0/5.0
Slides: 73
Provided by: pevsnerlab
Category:

less

Transcript and Presenter's Notes

Title: Protein Structure


1
Protein Structure
  • Ingo Ruczinski
  • Department of Biostatistics, Johns Hopkins
    University

2
Structural Proteins
3
Membrane Proteins
4
Globular Proteins
5
Terminology
  • Primary Structure
  • Secondary Structure
  • Tertiary Structure
  • Quatenary Structure
  • Supersecondary Structure
  • Domain
  • Fold

6
Hierarchy of Protein Structure
7
Helices
a
p
Amino acids/turn
3.6
4.4
3.0
Frequency
97
rare
3
H-bonding
i, i4
i, i5
i, i3
8
a-helices
a-helices have handedness
a-helices have a dipole

9
b-sheets
10
b-sheets
Have a right-handed twist!
11
b-sheets
Can form higher level structures!
12
Super Secondary Structure Motifs
13
What is a Domain?
  • Richardson (1981)
  • Within a single subunit polypeptide chain,
    contiguous portions of the polypeptide chain
    frequently fold into compact, local
    semi-independent units called domains.

14
More About Domains
  • Independent folding units.
  • Lots of within contacts, few outside.
  • Domains create their own hydrophobic core.
  • Regions usually conserved during recombination.
  • Different domains of the same protein can have
    different functions.
  • Domains of the same protein may or may not
    interact.

15
Two Very Small Domains
16
Why Look for Domains?
Domains are the currency of protein function!
17
Homology and Analogy
  • Homology Similarity in characteristics
    resulting from shared ancestry.
  • Analogy The similarity of structure between two
    species that are not closely related,
    attributable to convergent evolution.

Homologous structures can be divided into
orthologues (a result from changes in the same
gene between different organisms, such as
myoglobin) and paralogues (a result from gene
duplication and subsequent changes within an
organism and its descendents, such as
hemoglobin).
18
(No Transcript)
19
The CATH Hierarchy
20
(No Transcript)
21
DALIDistance Matrix Alignment
  • DALI generates alignments of structural
    fragments, and is able to find alignments
    involving chain reversals and different
    topologies.
  • The algorithm uses distance matrices to represent
    each structure to be compared.
  • Application of DALI to the entire PDB produces
    two classifications of structures FSSP and DDD
    (3D).

Holm L, and Sander C (1993)
22
DALI
23
DALI
24
FSSP and DDD
  • The families of structurally similar proteins
    (FSSP) is a database of structural alignments of
    proteins in the protein data bank (PDB). It
    presents the results of applying DALI to (almost)
    all chains of proteins in the PDB.
  • The DALI domain dictionary (DDD) is a
    corresponding classification of recurrent domains
    automatically extracted from known proteins.

25
Other Algorithms for Domain Decomposition
  • The Protein Domain Parser (PDP) uses compactness
    as a chief principle.
  • http//123d.ncifcrf.gov/pdp.html
  • DomainParser is graph theory based. The
    underlying principle used is that residue-residue
    contacts are denser within a domain than between
    domains.
  • http//compbio.ornl.gov/structure/domainparser/

26
Oh Dear
27
Parsing Sequence into Domains
  • Look for internal duplication.
  • Look for low complexity segments.
  • Look for transmembrane segments.

28
Why is That Important?
  • Functional insights.
  • Improved database searching.
  • Fold recognition.
  • Structure determination.

PRODOM http//protein.toulouse.inra.fr/prodom/cu
rrent/html/home.php PFAM
http//www.sanger.ac.uk/Softwa
re/Pfam/
29
Protein Structure Prediction
30
Homology Modeling
31
Fold Recognition
Sequence Known folds
32
Ab Initio Structure Prediction
33
Homology Modeling
  • Align sequence to protein sequences with known
    structure.
  • Construct and evaluate model of 3D structure from
    alignment.
  • Requirement Close match to template sequences
    with known 3D structure (sequence similarity of
    at least 25).

Note about 25 of the protein sequences in the
Swiss-Prot database have templates for at least
part of the sequence!
34
Threshold for Structural Homology
Rost B, Protein Engineering 12 (1999).
35
Homology Modeling Approach
  • Find set of sequences related to target sequence.
  • Align target sequence to template sequences (key
    step).
  • Construct 3D model for core (backbone)
  • Conserved regions ? conserved structure /
    coordinates.
  • Structure diverges ? use sequence similarity,
    secondary structure prediction, manual
    prediction, etc. to fill in gaps.
  • Construct 3D models for loops
  • Search loop conformation library, limited protein
    folding.
  • Model location of side chains
  • Search rotamer library, use molecular dynamics.
  • Optimize / verify the model
  • Improve likelihood / ensure legality of model.

36
Homology Modeling Web Pages
  • MODELLER
  • http//salilab.org/modeller/modeller.html
  • SWISS-MODEL
  • http//www.expasy.org/swissmod/SWISS-MODEL.html

37
Quality Assessment
  • Goal
  • Ensure predicted 3D structure is possible /
    probable in practice
  • Based on general knowledge of protein structures
  • Criteria
  • Carbon backbone conformations allowed
    (Ramachandran map)
  • Legal bond lengths, angles, dihedrals
  • Peptide bonds are planar
  • Side chain conformations correspond to ones in
    rotamer library
  • Hydrogen-bonding of polar atoms if buried
  • Proper environments for hydrophobic / hydrophilic
    residues
  • No bad atom-atom contacts
  • No holes inside 3D structure
  • Solvent accessibility

38
Quality Assessment Programs
  • VERIFY3D
  • http//shannon.mbi.ucla.edu/DOE/Services/Verify_3D
  • PROCHECK
  • http//www.biochem.ucl.ac.uk/roman/procheck/proch
    eck.html
  • WHATIF
  • http//www.cmbi.kun.nl/whatif/

39
Fold Recognition
  • The input sequence is threaded on different folds
    from a library of known folds.
  • Using scoring functions, we get a score for the
    compatibility between the sequence and the
    structures.

Amino acids with different chemical properties
Library of known folds
40
Fold Recognition
Hydrogen donor
Hydrogen acceptor
Hydrophobic
Glycin
Good score!
41
Web Sites for Fold Recognition
3D-PSSM http//www.bmm.icnet.uk/3dpssm LIBRA
I http//www.ddbj.nig.ac.jp/htmls/Email/libra/LI
BRA_I.html UCLA DOE http//www.doe-mbi.ucla.edu/p
eople/frsvr/frsvr.html 123D http//www-Immb.ncif
crf.gov/nicka/123D.html PROFIT http//lore.came.s
bg.ac.at/home.html
42
Ab Initio Methods
  • Ab initio From the beginning.
  • Assumption 1 All the information about the
    structure of a protein is contained in its
    sequence of amino acids.
  • Assumption 2 The structure that a (globular)
    protein folds into is the structure with the
    lowest free energy.
  • Finding native-like conformations require
  • - A scoring function (potential).
  • - A search strategy.

43
Fragment Selection
44
Hydrophobic Burial
45
Residue Pair Interaction
46
The Sequence Independent Term
47
Strand Packing Helps!
Estimated f-q distribution
48
(No Transcript)
49
(No Transcript)
50
(No Transcript)
51
(No Transcript)
52
(No Transcript)
53
Rosetta in CASP4
54
CASP
55
Hubbard Plot
56
Hubbard Plots
57
(No Transcript)
58
Functional Annotation
59
(No Transcript)
60
Protein Design
61
Protein Design
62
Protein Secondary Structure Prediction
63
(No Transcript)
64
Secondary Structure Assignment
  • Eight states from DSSP
  • H a-helix
  • G 310 helix
  • I p-helix
  • E b-strand
  • B bridge
  • T b-turn
  • S bend
  • C coil
  • CASP standard
  • H (H, G, I), E (E, B), C (C, T, S).

65
Secondary Structure Prediction
Given the sequence of amino acids of a protein,
what is its secondary structure?
GHWIATRGQLIREAYEDYRHFSSECPFIP
Primary structure
CEEEEECHHHHHHHHHHHCCCHHCCCCCC
Secondary structure
Notation H Helix E Strand C Coil
66
Secondary Structure Prediction
Helix
Edge strand
Buried strand
By eye!
67
Conformational Preferences of Amino Acids
Helical Preference.
Strand Preference.
Turn Preference.
68
Conformational Preferences of Amino Acids
Extended flexible side chains.
Bulky side chains, beta-branched.
Restricted conformations, side chain main
chain interactions.
69
Secondary Structure Prediction
Given the sequence of amino acids of a protein,
what is its secondary structure?
GHWIATRGQLIREAYEDYRHFSSECPFIP
Primary structure
CEEEEECHHHHHHHHHHHCCCHHCCCCCC
Secondary structure
Notation H Helix E Strand C Coil
70
Measures for Prediction Accuracy
The standard measure for prediction accuracy is
(still) the Q3 measure. It is simply the
proportion (in percent) of all amino acids that
have correct matches for the three states C, E,
H. In recent years, the segment overlap measure
(SOV) has been used more extensively. It aims for
measuring how well secondary structure elements
have been predicted rather than individual
residues.
Rost et al (1994), JMB 235, pp 13-26.
71
Automated Methods
The availability of large families of homologous
sequences together with advances in computing
techniques has pushed the prediction accuracy
well above 70. Most methods are available as web
servers. They include
PHD http//www.embl-heidelberg.de/predictprotein
/predictprotein.html PSI-PRED http//bioinf.cs.ucl
.ac.uk/psipred/ JPRED http//www.compbio.dundee.ac
.uk/www-jpred/
72
Consensus
Write a Comment
User Comments (0)
About PowerShow.com