Computational Approaches to Receptor Structure Prediction - PowerPoint PPT Presentation

1 / 56
About This Presentation
Title:

Computational Approaches to Receptor Structure Prediction

Description:

Computational Approaches to Receptor Structure Prediction U ur Sezerman Biological Sciences and Bioengineering Program Sabanc University, Istanbul – PowerPoint PPT presentation

Number of Views:203
Avg rating:3.0/5.0
Slides: 57
Provided by: Micha858
Category:

less

Transcript and Presenter's Notes

Title: Computational Approaches to Receptor Structure Prediction


1
Computational Approaches to Receptor Structure
Prediction
  • Ugur Sezerman
  • Biological Sciences and Bioengineering Program
  • Sabanci University, Istanbul

2
Determining Protein Structure
  • There are O(100,000) distinct proteins in the
    human proteome.
  • 3D structures have been determined for over
    60,000 proteins, from all organisms
  • Includes duplicates with different ligands bound,
    etc.
  • Coordinates are determined by X-ray
    crystallography or NMR

3
X-Ray Crystallography
  • The crystal is a mosaic of millions of copies of
    the protein.
  • As much as 70 is solvent (water)!
  • May take months (and a green thumb) to grow.

4
X-Ray diffraction
  • Image is averagedover
  • Space (many copies)
  • Time (of the diffractionexperiment)

5
Electron Density Maps
  • Resolution is dependent on the quality/regularity
    of the crystal
  • R-factor is a measure of leftover electron
    density
  • Solvent fitting
  • Refinement

6
The Protein Data Bank
  • http//www.rcsb.org/pdb/

ATOM 1 N ALA E 1 22.382 47.782
112.975 1.00 24.09 3APR 213 ATOM 2 CA
ALA E 1 22.957 47.648 111.613 1.00
22.40 3APR 214 ATOM 3 C ALA E 1
23.572 46.251 111.545 1.00 21.32 3APR
215 ATOM 4 O ALA E 1 23.948
45.688 112.603 1.00 21.54 3APR 216 ATOM
5 CB ALA E 1 23.932 48.787 111.380
1.00 22.79 3APR 217 ATOM 6 N GLY E
2 23.656 45.723 110.336 1.00 19.17
3APR 218 ATOM 7 CA GLY E 2 24.216
44.393 110.087 1.00 17.35 3APR 219 ATOM
8 C GLY E 2 25.653 44.308 110.579
1.00 16.49 3APR 220 ATOM 9 O GLY E
2 26.258 45.296 110.994 1.00 15.35
3APR 221 ATOM 10 N VAL E 3 26.213
43.110 110.521 1.00 16.21 3APR 222 ATOM
11 CA VAL E 3 27.594 42.879 110.975
1.00 16.02 3APR 223 ATOM 12 C VAL E
3 28.569 43.613 110.055 1.00 15.69
3APR 224 ATOM 13 O VAL E 3 28.429
43.444 108.822 1.00 16.43 3APR 225 ATOM
14 CB VAL E 3 27.834 41.363 110.979
1.00 16.66 3APR 226 ATOM 15 CG1 VAL E
3 29.259 41.013 111.404 1.00 17.35
3APR 227 ATOM 16 CG2 VAL E 3 26.811
40.649 111.850 1.00 17.03 3APR 228
7
A Peek at Protein Function
  • Serine proteases cleave other proteins
  • Catalytic Triad ASP, HIS, SER

8
Cleaving the peptide bond
9
Three Serine Proteases
  • Chymotrypsin Cleaves the peptide bond on the
    carboxyl side of aromatic (ring) residues Trp,
    Phe, Tyr and large hydrophobic residues Met.
  • Trypsin Cleaves after Lys (K) or Arg (R)
  • Positive charge
  • Elastase Cleaves after small residues Gly,
    Ala, Ser, Cys

10
Specificity Binding Pocket
11
Protein Folding Biological perspective
  • Central dogma Sequence specifies structure
  • Denature to unfold a protein back to random
    coil configuration
  • ?-mercaptoethanol breaks disulfide bonds
  • Urea or guanidine hydrochloride denaturant
  • Also heat or pH
  • Anfinsens experiments
  • Denatured ribonuclease
  • Spontaneously regained enzymatic activity
  • Evidence that it re-folded to native conformation

12
PROTEIN FOLDING PROBLEM
  • STARTING FROM AMINO ACID SEQUENCE FINDING THE
    STRUCTURE OF PROTEINS IS CALLED THE PROTEIN
    FOLDING PROBLEM

13
The Protein Folding Problem
  • Central question of molecular biologyGiven a
    particular sequence of amino acid residues
    (primary structure), what will the
    tertiary/quaternary structure of the resulting
    protein be?
  • Input AAVIKYGCALOutput ?1?1, ?2?2
    backbone conformation(no side chains yet)

14
Folding intermediates
  • Levinthals paradox Consider a 100 residue
    protein. If each residue can take only 3x39
    positions, there are 9100 possible conformations.
  • Folding must proceed by progressive stabilization
    of intermediates
  • Molten globules most secondary structure
    formed, but much less compact than native
    conformation.

15
Protein Packing
  • occurs in the cytosol (60 bulk water, 40
    water of hydration)
  • involves interaction between secondary structure
    elements and solvent
  • may be promoted by chaperones, membrane proteins
  • tumbles into molten globule states
  • overall entropy loss is small enough so enthalpy
    determines sign of ?E, which decreases (loss in
    entropy from packing counteracted by gain from
    desolvation and reorganization of water, i.e.
    hydrophobic effect)
  • yields tertiary structure

16
Folding help
  • Proteins are, in fact, only marginally stable
  • Native state is typically only 5 to 10 kcal/mole
    more stable than the unfolded form
  • Many proteins help in folding
  • Protein disulfide isomerase catalyzes shuffling
    of disulfide bonds
  • Chaperones break up aggregates and (in theory)
    unfold misfolded proteins

17
Forces driving protein folding
  • It is believed that hydrophobic collapse is a key
    driving force for protein folding
  • Hydrophobic core
  • Polar surface interacting with solvent
  • Minimum volume (no cavities)
  • Disulfide bond formation stabilizes
  • Hydrogen bonds
  • Polar and electrostatic interactions

18
Secondary Structure
  • non-linear
  • 3 dimensional
  • localized to regions of an amino acid chain
  • formed and stabilized by hydrogen bonding,
    electrostatic and van der Waals interactions

19
Common motifs
20
The Hydrophobic Core
  • Hemoglobin A is the protein in red blood cells
    (erythrocytes) responsible for binding oxygen.
  • The mutation E6?V in the ? chain places a
    hydrophobic Val on the surface of hemoglobin
  • The resulting sticky patch causes hemoglobin S
    to agglutinate (stick together) and form fibers
    which deform the red blood cell and do not carry
    oxygen efficiently
  • Sickle cell anemia was the first identified
    molecular disease

21
Sickle Cell Anemia
Sequestering hydrophobic residues in the protein
core protects proteins from hydrophobic
agglutination.
22
Computational Approaches
  • Ab initio methods
  • Threading
  • Comperative Modelling
  • Fragment Assembly

23
Why is ab-initio prediction hard?
24
Ab-initio protein structure prediction as an
optimization problem
  1. Define a function that map protein structures to
    some quality measure.
  1. Solve the computational problem of finding an
    optimal structure.
  2. ?

25
  • A dream function
  • ? Has a clear minimum in the native structure.
  • ? Has a clear path towards the minimum.
  • ? Global optimization algorithm should find the
    native structure.

Chen Keasar BGU
26
  • An approximate function
  • ? Easier to design and compute.
  • ? Native structure not always the global
    minimum.
  • ? Global optimization methods do not converge.
    Many alternative models (decoys) should be
    generated.

Chen Keasar BGU
27
  • An approximate function
  • ? Easier to design and compute.
  • ? Native structure not always the global
    minimum.
  • ? Global optimization methods do not converge.
    Many alternative models (decoys) should be
    generated.
  • ? No clear way of choosing among them.

Chen Keasar BGU
28
Fold Optimization
  • Simple lattice models (HP-models)
  • Two types of residues hydrophobic and polar
  • 2-D or 3-D lattice
  • The only force is hydrophobic collapse
  • Score number of H?H contacts

29
Scoring Lattice Models
  • H/P model scoring count noncovalent hydrophobic
    interactions.
  • Sometimes
  • Penalize for buried polar or surface hydrophobic
    residues

30
What can we do with lattice models?
  • For smaller polypeptides, exhaustive search can
    be used
  • Looking at the best fold, even in such a simple
    model, can teach us interesting things about the
    protein folding process
  • For larger chains, other optimization and search
    methods must be used
  • Greedy, branch and bound
  • Evolutionary computing, simulated annealing
  • Graph theoretical methods

31
Learning from Lattice Models
  • The hydrophobic zipper effect

Ken Dill 1997
32
Threading Fold recognition
  • Given
  • Sequence IVACIVSTEYDVMKAAR
  • A database of molecular coordinates
  • Map the sequence onto each fold
  • Evaluate
  • Objective 1 improve scoring function
  • Objective 2 folding

33
Protein Fold Families
  • CATH website www.cathdb.info

34
Secondary Structure Prediction
AGVGTVPMTAYGNDIQYYGQVT
A-VGIVPM-AYGQDIQY-GQVT
AG-GIIP--AYGNELQ--GQVT
AGVCTVPMTA---ELQYYG--T
AGVGTVPMTAYGNDIQYYGQVT
----hhhHHHHHHhhh--eeEE
35
Secondary Structure Prediction
  • Easier than folding
  • Current algorithms can prediction secondary
    structure with 70-80 accuracy
  • Chou, P.Y. Fasman, G.D. (1974). Biochemistry,
    13, 211-222.
  • Based on frequencies of occurrence of residues in
    helices and sheets
  • PhD Neural network based
  • Uses a multiple sequence alignment
  • Rost Sander, Proteins, 1994 , 19, 55-72

36
Chou-Fasman Parameters
37
HOMOLOGY MODELLING
  • Using database search algorithms find the
    sequence with known structure that best matches
    the query sequence
  • Assign the structure of the core regions obtained
    from the structure database to the query
    sequence
  • Find the structure of the intervening loops using
    loop closure algorithms

38
Homology Modeling How it works
  • Find template
  • Align target sequence
  • with template
  • Generate model
  • - add loops
  • - add sidechains
  • Refine model

39
Prediction of Protein Structures
  • Examples a few good examples

actual
predicted
predicted
actual
actual
actual
predicted
predicted
40
Prediction of Protein Structures
  • Not so good example

41
1esr
42
(No Transcript)
43
(No Transcript)
44
How can we predict protein structures?
45
HOMOLOGY MODELLING
  • Using database search algorithms find the
    sequence with known structure that best matches
    the query sequence
  • Assign the structure of the core regions obtained
    from the structure database to the query
    sequence
  • Find the structure of the intervening loops using
    loop closure algorithms

46
Homology Modeling How it works
  • Find template
  • Align target sequence
  • with template
  • Generate model
  • - add loops
  • - add sidechains
  • Refine model

47
Prediction of Protein Structures
  • Examples a few good examples

actual
predicted
predicted
actual
actual
actual
predicted
predicted
48
Prediction of Protein Structures
  • Not so good example

49
1esr
50
(No Transcript)
51
(No Transcript)
52
G-protein coupled receptors (GPCRs)
  • Vital protein bundles with versatile functions.
  • Play a key role in cellular signaling, regulation
    of basic physiological processes by interacting
    with more than 50 of prescription drugs.
  • Therefore excellent potential therapeutic target
    for drug design and the focus of current
    pharmaceutical research.

53
GPCR Functional Classification Problem
  • Although thousands of GPCR sequences are known,
    the crystal structure solved only for one GPCR
    sequence at medium resolution to date.
  • For many of them, the activating ligand is
    unknown.
  • Functional classification methods for automated
    characterization of such GPCRs is imperative.
  • Not suitable for homology modelling but hybrid
    methods may work. A Rayan J. Mol. Modelling
    (2010) p 183-191

54
Schematic overview of the MHC-I antigen
processing and presentation pathway
55
Pathway and MHC Molecule
  • Cytotoxic T-cells recognize antigen peptides
    (8-10 residues) bound to a MHC class I molecule
    on the cell surface.

56
MHC-I bound epitope is scanned by T-cell receptor
Write a Comment
User Comments (0)
About PowerShow.com