The Protein Databank - PowerPoint PPT Presentation

About This Presentation
Title:

The Protein Databank

Description:

Definition of Dihedral angles in the backbone of protein structures. figPSIPSI.eps ... Ramachandran Plot of dihedral angles of chain A from 1LQT. fig1LQTPHIPSI.eps ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 40
Provided by: glasnost
Category:

less

Transcript and Presenter's Notes

Title: The Protein Databank


1
The Protein Databank
  • Working with protein data-files

2
Determining Biomolecule Structures
  • X-ray crystallography
  • Nuclear magnetic resonance

3
The Protein Databank
4
The PDB Growth Chart
  • figGROWTH.eps

5
Maxim 10.1
  • Beware of anything in the PDB Header Section

6
The PDB Data-File Formats
7
Example PDB structure 1LQT
  • fig1LQT.eps

8
Example PDB structure 1M7T
  • fig1M7T.eps

9
Downloading PDB data-files
  • http//www.rcsb.org/pdb/
  • http//www.ebi.ac.uk/services/

10
Accessing Data In PDB Entries
  • Accessing PDB Annotation Data
  • Free R and resolution

11
Example PDB data-file
  • REMARK 2
  • REMARK 2 RESOLUTION. 1.05 ANGSTROMS.
  • REMARK 215 NMR STUDY
  • REMARK 215 THE COORDINATES IN THIS ENTRY WERE
    GENERATED FROM SOLUTION
  • REMARK 215 NMR DATA. PROTEIN DATA BANK
    CONVENTIONS REQUIRE THAT
  • REMARK 215 CRYST1 AND SCALE RECORDS BE INCLUDED,
    BUT THE VALUES ON
  • REMARK 215 THESE RECORDS ARE MEANINGLESS.

12
Example PDB data-file, cont.
  • .
  • .
  • .
  • REMARK 3 FIT TO DATA USED IN REFINEMENT.
  • REMARK 3 CROSS-VALIDATION METHOD THROUGHOUT
  • REMARK 3 FREE R VALUE TEST SET SELECTION RANDOM
  • REMARK 3 R VALUE (WORKING TEST SET) 0.134
  • REMARK 3 R VALUE (WORKING SET) 0.134
  • REMARK 3 FREE R VALUE 0.153
  • REMARK 3 FREE R VALUE TEST SET SIZE () NULL
  • REMARK 3 FREE R VALUE TEST SET COUNT 2200
  • .
  • .
  • .

13
Plotting Free R Values against Resolution
  • figFREER.eps

14
Database cross references
  • DBREF 1LQT A 1 456 GB 13882996 AAK47528 1 456
  • DBREF 1LQT B 1 456 GB 13882996 AAK47528 1 456
  • DBREF 1AFI 1 72 SWS P04129 MERP_SHIFL 20 91
  • DBREF 1M7T A 1 66 SWS P10599 THIO_HUMAN 0 65
  • DBREF 1M7T A 67 106 SWS P00274 THIO_ECOLI 68 107

15
Coordinates section
  • REMARK 210
  • REMARK 210 BEST REPRESENTATIVE CONFORMER IN THIS
    ENSEMBLE 21
  • REMARK 210

16
Data section
  • ATOM 1 N ARG A 2 26.318 -8.010 39.090 1.00 20.71
    N
  • ANISOU 1 N ARG A 2 2040 3071 2755 114 -339 -393 N
  • ATOM 2 CA ARG A 2 25.150 -8.702 38.505 1.00 18.85
    C
  • ANISOU 2 CA ARG A 2 2029 2677 2455 67 -321 -209 C
  • ATOM 3 C ARG A 2 24.846 -8.176 37.123 1.00 17.23
    C
  • ANISOU 3 C ARG A 2 1689 2429 2429 143 -282 -258 C
  • ATOM 4 O ARG A 2 25.151 -7.048 36.775 1.00 18.14
    O
  • .
  • .
  • TER 7215 GLY A 456
  • ATOM 7216 N ARG B 2 -19.423 25.709 6.980 1.00
    21.57 N
  • ANISOU 7216 N ARG B 2 2476 3012 2707 -165 -370 95
    N
  • ATOM 7217 CA ARG B 2 -18.718 26.510 8.024 1.00
    19.01 C
  • ANISOU 7217 CA ARG B 2 2127 2672 2424 -63 -285 91
    C
  • ATOM 7218 C ARG B 2 -17.250 26.207 8.002 1.00
    17.22 C
  • ANISOU 7218 C ARG B 2 1955 2392 2196 -91 -299 121
    C
  • ATOM 7219 O ARG B 2 -16.851 25.158 7.535 1.00
    18.15 O

17
Data section, cont.
  • TER 14289 GLY B 456
  • HETATM14290 C ACT 1866 -13.075 1.733 10.218 1.00
    27.25 C
  • ANISOU14290 C ACT 1866 3493 3560 3299 -39 -36 -44
    C
  • .
  • .
  • CONECT14290142911429214293
  • CONECT1429114290
  • CONECT1429214290
  • TER
  • .
  • .
  • CONECT1469014663
  • MASTER 389 0 15 46 38 0 0 620280 2 401 72
  • END

18
Data section, cont.
  • MODEL 1
  • ATOM 1 N MET A 1 3.110 -4.682 -3.025 1.00 0.00 N
  • ATOM 2 CA MET A 1 2.546 -3.712 -2.053 1.00 0.00 C
  • ATOM 3 C MET A 1 1.134 -3.295 -2.450 1.00 0.00 C
  • ATOM 4 O MET A 1 0.882 -2.130 -2.758 1.00 0.00 O
  • ATOM 5 CB MET A 1 3.466 -2.491 -2.002 1.00 0.00 C
  • ATOM 6 CG MET A 1 3.781 -1.903 -3.370 1.00 0.00 C
  • ATOM 7 SD MET A 1 4.256 -0.166 -3.285 1.00 0.00 S
  • ATOM 8 CE MET A 1 6.004 -0.307 -2.920 1.00 0.00 C
  • ATOM 9 1H MET A 1 2.906 -4.327 -3.980 1.00 0.00 H
  • ATOM 10 2H MET A 1 2.650 -5.601 -2.859 1.00 0.00
    H
  • ATOM 11 3H MET A 1 4.134 -4.738 -2.858 1.00 0.00
    H
  • ATOM 12 HA MET A 1 2.517 -4.178 -1.079 1.00 0.00
    H
  • ATOM 13 1HB MET A 1 2.996 -1.724 -1.405 1.00 0.00
    H
  • ATOM 14 2HB MET A 1 4.397 -2.778 -1.536 1.00 0.00
    H
  • ATOM 15 1HG MET A 1 4.596 -2.461 -3.807 1.00 0.00
    H
  • ATOM 16 2HG MET A 1 2.907 -1.993 -3.998 1.00 0.00
    H
  • ATOM 17 1HE MET A 1 6.344 -1.302 -3.167 1.00 0.00
    H
  • ATOM 18 2HE MET A 1 6.169 -0.120 -1.869 1.00 0.00
    H

19
Data section, cont.
  • TER 1659 VAL A 107
  • ENDMDL
  • MODEL 2
  • ATOM 1 N MET A 1 2.750 -6.779 -1.627 1.00 0.00 N
  • ATOM 2 CA MET A 1 2.487 -5.475 -2.290 1.00 0.00 C
  • .
  • .
  • .
  • TER 1660 VAL A 107
  • ENDMDL

20
Extracting 3D co-ordinate data
  • my ( X, Y, Z ) ( substr( _, 30, 8 ),
  • substr( _, 38, 8 ),
  • substr( _, 46, 8 ) )

21
The simple_coord_extract program
  • ! /usr/bin/perl -w
  • simple_coord_extract ltPDB Filegt - Demonstrates
    the extraction of
  • C-Alpha co-ordinates from a PDB
  • data-file.
  • use strict
  • while ( ltgt )
  • if ( /ATOM/ substr( _, 13, 4 ) eq "CA "
    )
  • my ( X, Y, Z ) ( substr( _, 30, 8
    ),
  • substr( _, 38, 8 ),
  • substr( _, 46, 8 ) )
  • X s/ //g
  • Y s/ //g
  • Z s/ //g

22
Results from simple_coord_extract ...
  • X, Y Z 25.150, -8.702, 38.505
  • X, Y Z 23.675, -8.497, 35.069
  • X, Y Z 20.747, -6.252, 34.332
  • X, Y Z 17.545, -8.297, 34.292
  • X, Y Z 15.182, -7.484, 31.454
  • X, Y Z 11.736, -8.952, 30.942
  • X, Y Z 10.261, -9.014, 27.451
  • X, Y Z 6.507, -9.548, 27.173

23
The graphic image contact map
  • figCONTACTMAP.eps

24
STRIDE Secondary Structure Assignment
25
Maxim 10.2
  • It is often easier and desirable to regenerate
    database annotation than trawl through entries
    reconstituting the annotation using custom code.

26
Installation of STRIDE
  • tar -zxvf stride.tar.gz
  • cd stride
  • make
  • ./stride

27
Assigning Secondary Structures
28
Simplified definition of a Hydrogen Bond
  • figSIMPLIFIED.eps

29
Example of Secondary Structure Elements in
Proteins
  • figSSDEMO.eps

30
Definition of Dihedral angles in the backbone of
protein structures
  • figPSIPSI.eps

31
Using STRIDE and parsing the output
  • ./stride
  • You must specify input file
  • Action secondary structure assignment
  • Usage stride Options InputFile gt file
  • Options
  • -f File Output file
  • -mFile MolScript file
  • -o Report secondary structure summary
    Only
  • -h Report Hydrogen bonds
  • -rId1Id2.. Read only chains Id1, Id2 ...
  • -cId1Id2.. Process only Chains Id1, Id2 ...
  • -qFile Generate SeQuence file in FASTA
    format and die
  • Options are position and case insensitive
  • stride -cA 1lqt.pdb

32
Using gawk ...
  • gawk '/ASG/ print 8 " " 9' 1lqt.A.stride
  • 360.00 156.52
  • -75.72 161.36
  • -71.26 145.24
  • -111.08 119.10
  • -118.65 131.78
  • .
  • .
  • gawk '(/ASG/ /Strand/) print 8 " " 9'
    1lqt.A.stride
  • gawk '(/ASG/ /AlphaHelix/) print 8 " "
    9' 1lqt.A.stride

33
Ramachandran Plot of dihedral angles of chain A
from 1LQT
  • fig1LQTPHIPSI.eps

34
Extracting amino acid sequences using STRIDE
  • stride -q 1lqt.pdb
  • gt1lqt.pdb A 452 1.050
  • RPYYIAIVGSGPSAFFAAASLLKAADTTEDLDMAVDMLEMLPTPWGLVRS
    GVAPDHPKIK
  • .
  • .
  • gt1lqt.pdb B 454 1.050
  • RPYYIAIVGSGPSAFFAAASLLKAADTTEDLDMAVDMLEMLPTPWGLVRS
    GVAPDHPKIK
  • .
  • .
  • stride -cA -q 1lqt.pdb
  • gt1lqt.pdb A 452 1.050
  • RPYYIAIVGSGPSAFFAAASLLKAADTTEDLDMAVDMLEMLPTPWGLVRS
    GVAPDHPKIK
  • .
  • .

35
Introducing The mmCIF Protein Format
36
Converting mmCIF
  • Converting mmCIF to PDB
  • Converting mmCIFs to PDB with CIFTr

37
The CIFTr program
  • cd
  • tar -zxvf ciftr-v2.0-linux.tar.gz
  • cd ciftr-v2.0-linux/
  • setenv RCSBROOT /ciftr-v2.0-linux
  • export RCSBROOT /ciftr-v2.0-linux
  • ./CIFTr -i 1lqt.cif

38
More on mmCIF
  • Problems with the CIFTr conversion
  • Some advice on using mmCIF
  • Automated conversion of mmCIF to PDB

39
Where To From Here
Write a Comment
User Comments (0)
About PowerShow.com