BCB 444544 - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

BCB 444544

Description:

BCB 444/544 F07 ISU Dobbs #21 - Protein Secondary Structure Prediction ... SCOP = Structural Classification of Proteins ... http://scop.mrc-lmb.cam.ac.uk/scop ... – PowerPoint PPT presentation

Number of Views:118
Avg rating:3.0/5.0
Slides: 39
Provided by: publicI
Category:
Tags: bcb | scop

less

Transcript and Presenter's Notes

Title: BCB 444544


1
BCB 444/544
  • Lecture 21
  • Protein Structure Visualization, Classification
    Comparison
  • Secondary Structure Prediction
  • 21_Oct10

2
Required Reading (before lecture)
  • Mon Oct 8 - Lecture 20
  • Protein Secondary Structure Prediction
  • Chp 14 - pp 200 - 213
  • Wed Oct 10 - Lecture 21
  • Protein Tertiary Structure Prediction
  • Chp 15 - pp 214 - 230
  • Thurs Oct 11 Fri Oct 12 - Lab 7 Lecture 22
  • Protein Tertiary Structure Prediction
  • Chp 15 - pp 214 - 230

3
Assignments Announcements
  • ALL HomeWork 3
  • vDue Mon Oct 8 by 5 PM
  • HW544 HW544Extra 1
  • vDue Task 1.1 - Mon Oct 1 by noon
  • Due Task 1.2 Task 2 - Fri Oct 12 by 5 PM
  • 444 "Project-instead-of-Final" students should
    also submit
  • HW544Extra 1
  • vDue Task 1.1 - Mon Oct 8 by noon
  • Due Task 1.2 - Fri Oct 12 by 5 PM
  • ltTask 2 NOT required for BCB444 studentsgt

4
Seminars this Week - Thurs
  • BCB List of URLs for Seminars related to
    Bioinformatics
  • http//www.bcb.iastate.edu/seminars/index.html
  • Oct 11 Thurs
  • Dr. Klaus Schulten (Univ of Illinois) - Baker
    Center Seminar The
    Computational Microscope? 210 PM in E164
    Lagomarcino http//www.bioinformatics.iastate.edu/
    seminars/abstracts/2007_2008/Klaus_Schulten_Semina
    r.pdf
  • Dr. Dan Gusfield (UC Davis) - Computer Science
    Colloquium ReCombinatorics Combinatorial
    Algorithms for Studying History of Recombination
    in Populations 330 PM in Howe Hall Auditorium
  • http//www.cs.iastate.edu/colloq/new/gusfield.sh
    tml

5
Seminars this Week - Fri
  • BCB List of URLs for Seminars related to
    Bioinformatics
  • http//www.bcb.iastate.edu/seminars/index.html
  • Oct 12 Fri
  • Dr. Edward Yu (Physics/BBMB, ISU) - BCB Faculty
    Seminar TBA "Structural Biology" (see
    URL below) 210 PM in 102 Sci
    http//webdev.its.iastate.edu/web
    news/data/site_gdcb_dept_seminars/30/webnewsfilefi
    eld_abstract/Dr.-Ed-Yu.pdf
  • Dr. Srinivas Aluru (ECprE, ISU) - GDCB Seminar
  • Consensus Genetic Maps A Graph Theoretic
    Approach
  • 410 PM in 1414 MBB
  • http//webdev.its.iastate.edu/webnews/data/site_gd
    cb_dept_seminars/35/webnewsfilefield_abstract/Dr.-
    Srinivas-Aluru.pdf

6
Chp 12 - Protein Structure Basics
  • SECTION V STRUCTURAL BIOINFORMATICS
  • Xiong Chp 12 Protein Structure Basics
  • Amino Acids
  • Peptide Bond Formation
  • Dihedral Angles
  • Hierarchy
  • Secondary Structures
  • Tertiary Structures
  • Determination of Protein 3-Dimensional Structure
  • Protein Structure DataBank (PDB)

7
Protein Structure Function
  • Protein structure - primarily determined by
    sequence
  • Protein function - primarily determined by
    structure
  • Globular proteins compact hydrophobic core
    hydrophilic surface
  • Membrane proteins special hydrophobic surfaces
  • Folded proteins are only marginally stable
  • Some proteins do not assume a stable "fold" until
    they bind to something Intrinsically disordered
  • Predicting protein structure and function can be
    very hard
  • -- fun!

8
6 Main Classes of Protein Structure
  • 1) a-Domains
  • Bundles of helices connected by loops
  • 2) ?-Domains
  • Mainly antiparallel sheets, usually 2 sheets
    forming sandwich
  • 3) a????Domains
  • Mainly parallel sheets with intervening helices,
    mixed sheets
  • 4) ?a????Domains
  • Mainly segregated helices and sheets
  • 5) Multidomain (a? ? ???
  • Containing domains from more than one class
  • 6) Membrane cell-surface proteins

9
Protein Structure Databases
  • PDB - Protein Data Bank
  • http//www.rcsb.org/pdb/
  • (RCSB) - THE protein structure database
  • MMDB - Molecular Modeling Database
  • http//www.ncbi.nlm.nih.gov/entrez/query.fcgi?db
    Structure
  • (NCBI Entrez) - has "added" value
  • MSD - Molecular Structure Database
    http//www.ebi.ac.uk/msd
  • Especially good for interactions binding sites

10
PDB (RCSB) - recently "remediated"
http//www.rcsb.org/pdb
11
Structure at NCBI http//www.ncbi.nlm.nih.gov/Str
ucture
12
MMDB at NCBI http//www.ncbi.nlm.nih.gov/
Structure/MMDB/mmdb.shtml
13
MMDB Molecular Modeling Data Base
  • Derived from PDB structure records
  • "Value-added" to PDB records includes
  • Integration with other ENTREZ databases tools
  • Conversion to parseable ASN.1 data description
    language
  • Data also available in mmCIF XML (also true
    for PDB now)
  • Correction of numbering discrepancies in
    structure vs sequence
  • Validation
  • Explicit chemical graph information (covalent
    bonds)
  • Integrated tool for identifying structural
    neighbors Vector Alignment Search Tool (VAST)
  • http//www.ncbi.nlm.nih.gov/Structure/VAST/vastsea
    rch.html

14
MSD Molecular Structure Database http//www.ebi.
ac.uk/msd/
15
wwPDB World Wide PDB http//www.wwpdb.org
16
Experimental Determination of 3D Structure
  • 2 Major Methods to obtain high-resolution
    structures
  • X-ray Crystallography (most PDB structures)
  • Nuclear Magnetic Resonance (NMR) Spectroscopy
  • Note Advantages Limitations of each method
  • (See your lecture notes textbook)
  • For more info http//en.wikipedia.org/wiki/Pro
    tein_structure
  • Other methods (usually lower resolution, at
    present)
  • Electron Paramagnetic Resonance (EPR - also
    called ESR, EMR)
  • Electron microscopy (EM)
  • Cryo-EM
  • Scanning Probe Microscopies (AFM - Atomic Force
    Microscopy)
  • http//www.uweb.engr.washington.edu/research/tutor
    ials/SPM.pdf
  • Circular Dichroism (CD), several other
    spectroscopic methods

17
Chp 13 - Protein Structure Visualization,
Comparison Classification
  • SECTION V STRUCTURAL BIOINFORMATICS
  • Xiong Chp 13
  • Protein Structure Visualization, Comparison
    Classification
  • Protein Structural Visualization
  • Protein Structure Comparison
  • Protein Structure Classification

18
Protein Structure Visualization
  • RASMOL decendents PyMol, MolMol
  • http//www.umass.edu/microbio/rasmol/index2.htm
  • Cn3D - esp. good for structural alignments
  • http//www.biosino.org/mirror/www.ncbi.nlm.nih.go
    v/Structure/cn3d/
  • CHIME (Protein Explorer)
  • http//www.umass.edu/microbio/chime/getchime.htm
  • MolviZ.Org
  • http//www.umass.edu/microbio/chime
  • Deep View Swiss-PDB Viewer
  • http//www.expasy.org/spdbv

19
PyMol http//pymol.sourceforge.net/
20
Cn3D http//www.ncbi.nlm.nih.gov/Structure/CN3
D/cn3d.shtml
21
Cn3D Displaying 3' Structures
Chloroquine
22
Cn3D Structural Alignments
NADH
23
Protein Explorer (Chime)http//www.umass.edu/micr
obio/chime/pe_beta/pe/protexpl/frntdoor.htm
24
Protein Structure Comparison Methods
We will skip this for now
  • 3 Basic Approaches for Aligning Structures
  • Intermolecular -
  • Intramolecular -
  • Combined -
  • DALI/FSSP (most commonly used)
  • Fully automated structure alignments
  • DALI server http//www.ebi.ac.uk/dali/index.html
  • DALI Database (fold classification) http//ek
    hidna.biocenter.helsinki.fi/dali/start

25
Protein Structure Classification
  • SCOP Structural Classification of Proteins
  • Levels reflect both evolutionary and structural
    relationships
  • http//scop.mrc-lmb.cam.ac.uk/scop
  • CATH Classification by Class,
    Architecture,Topology Homology http//cathwww.b
    iochem.ucl.ac.uk/latest/
  • DALI - (recently moved to EBI reorganized)
  • DALI Database (fold classification) http/
    /ekhidna.biocenter.helsinki.fi/dali/start

Each method has strengths weaknesses.
26
SCOP - Structure Classification http//scop.mrc-l
mb.cam.ac.uk/scop/
27
CATH - Structure Classification
http//www.cathdb.info/latest/index.html
28
Chp 14 - Secondary Structure Prediction
  • SECTION V STRUCTURAL BIOINFORMATICS
  • Xiong Chp 14
  • Protein Secondary Structure Prediction
  • Secondary Structure Prediction for Globular
    Proteins
  • Secondary Structure Prediction for Transmembrane
    Proteins
  • Coiled-Coil Prediction

29
Secondary Structure Prediction
  • Has become highly accurate in recent years (gt85)
  • Usually 3 (or 4) state predictions
  • H ?-helix
  • E ?-strand
  • C coil (or loop)
  • (T turn)

30
Secondary Structure Prediction Methods
  • 1st Generation methods
  • Ab initio - used relatively small dataset of
    structures available
  • Chou-Fasman - based on amino acid propensities
    (3-state)
  • GOR - also propensity-based (4-state)
  • 2nd Generation methods
  • based on much larger datasets of structures now
    available
  • GOR II, III, IV, SOPM
  • 3rd Generation methods
  • Homology-based Neural network based
  • PHD, PSIPRED, SSPRO, PROF, HMMSTR
  • Meta-Servers
  • combine several different methods
  • Consensus Ensemble based
  • JPRED, PredictProtein

31
Secondary Structure Prediction Servers
  • Prediction Evaluation?
  • Q3 score - of residues correctly predicted
    (3-state)
  • in cross-validation experiments
  • Best results? Meta-servers
  • http//expasy.org/tools/ (scroll for 2'
    structure prediction)
  • http//www.russell.embl-heidelberg.de/gtsp/secstru
    cpred.html
  • JPred www.compbio.dundee.ac.uk/www-jpred
  • PredictProtein http//www.predictprotein.org/
    Rost, Columbia
  • Best individual programs? ??
  • CDM http//gor.bb.iastate.edu/cdm/
    SenJernigan, ISU
  • GOR V http//gor.bb.iastate.edu/
    KloczkowskyJernigan, ISU

32
Consensus Data Mining (CDM)
  • Developed by Jernigan Group at ISU
  • Basic premise combination of 2 complementary
    methods can enhance performance by harnessing
    distinct advantages of both methods combines
    FDM GOR V
  • FDM - Fragment Data Mining - exploits
    availability of sequence-similar fragments in the
    PDB, which can lead to highly accurate prediction
    - much better than GOR V - for such fragments,
    but such fragments are not available for many
    cases
  • GOR V - Garnier, Osguthorpe, Robson V - predicts
    secondary structure of less similar fragments
    with good performance these are protein
    fragments for which FDM method cannot find
    suitable structures
  • For references additional details
    http//gor.bb.iastate.edu/cdm/

33
Secondary Structure Prediction for Different
Types of Proteins/Domains
  • For Complete proteins
  • Globular Proteins - use methods previously
    described
  • Transmembrane (TMM) Proteins - use special
    methods
  • (next slides)
  • For Structural Domains many under development
  • Coiled-Coil Domains (Protein interaction
    domains)
  • Zinc Finger Domains (DNA binding domains),
  • others

34
SS Prediction for Transmembrane Proteins
  • Transmembrane (TM) Proteins
  • Only a few in the PDB - but 30 of cellular
    proteins are membrane-associated !
  • Hard to determine experimentally, so prediction
    important
  • TM domains are relatively 'easy' to predict!
  • Why? constraints due to hydrophobic environment
  • 2 main classes of TM proteins
  • ??- helical
  • ?- barrel

35
SS Prediction for TM ?-Helices
  • ??-Helical TM domains
  • Helices are 17-25 amino acids long (span the
    membrane)
  • Predominantly hydrophobic residues
  • Helices oriented perpendicular to membrane
  • Orientation can be predicted using "positive
    inside" rule
  • Residues at cytosolic (inside or cytoplasmic)
    side of TM helix, near hydrophobic anchor are
    more positively charged than those on lumenal
    (inside an organelle in eukaryotes) or
    periplasmic side (space between inner outer
    membrane in gram-negative bacteria)
  • Alternating polar hydrophobic residues provide
    clues to interactions among helices within
    membrane
  • Servers?
  • TMHMM or HMMTOP - 70 accuracy - confused by
    hydrophobic signal peptides (short hydrophobic
    sequences that target proteins to the
    endoplasmic reticulum, ER)
  • Phobius - 94 accuracy - uses distinct HMM
    models for TM helices
  • signal peptide sequences

36
SS Prediction for TM ?-Barrels ?
  • ?-Barrel TM domains ?
  • ?-strands are amphipathic (partly hydrophobic,
    partly hydrophilic)
  • Strands are 10 - 22 amino acids long
  • Every 2nd residue is hydrophobic, facing lipid
    bilayer
  • Other residues are hydrophilic, facing "pore" or
    opening
  • Servers? Harder problem, fewer servers
  • TBBPred - uses NN or SVM (more on these ML
    methods later)
  • Accuracy ?

37
Prediction of Coiled-Coil Domains
  • Coiled-coils
  • Superhelical protein motifs or domains, with two
    or more interacting ?-helices that form a
    "bundle"
  • Often mediate inter-protein ( intra-protein)
    interactions
  • 'Easy' to detect in primary sequence
  • Internal repeat of 7 residues (heptad)
  • 1 4 hydrophobic (facing helical interface)
  • 2,3,5,6,7 hydrophilic (exposed to solvent)
  • Helical wheel representation - can be used
    manually detect these, based on amino acid
    sequence
  • Servers?
  • Coils, Multicoil - probability-based methods
  • 2Zip - for Leucine zippers special type of CC
    in TFs
  • characterized by Leu-rich motif
    L-X(6)-L-X(6)-L-X(6)-L

38
Chp 15 - Tertiary Structure Prediction
  • SECTION V STRUCTURAL BIOINFORMATICS
  • Xiong Chp 15
  • Protein Tertiary Structure Prediction
  • Methods
  • Homology Modeling
  • Threading and Fold Recognition
  • Ab Initio Protein Structural Prediction
  • CASP
Write a Comment
User Comments (0)
About PowerShow.com