Lecture 6.3: From DNA to Protein - PowerPoint PPT Presentation

1 / 61
About This Presentation
Title:

Lecture 6.3: From DNA to Protein

Description:

Used to calculate hydrophobicity ... Hydrophobicity Profile. Moving segment approach. Correlation of this technique with 3D structure ... – PowerPoint PPT presentation

Number of Views:85
Avg rating:3.0/5.0
Slides: 62
Provided by: Comp684
Category:

less

Transcript and Presenter's Notes

Title: Lecture 6.3: From DNA to Protein


1
Lecture 6.3 From DNA to Protein
  • Dr. Joanne Fox
  • Day 6 Saturday February 21st, 2004
  • 1345 1515pm

2
From DNA to Protein
3
Objectives
  • Review protein sequence features and databases
  • Review the structural diversity of amino acids
    and protein sequences
  • Highlight several physiochemical and structural
    features which can be calculated from protein
    sequences
  • Show how proteomics utilizes methods and
    techniques for measuring, comparing and assessing
    protein features

4
Outline
  • Protein sequence features
  • Databases of protein sequences
  • Basics of protein structure
  • 1o structure, prediction of Mw and pI
  • 2o structure, prediction methods
  • 3o structure, methods for predicting folds
  • Proteomics
  • Current methods
  • Cutting edge technology

5
Amino Acids
amino group
alpha carbon
  • The general formula for an amino acid
  • R is commonly one of 20 different side chains
  • At pH 7 both the amino and carboxyl groups are
    ionized

carboxyl group
side chain group
6
Peptide Bonds
  • Amino acids are joined together by an amide
    linkage called a peptide bond.
  • The two bonds on either side of the rigid planar
    peptide unit exhibit a high degree rotation

peptide bonds
rotation occurs here
7
Families of Amino Acids
  • The common amino acids are grouped according to
    whether their side chains are
  • acidic D, E
  • basic K, R, H
  • uncharged polar N, Q, S, T, Y
  • nonpolar G, A, V, L, I, P, F, M, W, C
  • Hydrophilic amino acids (uncharged polar) are
    usually on the outside of a protein whereas
    nonpolar residues cluster on the inside of
    protein
  • Basic or acidic amino acids are very polar and
    are generally found on the outside of protein
    molecules

8
Protein Sequence Features
  • Proteins exhibit far more sequence and chemical
    complexity than DNA or RNA
  • Properties and structure are defined by the
    sequence and side chains of their constituent
    amino acids
  • The engines of life
  • gt95 of all drugs target proteins
  • Favorite topic of post-genomic era

9
Protein Sequence Databases
  • Where does protein sequence information reside?
  • Entrez Cross Database Search
  • http//www.ncbi.nlm.nih.gov/gquery/gquery.fcgi
  • Swissprot TrEMBL
  • http//ca.expasy.org/sprot/
  • PIR
  • http//pir.georgetown.edu/
  • As of December 2003, all of this information is
    integrated into unified protein database called
    Uniprot.
  • Uniprot
  • http//www.pir.uniprot.org/

10
Entrez Cross Database Search
  • Protein sequence database gives access to
    translated protein sequences from
    Genbank/EMBL/DDBJ
  • Complete set of deduced protein sequences
  • Redundancy problem

11
Swissprot TrEMBL
  • Swissprot is an expert curated database
  • Function, domain structure, post-translational
    modifications, variants, reactions, similarities
  • TrEMBL (translated EMBL)
  • Computer annotated supplement to Swissprot

12
PIR Protein Information Resource
  • Annotated database which includes protein family
    classification information

13
The Uniprot Knowledgebase
  • Contains all of the information in Swiss-Prot,
    TrEMBL, and PIR. This new unified database was
    launched in December 2003.

14
Basics of Protein Structure
  • Primary
  • Secondary
  • Tertiary

15
Molecular Weight
  • Quick formula 110 X number of residues
  • Accurate determination of mass by mass
    spectrometry
  • Tools exist for accurately calculating mass of
    peptides based on amino acid composition

16
Molecular Weight Proteomics
2-D Gel QTOF Mass Spectrometry
17
Isoelectric Point
  • The pH at which a protein has a net charge0

18
Basics of Protein Structure
  • Primary
  • Secondary
  • Tertiary

19
Common Secondary Structure Elements
  • The Alpha Helix

20
Common Secondary Structure Elements
  • The Beta Sheet

21
Secondary StructurePhi Psi Angles Defined
  • Rotational constraints emerge from interactions
    with bulky groups (ie. side chains).
  • Phi Psi angles define the secondary structure
    adopted by a protein.

22
Ramachandran Plot
23
Supersecondary Structure
24
Secondary Structure Protein Folding
  • Understanding the forces of hydrophobicity

Hydrogen bonds can form with polar side
chains on outside of the protein
nonpolar side chains
polar side chains
hydrophobic core contains nonpolar side chains
unfolded or partially folded polypeptide
folded conformation
25
Hydrophobicity is a property which can be
calculated for protein sequences
  • Hydrophobicity Scales
  • Used to calculate hydrophobicity
  • Based on experimental evidence indicating
    hydrophobic/hydrophilic properties of each aa
  • Solubility, Stability, Location and/or
    Globularity of protein sequences can be predicted

26
Hydrophobicity Profile
  • Moving segment approach
  • Correlation of this technique with 3D structure

exterior
interior residues
hydrophobic hydrophilic -
score
NH2 protein sequence
COOH
27
The a-helix is a common secondary structure
element
acidic
  • A helical wheel is a representation of the 3D
    structure of the a-helix.
  • Projection of aa side chains onto a plane
    perpendicular to axis of helix
  • Hydrophobic arcs stabilize helical interactions
  • Amphipathic helices are common

nonpolar
28
Secondary Structure Prediction
  • The presence of secondary structure elements can
    be predicted.
  • Current algorithms rely on
  • statistics (Chou-Fasman, GOR)
  • homology or nearest neighbor comparisons (Levin)
  • physico-chemical properties (Lim, Eisenberg)
  • pattern matching (Cohen, Rooman)
  • neural networks (Qian Sejnowski, Karplus)
  • evolutionary methods (Barton, Niemann)
  • and combined approaches (Rost, Levin, Argos)

29
Chou-Fasman Algorithm
  • Assign each residue a Pa, Pb, Pc value
  • Take a window of 7 residues and calculate a
    window-averaged value for all Pa, Pb, Pc
  • Assign the average value for each of the
    secondary structures to the middle residue
  • Move down one residue and repeat steps 2 thru 3
    until finished
  • Scan and assign SS to the highest P/residue

30
Chou-Fasman Statistics
31
The PhD Approach
PRFILE...
32
The PhD Algorithm
  • Search the SWISS-PROT database and select high
    scoring homologues
  • Create a sequence profile from the resulting
    multiple alignment
  • Include global sequence info in the profile
  • Input the profile into a trained two-layer neural
    network to predict the structure and to
    clean-up the prediction

33
Predicting via Neural Nets PSSM
  • PHDhtm
  • http//www.embl-heidelberg.de/predictprotein/
  • TMAP
  • http//www.mbb.ki.se/tmap/index.html
  • TMPred
  • http//www.ch.embnet.org/software/TMPRED_form.html

34
Prediction Performance
35
Best of the Best
  • PredictProtein-PHD (72)
  • http//cubic.bioc.columbia.edu/predictprotein
  • Jpred (73-75)
  • http//www.compbio.dundee.ac.uk/www-jpred/
  • PREDATOR (75)
  • http//www.hgmp.mrc.ac.uk/Registered/Option/predat
    or.html
  • PSIpred (77)
  • http//bioinf.cs.ucl.ac.uk/psipred/

36
Basics of Protein Structure
  • Primary
  • Secondary
  • Tertiary

37
Tertiary Structure
38
Protein Structure Databases
  • Where does protein structural information reside?
  • PDB
  • http//www.rcsb.org/pdb/
  • MMDB
  • http//www.ncbi.nlm.nih.gov/Structure/
  • FSSP
  • http//www.ebi.ac.uk/dali/fssp/
  • SCOP
  • http//scop.mrc-lmb.cam.ac.uk/scop/
  • CATH
  • http//www.biochem.ucl.ac.uk/bsm/cath_new/

39
Structural Proteomics
  • Aim to delineate total repertoire of protein
    folds
  • Provide 3D portraits for all proteins in an
    organism
  • Goal Use structure to infer function.
  • Compare structure of unknown protein to known set
    of structures
  • More sensitive than primary sequence comparisons

40
The Protein Fold Universe
500? 2000? 10000?
How Big Is It???
8
?
41
Structures in PDB
PDB 19860 structures Jan 03 PDB 23997
structures Jan 04 structural genomics search
156 structures Jan 03 search 478 structures Jan
04
42
Structural Proteomics
100000
90000
80000
70000
60000
50000
Sequences
Structures
40000
30000
20000
10000
0
43
Unique folds in PDB
44
Prediction Methods for 3D structure
  • Intermediate Steps
  • Predict secondary structure
  • Calculate solvent accessibility
  • Methods for 3D structure prediction based on
  • Threading, Homology Modeling or Fold recognition
  • Similarity in amino acid sequence implies similar
    structure/function
  • Ab Initio Techniques
  • Numerical methods designed to simulate the
    structure and dynamics of marcromolecules

45
Proteomics
  • The study of the expression, location,
    interaction, function and structure of all the
    proteins in a given cell or organism
  • Expressional Proteomics
  • Functional Proteomics
  • Structural Proteomics

46
Proteomics
  • Expressional Proteomics
  • 2D or Capillary Electrophoresis, protein chips
  • Mass Spectrometry, Laser induced fluorescence
  • Functional Proteomics
  • Mass Spectrometry, micro-assays, protein chips
  • Yeast or Bacterial 2-hybrid systems
  • Structural Proteomics
  • High throughput X-ray crystallography
  • High throughput NMR spectroscopy

47
2D Gel Principles
SDS PAGE
48
Mass Spec Principles
Sample

_
Detector
Ionizer
Mass Filter
49
Ionization Methods
370 nm UV laser
Fluid (no salt)

_
Gold tip needle
cyano-hydroxy cinnamic acid
MALDI
ESI
50
Protein ID Protocol
51
Computational Tools for Protein Identification
  • PeptIdent
  • http//us.expasy.org/tools/peptident.html
  • Mascot
  • http//www.matrixscience.com/search_form_select.ht
    ml
  • ProteinProspector
  • http//prospector.ucsf.edu/
  • MOWSE
  • http//srs.hgmp.mrc.ac.uk/cgi-bin/mowse
  • PeptideSearch
  • http//www.mann.embl-heidelberg.de/
    GroupPages/PageLink/peptidesearchpage.html
  • AACompSim/AACompIdent
  • http//www.expasy.ch/tools

Covered in Lab 6.4
52
Proteomics
  • Human proteome estimated to contain 500,000
    proteins
  • The next big wave in bioinformatics
  • How to deal with so much data?
  • How to link structure to function to sequence?
  • How to show or store temporal and spatial data?
  • How to use it in drug discovery development?

Proteomics Workshop July 19 24th, 2004
Calgary, Alberta
53
The Cutting Edge of Proteomics
  • Evolution of Proteomes
  • Structural Genomics
  • Quantitative Mass Spectrometry and Protein
    Chip Technology
  • Chemical Proteomics
  • Proteome Scale Analysis of Networks, i.e., signal
    transduction, Y2H experiments

54
Global Proteome Interaction Mapping in C. elegans
Science 23 January 2004 303 540
see also
Science 7 January 2000 287 116
55
Yeast Two Hybrid (Y2H) on the genomic scale
  • Global interaction map of C. elegans
  • Use proteome as bait in Y2H experiment
  • Detect all pairwise interactions
  • Create global proteinprotein interaction network

56
ProteinProtein Interaction Networks
57
DNA vs Protein Chip Technology
  • DNA microtechnology
  • Can successfully read 1000s of side by side
    measurements of RNA levels
  • BUT RNA ? protein function
  • Protein Microarray Technology
  • Goal develop protein chip with proteins in
    active state.
  • Proteins more challenging to prepare than DNA/RNA
  • Protein functionality depends on state,
    modifications, binding partners, localization
    etc.

58
Protein Chip - Methods
  • Attachment Methods
  • Diffusion
  • Absorption
  • nitrocellulose
  • Covalent Crosslinking
  • Reactive surfaces
  • Affinity Attachment
  • Affinity tags

59
Protein Chip - Applications
  • Antibody Chip
  • Detect Ag-Ab interactions
  • Protein Chip
  • Proteinprotein
  • Proteindrug
  • Enzymesubstrate
  • Ligand Chip
  • And more.

60
Protein Chips
61
Summary
  • Protein sequence, and subsequently protein
    sequence databases, are much more complex than
    DNA
  • Prediction of protein structure is a complex
    problem at both the 2D and 3D levels
  • Proteomics initiatives based on different
    technologies are making inroads into the study of
    protein structure and function on a global level
Write a Comment
User Comments (0)
About PowerShow.com