Bioinformatics how to - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Bioinformatics how to

Description:

Over 30,000 protein structures have been determined, mostly by X-ray crystallography (PDB) ... OR Go to.... http://bioinformatics.burnham.org/~natasha/modeling.html ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 16
Provided by: AdamG63
Category:

less

Transcript and Presenter's Notes

Title: Bioinformatics how to


1
Bioinformatics how to
  • use publicly available free tools to predict
    protein structure by comparative modeling

2
Proteins are 3D objects with complex shapes
  • Over 30,000 protein structures have been
    determined, mostly by X-ray crystallography (PDB)
  • 3D structure of 70 of bacterial and 50 of
    human proteins can be predicted (comparative
    modeling)

3
A predicted model simply illustrates our
assumptions
No assumptions, this is nature telling us how it
is
GNAAAAKKGSEQESVKEFLAKAKEDFLKKWENPA QNTAHLDQFERIKTL
GTGSFGRVMLVKHKETGNH FAMKILDKQKVVKLKQIEHTLNEKRILQAV
NFPF LVKLEYSFKDNSNLYMVMEYVPGGEMFSHLRRIG RFSEPHARFY
AAQIVLTFEYLHSLDLIYRDLKPE NLLIDQQGYIQVTDFGFAKRVKGRT
WTLCGTPEY LAPEIILSKGYNKAVDWWALGVLIYEMAAGYPPF FADQP
IQIYEKIVSGKVRFPSHFSSDLKDLLRNL LQVDLTKRFGNLKDGVNDIK
NHKWFATTDWIAIY QRKVEAPFIPKFKGPGDTSNFDDYEEEEIRVSIN
EKCGKEFSEF
Assumption (protein A is Similar to protein B)
Result (protein A is Similar to protein B)
Sequence
4
How do we know that these proteins are similar?
  • Well studied protein
  • SRRSASHPTYSEMIAAAIRAEKSRGGSSRQSIQKYIKSHYKVGHNADLQI
    KLSIRRLLAA
  • Unknown protein
  • GLLTTKFVSLLQEAKDGVLDLKLAADTLAVRQKRRIYDITNVLEGIGLIE
    KKSKNSIQW

similarity
prediction
5
How can we make such assumptions?
  • Statistical reliability of the prediction
  • E-value - the number of hits one can "expect" to
    see just by chance when searching a database of a
    particular size (closer to zero the better)
  • Z-score score expressed as a distance from the
    mean calculated in standard deviations (the
    bigger the better)

6
Similar, but not homologous
  • phosphoribosyltransferase and viral coat
    protein, identity 42, different folds,
    different functions
  • . . . .
    .
  • 99 IRLKSYCNDQSTGDIKVIGGDDLSTLTGKNVLIVEDIIDTGKTMQT
    LLSLVRQY.NPKMVKVASLLVKRTPRSVGY 173
  • . .
    .
  • 214 VPLKTDANDQ.IGDSLY....SAMTVDDFGVLAVRVVNDHNPTKVT
    ..SKVRIYMKPKHVRV...WCPRPPRAVPY 279

7
Different, but homologous
  • Histone H5 and transcription factor E2F4,
    identity 7, similar fold, similar function (DNA
    binding)
  • PTYSEMIAAAIRAEKSRGGSSRQSIQKYIKSHYKVGHNADLQIKLSIRRL
    LAAGVLKQTKGVGASGSFRL
  • GLLTTKFVSLLQEAKD-GVLDLKLAADTLA------VRQKRRIYDITNVL
    EGIGLIEKKS----KNSIQW

8
Steps in comparative modeling
Are there any well characterized proteins similar
to my protein?
Recognition
What is the position-by-position target/template
equivalence
Alignment
What is the detailed 3D structure of my proteins
Modeling
Model analysis
Is my model any good?
9
Recognition
  • BLAST, PSI-BLAST or PFAM, FFAS, metaserver
    (bioinfo)
  • Name (PDB code) of the template
  • Statistical significance of the match (Z-score,
    e.value, p.value, points)

10
Alignment
  • The same tools as in recognition (perhaps with
    different parameters), editing by hand
  • Position by position equivalence table

11
Modeling
  • Commercial programs
  • Accelrys (Insight)
  • Tripos (Sybyl)
  • Freeware/shareware/servers
  • Modeller (Andrej Sali)
  • Jackal (Barry Honig)
  • SCRWL (Roland Dunbrack)
  • SwissModel

12
Model quality
  • Empirical energy based tools
  • PSQS (http//www1.jcsg.org/psqs/psqs.cgi)
  • SwissPDB viewer
  • Geometric quality
  • Procheck, SFCHECK, etc. (http//www.jcsg.org/scrip
    ts/prod/validation/sv3.cgi)

13
Expectations of comparative modeling
Easy 100-40 sequence id - strong
sequence similarity, strong structure
similarity, obvious function analogy
75
Difficult 40-25 - twilight zone sequence
similarity, increasing structure divergence,
function diversification
50
25
Fold prediction below 25 seq id. no apparent
sequence similarity extreme function divergence
0
14
Challenges of comparative modeling
100
80
60
40
20
15
Hands-on Activity
  • Click below for a hands-on, bioinformatics how
    to activity
  • Go to
  • http//bioinformatics.burnham.org/
  • Click Structure Biology Course - Protein
    Modeling Tutorial Link in the homepage.
  • OR Go to.
  • http//bioinformatics.burnham.org/natasha/modeli
    ng.html
Write a Comment
User Comments (0)
About PowerShow.com