Bioinformatics approaches for - PowerPoint PPT Presentation

About This Presentation
Title:

Bioinformatics approaches for

Description:

A functionally diverse family of cell-surface 7TM proteins ... RA Strausberg R.L., Feingold E.A., Grouse L.H., Derge J.G., RA Jones S.J., Marra M.A. ... – PowerPoint PPT presentation

Number of Views:66
Avg rating:3.0/5.0
Slides: 64
Provided by: Att88
Category:

less

Transcript and Presenter's Notes

Title: Bioinformatics approaches for


1
Bioinformatics approaches for
  • Teresa K Attwood
  • Faculty of Life Sciences School of Computer
    Science
  • University of Manchester, Oxford Road
  • Manchester M13 9PT, UK
  • http//www.bioinf.man.ac.uk/dbbrowser/

2
.analysing GPCRs.
3
.which craft is best?
4
Overview
  • What are GPCRs?
  • why theyre interesting important
  • why bioinformatics approaches are important
  • In silico function prediction
  • a reality check
  • Family-based methods for characterising GPCRs
  • Understanding the tools
  • problems with pair-wise family-based approaches
  • estimating (biological) significance
  • Seeking deeper functional insights
  • Conclusions

5
What are GPCRs?G protein-coupled receptors
  • A functionally diverse family of cell-surface 7TM
    proteins
  • Functional diversity achieved via
  • interaction with a variety of ligands
  • stimulation of various intracellular pathways via
    coupling to different G proteins

GDP
6
Why are GPCRs interesting?Attwood, TK Flower,
DR (2002) Trawling the genome for G
protein-coupled receptors the importance of
integrating bioinformatic approaches. In Drug
Design Cutting Edge Approaches, pp.60-71.
  • They are ubiquitous
  • gt800 GPCR genes in the human genome, from 3 major
    superfamilies
  • rhodopsin-, secretin- metabotropic glutamate
    receptor-like
  • Share almost no sequence similarity
  • but are united by common 7TM architecture
  • Constitute a complex multi-gene family
  • populated by gt50 families gt350 subtypes

7
Isnt just stamp collecting!Attwood, TK
Flower, DR (2002) Trawling the genome for G
protein-coupled receptors the importance of
integrating bioinformatic approaches. In Drug
Design Cutting Edge Approaches, pp.60-71.
  • GPCRs are of profound biomedical importance
  • targets for gt50 of prescription drugs
  • yield sales gt16 billion/annum
  • theyre big business!
  • Given their importance, we need to
  • characterise the ones we know about
  • identify new ones
  • discover what they do!
  • e.g., as potential new drug targets

8
Why studying GPCRs is difficult
  • Only 2 crystal structures available
  • bovine rhodopsin (2000) human ?2-adrenergic
    receptor (2007)
  • Many GPCRs havent been characterised
    experimentally
  • remain 'orphans, with unknown ligand specificity
  • With gt800 human GPCRs, this isnt much to go on!

9
Why use bioinformatics approaches?
  • Computational approaches are important
  • can be used to help identify, characterise
    model novel receptors
  • usually by similarity extrapolation of known
    characteristics
  • Bioinformatics thus offers complementary tools
    for elucidating the structures functions of
    receptors
  • But the task is non-trivial
  • GPCRs exhibit rich relationships complex
    molecular interactions
  • present many challenges for in silico analysis
  • in trying to derive meaningful functional
    insights, traditional methods are likely to be
    limited

10
Weve been using biology-unaware search tools to
analyse such complex systems
How far can we truly expect to understand
cellular function with such naïve approaches?
11
In silico function predictiona reality check
  • What is the function of this structure?
  • What is the function of this sequence?
  • What is the function of this motif?
  • the fold provides a scaffold, which can be
    decorated in different ways by different
    sequences to confer different functions - knowing
    the fold function allows us to rationalise how
    the structure effects its function at the
    molecular level

12
A test case for structural genomics
Structure-based assignment of the biochemical
function of hypothetical protein mj0577
(Zarembinski et al., PNAS 95 1998)
Although the structure co-crystallised with ATP,
the biochemical function of the protein is
unknown
13
What's in a sequence?
14
Methods for family analysis Attwood, TK (2000).
The quest to deduce protein function from
sequence the role of pattern databases. Int.J.
Biochem. Cell Biol., 32(2), 139155.
Fuzzy regex (eMOTIF)
Single motif methods
Exact regex (PROSITE)
Full domain alignment methods
Profiles (Profile Library)
HMMs (Pfam)
Identity matrices (PRINTS)
Multiple motif methods
Weight matrices (Blocks)
15
The challenge of family analysis
  • highly divergent family with single function?
  • superfamily with many diverse functional
    families?
  • must distinguish if function analysis done in
    silico
  • a tough challenge!

16
In the beginning was PROSITE
TM domain
GSTALIVMYWC-GSTANCPDE-EDPKRH-X(2)-LIVMNQGA
-X(2)-LIVMFT-GSTANC-LIVMFYWSTAC-DENH-R
17
Diagnostic limitations of PROSITE
  • ID G_PROTEIN_RECEP_F1_1 PATTERN.
  • AC PS00237
  • DT APR-1990 (CREATED) NOV-1997 (DATA UPDATE)
    SEP-2004 (INFO UPDATE).
  • DE G-protein coupled receptors family 1
    signature.
  • PA GSTALIVMFYWC-GSTANCPDE-EDPKRH-x(2)-LIVMN
    QGA-x(2)-LIVMFT-
  • PA GSTANC-LIVMFYWSTAC-DENH-R-FYWCSH-x(2)-
    LIVM.
  • NR /RELEASE44.6,159201
  • NR /TOTAL1622(1621) /POSITIVE1530(1529)
    /UNKNOWN0(0)
  • NR /FALSE_POS92(92) /FALSE_NEG261
    /PARTIAL61
  • This represents an apparent 22 error rate
  • the actual rate is probably higher
  • Thus, a match to a pattern is not necessarily
    true
  • a mis-match is not necessarily false!
  • False-negatives are a fundamental limitation to
    this type of pattern matching
  • if you don't know what you're looking for, you'll
    never know you missed it!

18
Where do motifs (fingerprints) fit in?
(fingerprints are hierarchical)
19
Rhodopsin-like superfamily, family subtype
GPCRs in PRINTS Attwood, TK (2001) A compendium
of specific motifs for diagnosing GPCR subtypes.
TiPS, 22(4), 162-165.
20
Searching PRINTS - FingerPRINTScan Scordis, P,
Flower, DR Attwood, TK (1999) FingerPRINTScan
intelligent searching of the PRINTS motif
database. Bioinformatics, 15, 523-524.
  • GPCR fingerprints are embedded in PRINTS
  • allows diagnosis of GPCR mosaics

21
(No Transcript)
22
Visualising fingerprints Attwood, TK Findlay,
JBC (1993) Design of a discriminating fingerprint
for G-protein-coupled receptors. Protein Eng.,
6(2), 167176.
N
C
23
Visualising fingerprints Attwood, TK Findlay,
JBC (1993) Design of a discriminating fingerprint
for G-protein-coupled receptors. Protein Eng.,
6(2), 167176.
24
Diagnosing partial matches
  • Missed by PROSITE
  • wasnt annotated as a FN

25
An integrated approachMulder, NJ, Apweiler, R,
Attwood, TK, Bairoch, A et al. (2007) New
developments in InterPro. NAR, 35, D224-8.
  • To simplify sequence analysis, the family dbs
    were integrated within a unified annotation
    resource InterPro
  • initial partners were PRINTS, PROSITE, profiles
    Pfam
  • now many more partners
  • linked to its satellite dbs
  • but lags behind their coverage
  • by Oct 2007, it had 14,768 entries covered 76
    of UnitProtKB
  • major role in fly human genome annotation

26
InterPro method comparison
27
Where has this got us?
28
Understanding the tools estimating significance
  • How do we know what to believe?
  • Lets explore some of the difficulties that arise
    when pair-wise search tools (BLAST FastA)
    family-based methods are used naïvely
  • these examples caution us to think about what the
    results actually mean in biological terms.....

29
Identifying sequence similarity
  • GPCRs present many challenges for in silico
    functional analysis
  • Several signature-based methods now available
  • with different areas of optimum application
  • Yet naïve, pair-wise similarity searching has
    been the mainstay of functional annotation
    efforts
  • it allows us to identify/quantify relationships
    between sequences
  • But quantifying similarity between sequences is
    not the same as identifying their functions

30
Problems with pairwise similarity toolsGaulton,
A Attwood, TK (2003) Bioinformatics approaches
for the classification of G protein-coupled
receptors. Current Opinion in Pharmacology, 3,
114-120.
  • For identifying precise families to which
    receptors belong the ligands they bind,
    pair-wise tools are limited
  • at what level of seq ID is ligand specificity
    conserved?
  • some GPCRs with 25 ID share a common ligand
  • others, with greater levels, dont
  • It may be impossible to tell from BLAST if an
    orphan belongs to a known family (the top hit),
    or if it will bind a novel ligand
  • e.g., for the now de-orphaned UR2R, BLAST
    indicates most similarity to the type 4 SSRs, yet
    it is known to bind a different (related) ligand

31
When is a GPCR not an SSR?
  • Query length 389 AA
  • Date run 2002-10-18 090829 UTC0100 on
    sib-blast.unil.ch
  • Taxon Homo sapiensDatabase XXswissprot
  • 120,412 sequences 45,523,583 total letters
    SWISS-PROT Release 40.29 of 10-Oct-2002
  • Db AC Description
    Score E-value
  • sp Q9UKP6 Q9UKP6 Orphan receptor Homo
    sapiens... 782 0.0
  • sp P31391 SSR4_HUMAN Somatostatin receptor
    type 4 (SS4R) SSTR4... 167 3e-41
  • sp O43603 GALS_HUMAN Galanin receptor type 2
    (GAL2-R) (GALR2) G... 147 4e-35
  • sp P30872 SSR1_HUMAN Somatostatin receptor
    type 1 (SS1R) (SRIF-2... 144 3e-34
  • sp P32745 SSR3_HUMAN Somatostatin receptor
    type 3 (SS3R) (SSR-28... 140 3e-33
  • sp P35346 SSR5_HUMAN Somatostatin receptor
    type 3 (SS5R) (SSTR5)... 140 6e-33
  • sp P30874 SPLICE ISOFORM B of P30874 SSTR2
    Homo sapiens... 134 3e-31
  • sp P30874 SSR2_HUMAN Somatostatin receptor
    type 2 (SS2R) (SRIF-1... 134 3e-31
  • sp P48145 GPR7_HUMAN Neuropeptides B/W
    receptor type 1 (G protei... 133 7e-31
  • sp O60755 GALT_HUMAN Galanin receptor type 3
    (GAL3-R) (GALR3) G... 132 2e-30
  • sp P41143 OPRD_HUMAN Delta-type opioid
    receptor (DOR-1) OPRD1 ... 128 2e-29
  • sp P35372 SPLICE ISOFORM 1A of P35372
    OPRM1 Homo sapien... 125 1e-28
  • sp P35372 OPRM_HUMAN Mu-type opioid receptor
    (MOR-1) OPRM1 Ho... 125 1e-28

32
When is a GPCR not an SSR?when its a UR2R
  • Query length 389 AA
  • Date run 2002-10-18 090829 UTC0100 on
    sib-blast.unil.ch
  • Taxon Homo sapiensDatabase XXswissprot
  • 120,412 sequences 45,523,583 total letters
    SWISS-PROT Release 40.29 of 10-Oct-2002
  • Db AC Description
    Score E-value
  • sp Q9UKP6 UR2R_HUMAN Urotensin II receptor
    (UR-II-R) GPR14 Ho... 782 0.0
  • sp P31391 SSR4_HUMAN Somatostatin receptor
    type 4 (SS4R) SSTR4... 167 3e-41
  • sp O43603 GALS_HUMAN Galanin receptor type 2
    (GAL2-R) (GALR2) G... 147 4e-35
  • sp P30872 SSR1_HUMAN Somatostatin receptor
    type 1 (SS1R) (SRIF-2... 144 3e-34
  • sp P32745 SSR3_HUMAN Somatostatin receptor
    type 3 (SS3R) (SSR-28... 140 3e-33
  • sp P35346 SSR5_HUMAN Somatostatin receptor
    type 3 (SS5R) (SSTR5)... 140 6e-33
  • sp P30874 SPLICE ISOFORM B of P30874 SSTR2
    Homo sapiens... 134 3e-31
  • sp P30874 SSR2_HUMAN Somatostatin receptor
    type 2 (SS2R) (SRIF-1... 134 3e-31
  • sp P48145 GPR7_HUMAN Neuropeptides B/W
    receptor type 1 (G protei... 133 7e-31
  • sp O60755 GALT_HUMAN Galanin receptor type 3
    (GAL3-R) (GALR3) G... 132 2e-30
  • sp P41143 OPRD_HUMAN Delta-type opioid
    receptor (DOR-1) OPRD1 ... 128 2e-29
  • sp P35372 SPLICE ISOFORM 1A of P35372
    OPRM1 Homo sapien... 125 1e-28
  • sp P35372 OPRM_HUMAN Mu-type opioid receptor
    (MOR-1) OPRM1 Ho... 125 1e-28

33
(No Transcript)
34
The trouble with top hits
  • The most statistically significant hit is not
    always the most biologically relevant
  • Yet many rule-based expert systems still rely
    on top BLAST or FastA hits to make their
    diagnoses
  • BLAST/FastA see generic similarity not the
    often-subtle differences that constitute the
    functional determinants between closely-related
    receptor families subtypes
  • Failure to appreciate this fundamental point has
    generated numerous annotation errors in our
    databases

35
Misleading annotation via FastA
36
Misleading results from BLAST
  • As weve seen, its tempting to use top hits from
    BLAST or FastA results to classify unknown
    proteins
  • but this may lead us ( especially computer
    programs) to false functional conclusions
  • PSI-BLAST is more sensitive than BLAST, because
    it creates a profile from hits above a given
    threshold
  • but this too can cause problems
  • lets take a closer look

37
(No Transcript)
38
So, is UL78 a GPCR? if so, what sort?
39
What PSI-BLAST said (profile dilution in action)



40
What GeneQuiz said
a thrombin receptor
41
What GeneQuiz said later
42
Overview of results pair-wise family-based
methods
43
What is UL78?
Tool No hit Poor hit Significant hit
BLAST GPCRs in list
PSI-BLAST thrombin receptor chemokine opioid receptors
PROSITE profile GPCR
Pfam
PRINTS
Blocks-PRINTS GPCR
GeneQuiz thrombin receptor C5A receptor
?
?
Bioinformatics tools, alone, cannot tell us!
44
So, beware top hitsbut also beware bottom hits!
  • Let us now compare contrast some InterPro
    results with those of its source dbs

45
Rhodopsin-like superfamily GPCRs in InterPro 2005
  • IPR000276 GPCR_Rhodopsn 7752
    proteins
  • PS50262 G_PROTEIN_RECEP_F1_2 7702 proteins
  • PF00001 7tm_1 7064 proteins
  • PS00237 G_PROTEIN_RECEP_F1_1 6527 proteins
  • PR00237 GPCRRHODOPSN 5821 proteins
    (dont include partials)

46
Rhodopsin-like superfamily GPCRs in the source
databases
  • Pfam FP ? FN ? U ? TP? 8776 matches
    7064
  • PROSITE (profile) FP 3 FN 3 U 12 TP 1837
    matches
  • 7702
  • PROSITE (regex) FP 92 FN 261 U 0 TP 1530
    matches 6527
  • PRINTS FP 0 FN ? U 0 TP 1154 matches
    5821
  • gt2165 updated

47
Rhodopsin-like superfamily GPCRs in InterPro 2007
  • IPR000276 GPCR_Rhodopsn 16,845
    proteins
  • PS50262 G_PROTEIN_RECEP_F1_2 16,714 proteins
  • PF00001 7tm_1 15,712 proteins
  • PR00237 GPCRRHODOPSN 13,405 proteins
  • PS00237 G_PROTEIN_RECEP_F1_1 13,723 proteins

No human curator has time to validate all these
matches
48
14,615 rhodopsin-like superfamily GPCRs in Pfam?
49
Pfam match Q6NV75/24-297
ID Q6NV75 PRELIMINARY PRT 609
AA. AC Q6NV75 DT 05-JUL-2004 (TrEMBLrel. 27,
Created) DT 05-JUL-2004 (TrEMBLrel. 27, Last
sequence update) DT 05-JUL-2004 (TrEMBLrel. 27,
Last annotation update) DE G protein-coupled
receptor 153. GN NameGPR153 OS Homo sapiens
(Human). OX NCBI_TaxID9606 RN 1 RP
SEQUENCE FROM N.A. RC TISSUEBrain RA
Strausberg R.L., Feingold E.A., Grouse L.H.,
Derge J.G., RA Jones S.J., Marra M.A. RT
"Generation and initial analysis of more than
15,000 full-length RT human and mouse cDNA
sequences." RL Proc. Natl. Acad. Sci. U.S.A.
9916899-16903(2002). RP SEQUENCE FROM N.A. RC
TISSUEBrain RA Strausberg R. RL Submitted
(MAR-2004) to the EMBL/GenBank/DDBJ databases. DR
EMBL BC068275 AAH68275.1 -. DR GO
GO0004872 DR InterPro IPR000276
GPCR_Rhodpsn. DR Pfam PF00001 7tm_1 1. DR
PROSITE PS50262 G_PROTEIN_RECEP_F1_2 1. KW
Receptor SQ SEQUENCE 609 AA 65341 MW
E525CC7F60D0891C CRC64 MSDERRLPGS
AVGWLVCGGL SLLANAWGIL SVGAKQKKWK PLEFLLCTLA
ATHMLNVAVP IATYSVVQLR RQRPDFEWNE GLCKVFVSTF
YTLTLATCFS VTSLSYHRMW MVCWPVNYRL SNAKKQAVHT
VMGIWMVSFI LSALPAVGWH DTSERFYTHG CRFIVAEIGL
GFGVCFLLLV GGSVAMGVIC TAIALFQTLA VQVGRQADHR
AFTVPTIVVE DAQGKRRSSI DGSEPAKTSL QTTGLVTTIV
FIYDCLMGFP VLVVSFSSLR ADASAPWMAL CVLWCSVAQA
LLLPVFLWAC DRYRADLKAV REKCMALMAN DEESDDETSL
EGGISPDLVL ERSLDYGYGG DFVALDRMAK YEISALEGGL
PQLYPLRPLQ EDKMQYLQVP PTRRFSHDDA DVWAAVPLPA
FLPRWGSGED LAALAHLVLP AGPERRRASL LAFAEDAPPS
RARRRSAESL LSLRPSALDS GPRGARDSPP GSPRRRPGPG
PRSASASLLP DAFALTAFEC EPQALRRPPG PFPAAPAAPD
GADPGEAPTP PSSAQRSPGP RPSAHSHAGS LRPGLSASWG
EPGGLRAAGG GGSTSSFLSS PSESSGYATL HSDSLGSAS //
?
PROSITE (profile) no match
false negative
PROSITE (regex) no match
PRINTS no match
ClustalW sequences too divergent to be aligned
GPCR?
50
Beware top bottom hitsbut also beware
simplistic analysis tools coupled with wet
experiments!
  • Lets finally look at how hydropathy profiles
    can compel biologists to make strange deductions
  • - still get their results published in
    Science!

51
ID Q9C929_ARATH Unreviewed
401 AA. AC Q9C929 DT 01-JUN-2001, integrated
into UniProtKB/TrEMBL. DT 01-JUN-2001, sequence
version 1. DT 24-JUL-2007, entry version 23. DE
Putative G protein-coupled receptor
80093-78432. GN NameF14G24.19
OrderedLocusNamesAt1g52920 OS Arabidopsis
thaliana (Mouse-ear cress). OC Eukaryota
Viridiplantae Streptophyta ... Arabidopsis. OX
NCBI_TaxID3702 RN 1 RP NUCLEOTIDE
SEQUENCE. RA Lin X., Kaul S., Town C.D., Benito
M., Creasy T.H., Haas B.J., Wu D., RA Maiti R.,
Ronning C.M., Koo H., Fujii C.Y., Utterback
T.R., RA Barnstead M.E., Bowman C.L., White O.,
Nierman W.C., Fraser C.M. RT "Arabidopsis
thaliana chromosome 1 BAC F14G24 genomic
sequence." RL Submitted (DEC-1999) to the
EMBL/GenBank/DDBJ databases. RN 2 RP
NUCLEOTIDE SEQUENCE. RA Town C.D., Kaul S. RL
Submitted (JAN-2001) to the EMBL/GenBank/DDBJ
databases. DR EMBL AC019018 AAG52264.1 -
Genomic_DNA. EMBL / GenBank / DDBJ DR PIR
E96570 E96570. DR UniGene At.66935 -. DR
GenomeReviews CT485782_GR AT1G52920. DR KEGG
athAt1g52920 -. DR TAIR At1g52920 -. DR
GO GO0004872 Freceptor activity
IEAUniProtKB-KW. DR InterPro IPR007822
LANC_like. DR InterPro Graphical view of
domain structure. DR Pfam PF05147 LANC_like
1. KW Receptor. SQ SEQUENCE 401 AA 45284
MW C9D3BF8CC8F0FE0B CRC64 MPEFVPEDLS
GEEETVTECK DSLTKLLSLP YKSFSEKLHR YALSIKDKVV
WETWERSGKR VRDYNLYTGV LGTAYLLFKS YQVTRNEDDL
KLCLENVEAC DVASRDSERV TFICGYAGVC ALGAVAAKCL
GDDQLYDRYL ARFRGIRLPS DLPYELLYGR AGYLWACLFL
NKHIGQESIS SERMRSVVEE IFRAGRQLGN KGTCPLMYEW
HGKRYWGAAH GLAGIMNVLM HTELEPDEIK DVKGTLSYMI
QNRFPSGNYL SSEGSKSDRL VHWCHGAPGV ALTLVKAAQV
YNTKEFVEAA MEAGEVVWSR GLLKRVGICH GISGNTYVFL
SLYRLTRNPK YLYRAKAFAS FLLDKSEKLI SEGQMHGGDR
PFSLFEGIGG MAYMLLDMND PTQALFPGYE L //
Pfam Lanthionine synthetase C-like protein
PROSITE (profile) no match
PROSITE (regex) no match
PRINTS no match
ClustalW sequences too divergent to be aligned
GPCR?
52
They do sums (quickly) crude string matching
53
Seeking deeper functional insights Attwood, TK,
Croning, MD Gaulton, A (2002) Deriving
structural and functional insights from a
ligand-based hierarchical classification of G
protein-coupled receptors. Protein Eng., 15,
7-12.
  • Sfamily, family subtype motifs have different
    locations
  • If sfamily motifs define the common scaffold,
    hypothesis
  • family motifs relate to ligand binding?
  • subtype motifs relate to G protein coupling?
  • powerful tools for subtyping potentially
    de-orphaning GPCRs

54
Locations of ligand-binding residues motif
distribution
55
Locations of G protein-coupling residues
distribution of motifs
56
Seeking deeper functional insights? Attwood, TK,
Croning, MD Gaulton, A (2002) Deriving
structural and functional insights from a
ligand-based hierarchical classification of G
protein-coupled receptors. Protein Eng., 15,
7-12.
  • Clearly, many family- subtype motifs are simply
    in the wrong place for the initial hypothesis
    to be true

57
Refining the hypothesis
  • Besides, its not that simple
  • only part of the answer
  • Need to consider that GPCRs dont function in
    isolation
  • their functions are modulated via interactions
    with other proteins
  • Also, the phenomenon of dimerisation challenges
    the view of the GPCR monomer as functional unit
  • many GPCRs exist as homo- heterodimers
  • Such observations demand a more systematic
    analysis of motifs their likely functional
    roles

58
Oligomerisation protein-protein interaction
residues/regions A pilot study with adrenergic,
bradykinin dopamine receptors
59
Where next?
  • Based on location, some family-level motifs
    couldnt be involved in ligand binding some
    subtype-level motifs couldnt be involved in G
    protein coupling
  • clearly, 3D location must be taken into account
  • functional correlations would then be stronger
  • The remaining motifs are likely to be involved in
    other molecular interactions
  • e.g., dimerisation, effector proteins.(early
    results promising)
  • this will help us to build a knowledge-based
    system to help suggest the likely functional
    roles for family- subtype-level motifs in future

60
Conclusions
  • There are many barriers to success for the
    jobbing bioinformatician, e.g.
  • not fully understanding the processes were
    trying to model predict (e.g., protein folding)
  • the dynamic nature of biological data
  • not having been rigorous in the way we define
    /or describe biology/biological processes in the
    literature
  • the volume of data, data heterogeneity
  • maintenance of data, propagation of errors
  • Possibly the largest hurdle is that computers are
    number crunchers
  • they dont do biology, trying to teach them is
    hard
  • the harder we try, the clearer it is how naïve
    weve been

61
Conclusions
  • In silico functional annotation requires several
    dbs to be searched several tools to be used
  • different methods provide different perspectives
  • dbs arent complete their contents dont fully
    overlap
  • The more dbs searched, the harder it is to
    interpret results
  • The more computers are involved in automating
    annotation, the greater the need for
    collaboration
  • especially between s/w developers, annotators
    wet experimentalists
  • The more data we have, the more rigorous we must
    be in thinking/writing if we are to make sense of
    the complexities

62
ConclusionsFlower DR Attwood, TK (2004)
Integrative bioinformatics for functional genome
annotation trawling for G protein-coupled
receptors.Semin Cell Dev Biol., 15(6), 693-701.
  • For GPCRs, there are many analysis tools
    available
  • BLAST, FastA, family databases, modelling tools,
    etc.
  • We must understand the limitations of the methods
  • no method is infallible or able to replace the
    need for biological validation
  • use all available resources understand their
    problems none is best!
  • Used wisely, bioinformatics tools are useful
  • BLAST/FastA offer broad brush strokes,
    motif-methods add fine detail
  • together, they facilitate receptor
    characterisation prediction of ligand
    specificity, allow identification of novel
    ligand-binding, G protein-coupling or other
    likely molecular interaction motifs
  • We are a long way from having reliable tools for
    deducing GPCR function structure from sequence
  • but with the right approach, there is hope

63
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com