Protein Pathways and Pathway Databases - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

Protein Pathways and Pathway Databases

Description:

All pathways are networks of interactions, however not all networks are ... Chemical substructure comparison. Enzymes,Transcription Factors. Genes: Blast search ... – PowerPoint PPT presentation

Number of Views:367
Avg rating:3.0/5.0
Slides: 44
Provided by: stephe78
Category:

less

Transcript and Presenter's Notes

Title: Protein Pathways and Pathway Databases


1
Protein Pathways and Pathway Databases
  • Shan Sundararaj
  • University of Alberta
  • Edmonton, AB
  • ss23_at_ualberta.ca

2
Interactions ? Networks ? Pathways
  • A collection of interactions defines a network
  • Pathways are a subset of networks
  • All pathways are networks of interactions,
    however not all networks are pathways!
  • Difference in the level of annotation/understandin
    g
  • We can define a pathway as a biological network
    that relates to a known physiological process or
    phenotype

3
Pathways
  • However, there is no precise biological
    definition of a pathway
  • Our partitioning of networks into pathways is
    somewhat arbitrary
  • We choose the start/finish points based on
    important or easily understood compounds
  • Gives us the ability to conceptualize the mapping
    of genotype ? phenotype

4
Biological pathways
  • There are 3 type of interactions that can be
    mapped to pathways
  • 1) enzyme ligand
  • metabolic pathways
  • 2) protein protein
  • cell signaling pathways
  • complexes for cell processes
  • 3) gene regulatory elements gene products
  • genetic networks

5
Pathways are inter-linked
Signalling pathway
Genetic network
STIMULUS
Metabolic pathway
6
Metabolic Pathways
1993 Boehringer Mannheim GmbH - Biochemica
7
What the pathway represents
  • Metabolites involved
  • Enzymes/transport proteins
  • Order of reactions
  • General biological function
  • Reaction rates
  • Expression data
  • Inhibitors, activators, alternate pathways
  • Genetic regulatory information

8
Describing metabolic networks
  • Classical biochemical pathways
  • glycolysis, TCA cycle, etc.
  • Stoichiometric modeling
  • flux balance analysis, extreme pathways
  • Kinetic modeling (CyberCell, E-cell, )
  • Need to accumulate comprehensive kinetic
    information

9
Complexity
  • Pathways involve multiple enzymes, which may have
    multiple subunits, alternate forms, alternate
    specificities
  • Enzymes may be involved in multiple pathways
  • Malate dehydogenase appears in 6 different
    metabolic pathways in some databases

10
Metabolic Pathway Reconstruction
  • Given a genomic sequence, we can infer what
    metabolic pathways are available to an organism
  • Used to design culture medium for Tropheryma
    whipplei by seeing what nutrients were essential
    for growth (Renesto et al., Lancet, 362, 447-449,
    2003)

11
Co-expression within pathways
  • Tempting thought genes that occur within the
    same pathway will show similar expression
    profiles
  • Reality depends greatly on how you identify your
    pathways, KEGG pathways show at best 50
    co-expression in survey of available yeast
    expression data (Ihmels et al., Nat Biotechnol.
    22, 86-92, 2004).
  • Expression levels do not correlate very well with
    protein interactions (unless they are stable
    complexes, maintained in many different
    conditions)

12
Pathway Databases
  • KEGG
  • BioCyc
  • Reactome
  • GenMAPP
  • BioCarta
  • TransPATH
  • 175 more at Pathway Resource List
    http//www.cbio.mskcc.org/prl/index.php

13
BioPAX(www.biopax.org)
  • Collaborative effort to create a data exchange
    format for biological pathway data

14
KEGG
  • 5904 chemical reactions
  • 15,037 pathways
  • 229 reference pathways
  • 85 ortholog tables
  • 181 organisms

http//www.genome.ad.jp/kegg/
15
KEGG
  • GENES Database
  • The universe of genes and proteins in complete
    genomes
  • LIGAND Database
  • The universe of chemical reactions involving
    metabolites and other biochemical compounds
  • Pathway Database
  • Molecular interaction networks, metabolic and
    regulatory pathways, and molecular complexes

16
Connection between KEGG and other Databases
17
Pathways
  • Represented as diagrams, manually created, stored
    as gifs
  • Easy to link to, highlight genes of interest
  • Generate orthologous pathways in other organisms

2.7.2.4 1.2.1.11 1.1.1.3 2.3.1.46 2.5.1.48 4.4.1.8
2.1.1.13 2.5.1.6
18
http//www.biocyc.org/
19
BioCyc
  • The primary database was EcoCyc (E. coli)
  • 21 more curated pathway/genome databases (PGDB),
    each focusing on one organism (e.g. HumanCyc)
  • Also 142 more non-curated (computationally
    generated) pathways
  • MetaCyc database contains non-redundant reference
    pathways from more than 240 organisms
  • Supports Pathway Tools software suite to
    analyze PGDBs, and PathoLogic pathway
    prediction program for new genomes

20
BioCyc
  • Each PGDB includes info about
  • Pathways, reactions, substrates
  • Enzymes, transporters
  • Genes, replicons
  • Transcription factors, promoters, operons, DNA
    binding sites
  • MetaCyc and EcoCyc are literature-based, the
    others are compu-tationally derived

Pathways
Reactions
Compounds
Proteins
Operons, Promoters, DNA Binding Sites
Genes
Chromosomes, Plasmids
21
164 datasets
Query by protein, gene, compound, reaction,
pathway
BLAST sequence if protein name unknown
22
MetaCyc Statistics
23
EcoCyc Statistics
24
BioCyc Pathway Tools
(Adapted from Pathway Tools tutorial,
http//bioinformatics.ai.sri.com/ptools/)
  • Full Metabolic Map
  • Paint gene expression data on metabolic network
    compare metabolic networks
  • Pathways
  • Pathway prediction (PathoLogic)
  • Reactions
  • Balance checker
  • Compounds
  • Chemical substructure comparison
  • Enzymes,Transcription Factors
  • Genes Blast search
  • Operons
  • Operon prediction

25
PathoLogic Making PGDBs
26
Completeness of Pathways
27
Completeness of Pathways
28
Issues with predicting pathways
  • Predicting metabolic pathways from genome
  • Predict genes
  • Assign enzymatic function to genes
  • Look for enzymes unique to pathway
  • Check if pathway is balanced (no holes)
  • Try to fill holes by re-searching genome

29
Reactomehttp//www.reactome.org/
30
Reactome
  • Joint venture of CSHL and EBI (supercedes the
    Genome Knowledgebase project)
  • Curated database of biological processes in
    humans
  • Also rat, mouse, fugu, zebrafish, chicken
  • Everything referenced by curators to literature
    citation or inference based on sequence
    similarity

31
Reactome model
  • Model reactions (input_entities)
    ?(output_entities)
  • Distinguishes between modified/unmodified
    proteins (modification is an explicit reaction)
  • Highly annotated at every step, very
    micromanaged, hope to find interesting links
    between reactions

32
Reactome PathFinder
  • Pathfinding between distant processes
  • Enter two molecules or events and see if they can
    be joined together by reactions

33
Reactome SkyPainter
  • Find all reactions that contain a molecule or
    event
  • Very flexible input, any one or more of
  • protein/gene ID (UniProt, Genbank or others)
  • protein/gene sequence
  • GO or OMIM identifier
  • time series from a gene expression study

34
Reactome SkyPainter
  • Starry sky output
  • If expression data used, you get different
    colours for different levels of expression
  • If time series available, you can make an
    animation

35
GenMAPP(www.genmapp.org)
  • Designed to rapidly analyze gene profiling data
    in the context of known biochemical pathways
  • Pathways (MAPPs) are authored by experts, as well
    as adapting several pathways from KEGG
  • Pathways easily web-queryable
  • Free for all users
  • But Windows platform only

36
GenMAPP
  • Easy to draw/edit pathways
  • Color genes from user imported expression data

37
MAPPFinder maps to GO ontology
38
BioCarta(www.biocarta.com)
39
BioCarta
  • Not a public database, but offers free,
    clickable, graphics-rich pathway database and
    gene information
  • Community annotation
  • Easy to use glyph system for genes
  • 355 pathways
  • mostly human/mouse metabolic and signaling
    pathways

40
TransPATH
41
TransPATH
  • Part of larger BioBase package (commercial)
  • PathwayBuilder package for network visualization
  • Highly integrated with signaling networks and
    transcription factor networks (TransFAC)
  • Linked to extensive enzyme information in BRENDA
    (www.brenda.uni-koeln.de/)
  • 28,456 molecules 52,007 reactions 54 hand-drawn
    pathways

42
Pathway Database Comparison
43
Conclusion
  • Pathway databases are continually evolving, and
    are an important abstract mid-level of expressing
    data between genes/proteins and observable
    phenotypes
  • Metabolic pathways are most well studied/modeled
  • Many different formats of storage and display,
    but moving towards standards (PSI-MI, Biopax)
Write a Comment
User Comments (0)
About PowerShow.com