The OBO Foundry - PowerPoint PPT Presentation

1 / 60
About This Presentation
Title:

The OBO Foundry

Description:

1. The OBO Foundry. Barry Smith. University at Buffalo. http: ... how do we know what ... bangs for your GO buck. science base. cross-species ... – PowerPoint PPT presentation

Number of Views:121
Avg rating:3.0/5.0
Slides: 61
Provided by: barr212
Category:
Tags: obo | foundry

less

Transcript and Presenter's Notes

Title: The OBO Foundry


1
The OBO Foundry
Ontology A Vision for the Future and Its
Realization
  • Barry Smith
  • University at Buffalo
  • http//ontology.buffalo.edu/smith

2
  • how do we know what data we have ?
  • how do I know what data you have ?
  • how do we know what data we dont have ?
  • how do we make different sorts of data
    combinable, as we need to do in large domains
    such as neurodevelopment, immunology, cancer ...?

we are accumulating huge amounts of sequence
data, image data, pharma data, ...
3
  • genomic medicine, molecular medicine,
    translational medicine, personalized medicine ...
    need
  • methods for data integration to enable reasoning
    across data at multiple granularities

to identify biomedically relevant relations on
the side of the entities themselves
4
(No Transcript)
5
where in the body ?
what kind of disease process ?
we need semantic annotation of data
we need ontologies
6
  • Semantic Web, Moby, wikis, etc.
  • let a million flowers (and weeds) bloom
  • to create integration rely on (automatically
    generated?) post hoc mappings

how create broad-coverage semantic annotation
systems for biomedicine?
7
most successful, thus far UMLS
  • built by trained experts
  • massively useful for information retrieval and
    information integration
  • UMLS Metathesaurus a system of post hoc mappings
    between source vocabularies separately built

8
(No Transcript)
9
UMLS-based mappings fall shortof creating
interoperability
  • because local usage is respected
  • regimentation frowned upon, no concern for
    cross-framework consistency
  • UMLS terminologies have different grades of
    formal rigor, different degrees of completeness,
    different update policies

10
with UMLS-based annotations
  • we can know what data we have (via term
    searches), but it is noisy
  • we can map between data at single granularities
    (via synonyms), but synonymy information is
    noisy
  • how do we know what data we dont have ?
  • how do we reason with data (as at the molecular
    level), when no common logical backbone ?

11
for science
  • to develop high quality annotation resources in a
    collaborative, community effort?
  • create an evolutionary path towards improvement
    of terminologies, of the sort we find elsewhere
    in science
  • find ways to reward early adopters of the results

what is to be done?
12
for science
  • science works out from a consensus core, and
    strives to isolate and resolve inconsistencies as
    it extends at the fringes
  • we need to create a consensus core
  • start with what for human beings are trivialities
    (low hanging fruit) and work out from there

for science, consistency is a sine qua non
13
Anatomical Space
Anatomical Structure
Organ Cavity Subdivision
Organ Cavity
Organ
Serous Sac
Organ Component
Serous Sac Cavity
Tissue
Serous Sac Cavity Subdivision
is_a
Pleural Sac
Pleura(Wall of Sac)
Pleural Cavity
part_of
Parietal Pleura
Visceral Pleura
Interlobar recess
Mediastinal Pleura
Mesothelium of Pleura
FMA
Foundational Model of Anatomy
14
for science
  • include ontologies corresponding to the basic
    biomedical sciences in the core

clinical medicine relies on anatomy and molecular
biology to provide integration across medical
specialisms
15
for science
  • where do we find scientifically validated
    information linking gene products and other
    entities represented in biochemical databases to
    semantically meaningful terms pertaining to
    disease, anatomy, development, histology in
    different model organisms?

but we need more
16
(No Transcript)
17
what makes GO so wildly successful ?
18
  • science basis of the GO trained experts curating
    peer-reviewed literature
  • different model organism databases employ
    scientific curators who use the experimental
    observations reported in the biomedical
    literature to associate GO terms with gene
    products in a coordinated way

The methodology of annotations
19
  • cellular locations
  • molecular functions
  • biological processes
  • used to annotate the entities represented in the
    major biochemical databases
  • thereby creating integration across these
    databases and making them available to semantic
    search

A set of standardized textual descriptions of
20
what cellular component?
what molecular function?
what biological process?
21
This process
  • leads to improvements and extensions of the
    ontology
  • which in turn leads to better annotations
  • ? a virtuous cycle of improvement in the quality
    and reach of both future annotations and the
    ontology itself
  • RESULT a slowly growing computer-interpretable
    map of biological reality within which major
    databases are automatically integrated in
    semantically searchable form

22
Five bangs for your GO buck
  • science base
  • cross-species database integration
  • cross-granularity database integration
  • through links to the things which are of
    biomedical relevance
  • ? semantic searchability links people to software

23
but now
need to improve the quality of GO to support
more rigorous logic-based reasoning across the
data annotated in its terms need to extend the GO
by engaging ever broader community support for
the addition of new terms and for the correction
of errors
24
but also
need to extend the methodology to other domains,
including clinical domains ? need for disease
ontology immunology ontology symptom
(phenotype) ontology clinical trial ontology ...
25
the problem
existing clinical vocabularies are of variable
quality and low mutual consistency need for
prospective standards to ensure mutual
consistency and high quality of clinical
counterparts of GO need to ensure consistency of
the new clinical ontologies with the basic
biomedical sciences if we do not start now, the
problem will only get worse
26
the solution
  • establish common rules governing best practices
    for creating ontologies and for using these in
    annotations
  • apply these rules to create a complete suite of
    orthogonal interoperable biomedical reference
    ontologies
  • this solution is already being implemented

27
First step (2003)
  • a shared portal for (so far) 58 ontologies
  • (low regimentation)
  • http//obo.sourceforge.net ? NCBO BioPortal

28
(No Transcript)
29
Second step (2004)reform efforts initiated, e.g.
linking GO to other OBO ontologies to ensure
orthogonality
GO

Cell type

Osteoblast differentiation Processes whereby an
osteoprogenitor cell or a cranial neural crest
cell acquires the specialized features of an
osteoblast, a bone-forming cell which secretes
extracellular matrix.
New Definition
30
Third step (2006)
The OBO Foundryhttp//obofoundry.org/
31
  • a family of interoperable gold standard
    biomedical reference ontologies to serve the
    annotation of inter alia
  • scientific literature
  • model organism databases
  • clinical trial data

The OBO Foundry
The OBO Foundry http//obofoundry.org/
32
A prospective standard
  • designed to guarantee interoperability of
    ontologies from the very start (contrast to post
    hoc mapping)
  • established March 2006
  • 12 initial candidate OBO ontologies focused
    primarily on basic science domains
  • several being constructed ab initio
  • by influential consortia who have the authority
    to impose their use on large parts of the
    relevant communities.

33
  • GO Gene Ontology
  • ChEBI Chemical Ontology
  • CL Cell Ontology
  • FMA Foundational Model of Anatomy
  • PaTO Phenotype Quality Ontology
  • SO Sequence Ontology
  • CARO Common Anatomy Reference Ontology
  • CTO Clinical Trial Ontology
  • FuGO Functional Genomics Investigation Ontology
  • PrO Protein Ontology
  • RnaO RNA Ontology
  • RO Relation Ontology

new
The OBO Foundry http//obofoundry.org/
34
(No Transcript)
35
Annotations plus ontologies yield an ever-growing
computer-interpretable map of biological
reality.
36
Building out from the original GO
37
  • Disease Ontology (DO)
  • Biomedical Image Ontology (BIO)
  • Upper Biomedical Ontology (OBO UBO)
  • Environment Ontology (EnvO)
  • Systems Biology Ontology (SBO)

Under consideration
The OBO Foundry http//obofoundry.org/
38
  • OBO Foundry a subset of OBO ontologies, whose
    developers have agreed in advance to accept a
    common set of principles reflecting best practice
    in ontology development designed to ensure
  • tight connection to the biomedical basic
    sciences
  • compatibility
  • interoperability, common relations
  • formal robustness
  • support for logic-based reasoning

The OBO Foundry http//obofoundry.org/
39
CRITERIA
  • The ontology is OPEN and available to be used by
    all.
  • The ontology is in, or can be instantiated in, a
    COMMON FORMAL LANGUAGE.
  • The developers of the ontology agree in advance
    to COLLABORATE with developers of other OBO
    Foundry ontology where domains overlap.

CRITERIA
The OBO Foundry http//obofoundry.org/
40
  • UPDATE The developers of each ontology commit to
    its maintenance in light of scientific advance,
    and to soliciting community feedback for its
    improvement.
  • ORTHOGONALITY They commit to working with other
    Foundry members to ensure that, for any
    particular domain, there is community convergence
    on a single controlled vocabulary.

CRITERIA
The OBO Foundry http//obofoundry.org/
41
for science
  • if we annotate a database or body of literature
    with one high-quality biomedical ontology, we
    should be able to add annotations from a second
    such ontology without conflicts

orthogonality of ontologies implies additivity of
annotations
The OBO Foundry http//obofoundry.org/
42
CRITERIA
  • IDENTIFIERS The ontology possesses a unique
    identifier space within OBO.
  • VERSIONING The ontology provider has procedures
    for identifying distinct successive versions to
    ensure BACKWARDS COMPATIBITY with annotation
    resources already in common use
  • The ontology includes TEXTUAL DEFINITIONS and
    where possible equivalent formal definitions of
    its terms.

CRITERIA
43
  • CLEARLY BOUNDED The ontology has a clearly
    specified and clearly delineated content.
  • DOCUMENTATION The ontology is well-documented.
  • USERS The ontology has a plurality of
    independent users.

CRITERIA
The OBO Foundry http//obofoundry.org/
44
  • COMMON ARCHITECTURE The ontology uses relations
    which are unambiguously defined following the
    pattern of definitions laid down in the OBO
    Relation Ontology.
  • Smith et al., Genome Biology 2005, 6R46

CRITERIA
The OBO Foundry http//obofoundry.org/
45
OBO Relation Ontology
The OBO Foundry http//obofoundry.org/
46
  • Further criteria will be added over time in light
    of lessons learned in order to bring about a
    gradual improvement in the quality of Foundry
    ontologies
  • ALL FOUNDRY ONTOLOGIES WILL BE SUBJECT TO
    CONSTANT UPDATE IN LIGHT OF SCIENTIFIC ADVANCE

IT WILL GET HARDER
The OBO Foundry http//obofoundry.org/
47
  • But not everyone needs to join
  • The Foundry is not seeking to serve as a check on
    flexibility or creativity
  • ALL FOUNDRY ONTOLOGIES WILL ENCOURAGE COMMUNITY
    CRITICISM, CORRECTION AND EXTENSION WITH NEW
    TERMS

IT WILL GET HARDER
The OBO Foundry http//obofoundry.org/
48
  • to introduce some of the features of SCIENTIFIC
    PEER REVIEW into biomedical ontology development
  • CREDIT for high quality ontology development work
  • KUDOS for early adopters of high quality
    ontologies / terminologies e.g. in reporting
    clinical trial results

GOALS
The OBO Foundry http//obofoundry.org/
49
  • to providing a FRAMEWORK OF RULES to counteract
    the current policy of ad hoc creation of new
    annotation schemas by each clinical research
    group by
  • REUSABILITY if data-schemas are formulated using
    a single well-integrated framework ontology
    system in widespread use, then this data will be
    to this degree itself become more widely
    accessible and usable

GOALS
The OBO Foundry http//obofoundry.org/
50
  • to serve as BENCHMARK FOR IMPROVEMENTS in
    discipline-focused terminology resources
  • once a system of interoperable reference
    ontologies is there, it will make sense to
    calibrate existing terminologies in its terms in
    order to achieve more robust alignment and
    greater domain coverage
  • exploit the avenue of EVIDENCE-BASED MEDICINE
    (NIH CLINICAL RESEARCH NETWORKS) to foster their
    use by clinicians

GOALS
The OBO Foundry http//obofoundry.org/
51
  • June 2006 establishment of MICheck
  • reflects growing need for prescriptive
    checklists specifying the key information to
    include when reporting experimental results
    (concerning methods, data, analyses and results).

the vision is spreading
The OBO Foundry http//obofoundry.org/
52
  • MICheck a common resource for minimum
    information checklists analogous to OBO / NCBO
    BioPortal
  • MICheck Foundry will create a suite of
    self-consistent, clearly bounded, orthogonal,
    integrable checklist modules
  • Taylor CF, et al. Nature Biotech, in press

MICheck Foundry
The OBO Foundry http//obofoundry.org/
53
  • Transcriptomics (MIAME Working Group)
  • Proteomics (Proteomics Standards Initiative)
  • Metabolomics (Metabolomics Standards Initiative)
  • Genomics and Metagenomics (Genomic Standards
    Consortium)
  • In Situ Hybridization and Immunohistochemistry
    (MISFISHIE Working Group)
  • Phylogenetics (Phylogenetics Community)
  • RNA Interference (RNAi Community)
  • Toxicogenomics (Toxicogenomics WG)
  • Environmental Genomics (Environmental Genomics
    WG)
  • Nutrigenomics (Nutrigenomics WG)
  • Flow Cytometry (Flow Cytometry Community)

MICheck/Foundry communities
54
  • how to replicate the successes of the GO in
    clinical medicine?
  • choose two or three representative disease
    domains
  • work out reasoning challenges for those domains
  • work with specialists to create ontologies
    interoperable with OBO Foundry basic science
    ontologies to address these reasoning challenges
  • work with leaders of professional associations
    and of clinical trial initiatives to foster the
    collection of clinical data annotated in their
    terms

Fourth Step (the future)
55
OBO Foundry coverage (canonical ontologies)
56
(No Transcript)
57
Draft Ontology for Acute Respiratory Distress
Syndrome
58
Draft Ontology for Muscular Sclerosis
59
Draft Ontology for Muscular Sclerosis
to apprehend what is unknown requires a complete
demarcation of the relevant space of alternatives


60
with thanks (inter alia) to
with thanks to
Write a Comment
User Comments (0)
About PowerShow.com