Title: Part III Putting Ontologies to Work
1Part IIIPutting Ontologies to Work
- Examples of use by us.
- Ontologies and the Grid.
- Survey of other examples.
- The Story of Gene Ontology.
2What are they used for taster
- A community reference.
- Support for community practice.
- Specification of database or database content.
- Controlled vocabulary for annotation.
- Application database interoperability.
- Consensual vocabulary.
- Searching, query formation indexing.
- Classification, interface/portal driving.
- Agent or service metadata management.
- Grid services description, advert, discover.
3What are they used for taster
- Support for knowledge intensive applications.
- Text extraction, decision support, resource
planning, intelligent interfaces. - Knowledge repository structure.
- Knowledge acquisition.
4Examples of ontologies in use by us
- AKT reference ontology for Depts of CS
- http//www.aktors.org
- COHSE Open Hypermedia linking using ontology
- http//cohse.semanticweb.org
- Geodise e-Science design optimisation ontology
- http//www.geodise.org
- Web services information integration study
Qinetiq - myGrid e-Science Grid Services ontology.
- http//www.mygrid.org.uk
- TAMBIS repository integration using an ontology
- http//www.cs.man.ac.uk/tambis/
5The Semantic Web
OILEd, Protégé
OntoPortal, SEAL, COHSE
DAMLOIL
FaCT, Racer
RDF
COHSE
CREAM
http//www.semanticweb.org
6Agents on the Web
Studer2000
7The AKT project - Knowledge Services on the
Semantic Web
- Content with extensive markup and annotation
- Markup will be significantly driven by ontologies
- Services will exploit this markup to deliver
knowledge - the right content, to the right
person/system, in the right form at the right
time - Navigating
- Semantic querying
- deliver integrated answers extended by derived
facts
8Ontologies The Semantic Backbone
9Ontologies Putting them to Work
10Communities of Practice
- Objective
- Increase skills
- Agents
- Homogeneous
- Cognitive ability
- Accumulate circulate best practice
- Recruitment
- Self selecting
- Knowledge production
- Learning through the work
- Community glue
- Common passion
11The Community of Practice (COP) Example
12Visualising COPs
13Integrating Knowledge Services through ontologies
14COHSE demo http//cohse.semanticweb.org
- Annotation of web resources with DAMLOIL
concepts from ontologies - Dynamic linking of resources based on concepts
and reasoning.
Linking
Annotation
15COHSE
16Web Services
17Semantic Networking
The interoperability layer migrates from the
syntactic to the semantic
18Service/Resource Descriptions
- Automated gt
- Discovery Search
- Selection
- Matching
- Composition
- Interoperation
- Invocation
- Execution monitoring
- Need
- Fine grain description of operations and message
parameters. - Coarse grain description of service and behaviour
as a whole.
19Web Services Stack (adapted from IBM)
Ontologies
20Description Discovery(from IBM)
Wire
Description
Discovery
21Grid Enabled Optimisation and Design Search
(GEODISE)
Geodise will provide grid-based seamless access
to an intelligent knowledge repository, a
state-of-the-art collection of optimisation and
search tools, industrial strength analysis codes,
and distributed computing data resources
Funded by EPSRC
http//www.geodise.org
22GEODISE Demo
(1) Security Infrastructure Authentication
Authorisation
(2) Define Geometry to optimise
Nacelle Design
Axisymmetric (2D)
3D
(3) Sample Objective function to build Response
Surface Model
Grid Computing
23GEODISE Demo
(4) Optimise over Response Surface Model
(5) Grid Database Query and Postprocessing of
Results
Automated Data Archiving
24Ontologies and Knowledge in GEODISE
Ontology and Knowledge Capture for Design Process
Ontology Driven Service Composition
25Ontologies and Web Services information and
knowledge integration - Qinetiq
- Combine Web Services and Semantic Web
- Ontologies of services
- Ontologically-informed service parameters
- Build on emergent DAML Services work
- Separate domain-independent components of
messages from domain-specific parts - Domain-independent pragmatics
- Domain-specific contents
- Agent Communication Languages
- FIPA ACL Process Ontology
26Agentifying Web Services
27Domain Ontology - entities
28Domain Ontology - events
29FIPA ACL Ontology - processes
30Situational Awareness in Humanitarian Relief
- Unseasonal rainfall brings about a rapid rise in
water levels and widespread flooding in a heavily
populated river delta. - Emerging humanitarian crisis requires that
flood-displaced persons be moved to a place of
safety and provided with food and shelter.
31Delta Flood Scenario
- Number of possible data sources
- Media reports
- Meteorological forecasts and reports
- ELINT
- Reports from NGOs
- Other field reports
- Multiple foci of situational awareness
- Refugee concentrations and movements
- Communications and transport infrastructure
- Weather conditions and river levels, current and
predicted
32System Architecture
33Core Knowledge Technologies in this Scenario
- Information Customisation
- Views on data for needs of different users
- Service descriptions used to select sets of
services to provide data to user - Information Integration
- Large number of possible data sources
- Integration of data from different services
presents challenges - Confirmation of reports
- Certainty
- Provenance
- Timeliness
- Information Extraction
- Newsfeeds from media agencies
- Text-based
- Geospecific references in stories
- Push medium
- Extract salient details
- Time, place, etc
- Confirm against other sources
34myGrid http//www.mygrid.org.uk
Building personalised extensible environments for
data-intensive in silico experiments in biology
myGrid Middleware
35myGrid http//www.mygrid.org.uk
Specialises. All concepts are subclassed from
those in the more general ontology.
Upper level ontology
Contributes concepts to form definitions.
Task ontology
Publishing ontology
Informatics ontology
Molecularbiology ontology
Organisationontology
Bioinformatics ontology
Web serviceontology
36myGrid demo
Portal
Repository Client
Ontology Client
Workflow Client
Meta Data Ontology
Personal Repository
Workflow Repository
Meta Data Service Type Directory
37myGrid http//www.mygrid.org.uk
- Data intensive Grid for bioinformatics
- Services described using DAMLOIL ontology.
- Services organised, queried and matched using
subsumption reasoning. - Descriptions controlled by concept satisfiability
reasoning.
38Ontologies Services
- Typing
- Controlling inputs and outputs of services.
- Mapping between WSDL / OGSA XML Schema types to
(DAMLOIL) concepts
- Classifying
- Indexing services and data
- OGSA factories and service instances.
- Organising services based on reasoning over the
service descriptions.
39Capturing Semantics Tuecke
- Service description obviously captures interface
syntax - But capturing semantic meaning is critical for
discovery - Not only does the service accept an operation
request with a particular signature - But it should also respond as expected
- as expected is usually defined offline in
specifications - Approach name everything
- Use names as a basis for reasoning about
semantics - THIS IS NOT ENOUGH!
40Semantic Web Services
UDDI
types
XML Schema
businessEntity
messages
businessService
portType
operation
binding Template
binding
service
tModel
WSDL
41SemanticGrid Services
types
XML Schema
messages
serviceType
Registry
portType
operation
serviceDataDescription
compatibilityAssertion
serviceImplementation
42SemanticGrid Services
types
XML Schema
messages
serviceType
Registry FindServiceData SubscribeServiceData
portType
operation
serviceDataDescription
serviceType
serviceData
43Ontology opportunities
- Types and Messages
- ltmessage name findFunctionRequest/gt
- ltpart nameprotein typemyxsdsequencegt/
- /messagegt
44Ontology opportunities
- portType
- ltportType nameProteinServiceBeangt
- ltoperation name findFunctiongt
- ltinput namefindFunctionRequest
messagetnsfindFunctionRequest/gt - ltoutput namefindFunctionResponse
messagetnsfindFunctionResponse/gt - lt/operationgt
- lt/portTypegt
45Service Data Description
Metadata Drawn from an ontology acting as a schema
serviceDataDescription
describes
Data values Drawn from an ontology acting as a
controlled vocabulary for content
serviceData
46Finding and substituting
- Building registries
- Topology of services used for discovery
- Finding based on property descriptions on Service
Data - Matching close enough (c.f. Condor ClassAtt)
- A service that maps an protein sequence to a gene
product function - A service that maps an enzyme sequence to a gene
product function
47Finding/Matching Grid Services
- FindServiceData
- queryByServiceDataName
- query by properties of service
48DAML-S http//www.daml.org
- Defines ontologies for the construction of
service models.
WSDL
UDDI
49DAML-S Service Profile
- Derive service adverts service requests,
populating service registries, automated service
discovery, matching of capability adverts to
requests
DAML-S
DAMLOIL based types
- Functionality
- High level overview of purpose
- Preconditions requirements
- Inputs Outputs
- Effects of execution
- Levels of quality
- Service provider
Process Model
Inputs/Outputs
Atomic Process
Operation
Message
Binding to SOAP, HTTP etc
WSDL
50DAML-S Service Models
- Automated service invocation, composition,
interoperation, coordination, monitoring. - Capability matching
- More detailed description of operations
- Described in terms of a set of processes
- Preconditions
- Inputs Outputs
- Effects of execution
- Statements about runtime behaviour and
interaction pattern services through workflow
constructs sequence, fork
- Service Ontologies
- Ontologies of service types
- Ontologies to augment service requests
- Quality
- Costs
- Security
51What do ontologies offer?
- Common framework for integration
- OpenMMS, TAMBIS, ONION
- Search support, querying matching
- GO, MGED, UMLS, MeSH
- Intelligent interfaces for queries and data
capture - Ingenuity web based products, TAMBIS.
- Control Semantics
52Integration through shared descriptions using
ontologies
- Conformance to shared controlled vocabularies for
content. - Shared semantic model for metadata and data
- Gene Ontology for describing gene function in
FlyBase, MouseBase, SGD, Swiss-Prot, Interpro
etc - UMLS, PharmGKB, MGED
53From Bertram Ludäscher, SDSC
54Information integration through mediation
- TAMBIS Transparent Access to Multiple
Bioinformatics Information Sources. - Domain ontology global schema
- Sources not visible to users.
- http//img.cs.man.ac.uk/tambis
55TAMBIS Architecture
- Ontology described using Description Logic.
- Query formulation ontology browsing concept
construction. - Reasoning about concept models, answering
questions like - What can I say about Proteins?
- What are the parents of concept X?
- Wrapper service Kleisli.
Biological Bioinformatics Ontology
Query Formulation Interface
56TAMBIS
- Query multiple databases using an ontology as a
global model
57Specification of content
- Controlled vocabularies
- Controlled description
- Accurate data collection
- Enhanced and accurate retrieval
- Expectation setting
- Step towards unification and interoperation
- Gene Ontology. MGED, UMLS, Mouse Anatomy etc
- Taxonomy
- Organisational classification
- Index
- Systematic explicitness
58 Knowledge Driven content
Controlled Vocabularies for Describing Alleles
59Ontology driven forms.
- Clinergy.
- Intelligent interfaces.
60What do ontologies offer?
- Support for knowledge intensive applications.
- Text extraction.
- Text generation.
- Problem Solving Environments.
- Decision support.
- Control Semantics
PASTA Protein structure extraction from texts to
support the annotation of PDB http//www.dcs.shef.
ac.uk/research/groups/nlp/
61What do ontologies offer?
- Knowledge repository.
- Knowledge discovery.
- Decision support.
- Hypothesis generation.
- RiboWeb, Ingenuity, PharmGKB, EcoCyc, PlasmoCyc,
Biopathways, AROM - Control Semantics Inference
62Role of Knowledge
Problem solving Environments
Describing, Advertising, Discovering Matching
Services
Intelligent portals
Model and content
Ontologies
Composition Constraining, Description generation
controlled content annotating results
Typing inputs outputs
Typing, controlled content
Metadata
Databases
Workflows
63http//www.geneontology.org
64Gene Ontology http//www.geneontology.org
- a dynamic controlled vocabulary that can be
applied to all eukaryotes - Built by the community for the community.
- Three organising principles
- Molecular function, Biological process, Cellular
component - Isa and Part of taxonomy but not good!
- 10,000 concepts
- Lightweight ontology, Poor semantic rigour.
- Ok when small and used for annotation.
- Obstacle when large, evolving and used for mining.
65Gene Ontology used for
- Controlled annotation quality consistency
- Mediation navigation through content
- Classification / query / index
InterPro
SWISS-PROT
BLAST
66Descriptive knowledge
- Integration of structured(?) and semi-structured
data.
67Annotation
GO annotations
Gene detail page in MGD for the vitamin D
receptor gene, Vdr
68Opens browser
69Search returns children
70Returns annotated terms
71(No Transcript)
72(No Transcript)
73Gene Ontology as index
- Relies on the Gene Ontologys taxonomy.
- Lack of semantics and rigour begins to be
problematic.
74How did GO happen?
- A respected leader with vision and influence who
got some financial backing. - A large collection of data resources.
- A motivated community with a problem prepared to
do the work. - A culture of curation.
- Early adopters.
- A legacy of controlled vocabularies.
- A desire to do something NOW.
75Approach Phase 1
- Childhood
- A small team built version 0 practically in their
spare time. - Published using primitive tools. No attempt to
get it right before publication. - Responsive feedback and curation from the
community, who feel as they will be listened to
and it is theirs. - Used to annotate repositories. One purpose.
76Approach Phase 2
- Adolescence
- Funded team of curators.
- Larger teams of specialists.
- Regular face-to-face meetings.
- Development and deployment tools and
environments. - Open source, simple.
- Methodology for evolution and change
- Depreciation of terms
- Linkage to other Controlled Vocabularies, e.g.
Swiss-Prot keywords.
77Approach Phase 3
- Adulthood
- Adoption of better methodologies and tools for
cleaning up, managing and evolving the
ontology. - DAMLOIL
- GONG, GOET projects .
- Different purposes to original intent.
78Summary
- Building ontologies for the sake of it is
pointless, but this is sometimes forgotten! - Use the ontology before it is finished because
it never will be. - Lots of work on ontology creation.
- Less work on ontology deployment.
- Most architectures use an ontology server.
- If it is useful it will be used.
79Further Reading
80Components in an Ontology Driven Approach
Reasoner
Ontology Language
Lexicon
Ontology Server
Ontologies
Parser