Title: Ontology Engineering
1Ontology Engineering Ontology Acquisition
- Primarily Ontology Acquisition
2Building Ontologies
- No field of Ontological Engineering equivalent to
Knowledge or Software Engineering - No standard methodologies for building
ontologies - Such a methodology would include
- a set of stages that occur when building
ontologies - guidelines and principles to assist in the
different stages - an ontology life-cycle which indicates the
relationships among stages. - Gruber's guidelines for constructing ontologies
are well known.
3The Development Lifecycle
- Two kinds of complementary methodologies emerged
- Stage-based, e.g. TOVE Uschold96
- Iterative evolving prototypes, e.g. MethOntology
Gomez Perez94. - Most have TWO stages
- Informal stage
- ontology is sketched out using either natural
language descriptions or some diagram technique - Formal stage
- ontology is encoded in a formal knowledge
representation language, that is machine
computable - An ontology should ideally be communicated to
people and unambiguously interpreted by software - the informal representation helps the former
- the formal representation helps the latter.
4A Provisional Methodology
- A skeletal methodology and life-cycle for
building ontologies - Inspired by the software engineering V-process
model - The overall process moves through a life-cycle.
The left side charts the processes in building
an ontology
The right side charts the guidelines, principles
and evaluation used to quality assure the
ontology
5Methodology
Ontology in Use
Evaluation coverage, verification, granularity
Identify purpose and scope
Knowledge acquisition
Maintenance
User Model
Conceptualisation Principles commitment,
conciseness, clarity, extensibility, coherency
Conceptualisation
Integrating existing ontologies
Conceptualisation Model
Ontology Learning
Encoding/Representation principles encoding
bias, consistency, house styles and standards,
reasoning system exploitation
Encoding
Representation
Implementation Model
6An Ontology Building Life-cycle
Identify purpose and scope
Consistency Checking
Knowledge acquisition
Building
Language and representation
Conceptualisation
Integrating existing ontologies
Available development tools
Encoding
Ontology Learning
Evaluation
7Questions
- How do we obtain our conceptualisation?
- The role of texts, experts and other sources
- How do we derive conceptualisation from texts etc
- How do we cope with tacit conceptualisations?
- How do we use models with the expert?
- How do we validate the conceptualisation?
8Knowledge Acquisition
- The process of capturing knowledge including
various forms of conceptualisation from whatever
source including experts, documents, manuals,
case studies etc. - Knowledge Elicitation
- techniques that are used to acquire knowledge
direct from human experts - Machine Learning
- use of AI pattern recognition methods to infer
patterns from sets of examples
9Problems of Knowledge Elicitation
- Techniques
- Limited range
- Ignorance
- Experts
- poor appreciation of different types
- ignorance
- Expertise
- poor appreciation of different types
- ignorance
- need to organise knowledge into higher level
units -
10First Steps -Initial Understanding of the Domain
- Problem Description
- List knowledge resources (verify that knowledge
really exists) - Experts, Technical Authorities
- Text Books, Training Material
- Manuals and Procedures
- Databases and Case Histories
- Produce domain yellow pages
- Establish performance metrics
- Initial task environment analysis
11Document and Text Analysis
- Look at the structure
- how material is organised into topics and
sub-topics - Content analysis
- Extract major linguistic categories
- nouns - objects and concepts
- verbs - relations
- modifiers - properties and values
- connectives - rules and links
- Use Intermediate representations
- Pseudo production rules
- Small concept networks and hierarchies
12Problems of Document andText Analysis
- Documents and texts are written for specific
purposes that may not reveal real knowledge or
explicit concepualisations - Duty logs and rostas
- Teaching texts
- All textual analysis is a form of content
analysis - the interpreter may or may not be
imputing the correct conceptualisation - Difficult to reconstruct the context need to
capture acquisition and design rationales
13Types of Expert
- The Academic
- Values logical consistency
- The Professional
- Solutions that work in the context of information
overload - The Samurai
- Pure Performance
- State of knowledge varies
- Required solutions vary
14Session Plan
- The importance of an acquisition plan
- A detailed agenda of what is to be covered during
a KA session. - Should include
- an introduction describing the objectives
- description of the techniques to be used
- questions to be asked (if required)
- timings
- Should be sent to the expert at least one day in
advance of the session
15KA Techniques
- Methods that help acquire and validate knowledge
from an expert during a KA session. - Three important types
- natural techniques
- contrived techniques
- modelling and mediating representation techniques
16KA Typology
17KA Techniques1
18Natural Techniques
- KA techniques that involve the expert performing
tasks they would normally do as part of their
job. - Variations
- Interviews
- Observational techniques
- (Group meetings)
- (Questionnaires)
19Interviews
- KA technique in which the knowledge engineer asks
questions of the expert or end user. - Essential method for acquiring explicit
conceptualisations and knowledge, but poor for
tacit knowledge. - Variations
- Unstructured interview
- Semi-structured interview
- Structured interview
20Unstructured Interview
- An interview in which the knowledge engineer has
no pre-defined questions. - Basically a chat to find out broad aspects of the
experts knowledge. - An aid to designing a KA session plan.
21Semi-structured Interview
- An interview in which pre-prepared questions are
used to focus and scope what is covered - Also involves unprepared supplementary questions
for clarification and probing. - Questions should be
- designed carefully
- sent to the expert beforehand
- asked verbatim (read-out as written)
- include timings
- The recommended interview technique at the start
of most KA projects.
22Structured Interview
- An interview in which the knowledge engineer
follows a pre-defined set of structured questions
but can ask no supplementary questions. - Often involves filling-in a matrix or generic
headings.
23KA Techniques2
24Contrived Techniques
- KA techniques that involve the expert performing
tasks they would not normally do as part of their
job. - Most of these techniques come from psychology
- Useful for capturing tacit knowledge, excellent
for conceptualisations. - Important types
- card sorting
- three card trick
- repertory grid technique
- constrained tasks
- 20-questions
- commentating
- teach back
25Card Sorting
- KA technique in which a collection of concepts
(or other knowledge objects) are written on
separate cards and sorted into piles by an expert
in order to elicit classes based on attributes. - Also enables significant elicitation of
properties and dimensions - Used to capture concept knowledge and tacit
knowledge - Use in conjunction with triadic method
- Can also sort objects or pictures instead of cards
26Triadic Elicitation Method
- KA technique used to capture the way in which an
expert views the concepts in a domain. - Involves presenting three random concepts and
asking in what way two of them are similar but
different from the other one. - Answer will give an attribute.
- A good way of acquiring tacit knowledge.
27Repertory Grid technique
- KA technique used for a number of purposes
- to elicit attributes for a set of concepts
- to rate concepts against attributes using a
numerical scale - uses statistical analysis to arrange and group
similar concepts and attributes - A useful way of capturing concept knowledge and
tacit knowledge - Requires special software (PC-PACK)
28Repertory Grid Example
29Repertory Grids -Demonstration usingPC-PACK
Laddering Tool
30Constrained Tasks
- KA technique in which the expert performs a task
they would normally do, but with constraints. - Variations
- limited time
- limited data
- Useful for focusing the expert on essential
knowledge and priorities
3120-Questions
- KA technique in which the expert asks yes/no
questions to the knowledge engineer in order to
deduce an answer. - The knowledge engineer need not know much about
the domain, or have an answer in mind, just
answer yes or no randomly - The questions asked provide a good way of quickly
acquiring attributes in a prioritised order.
32Commentating and protocol generation
- KA technique in which the expert provides a
running commentary of their own or anothers task
performance. - A valuable method for acquiring process knowledge
and tacit knowledge. - Variations
- self-reporting
- imaginary self-reporting
- self-retrospective
- shadowing
- retrospective shadowing
33Teach back
- KA technique in which the knowledge engineer
explains knowledge from part of the domain back
to the expert. - The expert then makes comments.
- Helps reveal misunderstandings and clarifies
terminology.
34Laddering
- KA technique that involves the construction,
modification and validation of trees. - A valuable method for acquiring concept knowledge
and, to a lesser extent, process knowledge. - Can make use of various trees
- concept tree
- composition tree
- attribute tree
- process tree
- decision tree
- cause tree
35Laddering -Demonstration usingPC-PACK Laddering
Tool
36KA Techniques3
37Modelling Techniques
- KA techniques that use knowledge models as the
focus for discussion, validation and modification
of knowledge. - Can use any form of model, but important types
include - process mapping
- concept mapping
- state diagram mapping
38Process Mapping
- KA technique that involves the construction,
modification and validation of process maps. - A valuable method for acquiring process knowledge
and tacit knowledge.
39Process Map - Example
40Process Mapping -Demonstration usingPC-PACK
Diagram Tool
41Concept Mapping
- KA technique that involves the construction,
modification and validation of concept maps. - A good method for acquiring concept knowledge.
42Concept Map - Example
written by
Author
is a
Oliver Twist
Charles Dickens
wrote
is a
wrote
admired
shorter than
is a
Dostoevsky
wrote on
Bleak House
is a
born in
Book
Russia
has part
Page
Paper
made from
43State Diagram Mapping
- KA technique that involves the construction,
modification and validation of a state diagram. - A different approach to process mapping.
- Useful for capturing process knowledge, concept
knowledge and tacit knowledge.
44State Diagram - Example
Your number is dialed
On hook - no ringing
On hook - ringing
Lift receiver
Person at other end rings off
Lift receiver
Off hook - conversation
Off hook - dialing tone
Hang up
Phone is answered at other end
Hang up
Off hook - ringing tone
Dial number
Off hook - dialing
Complete dialing
45How do you design a KA session plan?
Your 10 step guide
46Designing a KA plan
- We need different techniques because
- there are different types of knowledge
- acquiring a certain type knowledge is made more
efficient using the right technique - e.g. can't get tacit knowledge using interviews
- Three types of KA techniques
- Natural (e.g. interviews, observation)
- Contrived (e.g. commentary, rep grid,
20-questions) - Modelling (e.g. process mapping)
47Designing a KA Session Plan
- Be clear what knowledge you want from the
session. - Write an introduction summarising what knowledge
you want from the session. - Select the best KA technique/s to use.
- How do we do this? ..
48Designing a KA Session Plan
- Place the techniques selected in a clear and
logical order - e.g. interview questions first
- e.g. commentary and protocols before process
mapping - Always end the session plan with the following
question - "Bearing in mind the goals of this session, what
vital knowledge have we not yet covered" - Assign timings to each section.
49Designing a KA Session Plan
- If possible, check the session plan with your
project manager or colleague and make amendments
if necessary. - Send (email, fax) the session plan to the expert
at least one day before the session. - Make any changes the expert suggests.
- During the session, stick to the plan and keep to
the timings
50Which KA technique to use
- Decide what type/s of conceptualisation and
knowledge you need from the expert - Is it structural objects oriented knowledge?
(i.e. of concepts, attributes, states
relationships) - Is it process knowledge? (i.e. how to do things)
- Is it explicit knowledge? (i.e. easily explained)
- Is it tacit knowledge? (i.e. not easily
explained) - Use the diagram shown next to select the best
technique/s to use..
51Process Mapping
Observation
Process Knowledge
Protocols and Commentaries
State Diagram Mapping Teach Back
Constrained Tasks 20-questions
Repertory Grid
Interviews
Laddering
Triadic Method
Concept Knowledge
Concept Mapping
Card Sorting
Tacit Knowledge
Explicit Knowledge
52Building an ontology a quick tutorial example
- Look at the following materials and consider how
you might extract and model a conceptual model of
part of this domain igneous rocks - Structured interview
- Self report
- Repertory Grid
- Laddered Interview
- Item/Card Sort
- Photos thin and hand specimen
53Ontology Capture
- Scope
- Brainstorm
- Group
- Main Phases Knowledge Acquisition
- Produce Definitions
- Do not commit to meta-ontology early
- Terms (proceed middle out)
- Concensus
- Handling ambiguity
- Guidelines
- Wording
- Review
- Meta-ontology
54Ontology Documentation Skuces 4 Layer Model
- Level 0 Meta Assumptions
- MA 1 The ontology is divided into units termed
categories that are hierarchically organised - Level 1 Category Assumptions (4-tuple)
- Conceptual assumptions Explain what are the
assumptions or rationale underlying the category.
Why have such a category? - Terminological Assumptions List the term or
terms used. Explain why chosen. What terms in
other language are equivalent - Definitional Assumptions Define as in Dictionary
- Examples
- Level 2 NS Properties and Dimensions
- Level 3 Adding Non-logical Properties
- Typical, optional
55D3E Discussion Spaces
56Conceptualisation Model Pitfalls
- Pitfall Missing ontological elements
- Missing classes
- Missing attributes
- Confuse 11 with 1Many, or 1Many with ManyMany
- Important data is stored within text/comment
fields - Pitfall Extra ontological elements
- Pitfall Stop over-elaborating when do I stop?
- Proteins ? amino acid residues ? side chains ?
physical chemical properties . - Pitfall Relevance do I really need all this
detail? - Do we need to mention all the types of nucleic
acid?
57Integrating Existing Ontologies
- Reuse or adapt existing ontologies when possible
- Save time
- Correctness
- Facilitate interoperation
- Integration of ontologies
- Ontologies have to be aligned
- Hindered by poor documentation and argumentation
- Hindered by implicit assumptions
- Shared generic upper level ontologies should make
integration easier
58Encoding Implementation Toolkit
- Construct ontology using an ontology-development
system - Does the data model have the right expressivity?
- Is it just a taxonomy or are relationships
needed? - Is multiple parentage needed? Inverse
relationships? - What types of constraints are needed?
- Are reasoning services needed?
- What are authoring features of the development
tool? - Can ontology be exported to a DBMS schema?
- Can ontology be exported to an ontology exchange
language? - Is simultaneous updating by multiple authors
needed? - Size limitations of development tool?
59Encoding Ontology Implementation Pitfalls
- Pitfall Semantic ambiguity
- Multiple ways to encode the same information
- Meaning of class definitions unclear
- Pitfall Encoding Bias
- Encoding the ontology changes the ontology
- Pitfall Redundancy (lack of normalization)
- Exact same information repeated
- Presence of computationally derivable information
- Date of birth and age
- DNA sequence and reverse complement
- More effort required for entry and update
- Partial updates lead to inconsistency
- OK if redundant information is maintained
automatically
60Encoding The Interaction Problem
- Task influences what knowledge is represented and
how its represented - Molecular biology chemical and physical
properties of proteins - Bioinformatics accession number, function gene
- Underlying perspectives mean they may not be
reconcilable - If an ontology has too many conflicting tasks it
can end up compromised TAMBIS Ontology
experience
61Evaluate it - A guide for reusability
- Conciseness
- No redundancy Appropriateness protein
molecules at the atomic resolution when amino
acid level enough - Clarity Consistency
- Satisfiability it doesnt contradict itself
- Enzyme is a both a protein which catalyses a
reaction and does not catalyse a reaction - Extensibility
- Minimal Commitment
- Do I have to buy into a load of stuff I dont
really need or want just to get the bit I do? - Minimal Encoding Bias
62Documentation Make Ontology Understandable!
- Produce clear informal and formal documentation
- An ontology that cannot be understood will not be
reused - There exists a space of alternative ontology
design decisions - Semantics / Granularity
- Terminology
- Pitfall Neglecting to record design rationale
63Publish the Ontology
- Formal and informal specifications
- Intended domain of application
- Design rationale
- Limitations
64Further Reading