Stuart Aitken - PowerPoint PPT Presentation

About This Presentation
Title:

Stuart Aitken

Description:

A Process Ontology for Cell Biology Stuart Aitken Artificial Intelligence Applications Institute – PowerPoint PPT presentation

Number of Views:122
Avg rating:3.0/5.0
Slides: 27
Provided by: AIA62
Category:
Tags: aitken | slides | stuart

less

Transcript and Presenter's Notes

Title: Stuart Aitken


1

A Process Ontology for Cell Biology
  • Stuart Aitken
  • Artificial Intelligence Applications Institute

2
Outline
  • Rapid Knowledge Formation (RKF) Project
  • RKF Project goals and domain
  • The Cyc knowledge based-system
  • RKF Tools
  • Process Ontology
  • General approach
  • Formalisation
  • Example

3
Rapid Knowledge Formation
  • The RKF project aims to develop tools which will
    allow domain experts to enter knowledge directly
    into the KBS.
  • DARPA-funded, two teams
  • CYCORP
  • SRI
  • Organised around Challenge Problems Cell
    Biology

4
RKF
  • Aim To enable biologists to construct an
    ontology/KB from a textbook source

formalise
Ontology
Alberts et al, Essential Cell Biology, 1998
5
Rapid Knowledge Formation
  • Key techniques
  • The KBS has knowledge of the KA process
  • Knowledge of salience
  • Knowledge of the requirements of an adequate
    formalisation
  • There is a dialogue between expert and system,
    which clarifies the concept being defined.

6
Rapid Knowledge Formation
  • Evaluation
  • After a period of tool development,
  • trials are organised, both
  • expert performance, and
  • KE performance is measured,
  • and assessed independently.
  • The evaluation is extensive over a period of 2
    weeks

7
The Cyc KBS
  • Cyc (Doug Lenat) is a knowledge-based system,
    under development since 1984, aiming to
    represent common sense knowledge.
  • Cyc uses a large upper-level ontology
  • Uses a logical language based on first-order logic

8
The Cyc KBS
  • Concepts in the Upper Ontology
  • Thing, Agent, Event
  • TangibleThing, InformationBearingObject
  • . Dog, Book
  • subclass(genls), instance-of(isa)
  • parts, subevent, role predicates
  • 1600 concepts in total in the public release
    (1998) - small of Cyc
  • Classification
  • Stuff-like vs Object-like
  • Individual vs Set

9
The Cyc KBS
  • The upper-ontology supports application
    development

Thing
Upper-level
Intermediate-level
Application-level
10
The Cyc KBS
  • Cyc includes
  • An inference engine,
  • GUI,
  • tools for ontology development.
  • Until the RKF project, ontology development was
    by trained knowledge engineers, working with
    domain experts.

11
RKF
  • New tools in Cyc
  • Define a new concept, and place it correctly in
    the ontology
  • Refine a concept definition
  • Define a new predicate
  • Assert a new fact
  • Define a new rule
  • State an analogy
  • Construct a new process

12
RKF
  • User interaction
  • Selection of items in the interface
  • Choice determined intelligently, KBS has
    knowledge of salience, and the KA process, this
    knowledge must be authored
  • Browsing of the ontology
  • Search
  • Natural language dialogue

13
Process Models
RNA Transcription
BindsTogether
Move
14
Process Descriptor
  • Q Name the process
  • A RNA Transcription
  • QSelect the type of Process that describes the
    category best
  • event localised
  • creation or destruction event
  • say this _ _ _ _ _ _
  • Q Define
  • affected object _ _ _ _ _
  • location _ _ _ _ _
  • actor _ _ _ _ _

15
Process Models
  • Describing Processes
  • Complex expressions at the instance level
  • Simpler to describe in terms of types

subevent(Event,Event) doneBy(Event,Agent)
Upper-level
Intermediate-level
?
Application-level
ForAll ?E ?F ?G implies (subevent(?E,?G) and
isa(?E,BindsTogether) subevent(?F,?G) and
isa(?F,Move)) before(startOf(?E),startOf(?F))
16
Script Vocabulary
  • The Script theory defines the semantics of
    Type-Level assertions
  • (typePlaysRoleInScene RNATranscription
  • DNAMolecule BindsTogether
  • objectActedOn)
  • Requires rules for identity
  • Can require complex reasoning
  • Good for user input
  • Can be extended to cover pre and postconditions
    of actions

17
Scripts
  • subevents

RNA Transcription
startsAfterStartingOfInScript
BindsTogether
Move
t
e
f
Forall subevents f of t, of type Move, and all
subevents e of t, of type BindsTogether, (startsA
fterStartingof f e) where t is of type
RNATranscription
18
Scripts
  • Type playing role

BindsTogether
Nucleotide
Types
Instance
N
e
objectActedOn
For some n in N, (objectActedOn e n)
19
New Script Vocabulary
  • Pre and Post conditions

(preconditionOfScene-negated BindsTogether
touchingDirectly ltRibonucleotide
Nucleotidegt) (postconditionOfScene
BindsTogether connectedTo ltRibonucleotide
Nucleotidegt)
BindsTogether
N
N
R
not touchingDirectly
R
connectedTo
20
New Script Vocabulary
BindsTogether
Ribonucleotide
Nucleotide
Types
Set of Instances
role
role
N
R
e
Postcondition
Precondition
Some ?n in N, some ?r in R (not (touchingDirectly
?n ?r))
Some ?n in N, some ?r in R (connectedTo ?n ?r)
identity
21
Script Vocabulary
  • The Script vocabulary forms an intermediate
    level, which
  • lies behind the Process descriptor GUI (i.e. the
    textboxes)
  • Not, in itself, a taxonomy of processes, but
    allows processes to be described in detail.
  • Defining the subclass relation is just one task.

22
Vaccinia Virus Life Cycle
  • The vaccinia virus life cycle was selected as an
    example of a complex model to formalise as a set
    of Scripts.
  • The model includes actions, decomposition,
    ordering, objects-playing-roles and
    pre/postconditions
  • It is a good test for the Script vocabulary

23
Vaccinia Virus Life Cycle
Temporal
mRNATranscription-Early
ViralGeneTranslation-Early
MovementOfProtein
OutputsmessengerRNA
Participants
mRNATranscription-Early
InputsmessengerRNA
ViralGeneTranslation-Early
MovementOfProtein
Conditions
mRNATranscription-Early
PrespatiallySubsumes Cell VirusCore
ViralGeneTranslation-Early
PostspatiallySubsumes
CellCytoplasm Vitf2
MovementOfProtein
24
Evaluation
  • 8 biologists were selected, and trained in the
    tools, 4 per team
  • The knowledge to be formalised was selected
    (chapter 7 in Alberts)
  • The knowledge base was allowed to contain
    pump-priming knowledge
  • The biologists entered knowledge , using the
    tools, then tested it against a set of questions,
  • Ontology/KB was revised

25
Evaluation
  • Results (outline)
  • A huge amount of data was collected, but analysis
    is complex (IET Inc)
  • Domain experts were able to develop ontologies
    after light training
  • Knowledge engineers out-perform domain experts in
    ontology construction

26
Summary
  • Power Tools for ontology development are being
    implemented and tested in the RKF project.
  • A Script/Process vocabulary has been developed
    and applied to processes in cell biology,
    covering
  • Temporal order
  • Participants
  • Pre/postconditions
  • Repetition
Write a Comment
User Comments (0)
About PowerShow.com