Title: Knowtator
1Knowtator
- A knowledge-based text annotation tool
2Philip Ogren (Philip.Ogren_at_uchsc.edu) Larry
Hunter, PhD (Larry.Hunter_at_uchsc.edu)
Center for Computational Pharmacology University
of Colorado Health Sciences Center Aurora, CO
3- Availability
- bionlp.sourceforge.net/Knowtator
- Source code will be available under MPL soon.
- Comments and suggestions welcome!
This work was supported by NIH grant R01-LM008111
4- Knowtator is
- A general-purpose text annotation tool
- A Protégé plugin
5Knowtator screenshot
6Synopsis
- Knowtator facilitates the manual creation of
training and evaluation corpora for a variety of
biomedical language processing tasks. - Knowtators key strength is the ability to define
an annotation schema using a Protégé knowledge
base.
7Features
- Stand-off annotation
- Original text is not modified
- Inter-annotator agreement metrics
- Simple API allows annotation of any arbitrary
text source. - Annotation filters
- All annotations are assigned an annotator and
(optionally) one or more annotation sets. - Annotations of many types, from multiple
annotators and annotation sets can clutter the
user interface. - Filters allow viewing select annotations
8Knowtator annotation schemas are defined by a
Protégé knowledge base
Biological and linguistic concepts can be modeled
in Protégé.
9Entities in an annotation schema are defined by
Protégé class definitions. Protégé slots and
constraints on those slots can be used to relate
annotations in meaningful ways.
Class definition for endocytosis
10Example endocytosis annotation
- Annotations of endocytosis relate to annotations
of cellular component and molecule via the slot
definitions of the endocytosis class definition. - Six slots of endocytosis
- location filled by cellular component
annotations - origin subslot of location
- destination subslot of location
- transport participants filled by molecule
annotations - transported entities subslot of transport
participants - transporters subslot of transport participants
11Example endocytosis annotation
12Knowtator data model
The goal of Knowtator is to create mappings
between concepts represented in a knowledge base
and texts that talk about those concepts.
13The Knowtator data model has three parts
- Ontology/knowledge base of concepts and
relationships (Protégé frames) - Mentions of concepts and assertions about
relationships between concepts found in text - A mapping between the target text and members of
1 and 2 (annotations)
14III. Annotations
II. Mentions/Assertions
I. Ontology/KB
Endocytosis of molecule with
thromboxane A2 receptor from endosome to
cell surface
15To do
- report on annotation efforts
- mechanism for semi-automated annotation
- import/export scripts for other annotation
formats (e.g. ATLAS)