Title: Universit Paris IVSorbonne LaLICC
1Université Paris IV/Sorbonne - LaLICC
- Semantic/Conceptual Annotation Making Use of the
NKRL Technology - Gian Piero ZARRI
- LaLICC
- Université Paris IV/Sorbonne
- Maison de la Recherche
- 28, rue Serpente 75006 Paris, France
- gpzarri_at_paris4.sorbonne.fr, zarri_at_noos.fr
2O U T L I N E
- A rough classification of annotation techniques
- Semantic annotations a promising way
- Standard semantic annotations some problems
- A quick reminder of NKRL
- The two ontologies (concepts and events)
- The inference rules
- Conclusion, NKRL as an annotation tool.
3 Classification of annotation
techniques (1)
- Free form annotations
- a way of associating generic remarks (usually in
natural language) about an existing document,
e.g The text in this document does not make
much sense. Realised (Annotea) making use of a
simple RDF/XML schema, and anchored to specific
locations in the document. -
4 Classification of annotation
techniques (2)
- Utilisation reading supports for remembering,
emphasising, commenting etc., or for searching
for specific text fragments. Typically, they are
i) not formalised and impossible to exploit
beyond simple keyword searches ii)
oftenephemeral and then unavailable for search
operations involving the use of permanent
knowledge repositories iii) not very
expressive from the point of view of search for
explicit and implicit information. - An ideal vector for the implementation of
annotation for collaboration procedures. -
5 Classification of annotation
techniques (3)
- Linguistically-motivated annotations
- applied (mainly) to natural language (NL)
documents. They use Computational Linguistics
techniques to recognise the morphological,
syntactic (through Part-of-Speech, PoS, tagging)
and semantic categories (e.g., named entities)
of specific terms of the document. - Giving their dependence from an automatic
process of linguistic analysis,
linguistic-motivated annotations are not very
suitable for collaborative annotating
procedures. -
6 Classification of annotation
techniques (4)
- Two basic forms
- Simply associating (annotating, tagging)
terms from the original documents with their
corresponding linguistic categories. This form of
linguistic annotation can be considered as an
extension of the free-form annotation. - Making use of the linguistic properties
(mainly semantic) to fill pre-determined
templates that represent the general topic
(terrorism, commercial activities, airline
crashes etc.) of a given document (see Message
Understanding Conferences, MUC). -
7 Classification of annotation
techniques (5)
- A recent IST project, Parmenides
- three layers i) structural annotations, used
to define the physical structure of the
documents ii) lexical annotations, used to
mark interesting text units like Named
Entities, Temporal Expressions, Events,
Descriptive Phrases, etc. iii) semantic
annotations, used to express the relationships
that exist among lexical entities (e.g.,
lexically identified people can be associated
with their organization and their job title). -
8 Classification of annotation
techniques (6)
- Semantic/conceptual annotations are permanent,
not anchored to specific locations but associated
with the whole document. More importantly, they
are supposed to represent the semantic content
of the documents using standard ontologies and
W3C languages like RDF(S) and OWL. - A tool commonly used in this context is Protégé,
now endowed with an OWL plugin that allows
loading and saving OWL and RDF ontologies,
editing and visualizing OWL classes and their
properties and supporting reasoners such as the
description logics classifiers. -
9 Classification of annotation
techniques (7)
- Lot of activity in this domain, concerning,
e.g., the conceptual annotation of collections of
still images. - Still images projects The co-depiction
experiment two people are co-depicted if there
exists some digital image that depicts them both. - If we knew who was depicted in an image, we could
explore a Web of relationships between people
that were co-depicted, constructing then chains
of images that can lead from ordinary citizens to
John Fitzgerald Kennedy or Frank Sinatra. Makes
use of FOAF (Friend of a Friend), a vocabulary
that provides a way for RDF documents to talk
about people and their characteristics. -
10 Classification of annotation
techniques (8)
-
- Still images projects the widely publicized
project for the NASA image management, based on
the use of an annotation environment (PhotoStuff)
that enables users to annotate information about
NASA images and/or their regions using as
metadata concepts in OWL and/or RDF(S)
ontologies. -
11 Problems of standard semantic
annotations (1)
- Standard ontologies (W3C languages) may not be
sufficient, however, to fully render the semantic
content of all the information that can be of
interest in an annotation framework. - Again in a still images context, difficulties
in using simple binary languages like OWL and
RDF(S) to represent correctly an n-ary
situation like the central episode in the
Surrender of Breda masterpiece by Velasquez. We
need there i) an ontology in the W3C style to
describe correctly the two characters and the
key of the city element, but we must also
introduce ii) a ternary predicate like GIVE or
RECEIVE to characterize correctly the situation,
and iii) specify the roles of the two
characters and the key (SUBJECT, OBJECT and
BENEFICIARY in a GIVE perspective) with respect
to the predicate.
12Problems of standard semantic annotations (2)
13 Problems of standard semantic
annotations (3)
- In general, it is difficult to conceptually
annotate narrative documents (texts, images)
making use only of the W3C languages. - Narrative documents are really pervasive, they
concern, e.g., the corporate knowledge domain
(memos, policy statements, reports, minutes
etc.), the news, the normative and legal texts,
the medical records, many intelligence messages,
as well as a huge fraction of the documents
stored on the Web. Exploiting this narrative
information is mandatory, e.g., for all the
different monitoring applications, from the
technological monitoring to the strategic
one. -
14 Quick reminder of NKRL,
knowledge representation (1)
-
- NKRL (Narrative Knowledge Representation
Language) - A conceptual language designed for
representing, in a standardised way (metadata),
the semantic content (the meaning) of (complex)
narrative events. - The term narrative event is very general, and
covers also related notions like fact, action,
state, situation etc. In a narrative event, the
information to be represented concerns the real
or intended behaviour of some actors (or
personages, characters etc.). These try to
attain a specific result, experience particular
situations, manipulate some (concrete or
abstract) materials, send or receive messages,
buy, sell, deliver etc.
15 Quick reminder of NKRL, knowledge
representation (2)
-
- The main novelty of NKRL with respect to the
usual knowledge representation languages
consists in the presence of two ontologies - a (quite standard) ontology
- of concepts (like in Protégé, OWL etc.)
- a (new) ontology of events.
-
16 Quick reminder of NKRL, knowledge
representation (3)
-
- In NKRL, a concept is, substantially,
a frame-like data structure associated with
a symbolic label like human_being, location_,
city_, etc. Concepts are inserted into a
generalisation / specialisation hierarchy that,
for historical reasons, is called H_Class(es),
and which corresponds well to the usual
ontologies of terms. The instances of the NKRL
concepts (lucy_, taxi_53, paris_) take the name
of individuals. -
17 Quick reminder of NKRL, knowledge
representation (4)
- Ontology of Events Hierarchy of complex
threefold structures (templates) having the
following format - (Li(Pj (R1 a1) (R2 a2) (Rn an)))
- The instances of templates are called
predicative occurrences.
18 Quick reminder of NKRL, knowledge
representation (5)
- name MoveTransferOfServiceToSomeone father
MoveTransferToSomeone - position 4.23 NL description Supply a Service
to Someone -
- MOVE SUBJ var1 (var2)
- OBJ var3
- SOURCE var4 (var5)
- BENF var6 (var7)
- MODAL var8
- TOPIC var9
- CONTEXT var10
- modulators , ?abs
-
- var1 lthuman_being_or_social_bodygt
- var3 ltservice_gt
- var4 lthuman_being_or_social_bodygt
- var6 lthuman_being_or_social_bodygt
- var8 ltprocess_gt ltsector_specific_activitygt
- var9 ltsortal_conceptgt
- var10 ltsituation_gt ltsymbolic_labelgt
19 Quick reminder of NKRL, knowledge
representation (6)
- We notice today, 10 June 1998, that British
Telecom intends - offering its customers a pay-as-you-go (payg)
Internet service - c4) (GOAL c5 c6)
- (the aim of event c5 is to realise event c6 ?
NKRL representation of the connectivity
phenomena) - c5) BEHAVE SUBJ british_telecom
- obs
- date1 10-june-1998
- date2
- (we note, at a given moment, that British Telecom
wants to do something) - c6) MOVE SUBJ british_telecom
- OBJ payg_internet_service_1
- BENF (SPECIF customer_ british_telecom)
- date1 after-10-june-1998
- date2
- (an instance, predicative occurrence, of the
previous move service template)
20(No Transcript)
21 Quick reminder of NKRL, knowledge
representation (8)
- The expressiveness of this threefold format
is enhanced by the use of two additional tools - the AECS sub-language that allows the
construction - of complex (structured) predicate arguments
- Ex (SPECIF customer_ british_telecom)
- ? The customers of British Telecom
- the second order tools (binding structures
and - completive construction) used to code the
connectivity - phenomena between single narrative
fragments. - Ex c4) (GOAL c5 c6) ? The aim
of what is - described in c5 is to obtain the result
c6 -
22 Quick reminder of NKRL, inference
rules (1)
- Hypothesis rules link automatically some
information found by querying an NKRL knowledge
base to other information present in this base.
If this is possible, this last information
represents a sort of causal explanation of the
information originally retrieved. - E.g., having found in the base an information
like Pharmacopeia has received some money from
Shering, automatically link this event to
information in the style of Pharmacopeia and
Shering have concluded an agreement for the
production by Pharmacopeia of a given compound
and We observe that Pharmacopeia has really
produced the compound.
23 Quick reminder of NKRL, inference
rules (2)
-
- Transformation rules try to automatically
replace (transform) some retrieval queries that
failed with one or more different queries that
are not strictly equivalent but only
semantically close to the original one. - Search for the existence of links between Osama
bin Laden and Abubakar Abdurajak Janjalani ? - Search for the attestation of a specific
transfer of economic / financial items between
the two, - retrieving then Abubakar Abdurajak Janjalani
has received an undetermined amount of money from
bin Laden through an intermediate agent.
24 Quick reminder of NKRL, inference
rules (3)
- Representation of the NKRL inference rules
- HYPOTHESIS h1
-
- premise
-
- RECEIVE SUBJ var1
- OBJ money_
- SOURCE var2
-
- var1 company_ var2 human_being, company_
-
- A company has received money from another company
or a physical person. -
- first condition schema (cond1)
-
- PRODUCE SUBJ (COORD var1 var2)
- OBJ var3
- BENF (COORD var1 var2)
25 Quick reminder of NKRL, inference
rules (4)
- Economic/financial transfer transformation
-
- t1) BEHAVE SUBJ (COORD1 var1 var2)
- OBJ (COORD1 var1 var2)
- MODAL var3
-
- ?
- RECEIVE SUBJ var2
- OBJ var4 SOURCE var1
-
- var1 human_being_or_social_body
- var2 human_being_or_social_body
- var3 business_agreement, mutual_relationship
- var4 economic/financial_entity
-
- To verify the existence of a relationship or of a
business agreement between two persons, verify if
one of these persons has received a financial
entity (e.g., money) from the other.
26 Quick reminder of NKRL, inference
rules (5)
- FUM (Filtering/Unification Module)
- Allows unifying an NKRL search pattern (NKRL
equivalent of a natural language query) with a
knowledge base of NKRL occurrences. This module
includes a first level of inferencing
unification is executed taking into account the
fact that a generic concept in the search
pattern can unify one of its specific concepts
(or an instance) in the occurrence. - During the execution of the inference rules,
all - the reasoning steps are automatically
- transformed into search patterns
27 Quick reminder of NKRL, inference rules (6)
-
- mod3.c5) PRODUCE SUBJ (SPECIF INDIVIDUAL_PERSON_20
weapon_wearing - (SPECIF cardinality_ several_)) (VILLAGE_1)
- OBJ kidnapping_
- BENF ROBUSTINIANO_HABLO
- CONTEXT mod3.c6
- date-1 20/11/1999
- date-2
-
-
- On November 20, 1999, in an unspecified village,
an armed group of people has - kidnapped Robustiniano Hablo.
-
- PRODUCE SUBJ human_being
- OBJ violence_
- BENF human_being
-
- date1 1/1/1999
- date2 31/12/1999
28 Quick reminder of NKRL, inference
rules (7)
- A hypothesis corresponds to a fixed scenario
formed by a given number of reasoning steps - i) try to prove the existence of an agreement
about a given work ii) try to see if this work
has been really accomplished. - These steps correspond to queries (search
patterns) on the knowledge base of (NKRL-coded)
events - Integration transformations hypotheses
- use of the transformation rules to randomly
transform the orginal steps (original search
patterns) into semantically equivalent ones.
29 Quick reminder of NKRL, inference rules (8)
- Integrations aims introduce a certain degree of
fuzziness in the execution of hypotheses, and
increase the probability of discovering implicit
information. - Main principle to be executed, the reasoning
steps of a hypothesis must be reduced to search
patterns any NKRL search pattern can be
automatically converted into a new search pattern
by means of transformation rules. For this, it is
sufficient that the original pattern unifies the
antecedent part of one of the transformation
rules.
30 Quick reminder of NKRL, inference
rules (9)
- Inference steps in a kidnapping context
-
- (Cond1) The kidnappers are part of a separatist
movement or of a terrorist organization. - (Cond2) This separatist movement or terrorist
organization currently practices ransom
kidnapping of specific categories of people. - (Cond3) In particular, executives are concerned
(other rules will deal with civil servants,
servicemen, members of the clergy etc.). - (Cond4) It can be proved that the kidnapped is
really a businessperson.
31 Quick reminder of NKRL, inference
rules (10)
- Hypothesis rule in the presence of
transformations concerning the intermediary
inference steps -
- (Cond1) The kidnappers are part of a separatist
movement or of a terrorist organization. - (Rule T3, Consequent1) Try to verify whether a
given separatist movement or terrorist
organization is in control of a specific
sub-group and, in this case, - (Rule T3, Consequent2) check if the kidnappers
are members of this sub- group. We will then
assimilate the kidnappers to members of the
movement or organization. - (Cond2) This movement or organization practices
ransom kidnapping of given categories of
people. - (Rule T2, Consequent) The family of the
kidnapped has received a ransom request from
the separatist movement or terrorist
organization. - (Rule T4, Consequent1) The family of the
kidnapped has received a ransom request from a
group or an individual person, and - (Rule T4, Consequent2) this second group or
individual person is part of the separatist
movement or terrorist organization. - (Rule T5, Consequent1) Try to verify if a
particular sub-group of the separatist
movement or terrorist organization exists, and -
32C O N C L U S I O N
- Advantages of NKRL as a conceptual annotation
tool - In-depth conceptual representation of the
annotations - Powerful inference rules allowing a real
reasoning about the annotated material.