Title: Department of Computer Science
1Department of Computer Science Engineering
University of California, San DiegoCSE-291
Ontologies in Data IntegrationSpring 2003
- Bertram Ludäscher
- LUDAESCH_at_SDSC.EDU
2Outline
- Wrapping up last week
- What is a representation?
- Thesauri, Topic Maps
- Predicate Logic Primer
- Description logics
- RDF RDF Schema
- F-logic
- Topic Selection
- Special thanks
- Alexander Maedche, Steffen Staab
- ECAI2002 Tutorial on Ontologies
3Ontologies For What?
- Lack of a shared understanding leads to poor
communication
- People, organizations and software
systems
- must communicate between and among
- themselves
- Disparate modeling paradigms, languages and
software tools limit
- Interoperability
- Knowledge sharing reuse
Uschold, Gruninger, 96
4Origin and History (I)
- Ontology ....
- a philosophical discipline, branch of philosophy
that
- deals with the nature and the organisation of
reality
- Science of Being (Aristotle, Metaphysics, IV, 1)
- Tries to answer the questions
- What is being?
- What are the features common to all beings?
5Origin and History (II)
Humans require words (or at least symbols) to
communicate efficiently. The mapping of words to
things is only indirect possible. We do it by
creating concepts that refer to things.
The relation between symbols and things has been
described in the form of the meaning triangle
6Origin and History (III)
In recent years ontologies have become a hot
topic of interest. Here, an ontology refers to
an engineering artifact It is constituted by a
specific vocabulary used to describe a certain
reality, plus a set of explicit assumptions rega
rding the intended meaning of the vocabulary.
Thus, ontologies describe a formal partial speci
fication of a specific domain
Shared understanding of a domain of interest
Formal and machine executeable model of a domain
of interest
7Human and machine communication (I)
Maedche et al., 2002
...
Machine Agent 1
Human Agent 2
Machine Agent 2
Human Agent 1
exchange symbol, e.g. via nat. language
exchange symbol, e.g. via protocols
Ontology Description
Symbol
JAGUAR
Formal Semantics
Formal models
Internal models
commit
commit
Concept
Meaning Triangle
MA1
MA2
HA2
HA1
commit
Ontology
commit
a specific domain, e.g. animals
Things
8Ontology Natural Language
- It is important to emphasize that there is a mn
relationship between words and concepts
- This means practically
- different words may refer to the same concept
- a word may refer to several concepts
- Ontologies languages should provide means for
making this difference explicit.
9Example
Ontology C c1,c2, c3, R r1, HC(c2,c1),
r1(c2,c3),
10Ontology vs. Knowledge Bases
- There is no clear separation between ontology and
knowledge base
- Example
- Often it remains a modeling decision if something
is modeled as concept or as instance. In many
applications meta-modeling means are required.
11Types of Ontologies (I)
Guarino, 98
describe very general concepts like space, time,
event, which are independent of a particular
problem or domain. It seems reasonable to have
unified top-level ontologies for large
communities of users.
describe the vocabulary related to a generic
domain by specializing the concepts introduced in
the top-level ontology.
describe the vocabulary related to a generic task
or activity by specializing the top-level
ontologies.
These are the most specific ontologies. Concepts
in application ontologies often correspond to
roles played by domain entities while performing
a certain activity.
12Ontologies and their Relatives (I)
- There are many relatives around
- Controlled vocabularies, thesauri and
classification systems available in the WWW, see
http//www.lub.lu.se/metadata/subject-help.html
- Classification Systems (e.g. UNSPSC, Library
Science, etc.)
- Thesauri (e.g. Art Architecture, Agrovoc,
etc.)
- Lexical Semantic Nets
- WordNet, see http//www.cogsci.princeton.edu/wn/
- EuroWordNet, see http//www.hum.uva.nl/ewn/
- Topic Maps, http//www.topicmaps.org (e.g. used
within knowledge management applications)
- In general it is difficult to find the border
line!
13Ontologies and their Relatives (II)
General logical constraints
Formal Is-a
Thesauri
Frames
Catalog / ID
Informal Is-a
Formal Instance
Value Restric- tions
Terms/ Glossary
Axioms Disjoint Inverse Relations, ...
14Some Ontologies (and Friends) in Action
- (coming soon to a project near you)
15GEON Architecture
16SMART (Meta)data I Logical Data Views
Adoption of a standard (meta)data model wrap
data sets into unified virtual views
Source NADAM Team (Boyan Brodaric et al.)
17SMART Metadata II Multihierarchical Rock
Classification for Thematic Queries (GSC)
or Taxonomies are not only for biologists ...
Genesis
Fabric
Composition
Texture
18SMART Metadata III Source Contextualization
Ontology Refinement
Biomedical Informatics Research Network http/
/nbirn.net
Focused GEON ontology working meeting
last week ... (GEON, SCEC/KR, GSC, ESRI)
19EcoCyc
20Gene Ontology http//www.geneontology.org
a dynamic controlled vocabulary that can be
applied to all eukaryotes Built by the community
for the community. Three organising principles
Molecular function, Biological process, Cellular
component Isa and Part of taxonomy but not good
! 10,000 concepts Lightweight ontology, Poor se
mantic rigour. Ok when small and used for
annotation. Obstacle when large, evolving and
used for mining.
21Controlled vocabulary
- AGROVOC Agricultural Vocabulary
22Thesauri
- AAT Art Architecture Thesaurus
23Ontologies - Some Examples
- General purpose ontologies
- WordNet / EuroWordNet, http//www.cogsci.princeton
.edu/wn
- The Upper Cyc Ontology, http//www.cyc.com/cyc-2-1
/index.html
- IEEE Standard Upper Ontology, http//suo.ieee.org/
- Domain and application-specific ontologies
- RDF Site Summary RSS, http//groups.yahoo.com/grou
p/rss-dev/files/schema.rdf
- UMLS, http//www.nlm.nih.gov/research/umls/
- KA2 / Science Ontology, http//ontobroker.semantic
web.org/ontos/ka2.html
- RETSINA Calendering Agent, http//ilrt.org/discove
ry/2001/06/schemas/ical-full/hybrid.rdf
- AIFB Web Page Ontology, http//ontobroker.semantic
web.org/ontos/aifb.html
- Web-KB Ontology, http//www-2.cs.cmu.edu/afs/cs.cm
u.edu/project/theo-11/www/wwkb/
- Dublin Core, http//dublincore.org/
- Meta-Ontologies
- Semantic Translation, http//www.ecimf.org/contrib
/onto/ST/index.html
- RDFT, http//www.cs.vu.nl/borys/RDFT/0.27/RDFT.rd
fs
- Evolution Ontology, http//kaon.semanticweb.org/ex
amples/Evolution.rdfs
24Ontology Representation
- What is a representation?
25Ontology Representation Languages
- Machines need communication with formal content
to restrict meaning
- What makes a language formal?
- model theory (1st order predicate logic)
- proof theory (Gentzen calculus)
- But also
- conventions (e.g. Java)
26What makes a language suitable?
- For machine communication
- ? model theory ?
- ? proof theory
- ? tracktability
- ? strong conventions of use
- ? human readable names ?
- For human communication
- ? strong conventions of use ?
- ? human readable names ?
- ? natural primitives ?
27Representation Paradigms (incomplete)
TopicMaps
Thesauri
Taxonomies
Ontologies
Semantic Nets
extended ER-Modell
Predicate Logics /Description Logics
28Thesaurus
29Thesauri
similarTo
Fruit
Vegetable
Example
NarrowerTerm
Orange
Apfelsine (german)
synonymWith
Graph with labels edges (similar, nt, bt,
synonym) Fixed set of edge labels (aka relations
)
no instances
Well known in library science
cf. terminologies / classifications (Dewey)
30(No Transcript)
31Topic Maps are ...
- Standardized ISO/IEC 132502000
- ISO standard published Jan. 2000
- enabling standard to describe knowledge
structures,electronic indices, classification
schemes, ...
- Web enabled
- XML Topic Maps (XTM) are ready to use
- Designed to
- manage the info glut
- build valuable information networks above any
kind of resources / data objects
- enable the structuring of unstructured information
32Back-of-the-Book Index British Virgin Islands
Gorda Sound see North Sound Little Dix Bay ......
.............. 89 North Sound ...................
.... 90 Road Harbour see also Road Town ... 73 R
oad Town ...................... 69,71Spanish
Town ................... 81,82
Tortola ........................... 67Virgin
Gorda ...................... 77
33Back-of-the-Book Index British Virgin Islands
Gorda Sound see North Sound Little Dix Bay ......
.............. 89 North Sound ...................
.... 90 Road Harbour see also Road Town ... 73 R
oad Town ...................... 69,71Spanish
Town ................... 81,82
Tortola ........................... 67Virgin
Gorda ...................... 77
Topics
34Back-of-the-Book Index British Virgin Islands
Gorda Sound see North Sound Little Dix Bay ......
.............. 89 North Sound ...................
.... 90 Road Harbour see also Road Town ... 73 R
oad Town ...................... 69,71Spanish
Town ................... 81,82
Tortola ........................... 67Virgin
Gorda ...................... 77
Occurrences
35Back-of-the-Book Index British Virgin Islands
Gorda Sound see North Sound Little Dix Bay ......
.............. 89 North Sound ...................
.... 90 Road Harbour see also Road Town ... 73 R
oad Town ...................... 69,71Spanish
Town ................... 81,82
Tortola ........................... 67Virgin
Gorda ...................... 77
Different topic classes
36Back-of-the-Book Index British Virgin Islands
Gorda Sound see North Sound Little Dix Bay ......
.............. 89 North Sound ...................
.... 90 Road Harbour see also Road Town ... 73 R
oad Town ...................... 69,71Spanish
Town ................... 81,82
Tortola ........................... 67Virgin
Gorda ...................... 77
Different occurrences classes
37Back-of-the-Book Index British Virgin Islands
Gorda Sound see North Sound Little Dix Bay ......
.............. 89 North Sound ...................
.... 90 Road Harbour see also Road Town ... 73 R
oad Town ...................... 69,71Spanish
Town ................... 81,82
Tortola ........................... 67Virgin
Gorda ...................... 77
Multiple topic names
38Back-of-the-Book Index British Virgin Islands
Gorda Sound see North Sound Little Dix Bay ......
.............. 89 North Sound ...................
.... 90 Road Harbour see also Road Town ... 73 R
oad Town ...................... 69,71Spanish
Town ................... 81,82
Tortola ........................... 67Virgin
Gorda ...................... 77
Association
39Topics Computerized Subjects
Topic classes
Topics
Subject
Subject
Subject
Subject
Subject
Subject
Subject
Resources
SurfBVI
BVI Welcome
CaribNet
40Occurrences
Topics
Occurrences
Occurrenceclasses
Resources
41Occurrences
Topics
Occurrences
Occurrenceclasses
Resources
SurfBVI
BVI Welcome
CaribNet
42Associations
Association classes
Associations
Topics
43Associations
Association classes
Associations
Topics
44Class Hierarchies
Topic classes
Topics
45Class Hierarchies
Super-classes
Sub-classes
Topics
46Scopes
47Scopes
Scopes
48Scopes
Scopes
Names EnglishDeutsch
SurfBVI
BVI Welcome
CaribNet
49Scopes
Scopes
Names EnglishDeutsch
Occurrences PublicConfidential
SurfBVI
BVI Welcome
CaribNet
50Scopes
Scopes
Associations GeographyPolitics
Names EnglishDeutsch
Occurrences PublicConfidential
SurfBVI
BVI Welcome
CaribNet
51Scope Examples English, Public, Politics
Scopes
Associations GeographyPolitics
Geo Containment
Political Dependency
Names EnglishDeutsch
Article
Occurrences PublicConfidential
Map
Image
SurfBVI
BVI Welcome
CaribNet
52In-/Semi-formal approaches Topic Maps, Thesauri
- Advantages
- Capture a lot of modeling experiences
- Intuitive
- Interesting primitives that are not available in
other approaches (TM)
- Disadvantages
- No characterization independent from particular
implementation
- May be misinterpreted (TM) / few primitives
(Thesauri)
53Common errors about ontology representation
languages
- AI peoples errors
- it is good if it is formal
- it is good if someone with a logic background
may easily use it
- it is good if the language allows everything
- Engineers errors
- it works in my application, thus it is good
- who needs formality anyway?
- it did not work when I looked at it 10 years
ago
54Review/Introduction(Classical) First-order
Predicate LogicShort FO or PL1
55But first Propositional Logic Syntax
propositional logic (or "propositional ca
lculus") A system of symbolic logic using symbols
to stand for whole propositions and logical
connectives. Propositional logic only considers
whether a proposition is true or false. In
contrast to predicate logic, it does not consider
the internal structure of propositions.
http//wombat.doc.ic.ac.uk/foldoc/foldoc.cgi?propo
sitionallogic
- propositions (no internal structure) can be
assigned a truth-value
- either true or false (classical 2-valued logic
tertium non datur)
- Logical symbols
- conjunction ?, disjunction ?, negation ?,
- implication ?, equivalence ?, parentheses ? ?
- Non-logical symbols
- propositional variables p, q, r, ...
- signature set of propositional variables ? p,
q, r, ...
- Formation rules for well-formed formulas (wff)
- an atomic formula (propositional variable) is a
formula
- if F, G are formulas, so are
- F?G, F ? G, ? F, F?G , F?G, ? F ?
56Propositional Logic Semantics
- An interpretation I over a signature ? is a
mapping
- I ? ? true, false , associating a truth
value to every propositional variable
- Truth tables describe how to extend I from to
composite formulas (Boolean Algebra)
- F?G, F ? G, ? F, F?G , F?G
57Boolean Algebra, Truth Tables
http//wombat.doc.ic.ac.uk/foldoc/foldoc.cgi?two-v
aluedlogic
58Syntax of First-Order Logic (FO)
- Logical symbols
- ?, ?, ?, ?, ?, ? ?, ? (for all), ?
(exists), ...
- Non-logical symbols A FO signature ? consists
of
- constant symbols a,b,c, ...
- function symbols f, g, ...
- predicate (relation) symbols p,q,r, ....
- function and predicate symbols have an associated
arity
- we can write, e.g., p/3, f/2 to denote the
ternary predicate p and the function f with two
arguments
- First-order variables x, y, ...
- Formation rules for terms
- constants and variables are terms
- if t_1,...t_k are terms and f is a k-ary function
symbols then f(t_1,...,t_k) is a term
59Syntax of First-Order Logic (FO)
- Formation rules for formulas
- if t_1,...t_k are terms and p/k is a predicate
symbol (of arity k) then p(t_1,...,p_k) is an
atomic formula (short atom)
- all variable occurrences in p(t_1,..., t_k) are
free
- if F,G are formulas and x is a variable, then
the following are formulas
- F?G, F ? G, ? F, F?G , F?G, ? F ?,
- ?x F (for all x F(x,...) is true)
- ?x F (there exists x such that F(x,...) is
true)
- the occurrences of a variable x within the scope
of a quantifier are called bound occurrences.
60Examples
- ?x malePerson(x) ? person(x).
- malePerson(bill).
- child(marriage(bill,hillary),chelsea).
- Variable x
- Constants (0-ary function symbols) bill/0,
hillary/0, chelsea/0
- Function symbols marriage/2
- Predicate symbols malePerson/1, person/1, child/2
61Semantics of Predicate Logic
- Let D be a non-empty domain (a.k.a. domain of
discourse, universe). A structure is a pair I
(D,I), with an interpretation I that maps ...
- each constant c to an element I(c)? D
- each predicate symbol p/k to a k-ary relation
I(p) ? Dk,
- each function symbol f/k to a k-ary function
I(f) Dk?D
- Given a structure I, and a set of variables X, a
valuation is a mapping val X ? D, used to
evaluate terms and formulas over a given FO
signature ? - with this term evaluation val(t) yields a domain
element, and formula evaluation val(F) yields a
truth value
62Example
- Formula F ?x malePerson(x) ? person(x).
- Domain D b, h, c, d, e
- Lets pick an interpretation I
- I(bill) b, I(hillary) h, I(chelsea) c
- I(person) b, h, c
- I(malePerson) b
- Under this I, the formula F evaluates to true.
- If we choose I like I but I(malePerson)
b,d, then F evaluates to false
- Thus, I is a model of F, while I is not
- I F I / F
63FO Semantics (contd)
- F entails G (G is a logical consequence of F) if
every model of F is also a model of G F
G
- F is consistent or satisfiable if it has at least
one model
- F is valid or a tautology if every interpretation
of F is a model
- Proof Theory
- Let F,G, ... be FO sentences (no free variables).
- Then the following are equivalent
- F_1, ..., F_k G
- F_1 ? ... ? F_k ? G is valid
- F_1 ? ... ? F_k ? ? G is unsatisfiable
(inconsistent)
64Proof Theory
- A calculus is formal proof system to establish
- F_1, ..., F_k G
- via formal (syntactic) derivations
- F_1, ..., F_k ... G, where the
denotes allowed proof steps
- Examples
- Hilbert Calculus, Gentzen Calculus, Tableaux
Calculus, Natural Deduction, Resolution, ...
- First-order logic is semi-decidable
- the set of valid sentences is recursively
enumerable, but not recursive (decidable)
- Some inference engines
- http//www.semanticweb.org/inference.html
65Description LogicsDecidable Fragments of FO
- (aka terminological logics,member of concept
languages)
66Formalism for Ontologies Description Logic
- DL definition of Happy Father
(Example from Ian Horrocks, U
Manchester, UK)
67Description Logic Statements as Rules
- Another syntax first-order logic in rule form
(implicit quantifiers)
- happyFather(X) ?
- man(X), child(X,C1), child(X,C2), blue(C1),
green(C2),
- not ( child(X,C3), poorunhappyChild(C3) ).
- poorunhappyChild(C) ?
- not rich(C), not happy(C).
- Note
- the direction ? is implicit here (sigh)
- see, e.g., Clarks completion in Logic
Programming
68Description Logics
- Terminological Knowledge (TBox)
- Concept Definition (naming of concepts)
- Axiom (constraining of concepts)
- a mediators glue knowledge source
- Assertional Knowledge (ABox)
- the marked neuron in image 27
- the concrete instances/individuals of the
concepts/classes that your sources export
69Querying vs. Reasoning
- Querying
- given a DB instance I ( logic interpretation),
evaluate a query expression (e.g. SQL, FO
formula, Prolog program, ...)
- boolean query check if I ? (i.e.,
if I is a model of ?)
- (ternary) query (X, Y, Z) I ?
(X,Y,Z)
- check happyFathers in a given database
- Reasoning
- check if I ? implies I ? for all
databases I,
- i.e., if ? ?
- undecidable for FO, F-logic, etc.
- Descriptions Logics are decidable fragments
- concept subsumption, concept hierarchy,
classification
- semantic tableaux, resolution, specialized
algorithms
70Formalizing Glue KnowledgeDomain Map for
SYNAPSE and NCMIR
Domain Map labeled graph with concepts ("cla
sses") and roles ("associations") additional s
emantics expressed as logic rules
71 Source Contextualization DM Refinement
sources can register new concepts at the
mediator ...