Title: An Introduction to Ontologies
1An Introduction to Ontologies
- Tim Finin
- University of Maryland Baltimore County
2What is an ontology
- The subject of ontologyis the study of
thecategories of things thatexist or mayexist
in some domain. - The word ontology isfrom the Greek ontos
forbeing and logos for word. - Aristotle offered an ontology which included 10
categories, shown as the leaves in this tree
(from Sowa, after Brentano)
3Tree ofPorphyry
- The oldest knowntree diagram is the3rd century
AD work by Greek philosopherPorphyry in
commentary on Aristotle. - Substance was identified as the supreme genus or
the most general supertype.
4Top down vs. bottom up
- Philosophers build fromthe top down and
areinterested in capturingthe most
generalconcepts. - Programmers tend towork from the bottomup,
supporting a set ofapplications, with a little
generality to help reuse and future development. - Ex CHAT-80 system (Periera and Warren, 1982)
which answered NL questions about a geographic
database. - Example of a microworld ontology supported NLP,
query answering, and generation
5Blocks world
6Blocks world
- The blocks world is another microworld used
often for NLP, vision, planning. - It consists of a table, a set of blocks or
different shapes, sizes and colors and a robot
hand. - Some typical domain constraints
- Only one block can be on another block.
- Any number of blocks can be on the table.
- The hand can only hold one block.
- Typical representation
- ontable(a) ontable(c)
- on(b,a) handempty
- clear(b clear(c)
7Trees, Lattices, and Other Hierarchies
- Most systems for expressing ontologies make heavy
use of familiar representation schemes, including
trees, lattices, acyclic graphs and general
graphs - A lattice has a TOP (everthing) and BOTTOM
(nothing)
8Ontologies in Computer Science
- Ontology A common vocabulary and agreed upon
meanings to describe a subject domain.
Ontol"ogy (?), n. Gr. the things which exist
(pl.neut. of , , being, p.pr. of to be) -logy
cf.F. ontologie. That department of the science
of metaphysics which investigates and explains
the nature and essential properties and relations
of all beings, as such, or the principles and
causes of being. Webster's Revised Unabridged
Dictionary (G C. Merriam Co., 1913, edited by
Noah Porter)
- This is not a profoundly new idea
- Vocabulary specification
- Domain theory
- Conceptual schema (for a data base)
- Class-subclass taxonomy
- Object schema
9Importance of ontologies in communication
- An example of the importance of ontologies in
communication is the fate of NASAs Mars Climate
Orbiter - It crashed into Mars on September 23, 1999
- JPL used metric units in their program
controlling the thrusters and Lockheed-Martin
used imperial units. - Instead of establishing an orbit at an altitude
of 140km, it did so at 60km, causing it to burn
up in the Martian atmosphere.
10Conceptual Schemas
- A conceptual schema specifies the intended
meaning of concepts used in a data base
Data Base
Table price stockNo integer cost float
Data Base Schema
Auto Product Ontology
price(x, y) ? (x, y) auto_part(x)
part_no(x) x
retail_price(x, y, Value-Inc)
magnitude(y, US_dollars) y
Product Ontology
Conceptual Schema
Units Measures Ontology
11Implicit vs. Explicit Ontologies
- Systems which communicate and work together must
share an ontology. - The shared ontology can be implicit or explicit.
- Implicit ontology are typically represented only
by procedures - Explicit ontologies are (ideally) given a
declarative representation in a well defined
knowledge representation language.
12Conceptualizations, Vocabularies and
Axiomitization
- Three important aspects to explicit ontologies
- Conceptualization involves the underlying model
of the domain in terms of objects, attributes and
relations. - Vocabulary involves assigning symbols or terms to
refer to those objects, attributes and relations. - Axiomitization involves encoding rules and
constraints which capture significant aspects of
the domain model. - Two ontologies may
- be based on different conceptualizations
- be based on the same conceptualization but use
different vocabularies - differ in how much they attempt to axiomitize the
ontologies
13Simple examples
fruit
tropical
temperate
14Ontologies vs. KBs
- Ontologies are distinguished from KBs not by
their form, but by the role they play in
representing knowledge - Consensus models for a domain
- Emphasis on properties that hold in all
situations - Emphasis on classes rather than instances
- Intended to support multiple tasks and methods
- Dont change during problem solving and are
suited for compiling into tools - Need to satisfy a community of use
- Emphasis on collaborative development
- Emphasis on translation to multiple logical
formalisms - Useful for education
15Ontology Library and Editing Tools
- Ontolingua is a language for building,
publishing, and sharing ontologies. - A web-based interface to a browser/editor server
at http//ontolingua.stanford.edu/ and mirror
sites. - Ontologies can betranslated into a number of
content languages, including KIF, LOOM, Prolog,
CLIPS, etc. - Chimera is a tool for merging existing ontologies
16Big Ontologies
- There are several large, general ontologies that
are freely available. - Some examples are
- Cyc - Original general purpose ontology
- WordNet - a large, on-line lexical reference
system - World Fact Book -- 5Meg of KIF sentences!
- UMLS - NLMs Unified Medical Language System
- See http//www.cs.utexas.edu/users/mfkb/related.ht
ml for more
17WordNet
- WordNet is an on-line lexical referencesystem
whose design is inspired bypsycholinguistic
theories of human lexicalmemory. - English nouns, verbs, adjectives and adverbs are
organized into synonym sets, each representing
one underlying lexical concept. - Synsets board,plankboard,committee
- Different relations link the synonym sets (e.g.
antonyms, generalizations, etc) - 140K words
- Developed by the Cognitive Science Laboratory at
Princeton and available online - Although linguistically motivated, many groups
have used it as a general ontology of concepts. - http//www.cogsci.princeton.edu/wn/
18EDR Electronic Dictionary
- http//www.iijnet.or.jp/edr/
- a dictionary with over 400,000 concepts, with
their mappings to both English and Japanese
words.
19Cyc
- CYC is a large KB which has beenunder continual
development sinceabout 1985. - The CYC KB is a formalized representation a vast
quantity of fundamental human knowledge facts,
rules of thumb, and heuristics for reasoning
about the objects and events of everyday life. - CYC is encoded in the KR language CYCL
- The Upper CYC Ontology contains approximately
3,000 terms capturing the most general concepts
of human consensus reality. - http//www.cyc.com/cyc-2-1/cover.html
20Cycs top level concepts
21openCyc
- http//www.opencyc.org/
- OpenCyc 1.0 (summer 2002?) will include the
following. - 6,000 concepts an upper ontology for all of
human consensus reality. - 60,000 assertions about the 6,000 concepts,
interrelating them, constraining them, in effect
(partially) defining them. - A compiled version of the Cyc Inference Engine
and the Cyc Knowledge Base Browser. - A specification of CycL, the language in which
Cyc (and hence OpenCyc) is written. There are
CycL-to-Lisp,CycL-to-C, etc. translators. - A specification of the Cyc API
- A few sample programs that demonstrate use of the
Cyc API for application development.
22IEEE Standard Upper Ontology
- An IEEE standards working group
- This standard will specify an upperontology
that will enable computers to utilize it for
applications such as data interoperability,
information search and retrieval, automated
inferencing, and natural language processing. - http//suo.ieee.org/
- See site for documents and archives of mailing
list discussions - Two starter documents for SUOs SUMO, IFF
23World Fact Book
- Stanfords WFB aims to semi-automatically
construct a substantial KB of basic geographic,
economic, political, and demographic knowledge
about the world's nations. - Source CIA World Fact Book
- 5.2 MB 5K classes 64K facts and rules encoded
in KIF - Available from http//www-ksl-svc.stanford.edu591
5/doc/wfb/ in several forms
- Example resources, industries, commodities
- Interrelated crude-oil reserves, production,
exports - Coal mining,computer industry,auto parts
industry, - Specify basic definitions
- A natural resource is a deposit of stuff an
industry is a collection of businesses a
commodity is an item whose sales can be measured
as a continuous quantity - Examine related classes identify key factors
- E.g., material, process, product, customer,
location, task - Define each industry as a conjunction of factors
- 6 generative factors discriminate 500 industries
- Organize values of factors (mining
24 Unified Medical Language System
- Under development since 1986 by the National
Library of Medicine - Supports standardize medical terminology via a
central dictionary thesaurus semantic
network search engine - Purpose is to aid the development of systems
that help health professionals and researchers
retrieve and integrate electronic biomedical
information from a variety of sources and to make
it easy for users to link disparate information
systems, including computer-based patient
records,bibliographic databases, factual
databases, and expert systems. - There are four UMLS knowledge sources
- UMLS Metathesaurus
- SPECIALIST Lexicon
- UMLS Semantic Network
- UMLS Information Sources Map
25Ontology Conclusions
- Shared ontologies are essential for agent
communication and knowledge sharing - Ontology tools and standards are important
- Ontolingua and OKBC are good examples
- XML and RDF may be a next step
- Some large general ontologies are available
- Cyc, WFB, WordNet,
- For more information
- http//www.kr.org/top describes projects
addressing major ontology construction issues - Ontology mailing list send mail to
majordomo_at_cs.umbc.edu with info ontology in
message body for information. - ANSI Ad Hoc Group on Ontology Standards
http//WWW-KSL.Stanford.EDU/onto-std/