Title: Ontology Basics
1Ontology Basics
- ? ? ?
- ???? ?????
- Email chingyeh_at_cse.ttu.edu.tw
- URL http//www.cse.ttu.edu.tw/chingyeh
2Selected From
- Natalya Fridman Noy and Deborah L. McGuinness.
Ontology Development 101 A Guide to Creating
Your First Ontology''. Stanford Knowledge Systems
Laboratory Technical Report KSL-01-05 and
Stanford Medical Informatics Technical Report
SMI-2001-0880, March 2001. - Deborah L. McGuinness. "Ontologies Come of Age".
In Dieter Fensel, J im Hendler, Henry Lieberman,
and Wolfgang Wahlster, editors. Spinning the
Semantic Web Bringing the World Wide Web to Its
Full Potential. MIT Press, 2002. - Ora Lassila and Deborah L. McGuinness. The Role
of Frame-Based Representation on the Semantic
Web''. KSL Tech Report Number KSL-01-02.
Submitted for publication, January, 2001.
3Ontologies Come of Age
4Abstract
- Ontologies have moved beyond the domains of
library science, philosophy, and knowledge
representation, and become the concerns of
marketing departments, CEOs, and mainstream
business. - The critical roles of ontologies in
- support of browsing and search for e-commerce and
in - support of interoperability for facilitation of
knowledge management and configuration (Forrester
Research report) - Ontologies used as central controlled
vocabularies that are integrated into catalogues,
databases, web publications, knowledge management
applications, etc.
5Abstract
- Large ontologies are essential components in many
online applications including - search (such as Yahoo and Lycos),
- e-commerce (such as Amazon and eBay),
- configuration (such as Dell and PC-Order), etc.
- Ontologies have long life spans, sometimes in
multiple projects (such as UMLS, SIC codes,
etc.). - Such diverse usage generates many implications
for ontology environments.
6The Webs Growing Needs
- The HTML pages contain information about how to
present information on a page target human
readers, rather than targeting programs or
automatic readers. - For the current search engines such as Google,
finding the exact information is not as easy as
one would hope. - The result is a rank ordered list of pages rather
than what the search engine thought. - Web pages typically do not contain markup
information about the contents of the page.
7The Webs Growing Needs
- The next generation of the web aims at pages for
machine or programs consumption. - The markup languages aimed at marking up content
and services instead of just presentation
information - XML, RDF, RDFS, DAML, etc. are becoming more
accepted as users and application developers see
the need for more understanding of what is
available from web pages.
8The Webs Growing Needs
Berners-Lees Architecture
9The Webs Growing Needs
- The markup languages at the base (just above
Unicode) are used for in term specification (or
in web speak, resource definition). - In the ontology layer, we can define terms and
their relationships to other terms. - In the logic layer, we can deduce information,
thereby allowing us to deduce implications of the
term definitions and relationships.
10Ontologies
- Merriam Webster (1721) provides two definitions
- a branch of metaphysics concerned with the nature
and relations of being and - a particular theory about the nature of being or
the kinds of existents. - While ontologies (even formal ontologies) have
had a long history, they remained largely the
topic of academic interest among philosophers,
linguists, librarians, and knowledge
representation researchers until somewhat
recently.
11Ontologies
- Ontologies have been gaining interest and
acceptance in computational audiences (in
addition to philosophical audiences). - Fields embracing ontologies Guarino 1998
- knowledge engineering, knowledge representation,
qualitative modeling, language engineering,
database design, information retrieval and
extraction, and knowledge management and
organization - Also including areas of
- library science Dublin Core 1999,
ontology-enhanced search (e.g., eCyc
(http//www.e-Cyc.com/) and FindUR McGuinness
1998), possibly the largest one, e-commerce
(e.g., Amazon.com, Yahoo Shopping, etc.), and
configuration.
12Ontologies
- Here we will be restricting our sense of
ontologies to those we see emerging on the web. - One widely cited definition of an ontology is
Grubers Gruber 1993 A specification of a
conceptualization.
13Ontologies
- Ontologies can be used to provide a concrete
specification of term names and term meanings.
14Ontology Spectrum
- One of the simplest notions of a possible
ontology may be a controlled vocabulary i.e., a
finite list of terms. - Catalogs are an example of this category.
- Catalogs can provide an unambiguous
interpretation of terms for example, every use
of a term, say car will denote exactly the same
identifier say 25. - Another potential ontology specification is a
glossary (a list of terms and meanings). - The meanings are specified typically as natural
language statements. - This provides a kind of semantics or meaning
since humans can read the natural language
statements and interpret them. - Typically, interpretations are not unambiguous
and thus these specifications are not adequate
for computer agents, thus this would not meet the
criteria of being machine processable.
15Ontology Spectrum
- Thesauri provide some additional semantics in
their relations between terms. - They provide information such as synonym
relationships. - In many cases their relationships may be
interpreted unambiguously by agents. - Typically thesauri do not provide an explicit
hierarchy (although with narrower and broader
term specifications, one could deduce a simple
hierarchy).
16Ontology Spectrum
- Informal isa
- Early web specifications of term hierarchies,
such as Yahoos, provide a basic notion of
generalization and specialization. - The hierarchy is not a strict subclass or isa
hierarchy however. - This mixing of categories such as accessories in
web classification schemes is not unique to Yahoo
it appears in many web classification schemes. - Without true subclass (or true isa)
relationships, we will see that certain kinds of
deductive uses of ontologies become problematic.
17Ontology Spectrum
- Formal isa
- If A is a superclass of B, then if an object is
an instance of B it necessarily follows that the
object is an instance of A. - For example, if Dress is a subclass of
Apparel and MyFavoriteDress is an instance of
Dress, then it follows that MyFavoriteDress
is an instance of Apparel. - Strict subclass hierarchies are necessary for
exploitation of inheritance. - Formal instance relationships
- Some classification schemes only include class
names while others include ground individual
content.
18Ontology Spectrum
- Frames
- Here classes include property information.
- For example,
- the Apparel class may include properties of
price and isMadeFrom. - My specific dress may have a price of 100 and
may be made from cotton. - Properties become more useful when they are
specified at a general class level and then
inherited consistently by subclasses and
instances. - In a consumer hierarchy, a general category like
consumer product might have a price property
associated with it. - Frames were introduced by Minsky Minsky 1975
and have been widely adopted.
19Ontology Spectrum
- A more expressive point in the ontology spectrum
includes value restrictions. - Here we may place restrictions on what can fill a
property. - For example, a price property might be
restricted to have a filler that is a number (or
a number in a certain range) and isMadeFrom may
be restricted have fillers that are a kind of
material.
20Ontology Spectrum
- As ontologies need to express more information,
their expressive requirements grow. - For example, we may want to fill in the value of
one property based on a mathematical equation
using values from other properties. - Some languages allow ontologists to state
arbitrary logical statements. - Very expressive ontology languages such as that
seen in Ontolingua Farquhar et al 1997 or CycL
allow ontologists to specify first order logic
constraints between terms and more detailed
relationships such as disjoint classes, disjoint
coverings, inverse relationships, part-whole
relationships, etc.
21Ontology Spectrum
- Here we will require the following properties to
hold in order to consider something an ontology. - Finite controlled (extensible) vocabulary
- Unambiguous interpretation of classes and term
relationships - Strict hierarchical subclass relationships
between classes - Properties that are typical but not mandatory
- Property specification on a per-class basis
- Individual inclusion in the ontology
- Value restriction specification on a per-class
basis - Properties that may be desirable but not
mandatory nor typical - Specification of disjoint classes
- Specification of arbitrary logical relationships
between terms - Distinguished relationships such as inverse and
part-whole
22Simple Ontologies and Their Uses
- Simple ontologies are not as costly to build and
potentially more importantly, many are available.
- Examples
- DMOZ (www.dmoz.com) leverages over 35,000
volunteer editors and at publication time, had
over 360,000 classes in a taxonomy - The unified medical language system (UMLS -
http//www.nlm.nih.gov/research/umls/) developed
by the national library of medicine is a large
sophisticated ontology about medical terminology.
- Some companies such as Cycorp (www.cyc.com ) are
making available portions of large, detailed
ontologies.
23Simple Ontologies and Their Uses
- They provide a controlled vocabulary. Common term
usage is a start for interoperability. - A simple taxonomy may be used for site
organization and navigation support. - Taxonomies may be used to support expectation
setting. It is an important user interface
feature that users be able to have realistic
expectations of a site. - Taxonomies may be used as umbrella structures
from which to extend content. E.g.. UNSPSC
(Universal Standard Products and Services
Classification www.unspsc.org ). - Taxonomies may provide browsing support.
- Taxonomies may be used to provide search support.
- Taxonomies may be used to sense disambiguation
support.
24Structured Ontologies and Their Uses
- They can be used for simple kinds of consistency
checking. - Ontologies may be used to provide completion.
- Ontologies may be able to provide
interoperability support. - Ontologies may be used to support validation and
verification testing of data (and schemas). - Ontologies containing markup information may
encode entire test suites. - Ontologies can provide the foundation for
configuration support. - Ontologies can support structured, comparative,
and customized search. - Ontologies may be used to exploit
generalization/specialization information.
25Ontologies provide completion
- For example, she is looking for a high-resolution
screen on a pc, and then have the ontology expand
the exact pixel range that is to be expected. - Simply by defining what the term
HighResolutionPc is with respect to a
particular pixel range on two roles
verticalResolution and horizontalResolution. - For example, a medical system may obtain
information from an ontology that if a patient is
stated to be a man, then the gender of the
patient is male and - that information may be used to determine that a
question concerning whether or not the patient is
pregnant should not be asked since there could be
information in the system that things whose
gender is male are disjoint from things that are
pregnant.
26Ontologies provide interoperability support
- In the case of controlled vocabularies, there is
enhanced interoperability support since different
users/applications are using the same set of
terms. - In simple taxonomies, we can recognize when one
application is using a term that is more general
or more specific than another term and greater
facilitate interoperability. - In more expressive ontologies, we may have a
complete operational definition for how one term
relates to another term and thus, we can use
equality axioms or mappings to express one term
precisely in terms of another and thereby support
more intelligent interoperability.
27Ontologies support validation and verification
testing
- If an ontology contains class descriptions, such
as StanfordEmployee, these definitions may be
used as queries to databases to discover what
kind of coverage currently exists in datasets. - For example, if one was going to expose the class
StanfordEmployee on an interface to some
application, it would be useful to know first if
the dataset contained any instances of Person
whose employer property was filled with the
value Stanford. - Similarly, checks could be done to see if there
were currently Persons in the dataset that were
known to be Employees yet did not have a value
for the employer property (thereby showing that
the dataset is not complete).
28Ontologies may encode entire test suites
- An ontology may contain a number of definitions
of terms, some instance definitions, and then
include a term definition that is considered to
be a query find all terms that meet the
following conditions. - Markup information could be encoded with this
query to include what the answer should be, thus
providing enough information to encode regression
testing data. - Example ontology
- http//ksl.stanford.edu/projects/DAML/chimaera-jt
p-cardinality-test1.daml
29Ontologies provide for configuration support
- Class terms may be defined so that they contain
descriptions of what kinds of parts may be in a
system. - Additionally interactions between properties can
be defined so that filling in a value for one
property can cause another value to be filled in
for another slot. - For example, one may generate an ontology of
information about home theatre products as is
done in a small configurator example using a
simple description logic-based system. - Terms such as television, amplifier, tuner, etc
are defined. - Additionally, information connecting the terms
together is included. - A class of HighQualityTelevisions is defined so
that users may choose from this class and the
configurator will automatically fill in limited
sets of manufacturers to choose from, minimum
diagonal values, minimum price ranges etc.
30 Ontologies support structured, comparative, and
customized search
- For example, if one is looking for televisions, a
class description for television may be obtained
from an ontology, its properties may be obtained
(such as diagonal, price, manufacturer, etc), and
then a comparative presentation may be made of
televisions by presenting the values of each of
the properties. - Those properties can also be used to provide a
form for users to fill in so that they may
provide a detailed set of specifications about
the items they are looking to find.
31Ontologies exploit generalization/specialization
information
- If a search application finds that a users query
generates too many answers, one may dissect the
query to see if any terms in it appear in an
ontology, and if so, then the search application
may suggest specializing that term. - For example, if one did a search for concerts in
the San Francisco Bay area and got too many
answers, a search engine might look up concert in
an ontology and discover that there are
subclasses of concert (and it may also discover
that there are specific concert locations in the
Bay area). - The search engine could then choose to present
the user with the option of looking for a
particular kind of concert (say rock concert).
32Ontology Acquisition
- One methodology for obtaining ontologies is to
begin with an industry standard ontology and then
modify or extend it. - Another methodology is to semi-automatically
generate a starting point for an ontology. - Many taxonomic structures exist on the web or in
the table of contents of documents. - One might crawl certain sites to obtain a
starting taxonomic structure and then analyze,
modify, and extend that.
33Ontology Acquisition
- Q Where to look for existing ontologies or
sources of information to be crawled? - Standards organizations, such as NIST (the
National Institute of Standards and Technology -
http//www.nist.gov/), support efforts in
producing controlled vocabularies and ontologies.
- Consortiums are forming to generate ontologies.
- For example, RosettaNet (http//www.rosettanet.org
) in the area of information technology,
electronic technology, electronic components, and
semiconductor manufacturing. - They are creating industry-wide open e-business
standards and providing a language for business
processes. - Trade organizations provide class hierarchies on
their sites that can also be used as a standard
structured controlled vocabulary. - Every e-commerce site today encodes at least a
taxonomic organization of terms, like Amazon in
organizing their book and music information.
34Ontology Acquisition
- Another emerging trend is the use of markup
languages. - Some pages are being annotated using markup
languages such as XML, RDF, DAML, etc. The
pages including the annotations may be using
markup terms from controlled vocabularies. - Some libraries are emerging of ontologies
potentially of use for markup. For example, the
DAML program maintains a library of DAML
ontologies in http//www.daml.org/ontologies/. - Much of this section has introduced the idea of
obtaining either a simple or complex ontology as
a starting point and then analyzing, modifying,
and maintaining it over time.
35Ontology-related Implications and Needs
- When starting an ontology-based application, the
two major concerns will be - language and
- environment.
36Language Issues
- If one is using a simple ontology, few issues
arise. - However, if one is considering a more complex
ontology, expressive power of a representation
and reasoning language needs to be considered. - For example, if one wants to do range checking in
an e-commerce application, then it would be
unwise to choose just a simple language that only
contains subclass and instance relationships and
does not include property specification with
value restrictions. - Candidate ontology languages
- Standard specification languages (such as the
KRSS effort the Knowledge Representation System
Specification effort) - Interchange formats (such as KIF -the Knowledge
Interchange Format which is now a proposed ANSI
standard ), and - Common application programming interface
standards (such as OKBC Open Knowledge Base
Connectivity).
37Language Issues
- One does not just want to consider
representational constructs in a language one
also wants to consider the reasoning that may be
supported in the language. - Some fields such as description logics
(www.dl.kr.org ), make this a central focus in
language design. - Also, a language should be usable with existing
platforms and should be something that
non-experts can use to do their conceptual
modeling. - The web is clearly the most important platform
with which to be compatible today, thus any
language choice should be able to leverage the
web. - Additionally, frame-based systems have had a long
history of being thought of as conceptually easy
to use, thus a frame paradigm may be worth
considering.
38Language Issues
- The DARPA Agent Markup Language program, for
example, attempted to take the emerging web
languages of today such as XML and RDF and create
a language that is web compatible but draws on
the 20 year history of description logics in
choosing language constructs along with reasoning
paradigms. - The resulting language DAMLOIL attempts to
merge the best of existing web languages,
description logics, and frame reasoning systems.
39Environment Issues
- How to analyze, modify, and maintain an ontology
over time? - If the ontology is to be maintained by subject
matter experts (and not by knowledge experts),
most likely some ontology tools will be needed. - Verity, an information retrieval company, has
provided a topic editor which will support
users in generating taxonomies and utilizing them
in search queries. - Research efforts have existed for many years in
producing ontology toolkits - Stanford Universitys previously mentioned tools
of Ontolingua and Chimaera - OilEd (http//img.cs.man.ac.uk/oil/ ) from
Manchester University and - Protégé from Stanford Medical Informatics
40Environment Issues
- Some companies with extensive ontology needs such
as VerticalNet (http//www.verticalnet.com/) have
or are developing their own ontology tools in
order to build ontologies that meet the needs of
a sophisticated commercial ontologist. - Their tools were built after analyzing existing
research prototypes and were then designed to
meet the commercial standards required in
diverse, collaborative, e-commerce applications
of today.
41Environment Issues
- When choosing to use or build an ontology
environment, the following issues should be
considered - Collaboration and distributed workforce support.
- Platform interconnectivity.
- Scale.
- Versioning.
- Security.
- Analysis.
- Lifecycle issues.
- Ease of use.
- Diverse user support
- Presentation Style.
- Extensibility.
42Conclusions
- The emergence of ontologies from academic
obscurity into mainstream business and practice
on the web - Ontology along with a spectrum of properties
- Criteria necessary, prototypical, and desirable
for simple and complex ontologies - Ways that ontologies are being and may be used to
provide value in many types of applications. - Issues of acquiring ontologies and then
maintaining and evolving ontologies - Ontology-related issues that arise from the
emergence of ontologies focusing on ontology
language and environment
43Ontology Development 101 A Guide to Creating
Your First Ontology
- From Natalya Fridman Noy and Deborah L.
McGuinness. Ontology Development 101 A Guide
to Creating Your First Ontology''. Stanford
Knowledge Systems Laboratory Technical Report
KSL-01-05 and Stanford Medical Informatics
Technical Report SMI-2001-0880, March 2001.
Available at http//www.ksl.stanford.edu/people/dl
m/papers/ontology101/ontology101-noy-mcguinness.ht
ml
44Why Develop an Ontology
- The development of ontologies has been moving
from the realm of Artificial-Intelligence
laboratories to the desktops of domain experts. - Ontologies have become common on the World-Wide
Web. - The ontologies on the Web range from large
taxonomies categorizing Web sites (such as on
Yahoo!) to categorizations of products for sale
and their features (such as on Amazon.com). - The WWW Consortium (W3C) is developing the RDF, a
language for encoding knowledge on Web pages to
make it understandable to electronic agents
searching for information. - The Defense Advanced Research Projects Agency
(DARPA), in conjunction with the W3C, is
developing DARPA Agent Markup Language (DAML) by
extending RDF with more expressive constructs
aimed at facilitating agent interaction on the
Web.
45Why Develop an Ontology
- Many disciplines now develop standardized
ontologies that domain experts can use to share
and annotate information in their fields. - Medicine, for example, has produced large,
standardized, structured vocabularies such as
snomed and the semantic network of the Unified
Medical Language System. - Broad general-purpose ontologies are emerging as
well. For example, the UNSPSC ontology which
provides terminology for products and services.
46Why Develop an Ontology
- Reasons why developing an ontology
- To share common understanding of the structure of
information among people or software agents - To enable reuse of domain knowledge
- To make domain assumptions explicit
- To separate domain knowledge from the operational
knowledge - To analyze domain knowledge
47What Is in an Ontology?
- For the purposes of this guide an ontology is a
formal explicit description of - concepts in a domain of discourse (classes
(sometimes called concepts)), - properties of each concept describing various
features and attributes of the concept (slots
(sometimes called roles or properties)), and - restrictions on slots (facets (sometimes called
role restrictions)). - An ontology together with a set of individual
instances of classes constitutes a knowledge
base.
48What Is in an Ontology?
- Classes describe concepts in the domain.
- Specific wines are instances of the class of
wines. - The Bordeaux wine in the glass in front of you
while you read this document is an instance of
the class of Bordeaux wines. - A class can have subclasses that represent
concepts that are more specific than the
superclass. - For example, we can divide the class of all wines
into red, white, and rosé wines. Alternatively,
we can divide a class of all wines into sparkling
and non-sparkling wines.
49What Is in an Ontology?
- Slots describe properties of classes and
instances - Château Lafite Rothschild Pauillac wine has a
full body - it is produced by the Château Lafite Rothschild
winery. - We have two slots describing the wine in this
example - the slot body with the value full and
- the slot maker with the value Château Lafite
Rothschild winery. - At the class level, we can say that instances of
the class Wine will have slots describing their
flavor, body, sugar level, the maker of the wine
and so on.
50What Is in an Ontology?
- In practical terms, developing an ontology
includes - defining classes in the ontology,
- arranging the classes in a taxonomic
(subclasssuperclass) hierarchy, - defining slots and describing allowed values for
these slots, - filling in the values for slots for instances.
- We can then create a knowledge base by defining
individual instances of these classes filling in
specific slot value information and additional
slot restrictions.
51Some classes, instances, and relations among them
in the wine domain
52A Simple Knowledge-Engineering Methodology
- There is no one correct way or methodology for
developing ontologies. - Here we discuss general issues to consider and
offer one possible process for developing an
ontology. - An iterative approach to ontology development
- we start with a rough first pass at the ontology.
- We then revise and refine the evolving ontology
and fill in the details. - Along the way, we discuss the modeling decisions
that a designer needs to make, as well as the
pros, cons, and implications of different
solutions.
53A Simple Knowledge-Engineering Methodology
- Some fundamental rules in ontology design
- There is no one correct way to model a domain
there are always viable alternatives. The best
solution almost always depends on the application
that you have in mind and the extensions that you
anticipate. - Ontology development is necessarily an iterative
process. - Concepts in the ontology should be close to
objects (physical or logical) and relationships
in your domain of interest. These are most likely
to be nouns (objects) or verbs (relationships) in
sentences that describe your domain.
54Step 1 Determine the domain and scope of the
ontology
- Answer several basic questions
- What is the domain that the ontology will cover?
- For what we are going to use the ontology?
- For what types of questions the information in
the ontology should provide answers? - Who will use and maintain the ontology?
- The answers to these questions may change during
the ontology-design process, but at any given
time they help limit the scope of the model.
55Step 1 Determine the domain and scope of the
ontology
- For the ontology of wine and food, representation
of food and wines is the domain of the ontology. - We plan to use this ontology for the applications
that suggest good combinations of wines and food.
- Naturally, the concepts describing different
types of wines, main food types, the notion of a
good combination of wine and food and a bad
combination will figure into our ontology. - At the same time, it is unlikely that the
ontology will include concepts for managing
inventory in a winery or employees in a
restaurant even though these concepts are
somewhat related to the notions of wine and food.
56Step 1 Determine the domain and scope of the
ontology
- The uses of ontology
- Assist in natural-language processing of articles
in wine magazines, it may be important to include
synonyms and part-of-speech information for
concepts in the ontology. - Helping restaurant customers decide which wine to
order, we need to include retail-pricing
information. - For wine buyers in stocking a wine cellar,
wholesale pricing and availability may be
necessary. - If the people who will maintain the ontology
describe the domain in a language that is
different from the language of the ontology
users, we may need to provide the mapping between
the languages.
57Step 1 Determine the domain and scope of the
ontology
- Competency questions
- Sketch a list of questions that a knowledge base
based on the ontology should be able to answer - These questions will serve as the litmus test
later - Does the ontology contain enough information to
answer these types of questions? - Do the answers require a particular level of
detail or representation of a particular area? - These competency questions are just a sketch and
do not need to be exhaustive.
58Step 1 Determine the domain and scope of the
ontology
- In the wine and food domain, the following are
the possible competency questions - Which wine characteristics should I consider when
choosing a wine? - Is Bordeaux a red or white wine?
- Does Cabernet Sauvignon go well with seafood?
What is the best choice of wine for grilled meat? - Which characteristics of a wine affect its
appropriateness for a dish? - Does a bouquet or body of a specific wine change
with vintage year? - What were good vintages for Napa Zinfandel?
59Step 2 Consider reusing existing ontologies
- It is almost always worth considering what
someone else has done and checking if we can
refine and extend existing sources for our
particular domain and task. - Reusing existing ontologies may be a requirement
if our system needs to interact with other
applications that have already committed to
particular ontologies or controlled vocabularies.
- Many ontologies are already available in
electronic form and can be imported into an
ontology-development environment that you are
using. - The formalism in which an ontology is expressed
often does not matter, since many
knowledge-representation systems can import and
export ontologies. - Even if a knowledge-representation system cannot
work directly with a particular formalism, the
task of translating an ontology from one
formalism to another is usually not a difficult
one.
60Step 2 Consider reusing existing ontologies
- Libraries of reusable ontologies on the Web and
in the literature. - For example, Ontolingua ontology library
(http//www.ksl.stanford.edu/software/ontolingua/)
or - the DAML ontology library (http//www.daml.org/ont
ologies/). - There are also a number of publicly available
commercial ontologies (e.g., UNSPSC
(www.unspsc.org), RosettaNet (www.rosettanet.org),
DMOZ (www.dmoz.org)). - For example, a knowledge base of French wines may
already exist. - Importing this knowledge base and the ontology on
which it is based, we will have not only the
classification of French wines but also the first
pass at the classification of wine
characteristics used to distinguish and describe
the wines. - Lists of wine properties may already be available
from commercial Web sites such as www.wines.com
that customers consider use to buy wines. - Here we will assume that no relevant ontologies
already exist and start developing the ontology
from scratch.
61Step 3 Enumerate important terms in the ontology
- It is useful to write down a list of all terms we
would like either to make statements about or to
explain to a user. - What are the terms we would like to talk about?
- What properties do those terms have?
- What would we like to say about those terms?
- For example, important wine-related terms will
include - wine, grape, winery, location, a wines color,
body, flavor and sugar content - different types of food, such as fish and red
meat - subtypes of wine such as white wine, and so on.
- Initially, it is important to get a comprehensive
list of terms without worrying about overlap
between concepts they represent, relations among
the terms, or any properties that the concepts
may have, or whether the concepts are classes or
slots.
62Step 3 Enumerate important terms in the ontology
- The next two steps are closely intertwined. It is
hard to do one of them first and then do the
other. - developing the class hierarchy and
- defining properties of concepts (slots)
- Typically, we create a few definitions of the
concepts in the hierarchy and then continue by
describing properties of these concepts and so
on. - These two steps are also the most important steps
in the ontology-design process.
63Step 4 Define the classes and the class hierarchy
- Approaches in developing a class hierarchy
- A top-down development process starts with the
definition of the most general concepts in the
domain and subsequent specialization of the
concepts. - A bottom-up development process starts with the
definition of the most specific classes, the
leaves of the hierarchy, with subsequent grouping
of these classes into more general concepts. - A combination development process is a
combination of the top-down and bottom-up
approaches We define the more salient concepts
first and then generalize and specialize them
appropriately.
64Different levels of generality
65Step 4 Define the classes and the class hierarchy
- None of these three methods is inherently better
than any of the others. The approach to take
depends strongly on the personal view of the
domain. - Whichever approach we choose, we usually start by
defining classes. - From the list created in Step 3, we select the
terms that describe objects having independent
existence rather than terms that describe these
objects. - We organize the classes into a hierarchical
taxonomy by asking if by being an instance of one
class, the object will necessarily (i.e., by
definition) be an instance of some other class.
66Step 5 Define the properties of classesslots
- We have already selected classes from the list of
terms we created in Step 3. - Most of the remaining terms are likely to be
properties of these classes. - These terms include, for example, a wines color,
body, flavor and sugar content and location of a
winery. - For each property in the list, we must determine
which class it describes. - These properties become slots attached to
classes. - Thus, the Wine class will have the following
slots color, body, flavor, and sugar. And the
class Winery will have a location slot.
67Step 5 Define the properties of classesslots
- In general, there are several types of object
properties that can become slots in an ontology - intrinsic properties such as the flavor of a
wine - extrinsic properties such as a wines name, and
area it comes from - parts, if the object is structured these can be
both physical and abstract parts (e.g., the
courses of a meal) - relationships to other individuals these are the
relationships between individual members of the
class and other items (e.g., the maker of a wine,
representing a relationship between a wine and a
winery, and the grape the wine is made from.)
68Step 5 Define the properties of classesslots
- All subclasses of a class inherit the slot of
that class. - For example, all the slots of the class Wine will
be inherited to all subclasses of Wine, including
Red Wine and White Wine. - We will add an additional slot, tannin level
(low, moderate, or high), to the Red Wine class. - The tannin level slot will be inherited by all
the classes representing red wines (such as
Bordeaux and Beaujolais). - A slot should be attached at the most general
class that can have that property. - For instance, body and color of a wine should be
attached at the class Wine, since it is the most
general class whose instances will have body and
color.
69Step 6 Define the facets of the slots
- Slots can have different facets describing
- the value type,
- allowed values,
- the number of the values (cardinality), and
- other features of the values the slot can take.
- For example,
- the value of a name slot (as in the name of a
wine) is one string. - A slot produces (as in a winery produces these
wines) can have multiple values and the values
are instances of the class Wine.
70Step 6 Define the facets of the slots
- Slot cardinality defines how many values a slot
can have. - Some systems distinguish only between single
cardinality and multiple cardinality. - A body of a wine will be a single cardinality
slot (a wine can have only one body). - Wines produced by a particular winery fill in a
multiple-cardinality slot produces for a Winery
class. - Some systems allow specification of a minimum and
maximum cardinality to describe the number of
slot values more precisely. - The grape slot of a Wine has a minimum
cardinality of 1 each wine is made of at least
one variety of grape. - The maximum cardinality for the grape slot for
single varietal wines is 1 these wines are made
from only one variety of grape.
71Step 6 Define the facets of the slots
- Slot-value type describes what types of values
can fill in the slot. Here is a list of the more
common value types - String
- Number
- Boolean
- Enumerated
- Instance-type
72(No Transcript)
73Step 6 Define the facets of the slots
- Domain and range of a slot
- Allowed classes for slots of type Instance are
often called a range of a slot. - For example the class Wine is the range of the
produces slot. - The classes to which a slot is attached or a
classes which property a slot describes, are
called the domain of the slot. - The Winery class is the domain of the produces
slot.
74Step 6 Define the facets of the slots
- Basic rules for determining a domain and a range
of a slot - When defining a domain or a range for a slot,
find the most general classes or class that can
be respectively the domain or the range for the
slots . - If a list of classes defining a range or a
domain of a slot includes a class and its
subclass, remove the subclass. - If a list of classes defining a range or a
domain of a slot contains all subclasses of a
class A, but not the class A itself, the range
should contain only the class A and not the
subclasses. - If a list of classes defining a range or a
domain of a slot contains all but a few
subclasses of a class A, consider if the class A
would make a more appropriate range definition.
75Step 7 Create instances
- Defining an individual instance of a class
requires - choosing a class,
- creating an individual instance of that class,
and - filling in the slot values.
- For example, we can create an individual instance
Chateau-Morgon-Beaujolais to represent a specific
type of Beaujolais wine. - Chateau-Morgon-Beaujolais is an instance of the
class Beaujolais representing all Beaujolais
wines. - This instance has the following slot values
defined - Body Light
- Color Red
- Flavor Delicate
- Tannin level Low
- Grape Gamay (instance of the Wine grape class)
- Maker Chateau-Morgon (instance of the Winery
class) - Region Beaujolais (instance of the Wine-Region
class) - Sugar Dry
76Defining Classes and a Class Hierarchy
- As we have mentioned before, there is no single
correct class hierarchy for any given domain. - The hierarchy depends on
- the possible uses of the ontology,
- the level of the detail that is necessary for the
application, - personal preferences, and sometimes
- requirements for compatibility with other models.
- Here, we discuss several guidelines to keep in
mind when developing a class hierarchy.
77Ensuring that the class hierarchy is correct
- An is-a relation
- Single wine is not a subclass of all wines
- For example, it is wrong to define a class Wines
and a class Wine as a subclass of Wines. - Transitivity of the hierarchical relations
- If B is a subclass of A and C is a subclass of B,
then C is a subclass of A - Evolution of a class hierarchy
- Maintaining a consistent class hierarchy may
become challenging as domains evolve. - Classes and their names
- Classes represent concepts in the domain and not
the words that denote these concepts. - Synonyms for the same concept do not represent
different classes - Avoiding class cycles
78Analyzing siblings in a class hierarchy
- Siblings in a class hierarchy
- Siblings in the hierarchy are classes that are
direct subclasses of the same class. - All the siblings in the hierarchy (except for the
ones at the root) must be at the same level of
generality. - How many is too many and how few is too few?
- If a class has only one direct subclass there may
be a modeling problem or the ontology is not
complete. - If there are more than a dozen subclasses for a
given class then additional intermediate
categories may be necessary.
79(No Transcript)
80Multiple inheritance
- A class can be a subclass of several classes.
- Suppose we would like to create a separate class
of dessert wines, the Dessert wine class. - The Port wine is both a red wine and a dessert
wine. - All instances of the Port class will be instances
of both the Red wine class and the Dessert wine
class. - Thus, it will inherit the value SWEET for the
slot Sugar from the Dessert wine class and the
tannin level slot and the value for its color
slot from the Red wine class.
81When to introduce a new class (or not)
- Rules of thumb
- Subclasses of a class usually (1) have additional
properties that the superclass does not have, or
(2) restrictions different from those of the
superclass, or (3) participate in different
relationships than the superclasses - In other words, we introduce a new class in the
hierarchy usually only when there is something
that we can say about this class that we cannot
say about the superclass. - Classes in terminological hierarchies do not have
to introduce new properties - For example, an ontology underlying an electronic
medical-record system may include a
classification of various diseases. This
classification may be just thata hierarchy of
terms, without properties (or with the same set
of properties). In that case, it is still useful
to organize the terms in a hierarchy rather than
a flat list because it will (1) allow easier
exploration and navigation and (2) enable a
doctor to choose easily a level of generality of
the term that is appropriate for the situation. - We should not create subclasses of a class for
each additional restriction.
82A new class or a property value?
- When modeling a domain, we often need to decide
whether to model a specific distinction (such as
white, red, or rosé wine) as a property value or
as a set of classes again depends on the scope of
the domain and the task at hand. - Do we create a class White wine or do we simply
create a class Wine and fill in different values
for the slot color? - The answer usually lies in the scope that we
defined for the ontology.
83A new class or a property value?
- If the concepts with different slot values become
restrictions for different slots in other
classes, then we represent the distinction as
classes. Otherwise, we represent the distinction
in a slot value. - If a distinction is important in the domain and
we think of the objects with different values for
the distinction as different kinds of objects,
then we should create a new class for the
distinction. - Our wine ontology has such classes as Red Merlot
and White Merlot, rather than a single class for
all Merlot wines. - A class to which an individual instance belongs
should not change often. - Chilled wine should not be a class in an ontology
describing wine bottles in a restaurant. - Usually numbers, colors, locations are slot
values and do not cause the creation of new
classes. Wine, however, is a notable exception
since the color of the wine is so paramount to
the description of wine.
84A new class or a property value?
- Consider, for example, the human-anatomy
ontology. When we represent ribs, - do we create a class for each of the 1st left
rib, 2nd left rib, and so on? Or - do we have a class Rib with slots for the order
and the lateral position (left-right)? - If the information about each of the ribs that we
represent in the ontology is significantly
different, then we should indeed create a class
for each of the ribs. - If we are modeling anatomy at a slightly lesser
level of generality, and all ribs are very
similar as far as our potential applications are
concerned, we may want to simplify our hierarchy
and have just the class Rib, with two slots
lateral position, order.
85An instance or a class?
- Deciding where classes end and individual
instances begin starts with deciding what is the
lowest level of granularity in the
representation. - The level of granularity is in turn determined by
a potential application of the ontology. - Individual instances are the most specific
concepts represented in a knowledge base. - If concepts form a natural hierarchy, then we
should represent them as classes
86Hierarchy of wine regions. The "A" icons next to
class names indicate that the classes are
abstract and cannot have any direct instances.
87Limiting the scope
- Helpful rules in deciding when an ontology
definition is complete - The ontology should not contain all the possible
information about the domain you do not need to
specialize (or generalize) more than you need for
your application (at most one extra level each
way). - In our ontology, we certainly do not include all
the properties that a wine or food could have.
We represented the most salient properties of the
classes of items in our ontology. - Even though wine books would tell us the size of
grapes, we have not included this knowledge. - Similarly, we have not added all relationships
that one could imagine among all the terms in our
system. - For example, we do not include relationships such
as favorite wine and favorite food in the
ontology just to allow a more complete
representation of all of the interconnections
between the terms we have defined.
88Disjoint subclasses
- Classes are disjoint if they cannot have any
instances in common. - For example, the Dessert wine and the White wine
classes in our ontology are not disjoint there
are many wines that are instances of both.
89Defining PropertiesMore Details
- We discuss inverse slots and default values for a
slot.
90Inverse slots
- The two relations, maker and produces, are called
inverse relations. - If a wine was produced by a winery, then the
winery produces that wine. - Storing the information in both directions is
redundant. - infer the value for the inverse relation
- However, from the knowledge-acquisition
perspective it is convenient to have both pieces
of information explicitly available.
91Inverse slots
- Example of inverse slots
- the maker slot of the Wine class and the produces
slot of the Winery class. - When a user creates an instance of the Wine class
and fills in the value for the maker slot, the
system automatically adds the newly created
instance to the produces slot of the
corresponding Winery instance.
92Default values
- If a particular slot value is the same for most
instances of a class, we can define this value to
be a default value for the slot. - Then, when each new instance of a class
containing this slot is created, the system fills
in the default value automatically. - We can then change the value to any other value
that the facets will allow. - That is, default values are there for
convenience they do not enforce any new
restrictions on the model or change the model in
any way. - For example, if the majority of wines we are
going to discuss are full-bodied wines, we can
have full as a default value for the body of
the wine. Then, unless we say otherwise, all
wines we define would be full-bodied.
93Whats in a Name?
- Defining naming conventions for concepts in an
ontology and then strictly adhering to these
conventions not only makes the ontology easier to
understand but also helps avoid some common
modeling mistakes. - We need to
- Define a naming convention for classes and slots
and adhere to it. - Features of a knowledge representation system
affect the choice of naming conventions - Does the system have the same name space for
classes, slots, and instances? - Is the system case-sensitive?
- What delimiters does the system allow in the
names?
94Capitalization and delimiters
- We can greatly improve the readability of an
ontology if we use consistent capitalization for
concept names. - For example, it is common to capitalize class
names and use lower case for slot names (assuming
the system is case-sensitive). - When a concept name contains more than one word
we need to delimit the words. Here are some
possible choices. - Use Space Meal course
- Run the words together and capitalize each new
word MealCourse - Use an underscore or dash or other delimiter in
the name Meal_Course, Meal_course, Meal-Course,
Meal-course.
95Singular or plural
- A class name represents a collection of objects.
- For example, a class Wine actually represents all
wines. - Therefore, i