Title: An Ontology Creation Methodology: A Phased Approach
1An Ontology Creation Methodology A Phased
Approach
- Jon Atle Gulla
- Norwegian University of Science and Technology
Norway - jag_at_idi.ntnu.no
- Vijay Sugumaran
- Oakland University, USA
- sugumara_at_oakland.edu
2Agenda
- Ontology development
- Traditional ontology learning
- Limitations of ontology learning
- A phased approach to ontology learning
3The Challenge
- How to develop large complex ontologies?
- How to keep ontologies updated in dynamic domains?
4Ontology Modeling vs. Learning
- Traditional ontology engineering approach
- ProjectForm team of ontology and domain experts
- Ontology domain expertsCollaborative manual
modeling process - Domain expertsVerify ontology against domain
knowledge - Ontology expertsVerify ontology against
syntactic and semantic quality measures - Expensive and time-consuming approach
- Stable domains assumed
- Ontology learning approach
- Domain expertsFind representative domain text
- ToolExtract candidate classes, individuals and
properties automatically from domain texts - Ontology domain expertsVerify candidate
structures and complete ontology - Can also be used to verify domain quality of
existing ontology - Cost-effective approach
- Not unproblematic in dynamic domains
5Agenda
- Ontology development
- Traditional ontology learning
- Limitations of ontology learning
- A phased approach to ontology learning
6Ontology Learning Basis
- People communicate using domain-specific concepts
- People document using domain-specific concepts
- Ontology learning Extract ontology structures
from written documentation - Requirements
- Documents representative for domain terminology
- Documents cover all the terminology
- Well-defined and consistent use of terminology in
domain
Realm of ontology engineering
Ontology discussions
Realm of ontology learning
Ontology in use
7Levels of Ontology Learning
Degree of difficulty
? x,y(manager(x,y) ? report(y,x))
Rules
FINANCE(agSPONSOR, go PROJECT)
Relations
Concept hierarchies
is_a(MANAGER, EMPLOYEE)
Concepts
PROJECT
Synonyms
(leader, manager, lead)
Terms
sponsors, costs, charter
8Ontology Learning Strategies
- Term extraction
- Linguistic analysis
- Statistical analysis
- Synonyms
- Classification-based techniques
- Distribution-based techniques
- Concept formation
- Structure recognition
- Keyphrase generation
- Instance learning
- Concept hierarchy
- Clustering
- Lexico-syntactic patterns
- Head-modifier approaches
- Subsumption approaches
- Classification-based techniques
- Relations
- Association rules
- Concept vectors
- Rules
- Structure recognition for meta-property
recognition - Dependency trees and path similarities
9Ontology Learning Process
Scope management WBS Business need Constituent
components Product description ...
Abstract elements Constraints Properties Rules
PMBOK
Domain text
Concept candidates
Search ontology
Reference set
Manual selection of candidates and completion of
model
Automatic extraction of concept and relationship
candidates
10Ex 1. Learning Concept/Individual Candidates
Scope planning is the process of progressively
elaborating and documenting the project work
(project scope) that produces the product of the
project.
POS tagging
Stopword removal (571 words)
Lemmatization/stemming (POS tags not shown)
Select consecutive nouns as candidate phrases
Calculate tf.idf score for phrases
11Classes Relevant to the Drama Genre
- Data sources IMDB, Wikipedia, Videoload
- Keyphrase extraction technique
- Noun phrases ranked according to various
statistical measures
12Ex 2. Learning Relationship Candidates
13Relationships Relevant to Drama Genre
- Association rules on extracted concepts
14Automatic OWL Generation
15Agenda
- Ontology development
- Traditional ontology learning
- Limitations of ontology learning
- A phased approach to ontology learning
16Limitations of Ontology Learning
- Different techniques produce different results
- Different data sources produce different results
- Lost control over process
- Extensive verification of final ontology needed
- New data hard to combine with old data
17Agenda
- Ontology development
- Traditional ontology learning
- Limitations of ontology learning
- A phased approach to ontology learning
18Ontology Learning for Entertainment Domain
- Ontology evolution for DeutscheTelecoms
Videoload downloadservice - What does Brangelina mean?
- Should Pitt be Brad Pitt or Michael Pitt?
- Actor vs. Schauspieler?
- All movies of Brad Pitt?
- Last movie of Pitt?
19Ontology Learning Project
- Duration Nov 2007 Nov 2009
- Domain movie download service
- Ontology analysis and creation based on indexed
noun phrases from movie documents - Ontology used for search and navigation on top of
FAST search platform - Ontology learning challenges
- Domain changes from one day to another
- No consistent domain terminology
- No professional domain terminology
- Multiple languages
- Movies about anything... unlimited domain
- Ontology needs to be up to date to support search
20Ontology Workbench
- 3 phases that are carried out independently
- Crawling into Lucene indices
- Supervised extraction of candidates
- Combining candidates into ontology structures
21Interactive Ontology Development
Expandable indices
Subset of data source
Focus of analysis
List of techniques
Partial results
Stored results
Set operations for combining results
22Thank you