OnToKnowledge - PowerPoint PPT Presentation

1 / 40

About This Presentation

Title:

OnToKnowledge

Description:

an Ontology-based search engine; ... Query engine for RDF Query Language (RQL). Data upload and download in RDF format. ... RQL query engine. RQL: tailored to ... – PowerPoint PPT presentation

Number of Views:44

Avg rating:3.0/5.0

Slides: 41

Provided by: yor90

Category:

more less

Transcript and Presenter's Notes

Title: OnToKnowledge

1

On-To-Knowledge
IST-1999-10132
Content-driven Knowledge Management through
Evolving Ontologies
Rob Engels, Dieter Fensel, Frank van
Harmelen, Victor Iosif, Arjohn Kampman, Uwe
Krohn, Ulrich Reimer, Rudi Studer and York Sure
www.ontoknowledge.org

2
Contents

The overall goals
The overall architecture and language
Ontology building and instantiation
Storing and manipulating meta information
Querying the semantic web
Case Studies
Conclusions

3
1. The overall goals

The competitiveness of companies active in areas
with high change rate depends heavily on how they
maintain and access their knowledge.
Large Companies have intranets with several
million pages. Finding, creating and maintaining
information is a rather hard problem in this
weakly structured representation media.
Knowledge Management is about acquiring,
maintaining, and accessing knowledge of an
organization.

4
The overall goals

With the large number of on-line documents
several document management systems arose.
However these systems have severe weaknesses
Word matching as search method.
Information retrieval instead of query answering.
Document exchange between departments is only
possible with severe effort.
Different views on documents are not supported.
Information maintenance is not supported.

5
The overall goals

Ontologies will allow structural and semantic
definitions of documents providing completely new
possibilities compared with existing document
management systems
Intelligent search instead of keyword matching.
Query answering instead of information retrieval.
Document exchange between departments via
transformation operators.
Definition of views on documents.
Support of information maintenance.

6
The overall goals

The goal of the On-To-Knowledge project is to
support efficient and effective knowledge
management. It focuses on acquiring,
representing, and accessing weakly-structured
on-line information sources

Acquiring Text mining and extraction techniques
are applied to extract semantic information from
textual information.
Representing XML, RDF, and OIL are used for
describing syntax and semantics of
semi-structured information sources.
Accessing Novel semantic web search technology
and knowledge sharing facilities.

7
2. The overall architecture and language

The On-To-Knowledge tool suite consists of
an Ontology-based knowledge sharing facility
an Ontology-based presentation platform
an Ontology-based search engine
an Ontology editor and semi-automatic Ontology
construction tools
inference engines and query engine for meta data
and schema information
persistent storage of Ontologies and meta data
and
extraction tools for meta data.

8
The overall architecture and language

Open architecture, maximal reliance on existing
standards
XML, RDF, HTTP, SOAP, JDBC, RQL,
Client-server approach
allows tools to be used over the Internet
requires minimal installation of tools locally

All tools use DAMLOIL as ontology language
OIL Core is the minimum requirement
Tool scalability is targeted at supporting
O(103) classes
O(105) data statements

9
The overall architecture and language
10
The overall architecture and language OIL
(Ontology Inference Layer)

RDF provides a simple data model for representing
formal semantics of information, i.e.
meta-information.

RDF Schema defines a simple ontology modeling
language on top of RDF that can be used to define
vcabulary and structure of meta information.
OIL adds a simple Description Logic to RDF
Schema It allows to define axioms that logically
describe classes, properties, and their
hierarchies.
Currently, a sub dialect of OIL called DAMLOIL
is the starting point of a web ontology language
standardization group of the W3C which should
start soon.

11
The overall architecture and language OIL
(Ontology Inference Layer)
CreditsThanks to Ian Horrocks from Manchester!
12
The overall architecture and language OIL
(Ontology Inference Layer)

OIL provides a layered architecture that offers
different layers of complexity

13
3. Ontology building and instantiation

OntoEdit Manual building of Ontologies.
OntoExtract Semi-automatic Ontology construction
from natural language sources.
OntoWrapper Semi-automatic Ontology construction
from semi-structured and structured information
sources.

14
Ontology building and instantiation OntoEdit

OntoEdit is a graphical Ontology Engineering
Environment
Helps in creating, modifying and browsing of
ontologies.
It is flexible and expandable through a plug-in
framework.

15
Ontology building and instantiation OntoEdit
16
Ontology building and instantiation CORPORUM

Structured documents Ontowrapper and
screen-scraping extract information from places
on specific sites (e.g. names, email addresses,
telephone numbers).
Unstructured documents OntoExtract extracts
initial ontologies from natural language on web
pages. OntoExtract is able to
provide initial ontologies,
refine existing ontologies,
find relations between key terms in documents,
find instances of concepts within document.

17
Ontology building and instantiation CORPORUM

CORPORUMs linguistic functionality is based on a
tokenizer, a morphologic component, and a
relation determining engine.
This allows CORPORUM to extract concepts from
texts that are more then just words, concepts can
also be generated by the engine.
Relations between such concepts are defined (e.g.
subClassOf relations, or InstanceOf relations).
Through semantic analysis of a domain, the tool
can automatically generate thesauri of words
within a domain.
Visualisation of such semantic structures can
than be used for navigation and browsing through
document sets.

18
Corporum-OntoExtract

OntoExtract allows for analysis of natural
language.
The component exports its Central Concept Area in
the form of a light-weight ontology (syntax
DAMLOIL).
It finds classes, sub-class relationships and
instances.
Finally there are cross-taxonomic relations
describing relations between concepts that are
often not easily recovered from standard
ontologies.

19
Corporum-OntoExtract

How does OntoExtract currently work
parses, tokenizes and analyses text,
generates nodes and relations between them,
enhances specific aspects of the discovered
knowledge item using a background repository
(containing general knowledge of the world,
represented in Sesame),
and the final analysis results are submitted to
the RDFS server Sesame.

20
Sesame domain knowledge
Sesame background knowledge
21
(No Transcript)
22
4. Storing and manipulating meta information

Sesame A repository and querying facility for
RDF Schema.
Functionality
Persistent storage of RDF data and RDF Schema.
Query engine for RDF Query Language (RQL).
Data upload and download in RDF format.
Communication over HTTP.

Worlds first!

Features
Designed for scalability.
Independent of repository types (databases,
files, in-memory data structures, ...).
Modular design allows for other functional
modules.
Architecture allows support for other
communication protocols.

23
Sesame architecture
HTTP Handler
??? Handler
Protocol handlers
Routes requests to modules
Request Router
Functional modules
RDF Admin
RQL Engine
??? Module
Provides database independence
Repository Abstraction Layer
Repository
Persistent storage
24
Sesame RQL query engine

RQL
tailored to the RDF graph model
currently the only query language for RDF Schema
semantics

based on OQL with features like
Set of core queries (Class, Property, subClassOf,
)
Filters (select-from-where)
Boolean expressions
Functional composition of queries
supports path expressions
For navigating the RDF graph model
Allowing mixed RDF data and schema queries

25
Sesame RQL query examples

Basic queries

26
5. Querying the Semantic Web

OTK provides as user access
Querying the Semantic Web
RDFferret
An Ontology-based presentation platform
Spectacle
An Ontology-based knowledge sharing facility
OntoShare

27
Querying the Semantic Web RDFferret

Impractical to create RDF annotations that
exhaustively cover the content of a given
document
RDF searches might produce low recall
RDF searches produce high precision
Search both RDF annotations and text content
Use well proven IR techniques (ranking, stemming,
...)

Low threshold
28
Querying the Semantic Web RDFferret
29
Ontology-based presentation Spectacle

Spectacle personalized information disclosure
Personalization is ontology-based
Spectacle can personalize
the content itself (WHAT)
the content presentation (HOW)
the content organisation/navigation (WHERE)
Example personalizations based on
Experience beginner vs expert user
Role maintainer vs end-user
Task learning vs problem solving
Etc.

30
Ontology-based presentation Spectacle
information sources
presentation data
profiles navigation layout
RDF
classification
classified data
DBMS
docs
presentation
ontology-based information presentation
31
Proactively sharing informationOntoShare

Sharing information through an organisation is a
key knowledge management issue
OntoShare supports and encourages information
sharing
User requests OntoShare to share some information
On sharing, page is assigned to an ontological
category (class) and matched against each users
ontology-based profile.
OntoShare automatically extracts keywords
summaries from the information and suggests
changes to user profile based on user activity
OntoShare proactively emails selected users when
information of interest is shared

32
6. Case Studies

Large Intranets Swiss Life
Customer Relationship Management BT
Virtual Enterprise Enersearch
A general Methodology AIFB

33
Large Intranets Swiss Life

Swiss Life is a large insurance company with a
huge intranet and other distributed information
sources.
Efficient knowledge management for this
information is of high strategic importance.
OTK technology is applied in two case studies
with Swiss Life
Searching a Large Document on IAS (International
Accounting Standard).
Skills Management (SkiM).

34
Searching a Large Document on IAS (International
Accounting Standard)

Goals
Fast and reliable access to relevant passages
ofa large document on IAS

Approach
Automatic generation of a light-weight ontology
with weighted semantic associations between
concepts
Use of that ontology to support query
reformulationby adding relevant ontology terms

35
Skills Management (SkiM)

Goals
Access the knowledge and skills of employees
Identify, manage, use and advance skills

Approach
Use of manually built ontologies
- to describe skills, job functions, education
in a controlled vocabulary
- to generate annotated homepages from the
skills descriptions
Exploit ontologies for a more specific search on
homepages for people with certain skills

36
Customer Relationship Management BT

Disseminating Customer Handling Rules
Offer a cost-effective channel for the
dissemination of all sorts of rules and
instructions.
Health Safety.
Sales scripts for Telesales Representatives.
Timely information on new products and services.
Dissemination Best Practice
Promote behaviours that are acceptably consistent
across call centres.
Help managers to become more aware of Best
Practice resources on the BT Intranet.
Help to build communities of best practice.

37
Customer Relationship Management BT

Staying Alert-Interest Profiles
Learn a users interests and preferences
autonomously (with minimal feedback from the
user).
Adapt to changing needs of the user over time.
Where possible user profiles should be acquired
automatically with the users role being one of
review to correct/refine their profile.

38
Virtual Enterprise EnerSearch

Goals
Improve usefulness of EnS website by semantic
methods
Especially for the shareholder representatives in
virtual organizations

Approach
Ontology development
Manual (OntoEdit/AIFB)
Automatic (OntoExtract/CognIT)
Information modes (i) key-word search (ii)
semantic ontology-based search (RDF-ferret) (iii)
browsing with knowledge visualization (spectacle)
Evaluation by end user studies (i) pre-trial
interviews (ii) end user test (iii) post trial
studies

39
KMMethodology
Maintenance Evolution
Ontology Kickoff
Feasibility study
Refinement
Evaluation

Check requirements
Test in target application
Analyze usage patterns
Deployment

Requirement specification
Analyze knowledge sources
Develop baseline ontology

Knowledge elicitation with domain experts
Develop and refine target ontology

Manage organizational maintenance process (Who is
responsible? How is it done?)

Identify people
Focus domain
Select tools from OTK tool suite
GO / No GO decision

40
7. Conclusions

The semantic web is based on machine-processable
semantics of data.
It will significantly change our information
access based on a higher level of service
provided by computers.
It is based on new web languages such as XML,
RDF, and OIL, and tools that make use of these
languages.
Applications are in areas such as knowledge
management and electronic commerce.
Many research projects have been started in the
EU and US on these topics.
On-To-Knowledge is one of the first ones breaking
the ice.