Title: OnToKnowledge
1- On-To-Knowledge
- IST-1999-10132
- Content-driven Knowledge Management through
Evolving Ontologies - Rob Engels, Dieter Fensel, Frank van
Harmelen, Victor Iosif, Arjohn Kampman, Uwe
Krohn, Ulrich Reimer, Rudi Studer and York Sure - www.ontoknowledge.org
2Contents
- The overall goals
- The overall architecture and language
- Ontology building and instantiation
- Storing and manipulating meta information
- Querying the semantic web
- Case Studies
- Conclusions
31. The overall goals
- The competitiveness of companies active in areas
with high change rate depends heavily on how they
maintain and access their knowledge. - Large Companies have intranets with several
million pages. Finding, creating and maintaining
information is a rather hard problem in this
weakly structured representation media. - Knowledge Management is about acquiring,
maintaining, and accessing knowledge of an
organization.
4The overall goals
- With the large number of on-line documents
several document management systems arose.
However these systems have severe weaknesses - Word matching as search method.
- Information retrieval instead of query answering.
- Document exchange between departments is only
possible with severe effort. - Different views on documents are not supported.
- Information maintenance is not supported.
5The overall goals
- Ontologies will allow structural and semantic
definitions of documents providing completely new
possibilities compared with existing document
management systems - Intelligent search instead of keyword matching.
- Query answering instead of information retrieval.
- Document exchange between departments via
transformation operators. - Definition of views on documents.
- Support of information maintenance.
6The overall goals
- The goal of the On-To-Knowledge project is to
support efficient and effective knowledge
management. It focuses on acquiring,
representing, and accessing weakly-structured
on-line information sources
- Acquiring Text mining and extraction techniques
are applied to extract semantic information from
textual information. - Representing XML, RDF, and OIL are used for
describing syntax and semantics of
semi-structured information sources. - Accessing Novel semantic web search technology
and knowledge sharing facilities.
72. The overall architecture and language
- The On-To-Knowledge tool suite consists of
- an Ontology-based knowledge sharing facility
- an Ontology-based presentation platform
- an Ontology-based search engine
- an Ontology editor and semi-automatic Ontology
construction tools - inference engines and query engine for meta data
and schema information - persistent storage of Ontologies and meta data
and - extraction tools for meta data.
8The overall architecture and language
9The overall architecture and language OIL
(Ontology Inference Layer)
- RDF provides a simple data model for representing
formal semantics of information, i.e.
meta-information.
- RDF Schema defines a simple ontology modeling
language on top of RDF that can be used to define
vcabulary and structure of meta information. - OIL adds a simple Description Logic to RDF
Schema It allows to define axioms that logically
describe classes, properties, and their
hierarchies. - Currently, a sub dialect of OIL called DAMLOIL
is the starting point of a web ontology language
standardization group of the W3C which should
start soon.
10The overall architecture and language OIL
(Ontology Inference Layer)
CreditsThanks to Ian Horrocks from Manchester!
11The overall architecture and language OIL
(Ontology Inference Layer)
- OIL provides a layered architecture that offers
different layers of complexity
123. Ontology building and instantiation
- OntoEdit Manual building of Ontologies.
- OntoExtract Semi-automatic Ontology construction
from natural language sources. - OntoWrapper Semi-automatic Ontology construction
from semi-structured and structured information
sources.
13Ontology building and instantiation OntoEdit
- OntoEdit is a graphical Ontology Engineering
Environment - Helps in creating, modifying and browsing of
ontologies. - It is flexible and expandable through a plug-in
framework.
14Ontology building and instantiation OntoEdit
15Ontology building and instantiation
- Structured documents Ontowrapper and
screen-scraping extract information from places
on specific sites (e.g. names, email addresses,
telephone numbers). - Unstructured documents OntoExtract extracts
initial ontologies from natural language on web
pages. OntoExtract is able to - provide initial ontologies,
- refine existing ontologies,
- find relations between key terms in documents,
- find instances of concepts within document.
16Corporum-OntoExtract
- How does OntoExtract currently work
- parses, tokenizes and analyses text,
- generates nodes and relations between them,
- enhances specific aspects of the discovered
knowledge item using a background repository
(containing general knowledge of the world,
represented in Sesame), - and the final analysis results are submitted to
the RDFS server Sesame.
17Sesame domain knowledge
Sesame background knowledge
18(No Transcript)
194. Storing and manipulating meta information
- Sesame A repository and querying facility for
RDF Schema. - Functionality
- Persistent storage of RDF data and RDF Schema.
- Query engine for RDF Query Language (RQL).
- Data upload and download in RDF format.
- Communication over HTTP.
Worlds first!
- Features
- Designed for scalability.
- Independent of repository types (databases,
files, in-memory data structures, ...). - Modular design allows for other functional
modules. - Architecture allows support for other
communication protocols.
20Sesame architecture
HTTP Handler
??? Handler
Protocol handlers
Routes requests to modules
Request Router
Functional modules
RDF Admin
RQL Engine
??? Module
Provides database independence
Repository Abstraction Layer
Repository
Persistent storage
21Sesame RQL query examples
225. Querying the Semantic Web
- OTK provides as user access
- Querying the Semantic Web
- RDFferret
- An Ontology-based presentation platform
Spectacle - An Ontology-based knowledge sharing facility
OntoShare
23Querying the Semantic Web RDFferret
- Impractical to create RDF annotations that
exhaustively cover the content of a given
document - RDF searches might produce low recall
- RDF searches produce high precision
- Search both RDF annotations and text content
- Use well proven IR techniques (ranking, stemming,
...)
Low threshold
24Querying the Semantic Web RDFferret
25Ontology-based presentation Spectacle
- Spectacle personalized information disclosure
- Personalization is ontology-based
- Spectacle can personalize
- the content itself (WHAT)
- the content presentation (HOW)
- the content organisation/navigation (WHERE)
- Example personalizations based on
- Experience beginner vs expert user
- Role maintainer vs end-user
- Task learning vs problem solving
- Etc.
26Ontology-based presentation Spectacle
information sources
presentation data
profiles navigation layout
RDF
classification
classified data
DBMS
docs
presentation
ontology-based information presentation
27Proactively sharing informationOntoShare
- User requests OntoShare to store a page.
- On storage, page matched against users
ontology-based profile. - OntoShare automatically extracts keywords
summaries. - OntoShare automatically suggests changes to user
profile. - OntoShare emails selected users URL, annotation
and keywords.
286. Case Studies
- Large Intranets Swiss Life
- Customer Relationship Management British Telecom
- Virtual Enterprise Enersearch
- A general Methodology AIFB
29Large Intranets Swiss Life
- Swiss Life is a large insurance company with a
huge intranet and other distributed information
sources. - Efficient knowledge management for this
information is of high strategic importance. - OTK technology is applied in two case studies
with Swiss Life - Searching a Large Document on IAS (International
Accounting Standard). - Skills Management (SkiM).
30Searching a Large Document on IAS (International
Accounting Standard)
- Goals
- Fast and reliable access to relevant passages
ofa large document on IAS
- Approach
- Automatic generation of a light-weight ontology
with weighted semantic associations between
concepts - Use of that ontology to support query
reformulationby adding relevant ontology terms
31Skills Management (SkiM)
- Goals
- Access the knowledge and skills of employees
- Identify, manage, use and advance skills
- Approach
- Use of manually built ontologies
- - to describe skills, job functions, education
in a controlled vocabulary - - to generate annotated homepages from the
skills descriptions - Exploit ontologies for a more specific search on
homepages for people with certain skills
32Customer Relationship Management British Telecom
- Disseminating Customer Handling Rules
- Offer a cost-effective channel for the
dissemination of all sorts of rules and
instructions. - Health Safety.
- Sales scripts for Telesales Representatives.
- Timely information on new products and services.
- Dissemination Best Practice
- Promote behaviours that are acceptably consistent
across call centres. - Help managers to become more aware of Best
Practice resources on the BT Intranet. - Help to build communities of best practice.
33Customer Relationship Management British Telecom
- Staying Alert-Interest Profiles
- Learn a users interests and preferences
autonomously (with minimal feedback from the
user). - Adapt to changing needs of the user over time.
- Where possible user profiles should be acquired
automatically with the users role being one of
review to correct/refine their profile.
34Virtual Enterprise EnerSearch
- Goals
- Improve usefulness of EnS website by semantic
methods - Especially for the shareholder representatives in
virtual organizations
- Approach
- Ontology development
- Manual (OntoEdit/AIFB)
- Automatic (OntoExtract/CognIT)
- Information modes (i) key-word search (ii)
semantic ontology-based search (RDF-ferret) (iii)
browsing with knowledge visualization (spectacle) - Evaluation by end user studies (i) pre-trial
interviews (ii) end user test (iii) post trial
studies
357. Conclusions
- The semantic web is based on machine-processable
semantics of data. - It will significantly change our information
access based on a higher level of service
provided by computers. - It is based on new web languages such as XML,
RDF, and OIL, and tools that make use of these
languages. - Applications are in areas such as knowledge
management and electronic commerce. - Many research projects are started in the EU and
in the US in these topics. - On-To-Knowledge is one of the first ones braking
the ice.