OnToKnowledge - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

OnToKnowledge

Description:

an Ontology-based search engine; ... Query engine for RDF Query Language (RQL). Data upload and download in RDF format. ... RQL query engine. RQL: tailored to ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 41
Provided by: yor90
Category:

less

Transcript and Presenter's Notes

Title: OnToKnowledge


1
  • On-To-Knowledge
  • IST-1999-10132
  • Content-driven Knowledge Management through
    Evolving Ontologies
  • Rob Engels, Dieter Fensel, Frank van
    Harmelen, Victor Iosif, Arjohn Kampman, Uwe
    Krohn, Ulrich Reimer, Rudi Studer and York Sure
  • www.ontoknowledge.org

2
Contents
  • The overall goals
  • The overall architecture and language
  • Ontology building and instantiation
  • Storing and manipulating meta information
  • Querying the semantic web
  • Case Studies
  • Conclusions

3
1. The overall goals
  • The competitiveness of companies active in areas
    with high change rate depends heavily on how they
    maintain and access their knowledge.
  • Large Companies have intranets with several
    million pages. Finding, creating and maintaining
    information is a rather hard problem in this
    weakly structured representation media.
  • Knowledge Management is about acquiring,
    maintaining, and accessing knowledge of an
    organization.

4
The overall goals
  • With the large number of on-line documents
    several document management systems arose.
    However these systems have severe weaknesses
  • Word matching as search method.
  • Information retrieval instead of query answering.
  • Document exchange between departments is only
    possible with severe effort.
  • Different views on documents are not supported.
  • Information maintenance is not supported.

5
The overall goals
  • Ontologies will allow structural and semantic
    definitions of documents providing completely new
    possibilities compared with existing document
    management systems
  • Intelligent search instead of keyword matching.
  • Query answering instead of information retrieval.
  • Document exchange between departments via
    transformation operators.
  • Definition of views on documents.
  • Support of information maintenance.

6
The overall goals
  • The goal of the On-To-Knowledge project is to
    support efficient and effective knowledge
    management. It focuses on acquiring,
    representing, and accessing weakly-structured
    on-line information sources
  • Acquiring Text mining and extraction techniques
    are applied to extract semantic information from
    textual information.
  • Representing XML, RDF, and OIL are used for
    describing syntax and semantics of
    semi-structured information sources.
  • Accessing Novel semantic web search technology
    and knowledge sharing facilities.

7
2. The overall architecture and language
  • The On-To-Knowledge tool suite consists of
  • an Ontology-based knowledge sharing facility
  • an Ontology-based presentation platform
  • an Ontology-based search engine
  • an Ontology editor and semi-automatic Ontology
    construction tools
  • inference engines and query engine for meta data
    and schema information
  • persistent storage of Ontologies and meta data
    and
  • extraction tools for meta data.

8
The overall architecture and language
  • Open architecture, maximal reliance on existing
    standards
  • XML, RDF, HTTP, SOAP, JDBC, RQL,
  • Client-server approach
  • allows tools to be used over the Internet
  • requires minimal installation of tools locally
  • All tools use DAMLOIL as ontology language
  • OIL Core is the minimum requirement
  • Tool scalability is targeted at supporting
  • O(103) classes
  • O(105) data statements

9
The overall architecture and language
10
The overall architecture and language OIL
(Ontology Inference Layer)
  • RDF provides a simple data model for representing
    formal semantics of information, i.e.
    meta-information.
  • RDF Schema defines a simple ontology modeling
    language on top of RDF that can be used to define
    vcabulary and structure of meta information.
  • OIL adds a simple Description Logic to RDF
    Schema It allows to define axioms that logically
    describe classes, properties, and their
    hierarchies.
  • Currently, a sub dialect of OIL called DAMLOIL
    is the starting point of a web ontology language
    standardization group of the W3C which should
    start soon.

11
The overall architecture and language OIL
(Ontology Inference Layer)
CreditsThanks to Ian Horrocks from Manchester!
12
The overall architecture and language OIL
(Ontology Inference Layer)
  • OIL provides a layered architecture that offers
    different layers of complexity

13
3. Ontology building and instantiation
  • OntoEdit Manual building of Ontologies.
  • OntoExtract Semi-automatic Ontology construction
    from natural language sources.
  • OntoWrapper Semi-automatic Ontology construction
    from semi-structured and structured information
    sources.

14
Ontology building and instantiation OntoEdit
  • OntoEdit is a graphical Ontology Engineering
    Environment
  • Helps in creating, modifying and browsing of
    ontologies.
  • It is flexible and expandable through a plug-in
    framework.

15
Ontology building and instantiation OntoEdit
16
Ontology building and instantiation CORPORUM
  • Structured documents Ontowrapper and
    screen-scraping extract information from places
    on specific sites (e.g. names, email addresses,
    telephone numbers).
  • Unstructured documents OntoExtract extracts
    initial ontologies from natural language on web
    pages. OntoExtract is able to
  • provide initial ontologies,
  • refine existing ontologies,
  • find relations between key terms in documents,
  • find instances of concepts within document.

17
Ontology building and instantiation CORPORUM
  • CORPORUMs linguistic functionality is based on a
    tokenizer, a morphologic component, and a
    relation determining engine.
  • This allows CORPORUM to extract concepts from
    texts that are more then just words, concepts can
    also be generated by the engine.
  • Relations between such concepts are defined (e.g.
    subClassOf relations, or InstanceOf relations).
  • Through semantic analysis of a domain, the tool
    can automatically generate thesauri of words
    within a domain.
  • Visualisation of such semantic structures can
    than be used for navigation and browsing through
    document sets.

18
Corporum-OntoExtract
  • OntoExtract allows for analysis of natural
    language.
  • The component exports its Central Concept Area in
    the form of a light-weight ontology (syntax
    DAMLOIL).
  • It finds classes, sub-class relationships and
    instances.
  • Finally there are cross-taxonomic relations
    describing relations between concepts that are
    often not easily recovered from standard
    ontologies.

19
Corporum-OntoExtract
  • How does OntoExtract currently work
  • parses, tokenizes and analyses text,
  • generates nodes and relations between them,
  • enhances specific aspects of the discovered
    knowledge item using a background repository
    (containing general knowledge of the world,
    represented in Sesame),
  • and the final analysis results are submitted to
    the RDFS server Sesame.


20
Sesame domain knowledge
Sesame background knowledge
21
(No Transcript)
22
4. Storing and manipulating meta information
  • Sesame A repository and querying facility for
    RDF Schema.
  • Functionality
  • Persistent storage of RDF data and RDF Schema.
  • Query engine for RDF Query Language (RQL).
  • Data upload and download in RDF format.
  • Communication over HTTP.

Worlds first!
  • Features
  • Designed for scalability.
  • Independent of repository types (databases,
    files, in-memory data structures, ...).
  • Modular design allows for other functional
    modules.
  • Architecture allows support for other
    communication protocols.

23
Sesame architecture
HTTP Handler
??? Handler
Protocol handlers
Routes requests to modules
Request Router
Functional modules
RDF Admin
RQL Engine
??? Module
Provides database independence
Repository Abstraction Layer
Repository
Persistent storage
24
Sesame RQL query engine
  • RQL
  • tailored to the RDF graph model
  • currently the only query language for RDF Schema
    semantics
  • based on OQL with features like
  • Set of core queries (Class, Property, subClassOf,
    )
  • Filters (select-from-where)
  • Boolean expressions
  • Functional composition of queries
  • supports path expressions
  • For navigating the RDF graph model
  • Allowing mixed RDF data and schema queries

25
Sesame RQL query examples
  • Basic queries

26
5. Querying the Semantic Web
  • OTK provides as user access
  • Querying the Semantic Web
  • RDFferret
  • An Ontology-based presentation platform
    Spectacle
  • An Ontology-based knowledge sharing facility
    OntoShare

27
Querying the Semantic Web RDFferret
  • Impractical to create RDF annotations that
    exhaustively cover the content of a given
    document
  • RDF searches might produce low recall
  • RDF searches produce high precision
  • Search both RDF annotations and text content
  • Use well proven IR techniques (ranking, stemming,
    ...)

Low threshold
28
Querying the Semantic Web RDFferret
29
Ontology-based presentation Spectacle
  • Spectacle personalized information disclosure
  • Personalization is ontology-based
  • Spectacle can personalize
  • the content itself (WHAT)
  • the content presentation (HOW)
  • the content organisation/navigation (WHERE)
  • Example personalizations based on
  • Experience beginner vs expert user
  • Role maintainer vs end-user
  • Task learning vs problem solving
  • Etc.

30
Ontology-based presentation Spectacle
information sources
presentation data
profiles navigation layout
RDF
classification
classified data
DBMS
docs
presentation
ontology-based information presentation
31
Proactively sharing informationOntoShare
  • Sharing information through an organisation is a
    key knowledge management issue
  • OntoShare supports and encourages information
    sharing
  • User requests OntoShare to share some information
  • On sharing, page is assigned to an ontological
    category (class) and matched against each users
    ontology-based profile.
  • OntoShare automatically extracts keywords
    summaries from the information and suggests
    changes to user profile based on user activity
  • OntoShare proactively emails selected users when
    information of interest is shared

32
6. Case Studies
  • Large Intranets Swiss Life
  • Customer Relationship Management BT
  • Virtual Enterprise Enersearch
  • A general Methodology AIFB

33
Large Intranets Swiss Life
  • Swiss Life is a large insurance company with a
    huge intranet and other distributed information
    sources.
  • Efficient knowledge management for this
    information is of high strategic importance.
  • OTK technology is applied in two case studies
    with Swiss Life
  • Searching a Large Document on IAS (International
    Accounting Standard).
  • Skills Management (SkiM).

34
Searching a Large Document on IAS (International
Accounting Standard)
  • Goals
  • Fast and reliable access to relevant passages
    ofa large document on IAS
  • Approach
  • Automatic generation of a light-weight ontology
    with weighted semantic associations between
    concepts
  • Use of that ontology to support query
    reformulationby adding relevant ontology terms

35
Skills Management (SkiM)
  • Goals
  • Access the knowledge and skills of employees
  • Identify, manage, use and advance skills
  • Approach
  • Use of manually built ontologies
  • - to describe skills, job functions, education
    in a controlled vocabulary
  • - to generate annotated homepages from the
    skills descriptions
  • Exploit ontologies for a more specific search on
    homepages for people with certain skills

36
Customer Relationship Management BT
  • Disseminating Customer Handling Rules
  • Offer a cost-effective channel for the
    dissemination of all sorts of rules and
    instructions.
  • Health Safety.
  • Sales scripts for Telesales Representatives.
  • Timely information on new products and services.
  • Dissemination Best Practice
  • Promote behaviours that are acceptably consistent
    across call centres.
  • Help managers to become more aware of Best
    Practice resources on the BT Intranet.
  • Help to build communities of best practice.

37
Customer Relationship Management BT
  • Staying Alert-Interest Profiles
  • Learn a users interests and preferences
    autonomously (with minimal feedback from the
    user).
  • Adapt to changing needs of the user over time.
  • Where possible user profiles should be acquired
    automatically with the users role being one of
    review to correct/refine their profile.

38
Virtual Enterprise EnerSearch
  • Goals
  • Improve usefulness of EnS website by semantic
    methods
  • Especially for the shareholder representatives in
    virtual organizations
  • Approach
  • Ontology development
  • Manual (OntoEdit/AIFB)
  • Automatic (OntoExtract/CognIT)
  • Information modes (i) key-word search (ii)
    semantic ontology-based search (RDF-ferret) (iii)
    browsing with knowledge visualization (spectacle)
  • Evaluation by end user studies (i) pre-trial
    interviews (ii) end user test (iii) post trial
    studies

39
KMMethodology
Maintenance Evolution
Ontology Kickoff
Feasibility study
Refinement
Evaluation
  • Check requirements
  • Test in target application
  • Analyze usage patterns
  • Deployment
  • Requirement specification
  • Analyze knowledge sources
  • Develop baseline ontology
  • Knowledge elicitation with domain experts
  • Develop and refine target ontology
  • Manage organizational maintenance process (Who is
    responsible? How is it done?)
  • Identify people
  • Focus domain
  • Select tools from OTK tool suite
  • GO / No GO decision

40
7. Conclusions
  • The semantic web is based on machine-processable
    semantics of data.
  • It will significantly change our information
    access based on a higher level of service
    provided by computers.
  • It is based on new web languages such as XML,
    RDF, and OIL, and tools that make use of these
    languages.
  • Applications are in areas such as knowledge
    management and electronic commerce.
  • Many research projects have been started in the
    EU and US on these topics.
  • On-To-Knowledge is one of the first ones breaking
    the ice.
Write a Comment
User Comments (0)
About PowerShow.com