Title: Introduction to Semantic Web Design
1Introduction to Semantic Web Design
2Introduction
- Web in its current form is an application on the
internet that delivers information. Ex browsing
daily news - Current applications involving the web integrate
data and information. Ex online shopping - Next generation web is expected integrate a
variety of resources and devices and support
knowledge sharing among machines. - Exploit the economies of scale possible by
machines processing of knowledge. - How to tell the machines about the resources and
how to specify concepts? How can machines acquire
knowledge? How to share knowledge among machines?
How to enable them to make decisions based on
these? - Need to specify resources, concepts, knowledge
and other artifacts used in human decision making
in a form usable by machines. - Machines can then integrate and analyze
information and make decisions and collect
knwolegde. - In this lecture we will examine technology,
tools, frameworks, and applications enabling the
next generation web, the semantic web. - We will also discuss an intelligent search engine
serving municipal services in a real semantic web
application (Chapter 4)
3References for todays discussion
- W3C schools tutorials (http//www.w3schools.com)
- Taxonomies and the semantic web by Alistair
Miles, CISTRANA workshop, Feb 2006, Rutherford
Appleton Lab -
4HTML, XML, RDF, and OWL
- HTML
- HTML stands for Hyper Text Markup Language
- An HTML file is a text file containing small
markup tags - The markup tags tell the Web browser how to
display the page - XML
- XML stands for eXtensible Markup Language
- XML is a markup language much like HTML
- XML was designed to carry data, not to display
data - XML tags are not predefined. You must define your
own tags - XML is designed to be self-descriptive
- XML is a W3C Recommendation
5HTML, XML RDF, ..
- RDF
- RDF stands for Resource Description Framework
- RDF is a framework for describing resources on
the web - RDF provides a model for data, and a syntax so
that independent parties can exchange and use it - RDF is designed to be read and understood by
computers - RDF is not designed for being displayed to people
- RDF is written in XML
- RDF is a part of the W3C's Semantic Web Activity
- RDF is a W3C Recommendation
- Lets discuss the details.
6HTMLOWL
- OWL
- OWL stands for Web Ontology Language
- OWL is built on top of RDF
- OWL is for processing information on the web
- OWL was designed to be interpreted by computers
- OWL was not designed for being read by people
- OWL is written in XML
- OWL has three sublanguages
- OWL is a web standard
7Web ontology
Natural language Ex English
Natural language Ontology
Programming language Ex Pascal
Web ontology
Programming language is a strict syntaxed
language for expressing algorithms (steps) for
execution by a computing device. Web ontology is
for expressing web related concepts. Web ontology
language (OWL) is a technology for accomplishing
this. Protégé-OWL is a tool that implements OWL.
8Taxonomy and web ontology
- Taxonomy is a science of classification. F
Taxonomy - Ontology is specification of conceptualization.
F Ontology - XML allows for meaningful tags. T XML
- Resource Definition Framework is an XML language
for defining resources on the web (www). T RDF - Web Ontology Language (OWL) TOWL
- RDF is an assertional language intended to be
used to express propositions using precise formal
vocabularies, particularly those specified using
RDFS RDF-VOCABULARY, for access and use over
the World Wide Web, and is intended to provide a
basic foundation for more advanced assertional
languages with a similar purpose. The overall
design goals emphasize generality and precision
in expressing propositions about any topic,
rather than conformity to any particular
processing model.
9RDF and OWL
- OWL is a semantic extension of RDF it allows for
specification of logical dependencies between
information structures. (as defined by Miles ref
2) - OWL works on structured information
- RDF is for structuring information.
- OWL is an information model.9
10Semantic stack
OWL
Semantic web
RDFS
RDF
URI
XML
11Intelligent Search Engine for online access to
municipal services (Ch 4) problem definition
- Citizens can perform 80 of the city services
from home - When somebody is looking for a service one must
be able to locate it easily. - You can collect, categorize and list all the
services (.. Taxonomy) - However searching through this list may not yield
expected results using traditional search
engines. - Search results are based on the description of
the services and co-occurrence of the words in
the query. - Ex A citizen want to dispose a washing machine
should search for special collection of large
items - Cannot force citizens to learn government
language - When a service is looked upon a set of related
services should be made available - Search engine is a first step in the roadmap to
citizen self-service
12Zaragoza Municipal services roadmap (Fig. 4.1)
Positioning
Intelligent search Engine
Citizen channels
Citizen self-service
Interface
Functionality
Content Scope
Technology
13Application of semantic web
- Three ways that Zaragosa used semantic web are
- Statistical approach to interpretation of citizen
requests. (fig. 4.3) - Enhanced-keyword based approach to interpretation
of citizen requests. (fig. 4.4) - Applying semantic distance to interpreting
citizen requests. (fig. 4.5)
14Usage of the three methods
- First approach is cheapest and consumes less
resources and the semantic web approach is the
most expensive. - Zaragosa architecture arranges the three in a
pipeline architecture where each stage is
triggered only when previous stage did not result
is satisfactory results.
15How does it work?
- Traditional search engines retrieve documents
based on occurrences of keyboards vs. Zaragosa
SOA (ZS) has understanding of its services,
information and data. - ZS knows persons can change addresses, car owners
pay taxes, construction work requires permits,
building bars near schools is not good etc. - All this information is stored in an ontology a
computer understandable description of what
e-services are. - This ontology allows ZS to understand citizenss
query and thus returns meaningful results. - ZS also uses natural language understanding
software to translate free text queries of
citizens into the ontology. (see fig. 4.6)
16Citizen-city government interaction (Fig. 4.6
modified)
Natural language query
Semantic Query
Result
Knowledge Tagger (KT)
NLP
Semantic Distance Analyzer (SD)
ZS domain ontology
17Search vs. Intelligent Search
- Search for keywords
- Result in ranked list of documents
- Users need to invest time and effort to filter
the right piece of information out of the overall
results
- Search for keywords, semantic concepts.
- Results in actual relevant document
- Perceived as search engine that understands the
user.
18ZS Domain Ontology
- Development of an ontology starts with detailed
study of the services offered by the city. - Objective is to extract all relevant terms
belonging to this domain from existing documents. - ZS ontology contains four main classes agent,
process, event, object
19ZS Domain Ontology (contd.)
- Agent entity participating in an action
- Process A series of actions that a citizen can
do using the online services offered by the city
government. - Event any social gathering or activity.
- Object any entity that exists in the city which
can be used for or by a service offered by the
city government.
20Using the ontology
- Approach is to establish a semantic similarity
between a question provided by a citizen and the
FAQs already available. - Ontology needs to be complete in order to contain
all the necessary terms to satisfy the requests. - Ontology is completed with a number of thesauri
to identify synonyms. Ex baby and infant - Context information is used to tackle any
ambiguity.
21Natural Language Process for ZS
- Knowledge tagger automatically annotates text
according to domain ontology - Series of linguistic analyzers, sentence
splitters, simple tokenizers, spell checkers and
morphological databases. - Outcome of this analysis is a annotated text
equivalent of the query. - Then the query is synthesized in terms of domain
ontology RDQL, SPARQL, SQL
22Semantic Annotation of city services
- Collect and index the information about services
- Semantic processing results in ontological
entities concepts, instances, attributes, and
relations - Output of this process is semantically described
services that can checked against citizens
queries.
23Overall Architecture of ZS
Search clients
Search Systems web services
Ontology Systems Ontology cache Ontology
Subsytem Web services
NLP systems NLP cache NLP subsystem Web services
Persistence RDBMS
24Summary
- Zaragosa is an powerful SOA that uses semantic
knowledge to better serve its citizens. - Its roadmap is open with ability to extend the
system through its WS interface.
25Networked SOA for Zaragosa
Enterprise Layer
Intermediary layer
Semantic Search