Title: Postcards from the Semantic Web' Metadata for Machines
1Postcards from the Semantic Web. Metadata for
Machines
- INF 384C
- Miles Efron
- School of Information
- University of Texas
2Metadata for Humans and for Machines
- We can understand the role of metadata in modern
information access along several axes. - Metadata type (descriptive, structural, admin)
- Who is the metadata for?
- How informative is the structure latent in the
metadata? - Todays lecture has two goals
- To contextualize our earlier discussion of
metadata use in the real world. - To articulate an important current in the
metadata world creating metadata for use by
machines instead of humans (grandly exemplified
by the Semantic Web initiatives)
3Debating the merits of the Semantic Web
- How do the proponents of Semantic Web
technologies and initiatives argue that SW is
different (and better) than the Web as we know
it? - Why does the Semantic Web have so many, and such
vocal detractors?
4Metadata and the Semantic Web
- The semantic web is a conglomeration of projects
that share the desire to create and maintain a
distributed information system based on highly
structured metadata. - The operation of the semantic web is predicated
on the existence of shareable ontologies. - The lingua franca of these ontologies is the
resource description framework (RDF).
5The Idea(l) of the Semantic Web
- Instead of a Web of pages intended for human
interpretation, the Semantic Web consists of
linked pages intended for interpretation by
semi-autonomous software agents. - Thus instead of the browser serving as your
gateway to Web-based info, you might have, for
example, a Calendar agent that helps you
negotiate people, places, and times.
6The Idea(l) of the Semantic Web
For this to work, such an agent would need to
know not only that people, places, and times
exist, but how they relate to each other, and how
they are structured. e.g. A boss is a kind of
person, and so is an assistant. Also, two people
can occupy one place, but a person can only
occupy one place at a time.
- Instead of a Web of pages intended for human
interpretation, the Semantic Web consists of
linked pages intended for interpretation by
semi-autonomous software agents. - Thus instead of the browser serving as your
gateway to Web-based info, you might have, for
example, a Calendar agent that helps you
negotiate people, places, and times.
7What do we talk about when we talk about an
Ontology?
short answer An ontology is a specification of
a conceptualization.
more generally an ontology is a description ...
of the concepts and relationships that can exist
for an agent or a community of agents. T.
Gruber. What is an Ontology?
8In less lofty terms Ontology
An ontology is a representation of the things in
a system, represented in a consistent,
agreed-upon fashion.
Typically, the term ontology implies not only a
representation of things, but also a
representation of their relationships, along with
semantic understanding of those
relationships. i.e. a relational database may or
may not be an ontology, depending on your
definition.
9Ontologies aid in Knowledge Sharing
knowledge about physics
knowledge about chemistry
how to compare/combine knowledge that is
represented in different ways?
10Ontologies aid in Knowledge Sharing
knowledge about physics
knowledge about chemistry
analogous representations allow for direct
comparisons
11Ontologies aid in Knowledge Sharing
knowledge about physics
knowledge about chemistry
analogous representations also encourage
combining knowledge bases
12Things you can do with an Ontology
- which uranium isotopes are most prone to chain
reaction? - what's a nice red wine to eat with meunster?
- show me all the people who have siblings named
dave.
- rich people make make at least 100,000
- it is not the case that I make more than
100,000 - therefore I am not a rich person
the point of building an ontology is to enable
processing of knowledge by other programs...the
ontology itself is rarely the goal.
13Elements of an Ontology
English
Ontology-speak
- concepts
- features
- actual things
- classes (e.g. wine, car)
- slots (e.g. type, cost)
- instances
make
car
model
year
14Elements of an Ontology
name
is it a class or a slot???
country
make
car
model
year
15Elements of an Ontology
Classes are the things that comprise the little
universe described by an ontology.
These classes are represented primarily by slots,
i.e. variables. Slots may themselves be classes.
16Organization of an Ontology
Ontologies typically encode associative
relatioinships between classes.
car
is-a relation
sedan
17Organization of an Ontology
Ontologies typically encode associative
relatioinships between classes.
bird
has-a relation...i.e. meronymy
beak
18Organization of an Ontology
The notion of inheritance adds efficiency to
representation.
weight
make
thing
model
car
sedan
superclasses vs. sub-classes
coupe
19Expressing Ontologies with XML
- The core idea underlying semantic web efforts is
that there should exist numerous ontologies, all
of which should be expressed in a way that allows
them to be shared/combined/used by a variety of
systems. - The semantic web community has focused on using
XML as the lingua franca for expressing
ontologies. - In particular, several XML markup languages
underpin the use of ontologies for semantic web
activity
20Expressing Ontologies with XML
- RDF The Resource Description Framework. Used
to express simple assertions about the
relationships between things. - RDF Schema A language used to describe the
relationship between classes of things, or
relationships between concepts in a particular
domain. - OWL The Web Ontology Language. Relies on both
RDF and RDF schema to express the ontologies
themselves.
21The Resource Description Framework (RDF)
describes
anything
Many of the following examples are adapted from
http//www.w3.org/TR/rdf-primer and
http//www.dlib.org/dlib/may98/miller/05miller.htm
l .
22RDF Overview
- When we talk about RDF, we commonly refer to at
least three distinct ideas - RDF is a conceptual structure, a model for
expressing descriptive statements about
resources. - RDF is often expressed (aka "serialized") using
XML. - RDF/XML is informed by ontological knowledge,
expressed using another XML-based language RDF
Schema.
23Problem DTDs specify syntax, not semantics
lt!ELEMENT letter (greeting,body,closing)gt lt!ELEMEN
T greeting (salutation?,recipient)gt lt!ELEMENT
body (PCDATA)gt lt!ELEMENT closing
(signoff?,sender)gt lt!ELEMENT salutation
(PCDATA)gt lt!ELEMENT recipient (PCDATA)gt lt!ELEMEN
T signoff (PCDATA)gt lt!ELEMENT sender
(PCDATA)gt lt!ATTLIST letter letterDate CDATA
REQUIREDgt
what is a sender? what is a sender's
relationship to a letter?
24Problem DTDs specify syntax, not semantics
In fact, it can be argued that DTDs do provide
semantics. i.e. Dublin Core offers an author
element, with fairly obvious meaning. But what
we lack in this circumstance is a way to compare
semantics across DTDs. RDF seeks to remedy this.
The goal of RDF is to promote semantic
interoperability, much as DC promotes syntactic
interoperability.
25The RDF Model Thinking of Metadata in terms of
Semantics
http//www.ibiblio.org/mefron/ has a creator with
name Miles Efron
http//www.ibiblio.org/mefron/ has a creation
date with value August 1998
http//www.ibiblio.org/mefron/ has a language
with value English
26Thinking of Metadata in terms of Semantics
resource
atomic value
property
http//www.ibiblio.org/mefron/ has a creation
date with value August 1998
27Thinking of Metadata in terms of Semantics
resource
atomic value
property
http//www.ibiblio.org/mefron/ has a creation
date with value August 1998
RDF data types
- a resource is something that has a URI
- a property is something possessed by a resource
- an atomic value is a basic piece of data (such
as a number)
28RDF as a Language a sample Statement
http//www.ibiblio.org/mefron/ has a creation
date with value August 1998
subject
predicate
object
29RDF as a Language a sample Statement
http//www.ibiblio.org/mefron/ has a creation
date with value August 1998
subject
predicate
object
Together, these are called an RDF triple.
30RDF as a Directed Graph
resource
http//www.ibiblio.org/mefron
property
creation date
atomic value
August 1998
31RDF as a Directed Graph
resource
http//www.ibiblio.org/mefron
property
author
resource
miles_at_utexas.edu
32RDF as a Directed Graph
resource
Miles Efron
email
properties
name
affiliation
miles_at_utexas.edu
Efron, Miles
www.utexas.edu
33Serializing RDF RDF/XML
- To make our RDF conceptual models
machine-readable, the RDF working group at W3C
has defined an XML vocabulary for describing
resources. - Unlike many of the other XML standards we've
discussed, RDF/XML uses shallow, repetitive
statements to articulate descriptions. This has
implications for the kind of info RDF is good at
expressing
34Structure of an RDF/XML document
root element is RDF
the RDF element contains 1 or more Description
elements. Each Description has a resource's URI
as its about attribute
the Description element contains 1 or more child
nodes, each of which contains a predicate-value
pair.
35Structure of an RDF/XML document
A Description element is not strictly necessary,
but they will be present in all of our RDF/XML
documents.
root element is RDF
the RDF element contains 1 or more Description
elements. Each Description has a resource's URI
as its about attribute
the Description element contains 1 or more child
nodes, each of which contains a predicate-value
pair.
36Structure of an RDF/XML document
RDF
Description about http//www.ibiblio.org/mefron
author Miles Efron
title Miles Efron's web page
37Properties are given semantics
RDF uses XML namespaces to specify inter-resource
relations
DCCreator
Miles Efron
http//www.ibiblio.org/mefron
ltRDFRDF xmlnsRDF"http//www.w3.org/1999
/02/22-rdf-syntax-ns" xmlnsDC"http//dubl
incore.org/2003/03/24/dces"gt ltRDFDescription
RDFabout "http//www.ibiblio.org/mefron"gt
ltDCCreatorgtMiles Efronlt/DCCreatorgt
lt/RDFDescriptiongt lt/RDFRDFgt
38Properties are given Semantics
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdf"http//w
ww.w3.org/1999/02/22-rdf-syntax-ns"
xmlnsDC"http//www.dublincore.org/2003/03/24/dce
s"gt ltrdfdescription about"http//www.ibiblio.
org/mefron"gt ltDCcreatorgtMiles
EfronltDCcreatorgt ltDCtitlegtMiles Efron's web
pagelt/DCtitlegt ltrdfdescriptiongt ltrdfRDFgt
39Some Example RDF/XML Docs
lt?xml version"1.0"?gt ltrdfRDF
xmlnsrdf"http//www.w3.org/1999/02/22-rdf-syntax
-ns" xmlnsdc"http//dublincore.org/2003/0
3/24/dces"gt ltrdfDescription rdfabout"http//w
ww.example.org/index.html"gt ltdcdategtAugust
16, 1999lt/dcdategt lt/rdfDescriptiongt
ltrdfDescription rdfabout"http//www.example.org
/index.html"gt ltdclanguagegtenlt/dclanguagegt
lt/rdfDescriptiongt lt/rdfRDFgt
40Expressing Dublin Core in RDF/XML
lt?xml version"1.0"?gt ltrdfRDF
xmlnsrdf"http//www.w3.org/1999/02/22-rdf-syntax
-ns" xmlnsdc"http//dublincore.org/2003/0
3/24/dces"gt ltrdfDescription rdfabout"http//w
ww.ukoln.ac.uk/metadata/resources/dc/datamodel/
WD-dc-rdf/"gt ltdctitlegt Guidance on
expressing the Dublin Core within the Resource
Description Framework (RDF) lt/dctitlegt
ltdccreatorgt Eric Miller lt/dccreatorgt
ltdcsubjectgt Dublin Core Resource Description
Framework RDF eXtensible Markup Language
XML lt/dcsubjectgt ltdcpublishergt Dublin Core
Metadata Initiative lt/dcpublishergt
ltdccontributorgt Dublin Core Data Model Working
Group lt/dccontributorgt ltdcdategt 1999-07-01
lt/dcdategt ltdcformatgt text/html
lt/dcformatgt ltdclanguagegt en lt/dclanguagegt
lt/rdfDescriptiongt lt/rdfRDFgt
Adapted from http//www.ukoln.ac.uk/metadata/resou
rces/dc/datamodel/WD-dc-rdf
41Expressing Dublin Core in RDF/XML
lt?xml version"1.0"?gt ltrdfRDF
xmlnsrdf"http//www.w3.org/1999/02/22-rdf-syntax
-ns" xmlnsdc"http//dublincore.org/2003/0
3/24/dces"gt ltrdfDescription rdfabout"http//w
ww.ukoln.ac.uk/metadata/resources/dc/datamodel/
WD-dc-rdf/"gt ltdctitlegt Guidance on
expressing the Dublin Core within the Resource
Description Framework (RDF) lt/dctitlegt
ltdccreatorgt Eric Miller lt/dccreatorgt
ltdcsubjectgt Dublin Core Resource Description
Framework RDF eXtensible Markup Language
XML lt/dcsubjectgt ltdcpublishergt Dublin Core
Metadata Initiative lt/dcpublishergt
ltdccontributorgt Dublin Core Data Model Working
Group lt/dccontributorgt ltdcdategt 1999-07-01
lt/dcdategt ltdcformatgt text/html
lt/dcformatgt ltdclanguagegt en lt/dclanguagegt
lt/rdfDescriptiongt lt/rdfRDFgt
So what exactly do these namespace URI's point
to? What is a title, or a subject? Ideally,
these terms should be drawn from an ontology,
expressed in RDF Schema or a related language.
Adapted from http//www.ukoln.ac.uk/metadata/resou
rces/dc/datamodel/WD-dc-rdf
42RDF Schema
- RDF gives us a syntax for expressing
relationships between things in the world. - But it doesnt explicitly lend these
relationships any meaning. - To enable systems to share meaningful, structured
information (i.e., sharing knowledge), RDF
statements often adhere to ontologies expressed
in RDF Schema.
43RDF Schema
- In our earlier discussion of XML, we talked about
DTDs and Schemas. - These tools define a metadata vocabulary.
- Similarly, RDF Schemas define the vocabularies
for ontologies. Additionally, RDF Schema
contains functions to define ontological
relationships among an ontology's classes and
properties (slots).
44A simple Schema Bread
- Imagine that we are going to create an electronic
database of bread recipes. - The database will be expressed using RDF so that
we can share information with other services. - To enable that sharing, we need to define the
kinds of knowledge that our database contains.
Hence the schema
45Outline of the Ontology
Has a
bread
shape
Has an
ingredient
Is a
leaven
46Outline of the Ontology
Has a
bread
shape
Has an
ingredient
Our ovals will become classes, and the arrows
will become properties. N.B. This is Not a formal
RDF graph.
Is a
leaven
47Writing the Ontology
lt?xml version"1.0" ?gt ltrdfRDF
xmlnsrdf'http//www.w3.org/1999/02/22-rdf-syntax
-ns' xmlnsrdfs'http//www.w3.o
rg/2000/01/rdf-schema' gt lt/rdfRDF
gt
RDF Schema documents are valid RDF, so we start
as usual, adding a namespace for the Schema
language (this gives us access to the ontology
functions).
48Writing the Ontology
lt?xml version"1.0" ?gt ltrdfRDF
xmlnsrdf'http//www.w3.org/1999/02/22-rdf-syntax
-ns' xmlnsrdfs'http//www.w3.o
rg/2000/01/rdf-schema' gt lt!-- top-level classes
--gt ltrdfsClass rdfID'Shape' /gt ltrdfsClass
rdfID'Ingredient' /gt ltrdfsClass rdfID'Bread'
/gt lt/rdfRDFgt
49Writing the Ontology
lt?xml version"1.0" ?gt ltrdfRDF
xmlnsrdf'http//www.w3.org/1999/02/22-rdf-syntax
-ns' xmlnsrdfs'http//www.w3.o
rg/2000/01/rdf-schema' gt lt!-- top-level classes
--gt ltrdfsClass rdfID'Shape' /gt ltrdfsClass
rdfID'Ingredient' /gt ltrdfsClass rdfID'Bread'
/gt lt!-- sub-classes --gt ltrdfsClass
rdfID'Leaven' gt ltrdfssubClassOf
rdfresource'Ingredient' /gt lt/rdfsClassgt lt/
rdfRDFgt
50Writing the Ontology
lt?xml version"1.0" ?gt ltrdfRDF
xmlnsrdf'http//www.w3.org/1999/02/22-rdf-syntax
-ns' xmlnsrdfs'http//www.w3.o
rg/2000/01/rdf-schema' gt class declarations
omitted due to space lt!-- properties
--gt ltrdfProperty rdfID'breadName'gt
ltrdfsdomain rdfresource'Bread'
/gt lt/rdfPropertygt ltrdfProperty
rdfID'breadShape' gt ltrdfsdomain
rdfresource'Bread' /gt ltrdfsrange
rdfresource'Shape' /gt lt/rdfPropertygt ltrdfPro
perty rdfID'breadIngredient' gt ltrdfsdomain
rdfresource'Bread' /gt ltrdfsrange
rdfresource'Ingredient' /gt lt/rdfPropertygt ltrd
fProperty rdfID'breadHydration' gt
ltrdfsdomain rdfresource'Bread'
/gt lt/rdfPropertygt lt/rdfRDFgt
The ontology is available at http//www.ibiblio.o
rg/mefron/xml/rdf/bread.rdf
51Using the ontology
- Imagine that we have a recipe for the wonderful
Italian bread called pugliese. - The recipe is stored at the URI
http//www.bread.com/recipe1.html - We want to create an entry in our database that
describes this recipe using the knowledge we've
just formalized.
52lt?xml version"1.0" ?gt ltrdfRDF
xmlnsrdf"http//www.w3.org/1999/02/22-rdf-syntax
-ns" xmlnsrdfs"http//www.w3.org/2000
/01/rdf-schema" xmlnsbread"http//www
.ibiblio.org/mefron/xml/rdf/bread"
gt lt/rdfRDFgt
We declare a namespace for the ontology,
attaching it to a copy of the Schema (this is
where I've put it online).
53lt?xml version"1.0" ?gt ltrdfRDF
xmlnsrdf"http//www.w3.org/1999/02/22-rdf-syntax
-ns" xmlnsrdfs"http//www.w3.org/2000
/01/rdf-schema" xmlnsbread"http//www
.ibiblio.org/mefron/xml/rdf/bread"
gt ltrdfDescription rdfID"batard"
rdftype"http//www.ibiblio.org/mefron/xml/rdf/br
eadShape" gt lt/rdfDescriptiongt ltrdfDescription
rdfID"durumFlour" rdftype"http//www.ibib
lio.org/mefron/xml/rdf/breadIngredient"
gt lt/rdfDescriptiongt ltrdfDescription
rdfID"wheatFflour" rdftype"http//www.ibibli
o.org/mefron/xml/rdf/breadIngredient"
gt lt/rdfDescriptiongt ltrdfDescription
rdfID"salt" rdftype"http//www.ibiblio.org/m
efron/xml/rdf/breadIngredient"
gt lt/rdfDescriptiongt ltrdfDescription
rdfID"water" rdftype"http//www.ibiblio.org/
mefron/xml/rdf/breadIngredient"
gt lt/rdfDescriptiongt ltrdfDescription
rdfID"instantYeast" rdftype"http//www.ibibl
io.org/mefron/xml/rdf/breadLeaven"
gt lt/rdfDescriptiongt
54lt?xml version"1.0" ?gt ltrdfRDF
xmlnsrdf"http//www.w3.org/1999/02/22-rdf-syntax
-ns" xmlnsrdfs"http//www.w3.org/2000
/01/rdf-schema" xmlnsbread"http//www
.ibiblio.org/mefron/xml/rdf/bread" gt
instances omitted for space ltrdfDescription
about"http//www.bread.com/recipe1.html"
rdftype"http//www.ibiblio.org/mefron/xml/rdf/br
eadBread" gt ltbreadbreadNamegtpuglieselt/breadbr
eadNamegt ltbreadbreadShapegtbatardlt/breadbreadSh
apegt ltbreadbreadIngredientgtdurumFlourlt/breadbr
eadIngredientgt ltbreadbreadIngredientgtwheatFlour
lt/breadbreadIngredientgt ltbreadbreadIngredientgt
saltlt/breadbreadIngredientgt ltbreadbreadLeavengt
instantYeast"lt/breadbreadLeavengt
ltbreadbreadHydrationgt0.78lt/breadbreadHydrationgt
lt/rdfDescriptiongt lt/rdfRDFgt
55Again Why the Skepticism?
56Again Why the Skepticism?
- Some critics have argued that the Semantic Web
has been slow to catch on because it hinges on
persuading data owners to structure their
information manually, often in the absence of a
clear economic incentive for doing so. While the
Semantic Web approach may work well for targeted
vertical applications where there is a built-in
economic incentive to support expensive markup
work (such as biomedical information), such a
labor-intensive platform will never scale to the
Web as a whole. - Rather than relying on Web site owners to mark up
their data, couldnt search engines simply do it
for them? - Google now sends a spider to pull up individual
query forms and indexes the contents of the form
for clues about the topic it covers. - --Wright, Alex. (2008). Searching the Deep Web.
Communications of the ACM. 51(10). 14-15.
57Again Why the Skepticism?
People sometimes talk about the semantic web (vs.
Semantic Web). Whats the difference?
- Some critics have argued that the Semantic Web
has been slow to catch on because it hinges on
persuading data owners to structure their
information manually, often in the absence of a
clear economic incentive for doing so. While the
Semantic Web approach may work well for targeted
vertical applications where there is a built-in
economic incentive to support expensive markup
work (such as biomedical information), such a
labor-intensive platform will never scale to the
Web as a whole. - Rather than relying on Web site owners to mark up
their data, couldnt search engines simply do it
for them? - Google now sends a spider to pull up individual
query forms and indexes the contents of the form
for clues about the topic it covers. - --Wright, Alex. (2008). Searching the Deep Web.
Communications of the ACM. 51(10). 14-15.
58Metadata and the Semantic Web
- As weve already discussed, the idea behind the
Semantic Web is to put in place metadata that is
intended to help machines learn about things in a
way that is useful to people. - To guide this learning, stakeholders in the
Semantic Web use a host of XML-based tools (RDF,
OWL--the Web Ontology Language--, etc.) to encode
ontologies, expressions of knowledge about the
world that machines can use. - Lets look at some Semantic Web Examples
59Postcards from the Semantic Web
- http//www.semantic-systems-biology.org/biogatewa
y/querying - W3Cs RDF Validator (http//www.w3.org/RDF/Validat
or/) Validates RDF and will generate a graph
visualization of the marked up relationships.