Title: Semantic Basics: Markup, Querying, and Reasoning
1Semantic Basics Markup, Querying, and Reasoning
- Marlon Pierce
- Community Grids Lab
- Indiana University
- With Slides and Help from Sean Bechhofer, Carole
Goble, Line Pouchard, and Dave De Roure
2Preface Beyond XML
3Reductio ad Absurdum
- Physics is the study of the harmonic
oscillator. - H. L. Richards
- Statistical Mechanics is the study of the Ising
Model - H. L. Richards
- Web Service standards are the study of ltxsdanygt
sequences - M. E. Pierce, soon to be anonymous
4Which Web Service Specs?
- ltxselement name"Header" type"tnsHeader" /gt
- ltxscomplexType name"Header"gt
- ltxssequencegt
- ltxsany namespace"any" processContents"lax"
minOccurs"0" maxOccurs"unbounded" /gt - lt/xssequencegt
- Â ltxsanyAttribute namespace"other"
processContents"lax" /gt - Â lt/xscomplexTypegt
- ltxsdcomplexType name"SecurityHeaderType"gt
- ltxsdsequencegt
- ltxsdany
- processContents"lax"
- minOccurs"0"
- maxOccurs"unbounded"gt
- lt/xsdanygt
- lt/xsdsequencegt
- Â ltxsdanyAttribute
- namespace"other"
- processContents"lax" /gt
- lt/xsdcomplexTypegt
5Which, What, and Why?
- Which is what?
- Left is the definition of the SOAP header.
- Right is taken from Web Service Secure Messaging
Specification. - You will find this pattern repeated pretty often
in web service specifications. - Why?
- We have limited ways of linking several XML
schema data models. - XML maps relationships to trees.
- Graphs are a more natural way of expressing many
inter-relationships of concepts.
6XML for KR
- Definition of self-describing data in worldwide
standardized, non-proprietary format. - Structured data and knowledge exchange for
enterprises in various industries. - Integration of information from different sources
to uniform documents. - Exchange of knowledge bases between different AI
languages, knowledge bases and databases,
application systems, etc. - But.
7XML is not enough
The Creator of the Resource http//www.w3.org/Ho
me/Lassila is Ora Lassila
- XML defines grammars to verify and structure
documents - The grammar enforces constraints on tags
- Different grammars define the same content
- XML lacks a semantic model it only has a
surface model which is a tree.
8XML is not enough
- Meaning of XML documents is intuitively clear
- semantic markup tags are domain terms
- But computers do not have intuition
- Tag names per se do not provide semantics
- The semantics are encoded outside the XML
specification - XML makes no commitment on
- Domain specific ontological vocabulary
- Ontological modeling primitives
- ? requires pre-arranged agreement on ? ?
- Feasible for closed collaboration
- agents in a small stable community
- pages on a small stable intranet
- Semantic Web Markups often are expressed in XML
but they carry extra meaning.
9Enter the Semantic Web/Grid
- The Semantic Web is the representation of data
on the World Wide Web. It is a collaborative
effort led by W3C with participation from a large
number of researchers and industrial partners. It
is based on the Resource Description Framework
(RDF), which integrates a variety of applications
using XML for syntax and URIs for naming.
10Resource Description Framework
- Overview of RDF basic ideas and XML encoding.
11Building Semantic Markup Languages
- XML essentially defines syntax rules for markup
languages. - Human readable means humans provide meaning
- We also would like some limited ability to encode
meaning directly within markup languages. - The semantic markup languages attempt to do that,
with increasing sophistication. - Stack indicates direct dependencies DAML is
defined in terms of RDF, RDFS.
Eric Miller, http//www.w3.org/2002/Talks/www2002-
w3ct-swintro-em/
12Resource Description Framework (RDF)
- RDF is the simplest of the semantic languages.
- Basic Idea 1 Triples
- RDF is based on a subject-verb-object statement
structure. - RDF subjects are called classes
- Verbs are called properties.
- Basic Idea 2 Everything is a resource that is
named with a URI - RDF nouns, verbs, and objects are all labeled
with URIs - Recall that a URI is just a name for a resource.
- It may be a URL, but not necessarily.
- A URI can name anything that can be described
- Web pages, creators of web pages, organizations
that the creator works for,.
13What Does This Have to Do with Grid Computing?
- RDF resources arent just web pages
- Can be computer codes, simulation and
experimental data, hardware, research groups,
algorithms, . - Recall from the CMCS chemistry example that they
needed to describe the provenance, annotation,
and curation of chemistry data. - Compound Xs properties were calculated by Dr. Y.
- CMCS maps all of their metadata to the Dublin
Core. - The Dublin Core is encoded quite nicely as RDF.
14RDF Graph Model
- RDF is defined by a graph model.
- Resources are denoted by ovals.
- Lines (arcs) indicate properties.
- Squares indicate string literals (no URI).
- Resources and properties are labeled by a URI.
http//.../CMCS/Entries/X
http//purl.org/dc/elements/1.1/creator
http//.../CMCS/People/DrY
http//purl.org/dc/elements/1.1/title
H2O
15Encoding RDF as Triplets
- RDF graphs may be written as triple sentences.
- A triple is just the subject, predicate, and
object (in that order) of a graph segment. - lthttp//.../CMCS/Entries/Xgtlthttp//purl.org/dc/ele
ments/1.1/creatorgthttp//.../CMCS/People/DrYgt - This structure my look trivial but is useful in
expressing queries (more later).
16Encoding RDF in XML
- The graph represents two statements.
- Entry X has a creator, Dr. Y.
- Entry X has a title, H2O.
- In RDF XML, we have the following tags
- ltRDFgt lt/RDFgt denote the beginning and end of the
RDF description. - ltDescriptiongts about attribute identifies the
subject of the sentence. - ltDescriptiongtlt/Descriptiongt enclose the
properties and their values. - We import Dublin Core conventional properties
(creator, title) from outside RDF proper.
17RDF XML The Gory Details
- ltrdfRDF xmlnsrdf'http//www.w3.org/1999/02/22-r
df-syntax-ns' xmlnsdc'http//purl.org/dc/eleme
nts/1.0/'gt - ltrdfDescription rdfabout'http//.../Xgt
- ltdccreator
- rdfresource'http///people/MEP/gt
ltdctitle - rdfresource'H2O'/gt lt/rdfDescriptiongt
- lt/rdfRDFgt
18Creating RDF Documents
- Writing RDF XML (or DAML or OWL) by hand is not
easy. - Its a good way to learn to read/write, but after
you understand it, automate it. - Authoring tools are available
- OntoMat buggy
- Protégé preferred by CGL grad students
- IsaViz another nice tool with very good
graphics. - You can also generate these programmatically
using Hewlett Packard Labs Jena toolkit for
Java. - This is what I did in previous example.
19What is the Advantage?
- So far, properties are just conventional URI
names. - All semantic web properties are conventional
assertions about relationships between resources. - RDFS and DAML will offer more precise property
capabilities. - But there is a powerful feature we are about to
explore - Properties provide a powerful way of linking
different RDF resources - Nuggets of information.
- For example, a publication is a resource that can
be described by RDF - Author, publication date, URL are all metadata
property values. - But publications have references that are just
other publications - DCs hasReference can be used to point from one
publication to another. - Publication also have authors
- An author is more than a name
- Also an RDF resource with collections of
properties - Name, email, telephone number,
20vCard Representing People with RDF Properties
- The Dublin Core tags are best used to represent
metadata about published content - Documents, published data
- vCards are an IETF standard for representing
people - Typical properties include name, email,
organization membership, mailing address, title,
etc. - See http//www.ietf.org/rfc/rfc2426.txt
- Like the DC, vCards are independent of (and
predate) RDF but are map naturally into RDF. - Each of these maps naturally to an RDF property
- See http//www.w3.org/TR/2001/NOTE-vcard-rdf-20010
222/
21Example A vCard in RDF/XML
ltrdfRDF xmlnsrdf'http//www.w3.org/1999/02/
22-rdf-syntax-ns' xmlnsvcard'http//www.w3.
org/2001/vcard-rdf/3.0'gt ltrdfDescription
rdfabout'http//cgl.indiana.edu/people/GCF'
vcardEMAIL'gcf_at_indiana.edu'gt
ltvcardFNgtGeoffrey Foxlt/vcardFNgt
ltvcardN vcardGiven'Geoffrey'
vcardFamily'Fox'/gt
lt/rdfDescriptiongtlt/rdfRDFgt
22Linking vCard and Dublin Core Resources
- The real power of RDF is that you can link two
independently specified resources through the use
of properties. - We do this using URIs as universal pointers
- Identify specific resources (nouns) and
specifications for properties (verbs) - The URIs may optionally be URLs that can be used
to fetch the information. - Linking these resource nuggets allows us to pose
queries like - What is the email address of the creator of this
entry in the chemical database? - What other entries reference directly or
indirectly on this data entry? - Linkages can be made at any time
- Dont have to be designed into the system
23Graph Model Depicting vCard and DC Linking
dry_at_stateu.edu
http//.../CMCS/Entry/1
dccreator
vcardEMAIL
http//.../People/DrY
dctitle
H20
vcardN
vcardFamily
vcardGiven
24What Else Does RDF Do?
- Collections typically used as the object of an
RDF statement - Bag unordered collection of resources or
literals. - Sequence ordered collection or resources or
literals. - Alternative collection of resources or literals,
from which only one value may be chosen - And thats about it. RDF does not define
properties, it just tells you where to put them. - Definitions are done by specific groups for
specific fields (Dublin Core Metadata Initiative,
for example). - RDF Schema provides the rules for defining
specific resources classes and properties.
25RDF Schema
26Other Semantic Markup Languages
- RDF Schema (RDFS)
- Provides formal definitions of RDF
- Also provides language tools for writing more
specialized languages. - Well examine in more detail.
- DARPA Agent Markup Language (DAML)
- DAML-OIL is the language component of the DAML
project. - Defined using RDF/RDFS.
- Well examine in more detail.
- Ontology Inference Layer (OIL)
- OIL language expressed in terms of RDF/RDFS.
- The OIL project is sponsored by the European
Union. - Web-Ontology Language (OWL)
- Developed by the W3Cs Web-Ontology Working Group
- Based on DAML-OIL
27RDF Schema
- RDF Schema is a rules system for building RDF
languages. - RDF and RDFS are defined in terms of RDFS
- DAMLOIL is defined by RDFS.
- Take the Dublin Core RDF encoding as an example
- Can we formalize this process, defining a
consistent set of rules? - Can we place restrictions and use inheritance to
define resources? - What really is the value of creator? Can I
derive it from another class, like person? - Can we provide restrictions and rules for
properties? - How can I express the fact that title should
only appear once? - Current DC encoding in fact is defined by RDFS.
28Some RDFS Classes
29Some RDFS Properties
30Sample RDFS Defining ltPropertygt
ltrdfsClass rdfabout"http//.../some/uri"gt
ltrdfsisDefinedBy rdfresource"http//.../some/ur
i"/gt ltrdfslabelgtPropertylt/rdfslabelgt
ltrdfscommentgtThe class of RDF properties.lt/rdfsc
ommentgt ltrdfssubClassOf
rdfresource"http//.../Resourcegt
lt/rdfsClassgt
- This is the definition of ltpropertygt, taken from
the RDF schema. - The about attribute labels names this nugget.
- ltpropertygt has several properties
- ltlabelgt,ltcommentgt are self explanatory.
- ltsubClassOfgt means ltpropertygt is a subclass of
ltresourcegt - ltisDefinedBygt points to the human-readable
documentation.
31RDFS Takeaway
- RDFS defines a set of classes and properties that
can be used to define new RDF-like languages. - RDFS actually bootstraps itself.
- You can express inheritance, restriction
- If you want to learn more, see the specification
- http//www.w3.org/TR/2003/WD-rdf-schema-20030123/
- But dont trust the write up
- Concepts are best understood by looking at the
RDF XML. English descriptions get convoluted. - If you want to see RDFS in action, see the DC
- http//dublincore.org/2003/03/24/dces
32Web Ontology Language(OWL)
- Eeyore W-O-L. That spells owl.
- Owl Bless my soul! So it does!
33Whats an Ontology?
- Ontology is an often used term in the field of
Knowledge Representation, Information Modeling,
etc. - English definitions tend to be vague to
non-specialists - A formal, explicit specification of a shared
conceptionalization - Clearer definition an ontology is a taxonomy
combined with inference rules - T. Berners-Lee, J. Hendler, O. Lassila
- But really, if you sit down to describe a subject
in terms of its classes and their relationships,
you are creating an Ontology. - You can express this in RDFS or OWL
34Problems with RDFS
- RDFS too weak to describe resources in sufficient
detail - No localised range and domain constraints
- Cant say that the range of hasChild is person
when applied to persons and elephant when applied
to elephants - No existence/cardinality constraints
- Cant say that all instances of person have a
mother that is also a person, or that persons
have exactly 2 parents - No transitive, inverse or symmetrical properties
- Cant say that isPartOf is a transitive property,
that hasPart is the inverse of isPartOf or that
touches is symmetrical - Difficult to provide reasoning support
- No native reasoners for non-standard semantics
- May be possible to reason via FO axiomatisation
35Web Ontology Language Requirements
- Desirable features identified for Web Ontology
Language - Extends existing Web standards
- Such as XML, RDF, RDFS
- Easy to understand and use
- Should be based on familiar KR idioms
- Of adequate expressive power
- Formally specified
- Possible to provide automated reasoning support
36History From RDF to OWL
- Two languages developed by extending (part of)
RDF - OIL developed by group of (largely) European
researchers - DAML-ONT developed by group of (largely) US
researchers (in DARPA DAML programme) - Efforts merged to produce DAMLOIL
- Development was carried out by Joint EU/US
Committee on Agent Markup Languages - Extends (subset of) RDF
- DAMLOIL submitted to W3C as basis for
standardisation - Web-Ontology (WebOnt) Working Group formed
- WebOnt group developed OWL language based on
DAMLOIL - OWL language now a W3C Recommendation (Feb 2004)
37What Are Description Logics?
- A family of logic based Knowledge Representation
formalisms - Descendants of semantic networks and KL-ONE
- Describe domain in terms of concepts (classes),
roles (relationships) and individuals - Distinguished by
- Formal semantics (typically model theoretic)
- Decidable fragments of FOL
- Closely related to Propositional Modal Dynamic
Logics - Provision of inference services
- Sound and complete decision procedures for key
problems - Implemented systems (highly optimised)
38Short History of Description Logics
- Phase 1
- Incomplete systems (Back, Classic, Loom, . . . )
- Based on structural algorithms
- Phase 2
- Development of tableau algorithms and complexity
results - Tableau-based systems for Pspace logics (e.g.,
Kris, Crack) - Investigation of optimisation techniques
- Phase 3
- Tableau algorithms for very expressive DLs
- Highly optimised tableau systems for ExpTime
logics (e.g., FaCT, DLP, Racer) - Relationship to modal logic and decidable
fragments of FOL
39Latest Developments
- Phase 4
- Mature implementations
- Mainstream applications and Tools
- Databases
- Consistency of conceptual schemata (EER, UML
etc.) - Schema integration
- Query subsumption (w.r.t. a conceptual schema)
- Ontologies and Semantic Web (and Grid)
- Ontology engineering (design, maintenance,
integration) - Reasoning with ontology-based markup (meta-data)
- Service description and discovery
- Commercial implementations
- Cerebra system from Network Inference Ltd
40OWL Semantic Layering
- Three language layers
- OWL Full
- Union of OWL Vocabulary and RDFS
- OWL DL
- Restricted to DL/FOL fragment (?DAMLOIL)
- OWL Lite
- Subset of OWL DL
- Syntactic Layering
- Semantic Layering
- Layers should agree on semantics.
Full
DL
Lite
41OWL Full
- No restriction on use of OWL vocabulary (as long
as legal RDF) - Classes as instances
- Assertions about vocabulary
- RDF style model theory
- Reasoning using FOL engines
- via axiomatisation
- Semantics should correspond with OWL DL for
suitably restricted KBs
Full
42OWL DL
- Use of OWL vocabulary restricted
- Cant be used to do nasty things (i.e., modify
OWL) - No classes as instances
- Standard DL/FOL model theory (definitive)
- Direct correspondence with (first order) logic
- Reasoning via DL engines
- Some problems with oneOf/inverse
- Reasoning for full language via FOL engines
- Would need built in datatypes for performance
DL
43OWL Lite
- Like DL, but fewer constructs
- No explicit negation or union
- Restricted cardinality (zero or one)
- No nominals (oneOf)
- Semantics as per DL
- Reasoning via standard DL engines (datatypes)
- E.g., FaCT, RACER, Cerebra
Lite
44An OWL Example
- An Earth Systems Grid example
45An Example Ontology Climate Data
- The example shows how to construct a really
simple ontology and instance. - Two classes
- dataset
- Parameter
- One property
- hasParameter
- Several parameters cloud_medium,
bounds_latitude, temperature - Line Pouchard (ORNL) created this for ESG using
Protégé and OilEd. - Full ontology shown at the end for reference.
46Ontology header With Dublin Core Parameters.
Class Definitions
hasParameter Definition
47Parameter Cloud_medium
Parameter Bounds_latitude
Parameter temperature
48OWL Enriched RDF Metadata about
PCM.B06.10.dataset1
49OWL Equivalence and Inheritance
- ltowlClass rdfIDusergt
- ltowlequivalentClass rdfresourcepersongt
- ltowlClassgt
- ltowlClass rdfaboutmagneticSpectrometergt
- ltrdfssubClassOfgt
- ltowlRestrictiongt
- ltowlonProperty rdfresourcehasMagnetsgt
- ltowlallValuesFrom rdfresourceSpectrometergt
- lt/owlRestrictiongt
- lt/rdfs subClassOfgt
- lt/owlClassgt
- Other logical relationships
- that can be asserted
- inverseOf,
- TransitveProperty,
- SymmetricProperty,
- FunctionalProperty,
- InverseFunctionalProperty
50Illustration of Inverse Properties
51Querying Semantic Data
- The Data Access Working Group (DAWG)
52What Is Semantic Querying?
- Dont confuse querying with inference.
- Querying just means retrieving data from Semantic
data models. - Post a query to the world of distributed RDF data
nuggets. - For RDF-like structures, this amounts to querying
triples
- Examples
- Finding an Email address from a persons vCard.
- Searching across subgraphs get me the email of
the author of this document (Dublin Core
vCard). - Persistent/scheduled queries on updates to
several multimedia databases.
53The DAWG Working Group
- Unfortunately, there are no standards for
querying RDF, etc. - There are solutions, like RDQL/SquishQL
- These are just not official
- The W3C Data Access Working Group DAWG is filling
the query gap. - Formed Feb 2004.
- This is a work in progress
- Use Cases and Requirements http//www.w3.org/TR/r
df-dawg-uc/ - BRQL Query Language http//www.w3.org/2001/sw/Dat
aAccess/rq23/
54A Simple Query
- Consider the following RDF triple
- lthttp//example.org/book/book1gt
lthttp//purl.org/dc/elements/1.1/titlegt "BRQL
Tutorial - Recall this is equivalent to the sentence book1
has title BRQL Tutorial - We may have a large set of such triples in our
data store. - We want to make a query on this data like this
What is the title of book1?
55The Query and the Results
- We can construct queries on any of the parts of
the triple, such as - SELECT ?title
- WHERE lthttp//example.org/book/book1gt
lthttp//purl.org/dc/elements/1.1/titlegt ?title .
- Thus just means what is the title of book1?
- ?title "BRQL Tutorial
56So What?
- This was a trivial example in which we posed a
query on the triples object, which was a string. - But the object of the triple may be a URI (an RDF
resource), not just a literal. - Or we may construct queries against subjects or
verbs of triples. - For complicated graphs, this means that the query
returns a pointer to another section of the
graph. - This means that we can make linked queries that
allow us to navigate graphs.
57Linked Queries Across Graph Sections
dry_at_stateu.edu
http//.../CMCS/Entry/1
dccreator
vcardEMAIL
http//.../People/DrY
dctitle
H20
vcardN
What is the given name Of the creator of Entry 1?
vcardFamily
vcardGiven
58What If You Cant Wait?
- BRQL is still a work in progress.
- If you need something now, there is Jenas RDQL.
- RDQL allows you to pose triplet queries similar
BRQL - Jena has a programming interface that allows you
to construct and execute these queries against
RDF.
59A Simple Jena RDQL Example
- Model modelnew ModelMem()
- Model.read(new FileReader(a.rdf))
- String queryString "SELECT ?x, ?fname WHERE
(?x,lthttp//www.w3.org/2001/vcard-rdf/3.0EMAILgt,
?fname)" - Query querynewQuery(queryString)
- query.setSource(model)
- QueryExecution qenew QueryEngine(query)
- QueryResults resultsqe.exec()
60Advanced OWL Tutorial
61OWL Syntaxes
- Abstract Syntax
- Used in the definition of the language and the
DL/Lite semantics - OWL as RDF triples (and thus as, e.g. RDF/XML or
N3) - the official concrete syntax
- mapping rules describe how to translate from
abstract syntax to triples. - XML Presentation Syntax
- XML Schema definition
62OWL Ontologies
- An OWL ontology consists of a number of Classes,
Properties and Individuals - All identified via URIs.
- Classes
- Have definitions providing their
characteristics - Properties
- Characteristics such as transitivity or
functionality - Domains and Ranges
- Individuals
- Class membership
- Relationships to other individuals
- Concrete values.
63XML Datatypes in OWL
- OWL supports XML Schema primitive datatypes
- Clean separation between object classes and
datatypes - Philosophical reasons
- Datatypes structured by built-in predicates
- Not appropriate to form new datatypes using
ontology language - Practical reasons
- Ontology language remains simple and compact
- Implementability not compromised can use hybrid
reasoner
64OWL Class constructors
- OWL has a number of operators for constructing
class expressions. - Boolean operators
- and, or, not
- Restrictions
- slot fillers with explicit quantification
- Enumerated Classes.
- explicit enumerations of the class members
65OWL Class Constructors
66OWL Class constructors
- The operators have an associated semantics
- Given in terms of a domain
- D
- and an interpretation function I
- Iconcepts ! Ã(D)
- Iproperties ! Ã(D D)
- Iindividuals ! D
- I is then extended to concept expressions.
67OWL Constructor Semantics
68OWL Constructor Semantics
69OWL Axioms
- Axioms allow us to add further statements about
arbitrary concept expressions and properties - Disjointness, equivalence, transitivity of
properties etc. - An interpretation is then a model of the axioms
iff it satisfies every axiom in the ontology.
70Basic Inference Tasks
- Inference can now be defined w.r.t.
interpretations/models. - C subsumes D w.r.t. K iff for every model I of K,
I(D) µ I(C) - C is equivalent to D w.r.t. K iff for every model
I of K, I (C) I (D) - C is satisfiable w.r.t. K iff there exists some
model I of K s.t. I (C) ? - Querying knowledge
- x is an instance of C w.r.t. K iff for every
model I of K, I(x) 2 I(C) - hx,yi is an instance of R w.r.t. K iff for, every
model I of K, (I(x),I(y)) 2 I(R)
71Why Reasoning?
- Why do we want it?
- Semantic Web aims at machine understanding
- Understanding closely related to reasoning
- Given key role of ontologies in the Semantic Web,
it will be essential to provide tools and
services to help users - Design and maintain high quality ontologies,
e.g. - Meaningful all named classes can have instances
- Correct captured intuitions of domain experts
- Minimally redundant no unintended synonyms
- Richly axiomatised (sufficiently) detailed
descriptions - Answer queries over ontology classes and
instances, e.g. - Find more general/specific classes
- Retrieve annotations/pages matching a given
description - Integrate and align multiple ontologies
72Why Decidable Reasoning?
- OWL DL constructors/axioms restricted so
reasoning is decidable - Consistent with Semantic Web's layered
architecture - XML provides syntax transport layer
- RDF(S) provides basic relational language and
simple ontological primitives - OWL DL provides powerful but still decidable
ontology language - Further layers may (will) extend OWL
- Will almost certainly be undecidable
- Facilitates provision of reasoning services
- Known practical algorithms
- Several implemented systems
- Evidence of empirical tractability
- Understanding dependent on reliable consistent
reasoning
73Other Links
74Tools for Playing with Things
- Jena Toolkit Java packages from HPLabs for
building Semantic Web applications. - http//www.hpl.hp.com/semweb/
- Both IsaViz and Protégé use this.
- IsaViz A nice authoring/graphing tool
- http//www.w3.org/2001/11/IsaViz/
- Protégé Another ontology authoring tool
- http//protege.stanford.edu/
- SiRPAC
- Allows you to parse RDF, convert RDF/XML into
graphs and triplets. - http//www.w3.org/RDF/Validator/
75Other Tutorials
- Original Semantic Grid GGF tutorial material is
here - http//www.semanticgrid.org/presentations/ontologi
es-tutorial/ - Beginner and Advanced OWL tutorials are here
- http//www.co-ode.org/resources/
- Lectures cover working examples (pizza ontology)
built with Protégé. - http//www.semanticgrid.org/presentations/ontologi
es-tutorial/
76XML Primer
- General characteristics of XML
77Basic XML
- XML consists of human readable tags
- Schemas define rules for a particular dialect.
- XML Schema is the root, defines the rules for
making other XML schemas. - Tree structure tags must be closed in reverse
order that they are opened. - Tags can be modified by attributes
- name, minOccurs
- Tags enclose either strings or structured XML
- ltcomplexType name"FaultType"gt
- ltsequencegt
- Â ltelement name"FaultName"
- type"xsdstring" /gt
- Â ltelement name"MapView/gt
- Â ltelement name"CartView/gt
- Â ltelement name"MaterialProps"
minOccurs"0" /gt - ltchoicegt
- Â ltelement name"Slip" /gtÂ
- ltelement name"Rate" /gt
- Â lt/choicegt
- Â lt/sequencegt
- Â lt/complexTypegt
78 Namespaces and URIs
- XML documents can be composed of several
different schemas. - Namespaces are used to identify the source schema
for a particular tag. - Resolves name conflictsfull path
- Values of namespaces are URIs.
- URI are just structured names.
- May point to something not electronically
retrievable - URLs are special cases.
- ltxsdschema xmlnsxsd"http//www.w3.org/2001/XML
Schema" xmlnsgem"http//commgrids.indiana.edu/GC
WS/Schema/GEMCodes/Faultsgt - ltxsdannotationgt
-
- Â lt/xsdannotationgt
- ltgemfaultgt
-
- lt/gemfaultgt
- lt/xsdschemagt
79Metadata and the Dublin Core
- Define metadata and describe its use in physical
and computer science.
80What is Metadata?
- Common definition data about data
- Traditional Examples
- Prescriptions of database structure and contents.
- File names and permissions in a file system.
- HDF5 metadata describes scientific/numerical
data set characteristics such as array sizes,
data formats, etc. - Metadata may be queried to learn the
characteristics of the data it describes. - Traditional metadata systems are functionally
tightly coupled to the data they describe. - Prescriptive, needed to interact directly with
data.
81Descriptive Metadata and the Web
- Traditional metadata concepts must be extended as
systems become more distributed, information
becomes broader - Tight functional integration not as important
- Metadata used for information, becomes
descriptive. - Metadata may need to describe resources, not just
data. - Everything is a resource
- People, computers, software, conference
presentations, conferences, activities, projects. - Well next look at several examples that use
metadata, featuring - Dublin Core digital libraries
- CMCS chemistry
82The Dublin Core Metadata for Digital Libraries
- The Dublin Core is a set of simple name/value
properties that can describe online resources. - Usually Web content but generally usable (CMCS)
- Intended to help classify and search online
resources. - DC elements may be either embedded in the data or
in a separate repository. - Initial set defined by 1995 Dublin, Ohio meeting.
83Thought Experiment Construct Your Own Metadata
Set
- Describe yourself your occupation, your
interests, your place of residence, your parents,
spouse, children,. - Take each sentence
- The verbs become properties
- The verbs objects are property values.
- Metadata is just a collection of these name/value
pairs. - For particular fields (like publishing), we can
define a conventional set of property names.
84The Dublin Core Metadata for Digital Libraries
- The Dublin Core is a set of simple name/value
properties that can describe online resources. - Usually Web content but generally usable (CMCS)
- Intended to help classify and search online
library resources. - Digital library card catalog.
- DC elements may be either embedded in the data or
in a separate repository. - Initial set defined by 1995 Dublin, Ohio meeting.
85Dublin Core Elements
- Content elements
- Subject, title, description, type, relation,
source, coverage. - Intellectual property elements
- Contributor, creator, publisher, rights
- Instantiation elements
- Date, format, identifier, language
- In RDF, these are called properties.
86Encoding the Dublin Core
- DC elements are independent of the encoding
syntax. - Rules exist to map the DC into
- HTML
- RDF/XML
- We provide more detailed info on RDF/XML encoding
in this seminar.
87Sample RDF/HTML
- ltheadgt
- lttitlegtExpressing Dublin Core in HTML/XHTML meta
and link elementslt/titlegt - ltmeta name"DC.title" content"Expressing Dublin
Core in HTML/XHTML meta and link elements" /gt - ltmeta name"DC.creator" content"Andy Powell,
UKOLN, University of Bath" /gt - ltmeta name"DC.type" content"Text" /gt
- lt/headgt
88Where Do I Put the Dublin Core Metadata?
- Dublin core elements may be placed directly in
HTML pages. - Still need DC-aware crawlers or applications to
find and use them. - Or you may have a large database on DC entries
that are used by DC-aware applications. - Well examine a WebDAV-based scheme for chemistry
in a second.
89Dublin Core Element Refinements
- Many of these, and extensible
- See http//dublincore.org/documents/dcmi-terms/
for the comprehensive list of elements and
refinements - Examples
- isVersionOf, hasVersion, isReplacedBy,
references, isReferencedBy.