Title: Dr Mike Lowndes,
1An introduction to the Semantic Web for Museums
Presented at Museums and the Web 2006, Albuquerque
- Dr Mike Lowndes,
- Interactive Media Manager,
- Natural History Museum, London
- mikel_at_nhm.ac.uk
2The semantic web Contents
- Web futures context
- What is it?
- Web problems Digital objects and other issues
- Building blocks
- Steps along the way
- Current applications
- Other advances Web2.0
- Activity in the cultural sector
- Is it actually going to happen?
- Conclusions for Museums
3Web Futures
- We can be assured that
- Whatever we propose today, the future will be
different. - Technology progresses and conceptual thought
keeps playing catch-up. New ideas supplant old. - The future web will be as messy and tricky to
predict as the past. - So
- For Museum web users, we should strive toward a
greater signal to noise ratio.
4Web futures Other Developments
- Web 2.0 web as application platform
- Convergence
- The web becomes more TV-like, but remains
interactive and always available. - More layers of information on more channels.
- It will become optionally immersive degrees of
immersion depending on how you interact with it. - Internet 2 www.Internet2.org
- Infrastructure, for massive bandwidth.
- Grid computing www.gridcomputing.com
- Shared processing increasing available power
when connected. - Computing power becoming a utility like
electricity. - Towards instant processing of everyday tasks (in
the human timeframe).
5Web Futures Internet Ubiquity.
- All technological devices connected.
- The intelligent fridge, RFID, mobiles with GSM,
GPRS, G3. - Future mobile device operate your bank account,
hifi and front door lock, turn the car heater on
before you get to it these things are not that
far away. - The web is already old-school.
- We dont yet have a simple word for the continuum
between digital radio, TV, the web, mobile
internet, sms and multimedia kiosk interactions,
though internet technology underpins it all. - We can no longer limit our thinking to the needs
of the desktop browser-based web.
6- Problems with the current Web
7Problems A Digital Object
- Named Anomalocaris.
- Did that help?
- If we need help to make sense of many digital
objects, Google needs even more. - So A digital object should include or connect
to the supporting data that allows both humans
and machines to understand it. - Answer The semantic web
- Provides a framework, standards and tools for
attaching, extending, making available and
understanding the meaning of digital objects. - Makes the digital medium self-explaining.
8Problems the worst things about todays web?
- Its manual.
- Google is currently the most popular way to begin
exploring a topic. - It relies on humans to link sensibly to
interesting and relevant content. This only works
when a LOT of humans are making the links. - Hyperlinks. They are dumb.
- They do not explain themselves.
- Can you trust them?
- When you create them, you need to keep validating
them. - Searching for new links to make requires a search
engine. - Metadata can improve this, but metadata is poorly
used. - Answer The semantic web
9Problems how many logins do you have?
- Bank1
- Bank2
- Sharedealing
- Work VPN
- Basecamp
- Amazon
- eBay
- eBuyer
- Picstop
- Flickr
- Email 1
- Email 2
- Etc.
- What about an infrastructure that allows you to
log on to the internet just once? - Answer The semantic web.
10Some Web Issues For Museums
- People trust online museums content and their
links more than others, perhaps. - But our knowledge and collections are not easily
available for the public, as a single
collection relevant to their needs. - This requires breaking down the digital walls
between institutions digital access,
interoperability, flexible context. - Interoperability is a difficult thing.
- Our metadata is easy to publish, but nothing out
there uses it to improve searching. - Attempts are being made (e.g. OAI-PMH)
- GBIF
- Other portals
- Answer The semantic web.
11 12The Semantic Web
- Tim Berners-Lee Web Visionary and head of W3C.
- Formally set off in 1998 Goal is the solution to
information overload and the personalisation of
the web. - Adding logic to the web
- If youre 38 and some available content is aimed
at six year-olds, then its not appropriate to
prioritise display (unless youre searching for
your kids). This kind of logic is built into the
semantic web. - Turning the web into a global database
- Semantic web software should be able to find,
sort, classify, interpret, and present relevant
content in context. - Achieved via global use of metadata leading to
vastly improved browsing, and agents which may
seem intelligent because they can process a web
that describes itself.
13W3C Definition
- Tim Berners-Lee
- The Semantic Web is an extension of the current
Web in which information is given well-defined
meaning, better enabling computers and people to
work in cooperation. - For the Web to become a truly machine-readable
resource, the information it contains must be
structured in a logical, comprehensible and
transparent fashion. - This is the primary work required to enable the
semantic web.
14The Building Blocks
15What the Semantic Web Will Require
- Adoption of metadata standards.
- Usable tools for automatic and semiautomatic
multilingual knowledge mark-up. - Modelling relationships. E.g between types of
metadata. - Construction of ontologies (and mappings between
them). - Stay awake!
- Plus, intelligent agents to mine the above for
a particular persons needs. - Defining a particular person requires a user
profile.
16Boxes And Arrows No Clouds!
Context
User profile
Other ontologies
Maps to
User query, or query generated by user behaviour
Semantic Web Agent
Maps to and is constrained by
Identified ontologies
- Accurate,
- meaningful
- Answers
- Actions
- Views of information
Associated metadata
17W3C Current Semantic Web Work (2006)
- A roadmap.
- Two formal XML technologies are now part of the
first generation semantic web - RDF for holding and communicating the metadata.
- OWL for describing relationships and inferring
meaning.
18 W3C Semantic Web RoadMap
191. XML
- XML underpins the next step.
- It can describe the 'data' on the web by wrapping
that data in tags that explain it. - E.g.ltproductgtltfruitgtorangelt/fruitgtltpricegt20lt/price
gtltcurrencygtgbplt/currencygtlt/productgt - XML is a framework.
- Ad-hoc files can be created in it for specific
uses, using any tags you like. - There is no need to formally describe them unless
you want them to be understood outside your
particular use.
20XML Languages for Describing Content
- You can formalise a tag set written in XML by
creating a config file for it, known as a
Document Type Definition, or more recently, a
Schema. - e.g.
- Summary Metadata Dublin Core and its
derivatives. - Data Markup Encoded Archival Description, RSS.
- XML can also format and transform itself with XML
stylesheets XSL/XSLt. - Formal XML languages underpin the semantic web.
- XML over the internet enables machine-tomachine
communication.
212. RDF Resource Description Framework
- W3C supports the development of the Resource
Description Framework . - RDF is the official current encoding format for
semantic web data. - Can contain data, metadata and relationships.
- E.g. Dublin Core, RSS.
- Make web resources self-describing.
- RDF-S (a more recent development)
- Schema provides some ontology support to RDF.
- E.g. Simple DC file
223. Ontologies - OWL
- W3C supports the development of the Web Ontology
Language,usually abbreviated as OWL. - What is an ontology?
- A dictionary defines the meaning of words.
- A taxonomy or classification system describes
hierarchical relationships between things but not
usually other kinds of relationships. - A thesaurus deals with wider relationships
between words but meaning by inference only. - Ontologies join taxonomies and thesauri together
and can derive logic and inference
relationships of meaning. - OWL is the latest iteration of this idea as
applied to the web. - It is a vocabulary extension of RDF not
something different.
23Brainbreak - FenFire
24Definitions and Properties of an Ontology
- James Hendler
- a set of knowledge terms, including the
vocabulary, the interconnections in meaning, and
some simple rules of inference and logic for some
particular topic. - DigiCULT
- The most typical kind of ontology for the Web
has a taxonomy and a set of inference rules. - What does it do? Describes relationships between
data. - TBL
- An ontology may express the rule "If a city code
is associated with a state code, and an address
uses that city code, then that address has the
associated state code. - the functionality of a database (query) and a
thesaurus (meaning by context).
25How Will Ontologies Be Used In The Semantic Web?
- Ontologies can be domain-oriented, task-oriented,
application-oriented or general purpose. Also
called class taxonomies. - Upper Ontologies are more general and can tie
more specific ones together by mapping them. - e.g. How can we make a machine understand that
watercolours are linked to jewellery
semantically? - Concept of watercolour links to a definition
URI (url). - Local ontology watercolour is a type of
painting. - Local ontology necklace is a type of
jewellery. - Upper ontology painting and jewellery are
both types of art. - Someone needs to build these mappings.
- Now, do it all again in multiple languages
26A lot of talk
- Foundational ontologies - shared understanding,
providing intended meaning of a vocabulary. - Completeness, precision and overlap between
ontologies agreement on all are needed for
'establishing consensus'. - Gets philosophical very quickly
- is a hole different from the region of space it
occupies? I.e. Are there holes, or only holed
objects? - Is a statue different from the stuff it is
constituted by? I.e. Are there statues or only
statue-shaped stuffs? - Is a person different from their body?
- Ontologies - There will be a lot of them
27 W3C Semantic Web RoadMap
28Higher layers of the Roadmap
- Rules layer - early stage work
- Initial proposals to provide standardised
languages for the querying of RDF SPARQL -
Joseki query engine. - Experiments with rule languages RuleML
- Proof
- Authority, encryption
- Trust
- (PICS)
- Profiles
- FOAF friend of a friend. EARL
29Long Term Agents
- DigiCULT
- Agents are the final product of the semantic
web automatic, even artificially intelligent
software that does all your searching for you
(the process of narrowing down) and much more.
However, this is a very long term goal and there
are many steps on the way, each of which can
help. - Examples
- The agent attached to your diary automatically
organises travel etc, and can change your travel
tickets when you alter your diary. - The agent attached to your house automatically
organises food purchasing, bill payment,
lighting, heating, alarms etc.
30Visual navigation of ontology (Sculpteur)
- Visualising RDF metadata An aid for Museum
professionals, not the public. - Addis, M., et al., New Ways to Search, Navigate
and Use Multimedia Museum Collections over the
Web, Figure 3, in J. Trant and D. Bearman (eds.).
Museums and the Web 2005 Proceedings, CD-ROM
ISBN 1-885626-31-2 Toronto Archives Museum
Informatics, March 31, 2005
(right click/click-hold (Mac) for notes)
31Boxes, arrows and Acronyms
Context
User profile
OWL
FOAF/ EARL
Other ontologies
Maps to
User query, or query generated by user behaviour
RDF-S/ OWL (CIDOC-CRM, SKOS)
SPARQLRuleML
Semantic Web Agent
Maps to and is constrained by
Identified ontology
- Accurate,
- meaningful
- Answers
- Actions
- Views of information
RDF (DC, RSS)
Associated metadata
32- Q. Why isnt the semantic web here?
- A. Its hard to do.
33 34Short Term some current applications
- Making digital resources self-describing
- RSS in RDF
- Was rich site summary, now really simple
syndication - making simple summary information
self-describing. - Mobile devices CC/PP.
- called Composite Capability/Preference Profile
(CC/PP). - will let cell phones and other non- standard Web
clients describe their characteristics to other
software and agents. - Business XBRL.
- describes/classifies content of financial
statements. - makes report generation easier.
- FOAF
- Friend of a Friend.
- Describes people and their interests, plus
network of peers. - www.foaf-project.org/
- Topic Maps.
- A framework for creating and browsing
relationships. - Works within and between between systems and
disciplines. - Works with RDF.
- Human friendly relatively easy to grasp how it
works -browsers are in development.
35Haystack (MIT) an RDF-PIM
36Medium Term e.g. 'smart links'
- As semantic content appears browsers can be
modified to use it. - On mouseover.
- Metadata of target.
- More information on evolution.
- Multiple targets.
- More information on evolution.
- These do not even need to be defined as links
simply highlighting words could initiate the
semantic web browser. - Its automatic for the people.
- As well as smart links more and more local
domains of knowledge will be related by their
linking ontologies. More semantic portals will
appear.
Author the Natural History Museum, London. Date
published July 2005. Description A website
exploring evolution by natural selection. Audience
12 years plus. Language English
(international).
Link definition of evolution. Link evolution at
the Natural History Museum. Link evolution at
the American Museum of Natural History. Link
Evolution on god.com. Link evolution at New
Scientist magazine. Definition Evolution part
of natural history. Browse evolution.
37Magpie IE plugin (Open University)
38- So nothing practical even yet?
- (Semagix can you afford it?)
- Semantic web portals?
39An aside? The Web 2.0 tag cloud or folksonomy
40Web 2.0 the Web as Application platform
- - first uses social networking, content
authoring and sharing, real-time GIS, feedback - Flickr
- Google Maps / Earth
- Del.icio.us (bookmarking)
- Technorati (blog-tracking)
- Wikipedia
- Basecamp, ACEproject
- Blogger
- Open source frameworks (e.g. Drupal)
- Amazon, Yahoo
41AJAX
- Advanced javascript to send / receive content and
update parts of pages, using XML over the web - can use other messaging formats as well
- thus getting around another issue with the
web from the start pages being static. - Real-time response to user input
- i.e. approaching true desktop applications on the
web. - Beyond the original Berners-Lee vision?
- Examples
- Google Maps (map data)
- Basecamp (saving changes/state without reloading
pages) - Writely (word processor for online collaboration)
- Shell Wildlife Photographer of the Year
42Social Tagging (folksonomies)
- The old Yahoo / Google (DMOZ) directories method
of classifying sites is hardly used as a search
aid - also ungainly, complex and impossible to maintain
- 'tagging' is communities of web users freely
keywording their content - These sites then use popularity and associations
of keywords to infer relevance/closeness of
meaning - http//www.flickr.com/photos/tags/family/clusters/
- provides a simple way to group content
- What about specialist knowledge?
- Specialist knowledge fewer people worse
tagging? - Go visit the steve project
43RSS
- Newsfeed reading via Really Simple Syndication is
now huge - a simple but structured way to syndicate
information or broadcast change - A subscription model people get the information
they want delivered to them as it is generated - Gets around an original web turnoff having to
revisit favourite sites regularly. - Content from many sources can be aggregated into
themed feeds - Use of truly semantic ideas is at an early stage,
- RDF is extensible
- will improve as the sheer number of newsfeeds
requires new layers of interpretation. - Will be embedded in next generation operating
systems - e.g. 24 Hour Museum
44Web2.0 and the Semantic Web
- Joshua Allen, 2001 (Making a Semantic Web)
- Until anyone can create metadata about any page
and share it with everyone there will not be a
semantic web - Web 2.0?!
- Web 2.0 is NOT a new infrastructure for the web.
- It wont do the job of providing the global
database. - It does take steps in the right direction.
45- What has the cultural sector done?
- Done? this is mostly old stuff.
46We Have A Role.
- Were are the holders of knowledge and authority,
and can help to define the semantic web. - Thesauri owned and created by Museums could
become ontologies and act as part of the
backbone. - Museums are behind and will remain behind as
other areas see competitive advantage business,
commerce and research. - DigiCULT Thematic issue 3, 2003 museums need
to take a lead. We need to do a big project
together Standardise thesauri, develop
ontologies.
47Infrastructure The CIDOC Conceptual Reference
Model
- A common language and extensible semantic
framework to which any cultural heritage
information can be mapped. The interoperability
glue. - Provides the words and relationships we can
use to map our stuff together. - I.e. an agreed framework for our ontologies
- An international standard.
- Exposed in RDF already RDF-S/OWL to follow?
- http//cidoc.ics.forth.gr/
- For an introduction, download
- http//www.rlg.org/en/downloads/2002metadata/gill/
gill.PPT
48Portal example Sculpteur
- Several collections brought together into one
place, one meta database or portal. - Content from the VA among others.
- Visual display of relationships.
- A published ontology in RDF.
- Concept-based searching based on a semantic
network. - Content-based searching of images and 3d models.
- http//www.sculpteurweb.org/ (Browser needs
downloading)
49Richard Light Museum thesauri in Topic Maps
- Ontology framework written to thesaurus
standards. - Museum thesauri turned into ontologies in Topic
Map format. - Topic Map browser (Omnigator) a visual
environment. - Aims to provide meaning an authoritative
reference that software can use when searching
the web. - Could become part of the future semantic web
backbone. - Topic Map / RDF interoperability now a focus at
W3C - Museums Computer Group Newsletter, April 2004.
50VICODI Visual Contextualization of Digital
Content
- semi-automatic creation of contextual semantic
metadata for digital historical resources, by
users. - Visualisation of richly structured,
contextualised content. - Interface uses historical maps and colour-coded
links. - Felt to be not generally usable in hindsight by
the developers, but still in some development. - http//www.vicodi.org/
51VICODI powered (http//www.eurohistory.net/Index
.do)
52Finnish Museums on the Semantic Web
- The most ambitious and realised attempt.
- Uses RDF encoded Dublin Core metadata.
- Brings 15 Museum collections together.
- Difficult to assess as it is in Finnish, but has
good critical reports from users. - A semantic web HTML generator is in development.
- http//museosuomi.cs.helsinki.fi/
53Finnish Museums on the Semantic Web
All the right buzzwords in all the right places.
54Finnish Museums on the Semantic Web
(right click/click-hold (Mac) for notes)
55Finnish Museums on the Semantic Web
56Swed-E
57First generation sites
- Expose the workings of the semantic web too much
- Not simple enough for most web users
- But work is ongoing.
- http//cipher.uiah.fi/project/trials/mapsvisu/one_
visu/ihala_monsters_1024.jpg
58Coda Reality Check
- Is the semantic web the right approach to
information overload?
59Nay-sayers?
- DigiCULT Janneke Van Kersen, Dutch Digital
Heritage Association. - I do not believe in developing a fundamental
ontology to give meaning to information on the
Net. It looks to me like the 18th-century
endeavour to write an encyclopaedia that contains
all the knowledge in the world. I am afraid it
does not work that way. A lot of knowledge, even
scientific knowledge, cannot be described in a
logical way. Especially in the arts a lot of
knowledge is the result of heuristics and
associative thinking. - Patel-Schneider and Siméon, Bell Labs Research.
- there is a semantic discontinuity at the very
bottom of the Semantic Web, interfering with the
stated goal of the Semantic Web If semantic
languages do not respect World-Wide Web data,
then how can the semantic web be an extension of
the World-Wide Web at all?
60Then its impossible?
- TBL sees the Semantic Web as based upon a whole
bunch of ontologies mapped together. - Instead of asking machines to understand
peoples language, ask people to make the extra
effort - It is acknowledged that this is a vast and
difficult thing to do. - The tools are not yet there.
- The consensus.
- Its hard to do, not easy like the current web.
- Its utopian but the main goals are achievable.
- It will be a part of the future web, but never
all of it. - Any movement towards it increases the signal to
noise ratio of the web. - It should and will be done where it can be.
- Better for formalised knowledge anyway, informal
knowledge can associate loosely or closely
61Who will use it?
- Initially it will be used by the formal web.
- commerce
- b2b
- research
- education
- institutions
- The informal web (most blogs/wikis, personal
pages, link sets etc) will benefit from the work,
and buy in at some levels. - Consider the speed of technological advance
- Other things will come along.
62Conclusions for Museums
- DigiCult
- The Semantic Web is a direction, it is like
North. You go North but you never arrive and say
here it is. - Its going to be a large scale, collaborative,
community thing. - Requires leadership and opportunity from the
State. - We can and should make more starts now.
- There are many valuable steps on the way.
- It can make what you have to say far more
accessible to those people who want to know.
63Further Reading
- Tim Berners-Lee
- BERNERS-LEE,T., J. HENDLER, O. LASSILA The
Semantic Web A new form of Web content that is
meaningful to computers will unleash a revolution
of new possibilities Scientific American, 17 May
2001. - http//www.sciam.com/article.cfm?articleID0004814
4-10D2-1C70-84A9809EC588EF21 - DigiCULT
- Themed Issue 3 Towards a Semantic Web for
Heritage Resources, May 2003. - http//www.digicult.info/pages/themiss.php
- For more references please refer to the
associated paper .
64Thank you