Title: Tools%20for%20the
1Tools for the
Semantic web
- Jim Hendler
- http//www.mindswap.org
2Sem Web What its all about
Knowledge representation, as this technology is
often called, is currently in a state comparable
to that of hypertext before the advent of the
web it is clearly a good idea, and some very
nice demonstrations exist, but it has not yet
changed the world. It contains the seeds of
important applications, but to unleash its full
power it must be linked into a single global
system. -- Tim Berners-Lee, inventor of the WWW,
2001.
3Part I Review of semantic WEB
4On the Web -- links are critical!
Web page
Any Web Resource
lta href
URIgt
HTML
lta hrefhttp//gt
On the Semantic WEB -- links are critical!
URI
URI
URI
RDF is like the web!
RDF
5Sem Web models start from RDF
DOC1
ltmindPerson rdfidHendlergt ltmindtitle
jobsProfessorgt ltjobsplaceOfWork
http//www.cs.umd.edugt lt/mindPersongt
Jobs
Mind
Professor
DOC1
Mindtitle
Hendler
Jobs
Web Page http//www
JobsplaceOfWork
6XML is NOT semantics
7XML is NOT semantics
ltphotogt ltsubjectgt http//www.w3.org/timbl
lt/subjectgt ltnamegt Tim Berners-Leelt/namegt
lt/namegt lt/photogt
8XML is NOT semantics
Xml schema is DOCUMENT checking photo has
multiple subject fields photo has one physical
location etc.
ltphotogt ltsubjectgt http//www.w3.org/timbl
lt/subjectgt ltnamegt Tim Berners-Leelt/namegt
lt/namegt lt/photogt
9XML is NOT semantics
Xml schema is DOCUMENT checking photo has
multiple subject fields photo has one physical
location etc. WHICH SAYS NOTHING ABOUT TALKS,
SUBJECTS, PEOPLE, EVENTS, etc.
ltphotogt ltsubjectgt http//www.w3.org/timbl
lt/subjectgt ltnamegt Tim Berners-Leelt/namegt
lt/namegt lt/photogt
10The SEMANTICS is inthe links (e.g. to
ontologies)!
Eventtitle
ltdamlObjectProperty rdfID"photograph"gt ltrdfsdo
main rdfresource"Picture"/gt ltrdfsrange
rdfresource person"/gt lt/damlObjectPropertygt
EventWebPage
lt gt rdftype photoPhotograph, PhotoFile
http///imagesimage1, Phototopic
event1eventspeaker.
Event1 a Eventevent date May 7-11,
speaker http//timbl.html Title WWW
2002 TimBL rdftype w3c-ontperson name
Tim Berners-Lee
ltsClass rdfabout"http//www.semanticweb.org/ont
ologies/swrc-onto-2000-09-10.damlConference"gt lts
commentgt describes a generic conceptabout events
lt/scommentgt ltssubClassOf
rdfresource"http//www.semanticweb.org/ontologie
s/swrc-onto-2000-09-10.damlEvent"/gt ltadisjointFr
om rdfresource"http//www.semanticweb.org/ontolo
gies/swrc-onto-2000-09-10.damlWorkshop"/gt ltarest
rictedBy rdfresource"http//www.semanticweb.org/
ontologies/swrc-onto-2000-09-10.damlgenid18"/gt
ltrdfDescription rdfabout"http//www.w3.org/200
1/03/earl/0.95Person"gt ltrdftype
rdfresource"http//www.w3.org/2000/01/rdf-schema
Class"/gt ltrdfssubClassOf rdfresource"http//ww
w.w3.org/2001/03/earl/0.95Assertor"/gt lt/rdfDescr
iptiongt
11Semantic Web Ontologies are models
- New SW languages add models to provide mappings
and structure. - XML necessary, not sufficient.
12Semantics on the WEB
- Web ontologies, like the WWW itself, are not
separable - Thinking about the ontologies, without
considering - The links to other ontologies
- The instances that link to them
- The crawling and collecting of ontological
terminologies - Is like thinking about the Web without the
links!!
OtherProfessors
Othertitles
OtherPages
Jobs
Mind
Professor
OtherURIs
DOC1
Mindtitle
Hendler
Jobs
Web Page http//www
JobsplaceOfWork
Otherdescriptions
13Part 2 OWL - The Web Ontology Language
14OWL extends RDF
rdfsClass rdfID"Meeting"gt ltrdfssubClassOfgt
ltdamlRestrictiongt ltdamlonProperty
rdfresource"MeetingName"/gt ltdamltoClass
rdfresource"http//www.w3.org/2000/10/XMLSchema
string"/gt ltdamlcardinalitygt1lt/damlcardinal
itygt lt/damlRestrictiongt lt/rdfssubClassOfgt
ltrdfssubClassOfgt ltdamlRestrictiongt
ltdamlonProperty rdfresource"uri"/gt
ltdamltoClass rdfresource"http//www.w3.org/2000
/10/XMLSchemauriReference"/gt
ltdamlmaxCardinalitygt1lt/damlmaxCardinalitygt
lt/damlRestrictiongt lt/rdfssubClassOfgt
ltrdfssubClassOfgt ltdamlRestrictiongt
ltdamlonProperty rdfresource"location"/gt
ltdamltoClass rdfresource"http//www.w3.org/2000
/10/XMLSchemastring"/gt ltdamlcardinalitygt1lt
/damlcardinalitygt lt/damlRestrictiongt
ltrdfssubClassOfgt ltdamlRestrictiongt
ltdamlonProperty rdfresource"Issues" /gt
ltdamltoClass rdfresource"Issue" /gt
ltdamlminCardinalitygt0lt/damlminCardinalitygt
lt/damlRestrictiongt lt/rdfssubClassOfgt
lt/rdfsClassgt
- RDF-schema
- Class, subclass
- Property, subproperty
- Restrictions
- Range, domain
- Local, global
- Existential
- Cardinality
- Combinators
- Union, Intersection
- Complement
- Symmetric, transitive
- Mapping
- Equivalence
- Inverse
15Into a usable Modeling language
- In science, models provide interoperability
across jargons - Mathematical models equations of a system
- Physical models sticks and balls of the atom
- Virtual models the visualization of a complex
data set - INFORMATION MODELS taxonomies and thesauris
- Ontologies extend thesaurus information models to
provide - Semantic restrictions on property relations
- Must have vs. May have vs. Doesnt have
- Has some vs. has N vs. has 1
- Some vs. All property restrictions
- Formal underpinnings
- Logical entailments
- Note rules, logics, proofs are parts of
ontologies, but not yet at a consensus level
for standardization - Should build as add-ons to OWL to take advantage
of terminology features
16OWL is not
- OWL is NOT
- A knowledge representation language per se
- Definitely not The standard for KR
- A Description Logic per se
- It does support DL idioms
- E.g. Lymphoma is restricted to be a subClassOf
those things whose disease property is Cancer
- It will include a subset which is
- Complete, decidable, in DL complexity case
- But, it will allow uses that DLs do not
- Maybe outside the semantics of the model theory
- The right thing to use in KR/KA research per se
- But do use it to distribute your results
- But do use it to test your theories
17OWL is a WEB ontology langauge
- OWL is
- WEB-BASED
- DISTRIBUTED
- MACHINE-PROCESSIBLE
- BASED ON DAMLOIL
- By charter!
- It may become a Web recommendation
- Same language status as HTML, XML, XML schema
- A starting place for further evolution
- And SMIL, P3P,
- Standard ? Use
18Part 3 KA in the (OWL supported) Sem Web
- The good news
- DAMLOIL is already the most used ontology
language in history - Sept 30, 02 Crawler finds 5M DAML statements on
20,000 web pages - Doesnt include many instance KBs tied to
ontologies - Doesnt include many very large RDFS-based KBs
that include some OWL - OWL is being supported by large corporation labs
- Web tool developers IBM, HP, Sun, Intel, Fujitsu
- Content providers Daimler-Chrysler, Nokia,
Motorola, EDS, Agfa - OWL is starting to be used by thesaurus
distributors - C.f. National Cancer Institute metathesaurus to
be released in OWL - The bad news
- On the web it is a statistical blip -- the web is
HUGE (HUMONGOUS!!) - The big players are still on the sidelines
- We could become the next XML or the next SMIL
19Do we need KA?
- Tom Mitchell made an interesting point
- He says users are lazy they wont do mark-up
- He says we should use NLP machine learning
(primarily) - Hes WRONG
- Greatest impact likely to be non-textual,
non-document content
20So who is going to mark it up?
- There are not now, and never will be, enough
knowledge engineers to support the important,
critical applications of our technology - Government applications NASA, US DoD
- Health Care applications Open Health, Swiss
hospitals - Genomics/Bioinformatics NCI metathesaurus, Gene
Ontology - ...
- Historians Freedmans project
- Let alone the really important stuff out there
- MY information
- My photo archives, my home page, my daughters
home page, my project pages, my favorite hobby
pages, etc. etc. etc.
Personal information created the Web!!!
21Then a miracle occurs
22Key The Value Proposition
- Tools must consider work v. value
- People will NOT use tools that require a lot of
work and have little (perceived) value - People WILL use tools that save them work and/or
provide high (perceived) value - Perceived value ? real value in many cases
- Creating Web pages (ca. 1993) was cool
- No study has yet shown a positive work value for
the Web as a whole - But it has changed the way we live
- Viral My friend sees it, wants one. My
competitor sees it, needs one
TBLs secret advice Start small but viral and
you can change many things (July, 02)
23Value Proposition 1 Semantic Page Creation
The personal info killer application?
Ont Library
Tell me about your Important Person Hobby
Job
Marked Up Pages
Query
I know about - Scuba shop - Scuba vacation
1 - Scuba vacation 2 - Scuba instructor
classes
Choice
XHTMLOWL
- Many people dont have home pages Value
Hints for useful properties (using ontology
classes) Help create content (using ontology
instances). - Note Useful libraries (lots of stuff) already
exist (see daml.org)
24Value Proposition 2 Semantic Web Portals
The MOSAIC of the Semantic Web?
KB
- Combine browsing, search, and authoring
Value As I link to concepts, I find useful
resources Pages, Databases, programs, etc.
25Value prop 3 Semantic Web Services
26VP 3 And service composition
Buy the French version of a book from amazon.fr
and have it sent to my mother
27Semantic Web Knowledge Acquisition
- Virtually no one will create ontologies from
scratch - High-End ontology developers will be a tiny
percentage (10,000 High end Web Designers
1/10,000 of users) - It is easier to read then to create ontologies
- Expect cut and paste (HTML analogy)
- Most used OWL editor to date is Emacs
- Can Bootstrap from existing content
- HTML screen scrapers, structured data, Excel
spread sheets, - No training allowed
- Motivated users will skim the docs on occasion
- Most users want to use it now
- Everyone has a browser - deploy tools through
that - Common metaphors must be used Form fill, menu,
search - Note No formal justification for any of these -
but it worked before!
28Adding power via Semantic Web
- Tools can be domain independent
- Your tool should be usable in lots of contexts!
- Use the standards
- OWL and its successors crucial
- Tools should assume multiple ontologies
- Its the links, stupid
- Ontology search, collection, integration
crucial - Check out the DAML crawler (http//www.daml.org/cr
awler) - BackEnd technologies must be scaleable
- Can co-evolve with Semantic Web size
- But remember, the Web is HUGE
29Allow extensibility
- Users MUST be able to add their own concepts
- Semantic Web (and OWL) allow this
- Advanced users will become ontology providers
- It will be cool to have yours be the ontology
of choice in a domain - Consistency CANNOT be maintained on the web
- May be a useful heuristic
- Insist on consistency and the Semantic Web fails!
30GIVE IT AWAY!!!!!
- There is, and will be, no market for any of this
unless we create it! - No one will make money selling their tools until
we have MANY more users - Make small, cheap, easy to download version of
your tools available - Give it away
- The big winners on the web made it available for
free - Browsers Mosaic, Netscape, IE
- Plug-ins Flash, RealPlayer, Quicktime
- Tools Adobe, Real Media
31Part 4 Mindswap tools
Maryland Information and Network Dynamics
Laboratory Semantic Web Agents Project http//www.
mindswap.org/
32Practicing what I preach
- Open source Tools at http//www.mindswap.org
- Described in proceedings
- But out of date - open source moves fast
- Based on the principals outlined in this talk
- RIC Ontologies make it EASIER to enter knowledge
- Turn properties into forms, use restrictions to
check form filling - Creates a KB of the results that can be used for
search - Coming soon create a nice web page (using SXMLT)
- SMORE Create content and markup as you go
- Multiple ontology
- ConvertToRDF Dump spreadsheets to RDF using
mapping ontology - RDFScreenScraper turn semi-structured web pages
into - ParkaSW Scaleable, data-based KB back-end
- Some built in inferencing
- Pulled from the patent system to become open
source!
33Conclusions
- The Semantic Web is real, and it is moving fast
- Two years ago you hadnt heard of it, now its on
the cover of your proceedings - Well win if we remember the rules of the web
- Berners-Lee Principle Build small but viral
- Hendlers Rule On the web there is no THE
- Yours is ONE of the ways of doing it
- Consensus is hard, but critical
- We did it once and createdDAMLOIL, the
most-used AI language ever - Everyones application is needed
- Value proposition Make it fun, cool, and useful
and people will kill to do the markup (The Web
proves this) - Give it away Create the markets and well all
win - YOUR work is important!
- This time it could be for real!
THE