Title: Representing URI Resolution in OWL
1Representing URI Resolution in OWL
- Alan Ruttenberg (responsible for errors)
- Matthias Samwald
- Jonathan Rees
2What goes wrong with URLs
- The server disappears (s)
- The content disappears - 404 (c)
- The content might change and you want to know and
communicate what it used to be (d) - Access to the content is too slow (s)
- Access to the content is too public (p)
- The content is very big (b)
- You don't know if a URI is an information
resource or not (w) - You want to record and access metadata -
information about some information resource - and
you don't know where to get it. (m) - You don't know what format an information
resource is encoded in. (f)
3Existing proposals LSIDs
- Authority
- Location independence
- Data/Metadata distinction
- Access method independence
- Versioning
- (similar ARK, purl, DOI)
4Existing proposals http-range14
- Use http.
- Use result code to recognize potential
non-information resources - Result code 2xx information resource
- Result code 3xx any resource, pointer to more
information - Result code 4xx unknown type
- 303 to get more information about the thing
5Existing proposals Content negotiation
- Agent asks for resource
- Server responds with list of content types and
where to get each - Agent chooses which to retrieve
6Short statements
- John Barkley
- Identify MIME Types of URIs.
- Identify versions of URIs
- URIs should dereference to something, even if it
is only documentation,e.g., rdfscomment - Use LSIDs
7Short statements
- Phil Lord
- URIs should identify one thing only.
- URI allocation schemes should encourage stability
over time. - Resources identified by URIs should be
checksummable.
8Short statements
- Matthias Samwald
- Be careful what you are talking about - use
separate names - Distinguish information resource from
non-information resource - Distinguish making a query from resolution
9Short statements
- David Booth (Information resources/metadata)
- http//example.org/foobar might identify a
thing(non-information resource),
http//example.org/foo can be used to seek
metadata about it. - Dereferencing non-information resources yields a
http 303 result code
10Selected issues with proposed solutions
- http-range14 - late dont know anything until
you do the retrieval - Content negotion - confusion over what the thing
is - e.g. foaf human readable document, rdf at
same address. Try talking about the ugly font. - LSID - requires server deployment. Based on web
services (slow). Unclear semantics of versions,
metadata, data - No single proposal deals with all issues
11An Alternative
- Use the our SW tools to help solve this problem
- Represent the information that you want to know
about URIs in OWL. - Build an ontology to represent consensus/schema.
- Take advantage of consistency checking,
inheritence
12Goals
- Transparent/explicit. Contract based.
- Adjustable
- Extendable
- Ontologically sound
13Different things
- The temperature of a patient (not an information
resource) - A instrument that measures and reports
temperature (not an information resource) - The record retrieved when you query the
instrument (an information resource) - The record that you retrieved at a certain time
and you copied and saved (an information resource)
but related
14A sketch
- Some useful distinctions
- Top level classes
- Some properties
- Only about instances (not properties or classes)
15InformationResource NotAnInformationResource
- Information resources are conceptually Gettable
- They might not be able to be retrieved at a
particular time - They might change
- Ask yourself Would it be possible to get the
thing itself over a network - Disjoint
16UnchangingInformationResourceEvolveableInformatio
nResource
- UnchangingInformationResource is like LSID
data. A promise is made that the content will
never change. - EvolveableInformationResource are resources that
might change (even if we dont want them to, e.g.
NCBI gene records) - Disjoint
17RetrievalMethod
- A way to get an information resource.
- Some examples
- StandardURIRetrieval
- TransformUriRetrieval http//genesdbs.org/entrez/7
157 gt http//cache.ibm.com/?generetrieveidhttp
//genesdbs.org/entrez/ - SPARQLMethod gt http//genesdbs.org/entrez/7157
gt - select ?dataFROM http//sparql.ibm.com/lifesci
WHERE http//genesdbs.org/entrez/7157 data
?data - WebServiceMethodSupply a WSDL, name of parameter
18RetrievalMethod (notes)
- There may be more than one.
- When more than one try them all in random order,
or explicitly represent preference. - For company specific retrieval, add another
RetrievalMerthod to an appropriate upper class
(one more triple)
19InformationResourceFormat
- Explicitly give enough information to know what
you will have to parse should you retrieve the
resource (so you can choose whether or not to
retrieve) - Like mime/type - BUT only the format, not the
type (thats for defining by class) - RDFXML
- RDFTurtle
- JPEG
- TIFF
- HTML
-
- Note Different formats of same digital thing
should be given different names. (but they may be
related by some property, e.g. hasOtherRepresentat
ion)
20Some classes of InformationResource
- XrayImage
- Triples (rdf statements)
- MedicalRecord
- VersionInformation
- ProtocolDocumentation
- ProvenanceDescription
- SPARQLEndpoint
- .
- (ask me about metadata!)
21Properties
- Relate a NotInformationResource to an
InformationResource - seeAlso, subjectOfMedicalRec
ord, foafhomepage, biozend-data (described by
data)
22Properties
- Relating an informationResource to metadata
- previousVersion (UnchangingInformationResource)
- hasVersionDescription
- hasChangeDescription
- generatedBy (not an information resource)
- cachedDate
- hasMD5 (UnchangingInformationResource)
23An example
Class PartnersDigitalXray subclassOf
NeverChangingInformationResource subclassOf
DigitalXray mediaType hasValue jpegType
retrievalmethod hasValue webServiceMethod1002
http//partners.org/radiology/817277366 rdftype
PartnersDigitalXray mediaType jpegType
(inherited) retrievalmethod webServiceMethod1002
(inherited)
24Sharing bare URIs
- Dont, unless you have to. Generally messages
should be a set of triples giving adequate
information about type, resolution - If you do, use existing best practices to make
them last, e.g. PURL