Identifiers SIG Status Report - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

Identifiers SIG Status Report

Description:

Be permanent/resolve to data in a repeatable manner (or not resolve at all) ... Overkill? Canonical XML (W3C Recommendation) Not necessary for byte-identity ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 12
Provided by: scott136
Category:

less

Transcript and Presenter's Notes

Title: Identifiers SIG Status Report


1
Identifiers SIGStatus Report
  • Moses Hohman
  • Northwestern University
  • mmhohman_at_northwestern.edu

2
Guiding Principles
  • Identifiers should
  • Be locally assigned, globally unique
  • Be resolvable (if data is available)
  • Play nice with caching
  • Be permanent/resolve to data in a repeatable
    manner (or not resolve at all)
  • Be as simple as possible (for clients, for data
    providers, for the infrastructure, etc.)

3
LSID
  • urnlsidauthoritynamespaceobjectrevision
  • authority DNS name under control of data
    provider (recommended) that identifies the
    provider uniquely
  • namespace additional category whose use is up to
    the provider (e.g., may differentiate different
    data sources offered by the provider)
  • object identifies the data object within the
    given data source
  • revision optional field, use up to the provider
    (may be used to distinguish among different
    versions of a data object)

4
Locally Assigned, Globally Unique
  • urnlsidauthoritynamespaceobjectrevision
  • authority DNS name under control of provider
  • Does not make any requirements on resolution
    method
  • Why DNS name then?
  • Data provider has control over DNS name, which is
    guaranteed to be unique to that providers
    organization
  • Minimal administrative overhead

5
Resolvable
  • Given an identifier, return data
  • Recommended resolution method DDDS/DNS
  • LSID specification provides some recommendations
    for data service interface (as a web service)
  • Pros
  • Leverages existing, widely-deployed, distributed
    database technology
  • Lookup caching built in
  • Will work with both silver and gold services
  • Cons
  • Additional dependency
  • Doesnt start with WS- or G or Grid or
    OGSA - not a core grid technology
  • DNS zone file protectionism?

6
Caching, Repeatability and Permanency
  • ? between retrieving data from cache vs.
    originating service?
  • Many kinds of caches at many scales for many
    reasons
  • Two approaches
  • Data freshness/TTL not repeatable
  • Byte-identical service contract (enforceable?)
    recommended
  • Early grid will not have caches, but
    repeatability may well be a QoS concern

7
Example
  • Hibernate versions optimistic locking
    mechanism
  • Every change to data object increments version
  • Also, software much trigger a higher-order
    increment whenever the data serialization
    software or data schema changes, e.g.
  • urnlsidauth.eduns123A1
  • A is incremented (B, , Z, AA, AB, ) with
    software change
  • 1 is incremented by Hibernate

8
Simplicity
  • What do we really need right now?
  • What can we do away with at first, in order to
    simplify things?
  • What are the irreversible decisions?

9
Byte-identity and XML
  • Overkill?
  • Canonical XML (W3C Recommendation)
  • Not necessary for byte-identity
  • To me, byte-identity is really the simplest
    solution

10
Next steps
  • Complete communication with NCICB
  • Prioritize use case document
  • Develop a practical, phased plan for
    implementation
  • Develop plans to mitigate risks - what if X
    doesnt work? (e.g., wrapper library)
  • Get something out there for people to use!

11
Potential Phases
  • Phase 1 syntax, byte-identity, basic resolution
  • Phase 2 metadata (?), richer data service
    interface (retrieving most recent version of
    data, etc.)
  • Phase 3 address integration with query,
    potentially reconsider byte-identity and
    resolution protocol
Write a Comment
User Comments (0)
About PowerShow.com