Title: Identifying the Identifiers
1Identifying the Identifiers
- Douglas CampbellNational Library of New Zealand
Te Puna Matauranga o Aoteaora - DC-2007, 29th August 2007
255685
2Designing Identifiers
- Identifier Theory
- Identifier Qualities
- Identifier Design Checklist
3Identifiers 101
Chair ? Seat Plastic and metal
thing 1 metre tall with four legs
4Identifiers 101
Chair Seat Plastic and metal thing 1 metre tall
with four legs The one on the left The blue one
5Identifiers 101
Chair Seat Plastic and metal thing 1 metre tall
with four legs The one on the left The blue
one The stacking chair Vitra Tom Vac Chair Asset
no. 1123-33
6(No Transcript)
7Identifiers 999
Identify these
Application Profile (namespace)
Metadata schema (namespace)
Age (data field)
Economics (social networking tag)
Donor (relationship)
Economics (subject authority)
Boy with dog (photograph in collection)
Romeo and Juliet (FRBR work)
Boy with dog (digitised photo)
Craft Society (harvested, archived website)
Picasso, Pablo (1881-1973) (name authority)
Film director (agent role)
8Identifiers in Communication
- Identifiers are part of communicating to refer
to a thing
The goal is samenessLatin idem (the same)
facere (to make)
9Prerequisites for identifying
- To Identify we need to Differentiate
- To Differentiate we need to Compare sameness
- To Compare we need to Define / Describe
- To Describe we need to- Observe
characteristics eg. size, location- Interpret
characteristics eg. smell, topic- Assign new
characteristics eg. name, logo, id string
10Identify by comparing characteristics
- We identify by comparing the sameness, or not, of
characteristics
?
Bring a seat in
?
?
11Identify by comparing characteristics
- We identify by comparing the sameness, or not, of
characteristics
?
Bring an office seat in
?
?
12Promoting characteristics to identify
- Use description characteristic(s) as surrogate
(substitute) - A symbol to represent the thing
- Convenience compare single identifiers
instead of lots of characteristics - Promotion changes the characteristics role
purpose is differentiation, not description
13Context
?
Invitation Street Party at No. 4
?
?
?
14Describing and Identifying
Description
characteristics
characteristics
describe
contexts
associate
15Identifier Definitions
- Identifier
- A stated association between a symbol and a
thing that the symbol may be used to
unambiguously refer to the thing within a given
context. - Thing Any entity, idea, action, resource,
object, etc. - Symbol Any mark, token, sensory stimulus,
character string, etc. - Identifier system
- Policies, processes, and/or mechnisms for
assigning, managing, and using identifiers.
16Semiotics
- Study of how we communicate using signs and
symbols - We use symbols with no intrinsic meaning,we add
meaning around these symbols
17Semiotic Triangle
- Nothing is a sign unless it is interpretted as a
sign - Charles Pierce
Symbol
Object
18Semiotic Triangle
agent
Concept
Symbol
Object
19Semiotic Triangle
agent
Concept
Symbol
Object
(implied relationship)
- An identifier is a thought
20Semiotic Triangle
remembrance of association and context
agent
Concept
Symbol
Object
(implied relationship)
- Identifiers are the manifestation of the act of
identifying
21The Deconstructed Identifier
- The desconstructed identifier has six aspects
- A Thing
- A Symbol built from characteristics in a
description - An Associaton between the symbol and the thing
- A Context
- An Agent that states the association and context
- A Remembrance of the association and context
22Identifier Qualities
- Scope
- Uniqueness
- Granularity
- Intelligence
- Actionability
- Persistence
- Extensibility
- Context
23Scope
- Draw identifier from description, but be clear
what is being described - Newspaper article
?
DB
24Multiple scopes in one record
- A MARC record contains multiple scopes(thats
not wrong, just be aware of it) - Dublin Cores one-to-one rule is useful
alternative
100 Creator 245 Title 650 Subject 856 URL
Vocabulary term
25Uniqueness
- Often want to refer to a thing unambiguously
Thing
Thing
Thing
26The unique Johns
- Cannot uniquely identify John
John
John
John
John
27Uniqueness
- A thing has only one identifer
- An identifer only relates to one thing
Thing
Thing
Thing
28Uniqueness
- A thing has only one identifer
- An identifer only relates to one thing
Thing
Thing
Thing
29Global uniqueness
- Wrap naming authority identifier around local
identifiers
Org 1123
Org 2123
Org 3123
30Granularity
- Question How deeply should we break groups into
separately identified things? - Answer If you have an need to identify it, then
identify it!! - Methodology examples FRBR, ltindecsgt
- In practice, the identifier system may dictate
constraints
- Journal
- Newspaper
- Page
- Article
- Photo
31Intelligence
- Adding roles of description and remembrance
to identifiers - Intelligent Dumb Semantic Transparent
Opaque - Intelligent identifiers are time-dependent
based on your world view at the time, eg. country
names, gay, email address
32Location as Identifier
- Lazy identifiers, e.g.
- System file path to HTML file Web URL
- Location on shelf identifier
- Location identifiers are needed to access the
thing,but may not be the best identifier to
publicise - Dilution different identifiers for copies of
the same thing
a.com/x.html b.org/y.html c.com/z.jsp
33Sidebar http URIs
- URI Universal Resource Identifier
- URL Universal Resource Locator (a kind of
URI) - Many original URLs were lazy identifiers (using
file location) - Many are now more considered identifiers they
just happen to start with http// called
http URIs - Often are intelligent identifiers, so beware (as
discussed)
34Actionability
Live Dead ActionableResolvableDe-referenceabl
e
?
35Persistence
- How long does an identifier need to live?
- How do we keep it alive that long?
- Not a technology issue, is a commitment issue
- Need policies for handling changes in environment
- When an identifier is retired
- When the thing itself changes
- When the identifier system becomes obsolete
- When the custodian of the identifier changes
- Degree of mutability (allow identifiers to be
re-associated?)
36Extensibility
- Persistence of identifier systems
- Risk from unanticipated demand
- Risk from re-use in unanticipated situations
- Risk from changes to the environment
- Future-proof by including capacity to be adapted
- As generic form as possible
- Hooks for community-defined extensions
- Consider scalability
- Follow international standards
- Keep application independent
37Context
- Remembrance of association and its context
- Remembrance of context alongside or combined
- .
- except
need to know what a URN is! - Often context is missing, assuming the reader can
infer it! - Multiple identifiers in multiple contexts is not
undesirable - Sameness is different for different communities
- Though helpful for similar contexts to be
combined
38Checklist for designing identifiers
- Audience
- Consider how the identifiers are intended to be
used and potential downstream uses - Scope
- Determine the thing(s) being identified/described
(scope, granularity) - Context
- Determine the context(s) things are being
identified within (granularity). For example,
is it a concept/item/component/instance/etc., or
what communities will it serve?
39Checklist for designing identifiers
- Overlap
- Consider the relationship of the identifiers to
other similar identifiers and/or contexts,
consider merging - Persistence
- Determine the expected identifier lifespan and
strategies to preserve the relationship to the
associated thing for that long (e.g. commitment
level, resourcing, and policies) - Design the identifier system
- Identifier structure design uniqueness,
intelligence, actionability, persistence,
extensibility, and communication of context - Addressability - combine identifiers or
standalone identifiers? - Support - policies, processes, and mechanisms
40Checklist for designing identifiers
- Assign locally
- implementation (within your scope of control)
- Global uniqueness
- Wrap local identifiers with global authority
identifiers for wider use - Use them
- i.e. avoid using equivalent identifiers that may
cause duplication or confusion!
41Conclusion
42Describing and Identifying
Description
characteristics
characteristics
describe
contexts
associate
43The Deconstructed Identifier
- The desconstructed identifier has six aspects
- A Thing
- A Symbol
- An Associaton
- A Context
- An Agent
- A Remembrance
remembrance of association and context
agent
Concept
Symbol
Object
(implied relationship)
44Action List
- Look backwards
- Identifier audit
- Look forwards
- Identifier goals requirements
- Take action!
- Make identifiers unique internally
45DCMI Identifiers Community
- DCMI not the right place to do identifiers
workBut is a good place to disseminate - Key issues raised in DC Conference Special
Session - Identification vs. resolution
- Importance of being able to resolve
- New DCMI Identifiers Community October 2007
- http//dublincore.org/groups/identifiers/
- http//dublincore.org/identifierswiki/
46Identifiers Special Session
- Thursday 30th August 2007, 11.30am 1pm,
Karimata Room - Identifier work
- NISO identifiers roundtable John Kunze
- DOIs URNs Juha Hakala
- Identifier principles Stu Weibel
- Identifiers at National Library of New Zealand
Douglas Campbell - Discussion on identifier issues
- What are the issues or areas of confusion?
- What are the areas we DO understand?
- What are some best practices?
47Questions?
- http//www.dcmipubs.org/ojs/index.php/pubs/article
/view/34 - douglas.campbell_at_natlib.govt.nz