Title: Citation Linking for Electronic Journal Articles
1Citation Linking for Electronic Journal Articles
- CNI Fall Task Force Meeting
- Phoenix AZ
- December 1999
2What were going to talk about
- General model for reference linking (me)
- NISO / DLF / SSP / NFAIS workshops
- February 1999, Washington DC
- June 1999, Boston MA
- Appropriate copy problem (Dale)
- DLF Architecture Committee
- SFX
3Why talk about it?
- Publicize the activity so far
- Seeking interested parties
- How can we move this effort forward?
- Who can/should participate?
4What are we talking about?
- Reference (or citation) linking
- providing an actionable link from a reference to
an object - focus on electronic journal articles
- References
- from index databases (AI services, search
services, citation databases) - from article references section (bibliography)
5What are we talking about (cont)
- Links
- maybe URL
- maybe some other link-key (identifier)
- Objects
- works / manifestations / items gt creations
- content vs. surrogates / substitutes
6Puddles
- Closed systems where single agency controls both
citations and content - Publisher(s)
- Elseviers ScienceDirect, Wileys InterScience
- Aggregator service
- OCLC ECO
- Discipline
- NASA Astrophysics Data System, PubMed
7Puddles (cont.)
- User Community
- OhioLink, University of Toronto
- Problems with Puddles
- ok when everything a user wants is inside the
puddle - not ok when content is limited, arbitrary, or
incongruent with user needs
8Open Reference Linking
- Any link to any object, regardless of which
system the link, - the object,
- or the user
is in. - Assume multiplicity
- Require interoperability
9WHAT WE ARE TRYING TO ACCOMPLISH
Any old system
Citation
Citation
LINK
CLICK
LINK
MAGIC
Cited Article
10Model for open reference linking
Publisher
Reference Database
Location Database
Content
URL
Identifiers
Identifier
URLs
Citation
Client
Content
11Pieces of the problem
- Get a link for a reference
- Resolve the link to one or more locations of the
target document - Identify the most appropriate copy or copies of
the target document for the user
12URL or Identifier
- Multiple locations
- Persistence
- Data management
- Nearly all implementations find identifier
necessary - Identifier name based
13How to get a link derived vs. dumb
- Derived Construct it from data in the reference
- shared within a discipline (ADS)
- national standard (SICI)
- cope with multiplicity (S-Link-S)
- Dumb Look it up from data in the reference
- e.g. DOI-X
14How to get a link static vs. dynamic
- Static Pre-constructed
- embedded in the source document
- stored in a table associated with the source
- Advantage opportunity to review and correct
- Dynamic Supplied on-the-fly
- looked up or calculated when citation or
reference displayed - Advantage currency and flexibility
15Static and Dynamic Linking
16Model how to get a link
Publisher
Reference Database
Identifier(s)
citation
Client
17Resolve the link to location(s)
- For given identifier
- look up in database mapping identifier to
location(s) - return list of locations where items may be found
- return additional information to distinguish
between items (e.g. format)
18Model how to resolve a link
Publisher
Location Database
Identifier
URL(s)
Client
19How to resolve the link
- In puddles
- may be single type of link
- may be handled by system software
- In open reference linking
- will be multiple types of links
- need to find appropriate resolution service(s)
- need protocol for communicating with resolution
service
20How to find appropriate resolver
- Currently
- Browser plug-in
- Proxy server
- Tunnel identifier in URL
- Future ?
- URN model of distributed resolution
- web browser support for user configuration of a
hierarchy of identifier resolution services
21WHAT IF MORE THAN 1 COPY EXISTS?
- Elsevier journals, for example, are available
from - Elsevier ScienceDirect
- University of Michigan PEAK
- OhioLink
- University of Toronto
- Florida Center for Library Automation
22WHICH URL?
Name Resolver
NAME
URL?
Sciencedirect.com?
Ohiolink.edu?
Utoronto.ca?
Umich.edu?
FCLA.edu?
IT SOMETIMES DEPENDS ON WHO THE USER IS...
23SOURCES OF MULTIPLE COPIES
- Aggregators
- OCLC, EBSCO, Bell Howell, Lexis/Nexis, IAC
- Local loading
- OhioLink, University of Toronto, University of
Florida - E-print
- xxx (LANL), Cogprints, RePec.
24WHY MULTIPLE COPIES
- Performance -- may want highly used objects
closer to the user in network terms - Different players can provide different service
models using same content - e. g., gathering topically related materials
into knowledge bases (Ovid) - published and unpublished articles in a single
e-print service
25WHY MULTIPLE COPIES (continued)
- Competition in repository services
- Encourages functional innovation
- Rationalizes prices for services
- Archiving
- Institutional failure is as great a danger as
technological failure, particularly when dealing
with commercial players
26CURRENT STATE
- Few working solutions (Linkout _at_ NIH, SFX
prototype _at_ UGhent and LANL) - DLF/CNRI discussion of the following 3 models
- All intervene in the name resolution process to
select the appropriate URL to return
27Local Name Resolver
1 Name Resolution Request
2. Address (if found locally)
OR
2. Name Resolution Request (if address not found
locally)
3. Address
Universal Name Resolver
LOCAL CACHE
28Universal Name Resolver
Filter Server
2. Name Resolution Request
1. Name Resolution Request
3. Addresses (URL1, URL2, URL3.)
6. Address
4. Request Bibliographic Data (if
appropriate source is ambiguous))
5. Bibliographic Data
Reference Server
PROFILE-BASED FILTER
29Universal Name Resolver
Filter Server
2. Name Resolution Request
1. Name Resolution Request
8. Address
3. Addresses (URL1, URL2, URL3.)
4. Availability Query
6. Availability
4. Availability Query
7. Availability
4. Availability Query
5. Availability
Content Service 1
Content Service 2
Content Service 3
BROADCAST-RESULT- BASED FILTER
30SOME ISSUES
- Ugly, ugly, ugly
- In part because linking is to articles, most
access based on serial title and year - All solutions require a lot of coordination
- Users who are members of multiple rights
communities are a major complication
311. Service cookie-pusher URL
Portal
7. Page of links
6. Request for links
3. Cookie redirect to service
2. Cookie info
SFX Server
5. Article or citation SFX links based on
cookie
EXTERNAL SFX AWARE SERVICE
4. Service request cookie
Service
Cookie Pusher
SFX LINKING SYSTEM
32SFX vs Name Based Linking
- SFX
- generalized for many kinds of links (including to
paper copies) - requires explicit cooperation of citation source
- SFX does not simplify providing appropriate link
- but can work with both algorithmic and name-based
links - and methodology provides bibliographic context
for link derivation
33So
- Different approaches have different strengths
- mix and match possible
- The big issue who has the motivation to address
this seriously? - Interested? Contact us!
34The requisite URLs
Report on NISO/DLF/SSP/NFAIS meetings
http//www.dlib.org/dlib/july99/caplan/07caplan.ht
ml
Paper on DLF/CNRI appropriate copy discussion
http//www.niso.org/DLFarch.html
and contact information
pcaplan_at_nersp.nerdc.ufl.edu
dale_flecker_at_harvard.edu