The Emerging Framework for Scholarly Communication - PowerPoint PPT Presentation

About This Presentation

Title:

The Emerging Framework for Scholarly Communication

Description:

Personal emails, discussion lists, open access services such as OAI, eprint ... Reference lists from 155,000 arXiv papers to build CiteBase, an open citation ... – PowerPoint PPT presentation

Number of Views:56

Avg rating:3.0/5.0

Slides: 28

Provided by: stevehi3

Category:

more less

Transcript and Presenter's Notes

Title: The Emerging Framework for Scholarly Communication

1
The Emerging Framework for Scholarly Communication

Steve Hitchcock
The Open Citation Project (OpCit), Southampton
University
These slides prepared for The Future of Journal
Publishing
at Nottingham University, 22 March 2002
OpCit is a joint JISC-NSF
International Digital Libraries Project 1999-2002

2
Emerging framework the hypothesis Scholarly
electronic information will be seamless and
integrated

3
Scholarly electronic information will be
seamless and integrated

The provable truth, using Google
seamless integration of information
500 results, mostly companies offering network
and inter-application software
seamless access to information
almost 1000 sites, portals and gateways to the
fore
seamless linking
450 sites, leading with journal publishers and
databases
Results based on Google searches November 2001

4
What is seamless integration?

From any given document the user might expect to
be able to retrieve any related document within
one mouse click.
Typically what is related is defined, and linked,
by the author or publisher or other service
provider, and is constrained by the tools and
information services at their disposal.
Longer term the relation may be anything the user
might consider to be related.

5
Achieving seamless integration Web services

Emerging Web services standards are motivated by
the need to connect business processes,
especially databases, across the Web. The basic
platform for Web services is XML plus HTTP,
maintaining the ubiquity and simplicity of the
Web. Web services are based on three mechanisms
to register a service (e.g. Web Service
Definition Language, WSDL)
to find a service (e.g. a registry such as
Universal Description, Discovery, and
Integration, UDDI)
to communicate (e.g. Simple Object Access
Protocol, SOAP)
http//www.w3.org/2002/ws/
Digital library architectures are evolving to
include Web services-like components, and may
ultimately migrate to these emerging standards

6
Is seamless integration possible for the refereed
scholarly literature?

For scholarly research papers - those destined
for peer reviewed journal publication, by authors
who have no intention of receiving direct payment
for publication for the work they produce - this
prospect raises two subsidiary questions about
the seamlessly integrated literature
Will it be complete (from the viewpoint of every
user)?
Will it be free (or appear to be free)? A work
may appear to be free to the user when it is
accessed via a library, for example.
The refereed scholarly literature will need to be
complete, everywhere, if seamless integration,
even on a modest scale, is to be achieved.

7
Progress in libraries

Site licenses for electronic journals, and more
aggregated content from database services
Alternative journals, e.g. support for the
Scholarly Publishing Academic Resources
Coalition (SPARC), to increase competition in the
journal market by facilitating partnerships with
publishers and other journal producers
Open Archives Initiative, interoperability
standards to facilitate the efficient
dissemination of content
Fast-track standardization of OpenURL, to link
users to these subscription and document
services, recognising this vast new array of
electronic content would need to be accessible
and navigable by users within the librarys
information environment

8
Site licences

By licencing access to bundled collections of
e-journals, libraries can claim to have satisfied
their objective of better value for money in
terms of cost per page delivered to users.
The site from which users access content could
be an institution, a state-wide group of
institutions (e.g. OhioLINK), a national
collective, such as in Canada, or even all the
people of a nation, as in Iceland. The UK has the
National Electronic Site Licence Initiative
(NESLI), which brokers deals between publishers
and participating institutions.
The OhioLINK strategy Enablers rather than
gatekeepers
OhioLINK claims to have overcome the
library-imposed, self-limiting, collection
development mentality of information rationing
that pervades our community. Thomas Sanville,
Executive Director, OhioLINK

9
Making appropriate connections

Site licenses give libraries access to more
journal titles. Another outcome of the serials
crisis is that fewer, non-core journals are
subscribed to and libraries have resorted to
just-in-time document delivery and collections
from licensed full-text aggregators.
Library users may thus have authority to access a
paper free of charge via one library subscription
or another. This has become know as the
appropriate copy problem.
OpenURL is a generalized framework for
communicating and resolving links and supports
software solutions to the appropriate copy
problem. OpenURL is described as an
interoperability specification.

10
Syntax of OpenURL

http//(who you are, where you are, your
institution)/(where you want to go)
A
B
C
An OpenURL is mediated by the HTTP protocol
BASEURL, data about the user, typically inserted
during transport between servers. One interim
mechanism is to store the BASEURL as a cookie in
the users browser. The cookie identifies the
resolver that provides context-sensitive services
for the user.
QUERY, points to the referenced object, which
might be an identifier, e.g.
Digital Object Identifier (DOI)
Metadata derived from an authored reference
Partial metadata - a secondary service identifies
the required document
OpenURL has been proposed as a National
Information Standards Organization (NISO)
standard http//library.caltech.edu/openurl/

11
Example OpenURL architecture

OpenURLs might be based on CrossRefDOI services
(from Beit-Arie et al., 2001, D-Lib Magazine,
September) http//www.dlib.org/dlib/september01/ca
plan/09caplan.html

12
The Open Archives Initiative (OAI)

The OAI (http//www.openarchives.org/) defines
A Metadata Harvesting Protocol (MHP), an
application-independent interoperability
framework that can be used by a variety of
communities engaged in publishing content on the
Web
Two classes of participants
Data providers expose metadata about content
Service providers issue protocol requests to data
providers
OAI is a very simple, low-barrier-to-entry
interface, shifting implementation complexity and
operational processing load away from the data
repositories to the developers of federated
search services, repository redistribution
services, etc.

13
OAI service providers an example

The Open Citation project interposing an OAI
service provider between document (eprints)
source and user interface

14
Creating information interfaces portals

We have to manage the underlying complexity in
the form of interfaces. Portals have become
important interfaces in the scholarly
environment. Portal strategies
by publishers (e.g. Elseviers ScienceDirect)
by associated networked information services
(e.g. Ingenta),
by library resource discovery networks (e.g.
JISCs RDN)
have yet to establish a pre-eminent model. This
is because all have concentrated on content,
mostly owned content. The best next-generation
portals will build services on top of content,
and for researchers will become the starting
point for all lines of enquiry.

15
Information interfaces RDN example

JISC RDN is a good example of building on content
to provide new services and adaptable interfaces.
The individual subject networks, in medicine,
engineering, humanities and others, can be
searched as though they were one unified
repository, and an interface presenting users
with this search facility can be embedded in any
library Web page.

Guiding the implementation of these services is
the JISC Information Environment (from Powell and
Lyon 2001) http//www.ukoln.ac.uk/distributed-syst
ems/dner/arch/dner-arch.html
16
Multiple cooperating services in the
communication chain
FROM
Documents
User interface
http
Server
Client

TO
OpenURL,
OAI,
JISC IE
MEDIATING CONTENT
Site licenses,
eprint archives,
etc.

17
Access and interfaces implications for journals

Digital information, rich in media and resources,
formal and informal, mediated by multiple
services, presents the user with an array of
choices that might answer his or her queries most
efficiently.
Those queries might be expressed as input to a
search engine, or by selecting a link. Where
might these citations come from? Personal emails,
discussion lists, open access services such as
OAI, eprint archives, newsletters, library
services, Z-gateways and academic subject
portals, as well as formal research papers and
commercial indexing services. There will be many
more.
The journal package has traditionally been bound
in issues and volumes. With the advent of
multiple networked sources mediated by services
such as OpenURL, the binding has been unstitched.

18
What are digital journals for?

Journals will be scaled back to the single
essential function of quality control, in the
form of managed peer review
Access to journal contents will be mediated by
multiple interfaces - open access services,
portals and information interfaces, other than
just the journal.
Journals cannot remain the exclusive provider of
peer-reviewed papers

19
A post-Google information environment

Electronic journals exist in a post-Gutenberg and
a post-Google information environment
By March 2001 the Internet Archive had stored 10
billion Web pages (100 terabytes of data)
The ability to locate a specified item of
information precisely and instantly among the
mass of information available on the Web has
profound implications. In the electronic
environment the search engine has become the de
facto interface to information, rather than the
fragmented packages that have migrated from the
print world.

20
Building eprint archives

EPrints.org software for building institutional
eprint archives for author self-archiving
Version 2.0 February 2002
OAI-compliant
Free open source software
Developed at the Electronics and Computer Science
Department, University of Southampton
http//www.eprints.org/

21
A maximising strategy for authors

Authors who self-archive their papers in
OAI-compliant institutional or discipline-based
eprint archives will
Maximise interfaces to their work
Maximise access to their work
Maximise impact of their work

22
Maximising access arXiv example

Decreasing citation latencies The latency of the
citation peak has been reducing over the period
of the archive, i.e. each year papers are cited
sooner and more often
Mining the Social Life of an Eprint Archive
http//opcit.eprints.org/tdb198/opcit/

23
Maximising impact arXiv example

More highly cited papers show higher and more
sustained download frequencies
Mining the Social Life of an Eprint Archive
http//opcit.eprints.org/tdb198/opcit/

24
Maximising interfaces

Measuring arXiv access and impact data the Open
Citation project has mined
Usage data from selected arXiv mirror server
logs
Reference lists from 155,000 arXiv papers to
build CiteBase, an open citation database

CiteBase, a new interface to the refereed
literature http//citebase.eprints.org

25
Initiatives promoting open access to scholarly
research papers

Budapest Open Access Initiative (BOAI), funded
by George Soros' Open Society Institute. Open
access "gives readers extraordinary power to find
and make use of relevant literature, and gives
authors and their works vast and measurable new
visibility, readership, and impact. February
2002, has received almost 1800 signatories to
date
http//www.soros.org/openaccess/read.shtml
Public Library of Science, scientists urge
publishers to allow the research reports that
have appeared in their journals to be distributed
freely by independent, online public libraries of
science. Open letter March 2001, received almost
30 000 signatories
http//www.publiclibraryofscience.org/

26
A dynamic digital archive

Scientists and researchers, Nobel Laureates among
them, have produced the clearest declaration of
their requirement for access to published
research papers a comprehensive collection that
can be efficiently indexed, searched, and linked
Unimpeded access to these archives and open
distribution of their contents will enable
researchers to take on the challenge of
integrating and interconnecting the fantastically
rich, but extremely fragmented and chaotic,
scientific literature.
Roberts et al. (2001) Science, 23rd March, 2001
http//www.sciencemag.org/cgi/content/full/291/551
2/2318a

27
Credits

The Open Citation project is a collaboration
between Southampton University, Cornell
University and arXiv
The project is lead by Stevan Harnad and Carl
Lagoze
Technical development at Southampton is directed
by Les Carr
EPrints.org software is being developed by Chris
Gutteridge
CiteBase is produced and managed by Tim Brody
A copy of these slides can be found on the OpCit
Web site
http//opcit.eprints.org/. Look for Papers and
Presentations
Contact Steve Hitchcock sh94r_at_ecs.soton.ac.uk