Title: Open Archives Initiative
1Open Archives Initiative Object Re-Use Exchange
- Herbert Van de Sompel (1)
- Carl Lagoze (2), Pete Johnston (3), Michael
Nelson (4), Robert Sanderson (5), Simeon Warner
(2) - (1) Digital Library Research Prototyping Team,
Research Library, Los Alamos National Laboratory - herbertv_at_lanl.gov
- (2) Information Science, Cornell University
- lagoze_at_cs.cornell.edu
- (3) EduServ Foundation
- pete.johnston_at_eduserv.org.uk
- (4) Computer Science, Old Dominion University
- mln_at_cs.odu.edu
- (5) Computer Science, University of Liverpool
- azaroth_at_liv.ac.uk
2OAI Object Re-Use and Exchange
- OAI-ORE is a new interoperability effort
conducted under the umbrella of the OAI - Supported by the Andrew W. Mellon Foundation
additional support from the National Science
Foundation and Microsoft - International effort October 2006 - September
2008 - Coordinators Carl Lagoze Herbert Van de Sompel
- ORE Technical Committee 13 international members
- ORE Liaison Group 8 international members
- ORE Advisory Committee 16 international members
- Representing scholarly publishers and
aggregators, eScience, eHumanities, education,
search engines, various repository systems,
digital library efforts, related standardization
efforts, etc. - See http//www.openarchives.org/ore/
- See http//www.ctwatch.org/quarterly/articles/2007
/08/interoperability-for-the-discovery-use-and-re-
use-of-units-of-scholarly-communication/ for a
paper
3OAI-PMH
- Interoperability through metadata harvesting
- The content (resource) is purposely minimized in
the data model - Primary application resource discovery over
distributed collections - Repository-centric
4OAI-PMH
metadata
5OAI-ORE (Object Reuse and Exchange)
- Interoperability that focuses on the content
(resource) itself - Accommodating rich, compound information content
- Web-centric, resource-centric
6Compound Information Objects
- Units of scholarly communication are compound
information objects - Identified, bounded aggregations of related
information units that form a logical whole. - Components of a compound object may vary
according to - Semantic type book, article, software, dataset,
simulation, - Media type text, image, audio, video, mixed
- Media format PDF, HTML, JPEG, MP3,
- Network location
- Relationships internal, external
7Scholarly Examples
http//arxiv.org/abs/astro-ph/0611775
8And more scholarly examples
- Published scientific results that, in addition to
the features of a scholarly publication such as
the one from arXiv, incorporate data plus the
tools to visualize and analyze that data. - An ARTstor image object that is the aggregation
of various renderings of the same source image. - A semantically-linked group of cellular images -
each image available in distributed repositories
from research laboratories, museums, libraries,
and the like - in the manner implemented in the
ImageWeb Project. - Archaeological assemblies of images, maps,
charts, and find lists.
9But these things are not only scholarly at all!
http//www.flickr.com/photos/avocado8/sets/7215760
2034407182/
10OAI Object Re-Use and Exchange
- Original Context
- Augment Interoperability across repositories to
facilitate cross-repository applications and
value chains - Facilitate Use and Re-Use of Compound Information
Objects (and of their component parts)
11Repositories
- Preprint repositories,
- Publisher repositories,
- Postprint repositories,
- Dataset repositories,
- Software repositories,
- Cultural heritage collections,
- Learning Teaching object repositories,
- Digitized book repositories,
- .
- Can be institution-based, discipline-based,
corporate,
12Value Chains Starting in Repositories
- We must leverage the value of the materials that
become available in those distributed
Repositories. - Think about these Repositories as active nodes in
a global environment, not as passive local nodes - These Repositories are not about creating
services for local users (only) - These Repositories are not about creating 1
service (user interface) for all users - Materials from Repositories must be re-usable in
different contexts. - Life for those materials starts in Repositories
it does not end there.
13http//dx.doi.org/10.1045/september2004-vandesompe
l
14OAI Object Re-Use and Exchange
- Core goal of OAI-ORE
- Facilitate Use and Re-Use of Compound Information
Objects (and of their component parts)
15OAI Object Re-Use and Exchange
- Core goal of OAI-ORE
- Facilitate Use and Re-Use of Compound Information
Objects (and of their component parts)
Why is this an issue? What again is the problem
with compound information objects on the Web?
16Publishing a Compound Object to the Web
17Publishing a Compound Object to the Web
18Publishing a Compound Object to the Web Issue
19OAI Object Re-Use and Exchange
- Core goal of OAI-ORE
- Facilitate Use and Re-Use of Compound Information
Objects (and of their component parts)
By adding/integrating compound object information
to the Web.
How to do this in a manner that is in sync with
the Web architecture?
20W3C Web Architecture
Identifies
- So, the tools we have to solve the problem are
- Resource
- URI
- Representation
21OAI Object Re-Use and Exchange
- Core goal of OAI-ORE
- Facilitate Use and Re-Use of Compound Information
Objects (and of their component parts)
By adding/integrating compound object information
to the Web.
OK, lets work with the Web tools we have.
22It starts with some resources that belong together
23Bring these resources together Aggregation
24Describe this Aggregation Resource Map
Convention A-1 ReM-1aggregation
25The Resource Map can describe more
26The Resource Map can describe more
27The Resource Map can describe more
28So, the Resource Map can describe a lot
29But minimally it describes this
30Resource Map publishing Adding compound object
information to the Web
31The ORE Data Model
- All of this is formalized in the ORE Abstract
Data Model - The Data Model leverages
- Web Architecture
- Semantic Web
- Named Graphs
- This is explained in a simple manner in the Data
Model Overview User Guide
32OAI Object Re-Use and Exchange
- Core goal of OAI-ORE
- Facilitate Use and Re-Use of Compound Information
Objects (and of their component parts)
By adding/integrating compound object information
to the Web.
We still need real Resource Map Documents.
33Serializing a Resource Map the Resource Map
Document
34Serializing a Resource Map the Resource Map
Document
http//www.mkbergman.com/?p391
35Serializing a Resource Map the Resource Map
Document
36Atom Syndication Format
- RFC 4287
- XML-based Document Format
- Describes a list of related information known as
a feed - Feed consists of items known as entries
- Atom defines metadata for feed and entries
- Atom allows extensibility for feed and entries
37Atom Syndication Format ORE
- Resource Map Document Atom feed document
- Each Aggregated Resource is provided in an Atom
entry using ltatomlinkgt element with
_at_relalternate - Rule of thumb re the use of Atom in ORE
- Elements in Atom namespace pertain to the Atom
feed and Atom entries - Elements in other namespaces pertain to the
Aggregation and the Aggregated Resources - ltatomlinkgt elements make the connections between
the Atom and the ORE world
38Resource Map Profile of Atom
- All of this is described in the ORE Resource Map
Profile of Atom - And explained for implementers in the Resource
Map Implementation in Atom User Guide
39OAI Object Re-Use and Exchange
- Core goal of OAI-ORE
- Facilitate Use and Re-Use of Compound Information
Objects (and of their component parts)
How are Resource Maps discovered?
40Resource Map Discovery
- Harvest type discovery
- Expose Resource Maps via OAI-PMH, Atom, RSS,
Sitemaps - Resource Embedding
- HTML link element points at Resource Maps
- Response Embedding
- HTTP Link Header points at Resource Map
41Resource Map Discovery
- All of this is described in the Resource Map
Discovery User Guide
42Current Status
- Since December 10th 2007 Alpha ORE
Specifications openly available - Community feedback via Google Groups
- Experiments 01/2008-07/2008 approximately 500k
funding from Mellon Foundation and JISC - Several groups started experimenting in the
context of existing projects
43Next Steps
- ORE Open Meetings
- March 3rd 2008, John Hopkins University USA ORE
Open Meeting - Register at http//www.regonline.com/oai-ore
(limited to 150) - April 4th 2008, University of Southampton
European ORE Open Meeting - March 3rd 2008 Beta release of the
specifications - September 2008 Public release of stable ORE
Specifications
44Demo Preservation of Aggregations
http//www.ctwatch.org/quarterly/articles/2007/08/
interoperability-for-the-discovery-use-and-re-use-
of-units-of-scholarly-communication/
45Demos Preservation of Aggregations
46Demo Browser plug-in detects Aggregations
47Demo Citation leveraging ORE
48Questions
49OAI Object Re-Use and Exchange
- Core goal of OAI-ORE
- Facilitate Use and Re-Use of Compound Information
Objects (and of their component parts)
Lets assume we can solve that Compound Object
issue. By adding Compound Object information to
the Web graph.
50Examples of what could be achieved (in
interoperable ways)
- Grouping of search engine results according to
compound object boundaries instead of or in
addition to listing ungrouped results. - Grouping all citations to a paper, instead of
having different citation counts, e.g. a count
for the PDF version, a count for the PS version,
a count for the splash page. - Print all components of a Compound Object in one
go. - Provide navigation map of all components of a
Compound Object. - Image assembly.
- Group a resource and annotations pertaining to
the resource. - Submit compound object to a repository (cf.
SWORD). - Preservation of compound object by leveraging
existing Internet infrastructure (cf. later).