Title: the OAI Protocol for Metadata Harvesting
1the OAI Protocol for Metadata Harvesting an update
Herbert Van de Sompel Los Alamos National
Laboratory Research Library
2The Open Archives Initiative has been set up to
create a forum to discuss and solve matters of
interoperability between preprint solutions, as a
way to promote their global acceptance. Paul
Ginsparg, Rick Luce Herbert Van de Sompel
3Luce Van de Sompel Ginsparg
4- 2 core motivations
- as a systems librarian change the system
- as a researcher find (technical) ways to
facilitate the change
5as a systems librarian
optimizing the output
the input is far from optimal
6eprint systems
- xxx e-print archive
- (Physics - 1991 - Los Alamos - Ginsparg)
- RePEc
- (Economy - Surrey U - Krichel)
- NCSTRL
- (Computer Science - Cornell U - Lagoze)
- NDLTD
- (Theses - Virginia Tech - Fox)
- CogPrints
- (Cognitive Sciences - Southampton U - Harnad)
7as a researcher
- eprints are attractive building block in ongoing
transformation of scholarly communication
- but interoperability could increase impact of
e-prints - amongst e-print solutions
- with building blocks that implement other
functions of scholarly communication - with the established communication system
8UPS Prototype eprints discovery
- 1999 Van de Sompel, Krichel, Nelson
- results
- insights regarding how un-interoperable the
systems were - a cross-repository searching and linking service
- recommendations to the Santa Fe meeting
- data provider / service provider model
- metadata harvesting
- simplicity
9evolution towards OAI-PMH v.2.0
- Santa Fe Convention 02/2000
10Santa Fe convention
OAI-PMH v.1.0/1.1
OAI-PMH v.2.0
11OAI-PMH model
service provider
data provider
6 OAI-PMH
12OAI-PMH model
service provider
data provider
- Supporting protocol requests
- Identify
- ListMetadataFormats
- ListSets
- Harvesting protocol requests
- ListRecords
- ListIdentifiers
- GetRecord
13OAI-PMH model
service provider
data provider
Datestamp Identifier Set
Records
14federated services
15metadata harvesting via OAI-PMH
metadata
FTXT
16metadata harvesting via OAI-PMH
metadata
17issue solved?
- no, just a tiny part of the technical challenges
to support discovery - many more technical issues
- even more non-technical issues
18issue solved? technical
awareness
certification
rewarding
registration
archiving
19issue solved? non-technical
- I am happy to leave those to you
- but even for non-technological issues, part of
the answer might be found in applying technology
20indicators of adoption of OAI-PMH
21 data providers
- 49 registered repositories 11/2001
- 65 registered repositories 03/2002
- 5 million records
- many unregistered repositories
22 service providers
- Arc cross-searching of registered repositories
Old Dominion U - http//arc.cs.odu.edu
- OLAC cross-searching of Language Archive
Community repositories - http//www.language-archives.org/index.html
23 service providers
- Scirus scientific search engine Elsevier
- http//www.scirus.com
- my.OAI user-tailorable cross-searching of
registered repositories FS Consulting, Inc. - http//www.myoai.com
- growing interest from web search engines
24 OAI-PMH tools
- Repository Explorer interactive exploration of
repositories Virginia Tech - http//www.purl.org/NET/oai_explorer
- eprints.org generic OAI-PMH compliant
repository software U of Southampton - http//www.eprints.org
- ALCME repository and harvester software OCLC
- http//alcme.oclc.org/index.html
25OAI-PMH flies structural support
- Metadata Harvesting Initiative of the Mellon
Foundation - NSDL (NSF funded)
- UK FAIR call for proposals to support disclosure
of institutional assets (papers, learning
materials, etc.) - Institute for Museum and Library Services
- several EC projects exploring/supporting usage
of OAI-PMH TEL, Leaf, Cyclades, OA Forum
26OAI-PMH flies and also
- Australian Museums Online CIMI OAI
conference - NIMH white paper on data archiving for Animal
Cognition Research - Library of Congress
- National Library of Canada
- OCLC thesis database
- Illinois State Library Catalogue
27future
28 the OAI-PMH
- release of OAI-PMH v.2.0 06/2002
- no backwards compatibility with v.1.0/1.1
- stable
- migration process for registered repos
- ? formal standardization ?
- ? SOAP version web services framework SOAP,
WSDL, UDDI ?
29 communities
- proliferation of community-specific add-ons for
- collection set level metadata
- expressive metadata formats (e.g. qualified DC
XML Schema) - shared set-structures
- machine readable rights (about the metadata)
30 adoption
- evolution
- from talking about OAI-PMH
- to talking about projects that use OAI-PMH
- to talking about projects and failing to mention
they use OAI-PMH - gt OAI-PMH becomes part of the infrastructure
31I just wanted to report what I consider an OAI
success. I discovered that RLG had harvested
records for two of the American Memory
collections I had made available and integrated
them into their Cultural Materials Initiative
service without the need for a single e-mail or
phone call. They reported that it was working
very well for them. Caroline Arms, Library of
Congress
32http//www.openarchives.org openarchives_at_openarch
ives.org
33the OAI not really an organization
- Executive Carl Lagoze Herbert Van de Sompel
- 2000 2002 funding from CNI and DLF
- Steering Committee
- Technical Committe
- protocol revision stabilization
- Alpha testers
34OAI-tech
US representatives Thomas Krichel (Long Island U)
- Jeff Young (OCLC) - Tim Cole - (U of Illinois
at Urbana Champaign) - Hussein Suleman (Virginia
Tech) - Simeon Warner (Cornell U) - Michael
Nelson (NASA) - Caroline Arms (LoC) - Muhammad
Zubair (Old Dominion U) - Steven Bird (U Penn.)
European representatives Andy Powell (Bath U.
UKOLN) - Mogens Sandfaer (DTV) - Thomas Baron
(CERN) - Les Carr (U of Southampton)
35OAI-PMH 2.0 alpha testers (1/2)
- The British Library
- Cornell U. -- NSDL project e-print arXiv
- Ex Libris
- FS Consulting Inc -- harvester for my.OAI
- Humboldt-Universität zu Berlin
- InQuirion Pty Ltd, RMIT University
- Library of Congress
- NASA
- OCLC
36OAI-PMH 2.0 alpha testers (2/2)
- Old Dominion U. -- ARC , DP9
- U. of Illinois at Urbana-Champaign
- U. Of Southampton -- OAIA, CiteBase, eprints.org
- UCLA, John Hopkins U., Indiana U., NYU -- sheet
music collection - UKOLN, U. of Bath -- RDN
- Virginia Tech -- repository explorer