Title: Search Interoperability, OAI, and Metadata
1Search Interoperability, OAI, and Metadata
- Sarah Shreeves
- University of Illinois at Urbana-Champaign
- Basics and Beyond
- Grainger Engineering Library
- April 18, 2005
2Scenario A teacher is putting together a lesson
plan comparing immigration in the early 20th
century to immigration and wants to include a
variety of primary sources
3IMLS funded digital collections with relevant
content
4Search interoperability
- the ability to perform a search over diverse
sets of metadata records and obtain meaningful
results. - Priscilla Caplan
- Metadata Fundamentals for All Librarians
5Keys to Search Interoperability
- Communication protocol (Z39.50, OAI, etc.)
- Organizational commitment
- Standards
- Standards
- And More Standards
6Sharing metadata Federated search
- The distributed databases are searched directly.
Mill?
For Example Z39.50, SRU/SRW
7Sharing metadata Data aggregation
- The user searches a pre-aggregated database of
metadata from diverse sources.
Mill?
For Example Search engines, union catalogs, OAI
8Open Archives Initiative Protocol for Metadata
Harvesting
- The OAI-PMH is a tool
- Moves metadata (not content for the most part
yet) from a data provider to a service provider
(or harvester) - A set of rules that defines the communication
between two systems (like FTP and HTTP) - Facilitates the aggregation of metadata
- (like a union catalog)
9Basic OAI-PMH Concepts
- Aggregated search rather than Federated
search - Data providers support OAI PMH as a means to
expose metadata - Service providers harvests metadata from data
providers via the OAI-PMH - OAI-PMH based upon HTTP and XML
- OAI-PMH requires use of simple Dublin Core
- BUT supports and encourages use of other metadata
schemas - Unique and Persistent Identifiers and a Datestamp
for each OAI record
10OAIster http//www.oaister.org/o/oaister/ CIC
Metadata Portal http//nergal.grainger.uiuc.edu/c
gi/b/bib/oaister
11How OAI Works (Technically)
Service Provider Data Provider
- 6 distinct verbs or requests
- OAI requests are sent via HTTP
- Responses are sent in valid XML
Dig. Mngt. Sys.
A G G R E G A T E D
OAI H A R V E S T E R
OAI Data P R O V I D E R
M E T A D A T A
HTTP Request (OAI Verb)
HTTP Response (Valid XML)
12Examples of OAI Service Providers
- OAIster http//oaister.umdl.umich.edu/o/oaister/
- Engineering, Computer Science, and Physics
http//g118.grainger.uiuc.edu/engroai/ - Open Language Archives Communityhttp//www.langu
age-archives.org/
13OAI VERBS
- Identify
- ListMetadataFormats
- ListSets
- ListIdentifiers
- GetRecord
- ListRecords
14 Challenges for the OAI Community
- No best practices (yet)
- Shareablity of metadata
- Heterogeneity of items described
- Loss of Context / Information loss
- Knowledge structures differ so.
- Native metadata schemas differ
- Controlled vocabularies differ
- Use and presentation of items differ
15 OAI ? Dublin Core
- DC is OAIs lowest common denominator
- BUT
- OAI supports encourages use of other
community-driven metadata schemas
16Metadata Interoperability
- Semantics
- What is the metadata format used?
- Mapping from one format to another
- Content rules
- How are values for the metadata elements selected
and represented? - Syntax
- How are the metadata elements encoded in machine
readable form? - Documentation
17(No Transcript)
18Metadata for different communities
19Metadata for different communities
20- Loss of Context
- Record in OAI aggregation
21- Context
- Record in native database
22Loss of context / data
23Loss of context / data
24Granularity of Description Excerpt of Metadata
Record Describing American Woven Coverlet
25Granularity of Description Excerpt of Metadata
Record Describing "Cotton coverlet with
embroidered butterfly design"
26What does this record represent?
- identifierhttp//images.umdl.umich.edu/cgi/i/imag
e/image-idx?viewentrysubviewdetailccfish3ice
ntryidX-0802viewid1004_112 - publisher UMMZ Fish Division
- format jpeg
- type image
- subject 1926-05-181926081218Trib. to Sixteen
Cr. Trib. Pine River, Manistee R.R10WS26
S27JAM26-46005T21N1926/05/18 - language UND
- description Flora and Fauna of the Great Lakes
Region
27(No Transcript)
28Data providers can
- Create metadata for interoperability
- Reusable metadata - think beyond your local users
and environment - Use well structured and defined schemas move
beyond simple DC - Use and identify controlled vocabularies
- Document, document, document
29Service Providers can
- Analyze metadata and cluster and normalize some
aspects - Provide contextual information (such as
collection descriptions) - Custom interfaces and selective views for target
audiences / domains
30Contact Information
- Sarah Shreeves
- Project Coordinator,
- IMLS Digital Collections and Content Project
- University of Illinois at Urbana-Champaign
- sshreeve_at_uiuc.edu
- 217-244-7809
- Presentation available
- http//imlsdcc.grainger.uiuc.edu/basicsbeyondMar20
05.ppt