OAI and Metadata Aggregation - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

OAI and Metadata Aggregation

Description:

Format: Epson Expression 836 XL Scanner with Adobe Photoshop version 5.5; 300 dpi; 21-53K bytes. ... (image, text, physical object) rather than collection ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 31
Provided by: sarahls
Category:

less

Transcript and Presenter's Notes

Title: OAI and Metadata Aggregation


1
OAI and Metadata Aggregation
Sarah Shreeves University of Illinois at
Urbana-Champaign LIS 450 RO Representing and
Organizing Information Resources March 7, 2004
2
Outline
  • What is the Open Archives Initiative Protocol for
    Metadata Harvesting (OAI-PMH)?
  • OAI Projects at the University of Illinois and
    what weve learned

3
OAI is a tool
  • Set of rules that defines the communication
    between systems (like FTP and HTTP)
  • All about moving metadata (not data) around
  • Assumes widely distributed content, but
    centralized services
  • A building block for digital library services
  • The purpose of OAI is to foster interoperability

4
OAI is not.
  • Metadata
  • A search tool
  • A database

5
How OAI Works
Service Provider Data Provider
  • Data providers and service providers
  • OAI requests are sent via HTTP
  • Responses are sent in valid XML

6
OAI Use of Dublin Core
  • DC is OAIs lowest common denominator
  • BUT
  • OAI supports encourages use of other
    community-driven metadata schemas

7
Harvesting vs. Federation
  • Different approaches to interoperability
  • Federation services are run remotely on remote
    data (e.g. Broadcast Searching)
  • Harvesting metadata is transferred from the
    remote source to the destination where the
    services are located
  • OAI is a harvesting tool.

8
OAI Compared to Z39.50
9
Why Use OAI?
  • Content in non-Z39.50 enabled locations
  • Metadata provider more lightweight than Z39.50
    and scales well.
  • Service provider wishes to augment search
    services or metadata normalization is needed.
  • Portals can use both Z39.50 OAI

10
Who uses OAI?
  • Approximately 400 data providers
  • Basic building block of the National Science
    Digital Library (NSDL)
  • Incorporated into D-Space and Eprints.org
  • Part of ContentDM, Michigans DLXS, and other
    products
  • International use Open Archives Forum in Europe,
    UK and EU

11
OAI Projects at UIUC
  • NSF funded Second Generation Digital Mathematics
    Resources
  • Mellon funded OAI Metadata Harvesting Project
  • http//nergal.grainger.uiuc.edu/search/
  • IMLS Digital Collections and Content Project

12
(No Transcript)
13
Challenges of Metadata Aggregation
  • Heterogeneity of items described
  • Loss of Context / Information loss
  • Knowledge structures differ
  • So.
  • Native metadata schemas differ
  • Controlled vocabularies differ
  • Use and presentation of items differ

14
Challenges of Metadata Aggregation
  • Metadata quality issues emphasized
  • Completeness
  • Provenance
  • Accuracy
  • Conformance to expectations
  • Logical consistency/coherence
  • Timeliness
  • Accessibility

15
Metadata for different communities
16
Metadata for different communities
17
  • Loss of Context Record in OAI aggregation

18
  • Context Record in native database

19
Loss of context / data
20
Loss of context / data
21
Completeness of Metadata
  • identifierhttp//images.umdl.umich.edu/cgi/i/imag
    e/image-idx?viewentrysubviewdetailccfish3ice
    ntryidX-0802viewid1004_112
  • publisher UMMZ Fish Division
  • format jpeg
  • type image
  • subject 1926-05-18
  • subject 1926081218Trib. to Sixteen Cr. Trib.
    Pine River, Manistee R.R10WS26
    S27JAM26-46005T21N1926/05/18
  • language UND
  • description Flora and Fauna of the Great Lakes
    Region

22
(No Transcript)
23
Granularity of Description Excerpt of Metadata
Record Describing "Cotton coverlet with
embroidered butterfly design"
  • Description Digital image of a single-sized
    cotton coverlet for a bed with embroidered
    butterfly design. Handmade by Anna F. Ginsberg
    Hayutin.
  • Source Materials cotton and embroidery floss.
    Dimensions 71 in. x 86 in. Markings top right
    hand corner has 1 1/2 in. x 1/2 in. label cut
    outs at upper left and right hand side for head
    board fabric is woven in a variation of a rib
    weave color each of yellow and gray
    hand-embroidered cotton butterflies and flowers
    from two shades of each color of embroidery floss
    - blue, pink, green and purple and single top 20
    in. bordered with blue and black cotton
    embroidery thread stitches used for embroidery
    running stitch, chain stitch, French knot and
    back stitches selvage edges left unfinished
    lower edges turned under and finished with large
    gray running stitches made with embroidery floss.
  • Format Epson Expression 836 XL Scanner with
    Adobe Photoshop version 5.5 300 dpi 21-53K
    bytes. Available via the World Wide Web.
  • Coverage
  • Date Created 2001-09-19 094518 Updated
    20011107162451 Created 2001-04-05 Created
    1912-1920?
  • Type Image

24
Granularity of Description Excerpt of Metadata
Record Describing American Woven Coverlet
  • Description Materials Textile--Multi,
    PigmentDye Manufacturing Process
    Weaving--Hand, Spinning, Dyeing, Hand-loomed blue
    wool and white linen coverlet, worked in overshot
    weave in plain geometric variant of a
    checkerboard pattern.Coverlet is constructed from
    finely spun, indigo-dyed wool and undyed linen,
    woven with considerable skill. Although the
    pattern is simpler, the overall craftsmanship is
    higher than 1934.01.0094A. - D. Schrishuhn,
    11/19/99 This coverlet is an example of early
    "overshot" weaving construction, probably dating
    to the 1820's and is not attributable to any
    particular weaver. -- Georgette Meredith,
    10/9/1973
  • Source
  • Format 228 x 169 x 1.2 cm (1,629 g)
  • Coverage Euro-American America, North United
    States Indiana? Illinois?
  • Date Early 19th c. CE
  • Type cultural physical object original

25
Challenge Range of vocabularies in use
Controlled Vocabularies in use for IMLS NLG
projects (results from survey of 65 NLG projects
with digital content)
26
Data providers can
  • Create metadata for interoperability
  • Reusable metadata - think beyond your local users
    and environment
  • Use well structured and defined schemas move
    beyond simple DC
  • Use and identify controlled vocabularies

27
Service Providers can
  • Analyze metadata and cluster and normalize some
    aspects
  • Build indexes based on type of resource (image,
    text, physical object) rather than collection
  • Custom interfaces and selective views for target
    audiences / domains

28
Recap
  • OAI is a tool
  • OAI is easy - metadata is hard
  • Better metadata better interoperability

29
Resources
  • Open Archives Initiative
  • http//www.openarchives.org
  • Mellon Illinois OAI project
  • http//oai.grainger.uiuc.edu
  • IMLS Digital Collections and Content Project
  • http//imlsdcc.grainger.uiuc.edu

30
Contact Information
Sarah Shreeves Project Coordinator, IMLS Digital
Collections and Content Visiting Assistant
Professor of Library Administration University of
Illinois Library at Urbana-Champaign Email
sshreeve_at_uiuc.edu Phone 217.244.7809
Write a Comment
User Comments (0)
About PowerShow.com