METS at UCB - PowerPoint PPT Presentation

About This Presentation
Title:

METS at UCB

Description:

Configurable (by Project Managers) ... Projects not requiring GenDB to control digitization process ... Zealand Digital Library Project at University of Waikato ' ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 41
Provided by: loc
Learn more at: https://www.loc.gov
Category:
Tags: mets | ucb

less

Transcript and Presenter's Notes

Title: METS at UCB


1
METS at UCB
  • Themes in the Implementation of METS
  • Rick Beaubien
  • UC Berkeley Library

2
METS Themes
  • METS creation
  • Gathering metadata structural, descriptive,
    administrative
  • Generating METS objects
  • METS Repositories
  • Providing search access to METS objects
  • Presenting METS objects

3
METS Themes (2)
  • METS sharing
  • Sharing METS objects between METS Repositories
  • Sharing METS objects across standards and
    communities.

4
Brief History 3 eras
  • Paleozoic pre-MOA2 (pre-1997)
  • 1994 California Heritage preliminary DB
  • 1996 Honeyman Collection Desc. MD elements
    defined
  • Mesozoic MOA2 (1997-2001)
  • Consistent, non-central DB struct., desc.
    tech. md
  • MOA2 presentation tool
  • Cenozoic METS (2001-2004)
  • Central database struct., desc. tech. md
  • Expanding, adapting MOA2 tools

5
METS Projects at UCB
  • Archival collections Project-based, grant-funded
    initiatives
  • New projects
  • Data Rescue projects
  • Stored materials Online access to tables of
    contents
  • Technical Reports heir of Dienst

6
METS Creation at UCB
  • Centralized approach
  • Centralized database with associated metadata
    input and METS generation modules
  • Dispersed approach
  • Perl scriptsmostly ad hocextract necessary
    metadata from multiple sources

7
METS creationCentralized Approach
  • Components
  • GenDB
  • Relational database
  • Web-based input modules
  • Batch load input modules
  • GenX
  • Generates METS objects from GenDB.

8
METS creation GenDB Functionality
  • Metadata gathering tool
  • Facilitates input of structural descriptive md
    (manual batch)
  • Processing control tool
  • Guides the digitization process by vendors
    (image, transcription/structured text content)
  • Imports content file names and associated
    technical metadata coming out of digitization
    process

9
GenDB Typical Flow
Imaging/ Transcription WorkOrders
Vendor
GenDB User
Technical MD Spreadsheets
SQL Server Database
METS
10
METS creation Centralized GenDB Database
  • SQL Server database (Relational database)
  • Structural, descriptive and admin md recorded in
    flat table structure
  • GenDB element sets antedate METS, MODS, etc.
  • MOA2.DTD determined structural metadata.
  • Multiple standards influenced descriptive md
    element set.
  • Accommodated to MODS
  • MOA2.DTD determined image technical md element
    set
  • Accommodated to MIX

11
METS creation Centralized GenDB Web UI
  • Web based interface for manual input
  • Java Servlet Driven
  • Java Server Backend
  • Key Features
  • Configurable (by Project Managers)
  • Shields users from the complexities of METS and
    standards-specific vocabularies

12
METS creation Centralized GenDB Batch Interface
  • Components
  • Batchload schema to which MD to be loaded must
    conform.
  • Batch processor (Java module)
  • Other components shared with Web Interface
  • When useful
  • Anytime input can be programmatically generated
    from existing sources

13
METS creation Centralized GenX Generating METS
  • Components
  • Java Program with graphical UI
  • Key Features
  • Shows list of Objects available for export
  • On demand, queries database to gather md
    pertaining to selected object(s) and package as
    METS with MODS, MIX extensions

14
METS creation Centralized Reality Check
  • Main GenDB limitations
  • Better at physical than logical structuring
  • No support yet for video/audio content
  • Redundant Keying of DescMD
  • No Collection level input

15
METS CreationDispersed Approach
  • When used
  • Any time requisite metadata and content files
    already availablejust needs to be harvested and
    packaged
  • Legacy databases
  • Projects not requiring GenDB to control
    digitization process
  • Method Add hoc PERL scripts gather md package
    as METS.
  • Why used
  • Expedient. Lots of PERL programming expertise.

16
Stored Materials Project Flow
JPEG TOC scans
Perl Script 1
1
5
Perl Script 2
2
GLADIS Catalog
MODS Records
5
6
METS Objects
3
MARCtoMODS
4
MARC Records
17
METS CreationFuture Trends
  • Trend toward centralization will continue
    replace dispersed approach
  • Batch interface can handle most dispersed
    situations
  • Makes future maintenance easy
  • Helps insure consistency in METS output

18
METS CreationCommon Issues
  • Immaturity/Lack of Extension schemas
  • Problems for expressing MD
  • Problems for gathering MD
  • METS related schema status
  • METS stable
  • Descriptive Metadata MODS, DC Simple, MARCXML
  • Technical MD still immature, if available at all

19
METS Access
  • Main sub-themes
  • Discovery
  • Presentation of content associated metadata

20
METS Access DiscoverySearch Support at UCB
  • No centralized search support for our METS/MOA2
    repository
  • Current discovery mechanisms
  • Online catalog links
  • Finding Aids, OAC supported searching
  • Project home pages and Finding Tools

21
METS Access DiscoveryProjected Support
  • Options considered
  • Tamino/XML database
  • Abandoned
  • Too many limits on XML support
  • Still have to build search interface from scratch
  • Cheshire
  • Greenstone

22
METS Access DiscoveryCheshire Option
  • What is it
  • Developed by Ray Larson at U.C. Berkeley
  • next-generation online catalog and full-text
    information retrieval system using advanced IR
    techniques
  • Advantages
  • Free
  • Indexes hub documents (like METS) and content
    files where they reside
  • Very sophisticated searching/ranking algorithms
    including Boolean
  • OAI interface

23
METS Access DiscoveryCheshire Option (2)
  • Disadvantages
  • Does not support Unicode yet
  • coming in version 3
  • Limited collection management support
  • Adding collections
  • Developing search interface
  • No object-level presentation support

24
METS Access DiscoveryGreenstone Option
  • What is it
  • Developed by New Zealand Digital Library Project
    at University of Waikato
  • suite of software for building and distributing
    digital library collections. It provides a new
    way of organizing information and publishing it
    on the Internet or on CD-ROM.
  • Advantages
  • Free/Open Source
  • Next version will be METS-based
  • Strong collection management support

25
METS Access DiscoveryGreenstone Option (2)
  • Advantages (contd)
  • Unicode support now
  • Fairly sophisticated search support
  • Some presentation support
  • OAI support in progress
  • Disadvantages
  • Does not index objects where they reside
  • This limitation may apply to METS-based version
    as well

26
METS Access PresentationGenView
  • Java-based Software suite developed at UCB for
    MOA2/METS presentation
  • History
  • Originates in Making of America II (1997)
  • XSLT in infancy
  • Web Services non-existent
  • CORBA/RMI and servlet technology were hot
  • GenView originally supported MOA2 objects
  • GenView adapted to accommodate METS

27
GenView Basic Architecture
Java Servlet
Web Interface
RMI
XSLT
Repository Manager (java)
METS Java Object
METS XML Documents
METS Java Objects
28
METS Access PresentationGenView Evaluated
  • Advantages
  • It exists
  • Presentation very efficient
  • Meets basic presentation needs well
  • Disadvantages
  • Geared towards image/native browser content
  • Limited configuration options
  • Complex difficult to maintain

29
METS Access PresentationGenView In Context
  • XSLT-based approaches to METS presentation
  • NYU Native METS
  • Library of Congress Transformed METS
  • University of Chicago
  • Prebuilding html pages as part of an xslt
    transformation to load METS objects into
    Greenstone.

30
Sharing METS Objects
  • Sharing METS objects between METS repositories
  • Plea for Profiles
  • Sharing METS objects across standards
  • METS and Learning objects standards

31
METS Sharing METS to METSMETS as Transfer Syntax
  • METS, like MARC, can function as transfer syntax
  • Problem METS offers much more leeway to
    implementation than MARC
  • Key areas of variations
  • Structure of ltfileSecgt and ltstructMapgt and
    relations between the two
  • Extension Schemas used required elements
  • Attribute vocabularies
  • mets/_at_TYPE
  • fileGrp/_at_USE , file/_at_USE

32
METS Sharing METS to METSSharing in UC System
  • Not a theoretical goal but a reality
  • All UC campus libraries participate in OAC/CDL
  • Moving towards profiles
  • Common starting point MOA2
  • Working groups under auspices of OAC
  • Desired Result Submission Profiles

33
METS Sharing Across StandardsMETS and other
standards
  • METS originates in library world
  • especially suited to library needs
  • Focus/ primary concerns of other communities
    somewhat different
  • developing their own digital object standards
  • Does this matter and why?

34
METS Sharing Across StandardsMETS and IMS-CP
  • IMS Global Learning Consortium developing
    learning object standards
  • IMS-CP analogous to METS
  • Goal enable production of learning objects that
    can be played in IMS standards-savvy tools
  • Importance of compatibility with METS
  • Incorporating library resources (METS) into
    learning objects
  • Archiving learning objects in METS-based
    repositories

35
METS Sharing Across StandardsUCB Library Efforts
  • METS/IMS-CP Cross Walk project
  • Headed by Raymond Yee of Interactive University
    at UCB
  • Results of effort thus far
  • Analysis of key similarities and differences
    between two schemas
  • Preliminary x-walk
  • Published in Library Hi-Tech

36
METS Sharing Across StandardsUCB Library
Efforts (2)
  • Summary of analysis
  • Two schemas share many high level similarities
  • Hierarchical structMap
  • fileSec for inventorying resources referenced
    from structMap
  • Accommodation for MD defined by other schemas
  • Key difference IMS-CP does not distinguish
    between presentation and content
  • Future
  • Standards Merge?
  • Some provision for sharing across communities

37
Links
  • California Heritage Collection.
    http//sunsite.berkeley.edu/CalHeritage/
  • MOA2 Project. http//sunsite.berkeley.edu/moa2/
  • GenDB Web Interface Demo. http//sunsite2.berkeley
    .edu/GenDB (Account demoman Password demoman)

38
Links
  • MODS. http//www.loc.gov/mods
  • MIX. http//www.loc.gov/mix
  • Cheshire II. http//cheshire.lib.berkeley.edu/
  • Greenstone
  • http//www.greenstone.org/cgi-bin/library

39
Links
  • GenView demo. http//metsviewer.lib.berkeley.edu/m
    etstest/BreenMETS.xml
  • NYU METS Page-Turner. http//dlib.nyu.edu/metstool
    s/
  • U. Chicago Chopin Early Editions
    (Greenstone-based collection). http//chopin.lib.u
    chicago.edu/

40
Links
  • IMS-CP. http//www.imsglobal.org/content/packaging
    /index.cfm
  • METS/Educational Technology Interoperability
    http//iu.berkeley.edu/crosswalk/
Write a Comment
User Comments (0)
About PowerShow.com