Title: METS at UCB
1METS at UCB
- Themes in the Implementation of METS
- Rick Beaubien
- UC Berkeley Library
2METS Themes
- METS creation
- Gathering metadata structural, descriptive,
administrative - Generating METS objects
- METS Repositories
- Providing search access to METS objects
- Presenting METS objects
3METS Themes (2)
- METS sharing
- Sharing METS objects between METS Repositories
- Sharing METS objects across standards and
communities.
4Brief History 3 eras
- Paleozoic pre-MOA2 (pre-1997)
- 1994 California Heritage preliminary DB
- 1996 Honeyman Collection Desc. MD elements
defined - Mesozoic MOA2 (1997-2001)
- Consistent, non-central DB struct., desc.
tech. md - MOA2 presentation tool
- Cenozoic METS (2001-2004)
- Central database struct., desc. tech. md
- Expanding, adapting MOA2 tools
5METS Projects at UCB
- Archival collections Project-based, grant-funded
initiatives - New projects
- Data Rescue projects
- Stored materials Online access to tables of
contents - Technical Reports heir of Dienst
6METS Creation at UCB
- Centralized approach
- Centralized database with associated metadata
input and METS generation modules - Dispersed approach
- Perl scriptsmostly ad hocextract necessary
metadata from multiple sources
7METS creationCentralized Approach
- Components
- GenDB
- Relational database
- Web-based input modules
- Batch load input modules
- GenX
- Generates METS objects from GenDB.
8METS creation GenDB Functionality
- Metadata gathering tool
- Facilitates input of structural descriptive md
(manual batch) - Processing control tool
- Guides the digitization process by vendors
(image, transcription/structured text content) - Imports content file names and associated
technical metadata coming out of digitization
process
9GenDB Typical Flow
Imaging/ Transcription WorkOrders
Vendor
GenDB User
Technical MD Spreadsheets
SQL Server Database
METS
10METS creation Centralized GenDB Database
- SQL Server database (Relational database)
- Structural, descriptive and admin md recorded in
flat table structure - GenDB element sets antedate METS, MODS, etc.
- MOA2.DTD determined structural metadata.
- Multiple standards influenced descriptive md
element set. - Accommodated to MODS
- MOA2.DTD determined image technical md element
set - Accommodated to MIX
11METS creation Centralized GenDB Web UI
- Web based interface for manual input
- Java Servlet Driven
- Java Server Backend
- Key Features
- Configurable (by Project Managers)
- Shields users from the complexities of METS and
standards-specific vocabularies
12METS creation Centralized GenDB Batch Interface
- Components
- Batchload schema to which MD to be loaded must
conform. - Batch processor (Java module)
- Other components shared with Web Interface
- When useful
- Anytime input can be programmatically generated
from existing sources
13METS creation Centralized GenX Generating METS
- Components
- Java Program with graphical UI
- Key Features
- Shows list of Objects available for export
- On demand, queries database to gather md
pertaining to selected object(s) and package as
METS with MODS, MIX extensions
14METS creation Centralized Reality Check
- Main GenDB limitations
- Better at physical than logical structuring
- No support yet for video/audio content
- Redundant Keying of DescMD
- No Collection level input
15METS CreationDispersed Approach
- When used
- Any time requisite metadata and content files
already availablejust needs to be harvested and
packaged - Legacy databases
- Projects not requiring GenDB to control
digitization process - Method Add hoc PERL scripts gather md package
as METS. - Why used
- Expedient. Lots of PERL programming expertise.
16Stored Materials Project Flow
JPEG TOC scans
Perl Script 1
1
5
Perl Script 2
2
GLADIS Catalog
MODS Records
5
6
METS Objects
3
MARCtoMODS
4
MARC Records
17METS CreationFuture Trends
- Trend toward centralization will continue
replace dispersed approach - Batch interface can handle most dispersed
situations - Makes future maintenance easy
- Helps insure consistency in METS output
18METS CreationCommon Issues
- Immaturity/Lack of Extension schemas
- Problems for expressing MD
- Problems for gathering MD
- METS related schema status
- METS stable
- Descriptive Metadata MODS, DC Simple, MARCXML
- Technical MD still immature, if available at all
19METS Access
- Main sub-themes
- Discovery
- Presentation of content associated metadata
20METS Access DiscoverySearch Support at UCB
- No centralized search support for our METS/MOA2
repository - Current discovery mechanisms
- Online catalog links
- Finding Aids, OAC supported searching
- Project home pages and Finding Tools
21METS Access DiscoveryProjected Support
- Options considered
- Tamino/XML database
- Abandoned
- Too many limits on XML support
- Still have to build search interface from scratch
- Cheshire
- Greenstone
22METS Access DiscoveryCheshire Option
- What is it
- Developed by Ray Larson at U.C. Berkeley
- next-generation online catalog and full-text
information retrieval system using advanced IR
techniques - Advantages
- Free
- Indexes hub documents (like METS) and content
files where they reside - Very sophisticated searching/ranking algorithms
including Boolean - OAI interface
23METS Access DiscoveryCheshire Option (2)
- Disadvantages
- Does not support Unicode yet
- coming in version 3
- Limited collection management support
- Adding collections
- Developing search interface
- No object-level presentation support
24METS Access DiscoveryGreenstone Option
- What is it
- Developed by New Zealand Digital Library Project
at University of Waikato - suite of software for building and distributing
digital library collections. It provides a new
way of organizing information and publishing it
on the Internet or on CD-ROM. - Advantages
- Free/Open Source
- Next version will be METS-based
- Strong collection management support
25METS Access DiscoveryGreenstone Option (2)
- Advantages (contd)
- Unicode support now
- Fairly sophisticated search support
- Some presentation support
- OAI support in progress
- Disadvantages
- Does not index objects where they reside
- This limitation may apply to METS-based version
as well
26METS Access PresentationGenView
- Java-based Software suite developed at UCB for
MOA2/METS presentation - History
- Originates in Making of America II (1997)
- XSLT in infancy
- Web Services non-existent
- CORBA/RMI and servlet technology were hot
- GenView originally supported MOA2 objects
- GenView adapted to accommodate METS
27GenView Basic Architecture
Java Servlet
Web Interface
RMI
XSLT
Repository Manager (java)
METS Java Object
METS XML Documents
METS Java Objects
28METS Access PresentationGenView Evaluated
- Advantages
- It exists
- Presentation very efficient
- Meets basic presentation needs well
- Disadvantages
- Geared towards image/native browser content
- Limited configuration options
- Complex difficult to maintain
29METS Access PresentationGenView In Context
- XSLT-based approaches to METS presentation
- NYU Native METS
- Library of Congress Transformed METS
- University of Chicago
- Prebuilding html pages as part of an xslt
transformation to load METS objects into
Greenstone.
30Sharing METS Objects
- Sharing METS objects between METS repositories
- Plea for Profiles
- Sharing METS objects across standards
- METS and Learning objects standards
31METS Sharing METS to METSMETS as Transfer Syntax
- METS, like MARC, can function as transfer syntax
- Problem METS offers much more leeway to
implementation than MARC - Key areas of variations
- Structure of ltfileSecgt and ltstructMapgt and
relations between the two - Extension Schemas used required elements
- Attribute vocabularies
- mets/_at_TYPE
- fileGrp/_at_USE , file/_at_USE
32METS Sharing METS to METSSharing in UC System
- Not a theoretical goal but a reality
- All UC campus libraries participate in OAC/CDL
- Moving towards profiles
- Common starting point MOA2
- Working groups under auspices of OAC
- Desired Result Submission Profiles
33METS Sharing Across StandardsMETS and other
standards
- METS originates in library world
- especially suited to library needs
- Focus/ primary concerns of other communities
somewhat different - developing their own digital object standards
- Does this matter and why?
34METS Sharing Across StandardsMETS and IMS-CP
- IMS Global Learning Consortium developing
learning object standards - IMS-CP analogous to METS
- Goal enable production of learning objects that
can be played in IMS standards-savvy tools - Importance of compatibility with METS
- Incorporating library resources (METS) into
learning objects - Archiving learning objects in METS-based
repositories
35METS Sharing Across StandardsUCB Library Efforts
- METS/IMS-CP Cross Walk project
- Headed by Raymond Yee of Interactive University
at UCB - Results of effort thus far
- Analysis of key similarities and differences
between two schemas - Preliminary x-walk
- Published in Library Hi-Tech
36METS Sharing Across StandardsUCB Library
Efforts (2)
- Summary of analysis
- Two schemas share many high level similarities
- Hierarchical structMap
- fileSec for inventorying resources referenced
from structMap - Accommodation for MD defined by other schemas
- Key difference IMS-CP does not distinguish
between presentation and content - Future
- Standards Merge?
- Some provision for sharing across communities
37Links
- California Heritage Collection.
http//sunsite.berkeley.edu/CalHeritage/ - MOA2 Project. http//sunsite.berkeley.edu/moa2/
- GenDB Web Interface Demo. http//sunsite2.berkeley
.edu/GenDB (Account demoman Password demoman)
38Links
- MODS. http//www.loc.gov/mods
- MIX. http//www.loc.gov/mix
- Cheshire II. http//cheshire.lib.berkeley.edu/
- Greenstone
- http//www.greenstone.org/cgi-bin/library
39Links
- GenView demo. http//metsviewer.lib.berkeley.edu/m
etstest/BreenMETS.xml - NYU METS Page-Turner. http//dlib.nyu.edu/metstool
s/ - U. Chicago Chopin Early Editions
(Greenstone-based collection). http//chopin.lib.u
chicago.edu/
40Links
- IMS-CP. http//www.imsglobal.org/content/packaging
/index.cfm - METS/Educational Technology Interoperability
http//iu.berkeley.edu/crosswalk/