Title: Visual Materials: Metadata, Standards, and Best Practices for Digital Libraries
1Visual Materials Metadata, Standards, and Best
Practices for Digital Libraries
- Howard Besser
- UCLA School of Education Information
- http//www.gseis.ucla.edu/howard
2Metadata for Digital Libraries-
- Models for Digital Libraries
- Importance of Metadata Standards
- Types and Uses of Metadata
- Discovery Metadata The Dublin Core
- Administrative and Structural Metadata The
Making of America II Project - Longevity Metadata
- Identification/Provenance
- The 4/99 NISO/DLF Image Metadata Workshop
- Various other Metadata
3Key problems were facing
- Discovery
- Longevity-
- Interoperability-
4Serious Longevity Problems
- What we know from prior widespread digital file
formats - Images separating from their metadata
- Inaccessibility of software needed to view an
image - Inability to even decode the file format of an
image
5Traditional Digital Library Model
6Ideal Digital Library Model
7For Interoperability Digital Libraries Need
Standards
- Discovery Metadata for finding
- Administrative Metadata for viewing and
maintaining - Structural Metadata for navigation
- ... IP Rights Management Metadata for controlling
access...
8Why are Standards and Metadata consensus
important?
- Managing digital files over time
- Longevity
- Interoperability
- Veracity
- Recording in a consistent manner
- Will give vendors incentive to create
applications that support this
9Why Standards?
- Why do we need a standards?
- To make information universally available to
users - facilitate sharing and interchange of
information - To preserve information (make it safe from
changes in hardware and software) - Standards are the work of communities
- They are necessary so that communities can work.
10Why are you Managing this Information?
- Organizational mission type
- Users
- Uses
11Questions to Ask
- What communities is this standard designed for?
- What type of information is this standard
designed to handle? - What functions is this standard designed to
serve? - What previous standards is it built upon?
- Does the standard prescribe how to create new
records (or parts of records), or how to map from
existing records? - How far does the standard go? Semantics Does it
define element sets? Rules? Syntax?-
12What is Metadata
- Structured data describing other data used to
find or help manage information resources - Aids in interoperability
- Titles, dates, captions, cataloging and indexing
data, file headers, rights info, provenance, code
books, transaction logs, ... - One persons metadata is anothers data
13Sorting through the Standards Morass
- Data Structures (DC, CDWA, MARC, VRA Core, TEI,
EAD, MESL data dict) - Data Interchange (Z39.50)
- Data Values/vocabularies (LCSH, AAT, ULAN, TGN)
- Data Content/syntax (AACR2)
14Semantics/Syntax/Structure
- Semantics
- meaning, as defined by a community to meet their
particular needs (DC) - Syntax
- a systematic arrangement of data elements for
machine processing - facilitates the exchange and use of metadata
among various applications (HTML, XML, RDF) - Structure
- a formal arrangement of the syntax with the goal
of consistent representation of the semantics
(rules defining field contents like 1/11/99)
15What is MetadataTypes Uses
- lots of different ways of dividing the clusters
16Uses of Metadata
- Discovery Retrieval
- Identification/Provenance
- Rights Management
- Viewing
- Integrity
- Longevity
- Content rating
17Types of Metadata
- Descriptive
- Discovery Retrieval
- Structural
- Administrative
- Intellectual
- Other Metadata
18Metadata -- Detailed Types
- Identification metadata
- Instance or Fixation metadata
- Source image metadata
- Content metadata
- Subject metadata
- Form and format metadata
- Context metadata
- Structure metadata
- Relationships metadata
- Terms Conditions metadata
- Use history metadata
19Containers and Packages of MetadataWarwick, not
MARC
- modular
- overlapping
- extensible
- community-based
- designed for a networked world to aid commonality
btwn communities while still providing full
functionality within each community
20Some different schemes where Metdata is kept
- embedded withing the object (HTML tags)
- in a separate related DB maintained by same
organization (OPAC, MOA II) - in a separate DB maintained by a separate
organization (Books in Print, ratings systems) - derived on-the-fly from a different scheme
(MARC-to-DC)
21Some Standards/Metadata Efforts
- Dublin Core
- Visual Resources Association (VRA) Core
- Encoded Archival Description (EAD)
- Computerized Interchange of Museum Information
(CIMI) - Records Export for Art and Cultural Heritage
(REACH)
22Dublin Core (3/95)
- improve resource discovery
- anticipate precision problems of Web
Crawler-based searching tools - existing metadata could be dumbed down
- elements should be simple to understand and use,
so that any individual should be able to assign
terms him/herself - software might eventually automatically generate
very base-level metadata
23Dublin Core
- Title
- Creator
- Subject
- Description
- Publisher
- Contributors
- Date
- Type
- Format
- Identifier
- Source
- Language
- Relation
- Coverage
- Rights
24Dublin Core
- every element is both optional and repeatable
- elements are cross-disciplinary
- elements are extensible by organized communities
- can employ a syntax such as htmls
ltMETAgt tagset
25DC Qualifiers
- allows one community to express important nuances
and qualifications, while still making the basic
importance available to communities with simple
needs - our community can reflect alternate title,
transliterated title, and main title, yet they
will all be found under a simple Web search under
title
26Discovery MetadataRecent History
- Dublin Core (3/95)
- Warwick Framework (4/96)
- Image Metadata Workshop (9/96)
- Canberra, Helsinki, ... DC (98)
- Digital Library Collaboratory (97-)
- DC-8, Frankfurt 10/99
27Dublin Core--further work
- Warwick Framework
- metadata packages for extensible functions
- layed groundwork for RDF
- Canberra Qualifiers
- refining the semantics of the element set to
provide more precise info - SUBELEMENT, SCHEME, LANG
- Granularity
- no hierarchical relationships w/i a given DC
record only one record per discrete object
(collection or item-level), and relationship
field plus qualifier links them
28The Research Process and Functional Categories
of Metadata
- Discovery
- Retrieval
- Collation
- Analysis
- Re-presentation
29Making of America II-
- Background of the DLF Project
- Administrative Metadata
- Structural Metadata
30Other Types of Metadata-
- Longevity
- Identification/Provenance
- Rights Management
31The Short Life of Digital Info Digital Longevity
Problems-
- Disappearing Information
- The Viewing Problem
- The Scrambling Problem
- The Inter-relation Problem
- The Custodial Problem
- The Translation Problem
32Identification/Provenance (Images)-
- The number of variant forms of a work can be
enormous - Image Families
- A digital image frequently has many layers of
parentage - Information about the parentage that can indicate
the quality and veracity of the image (Dublin
Core "Source" and "Relation") - how to deal with different versions derived from
the same scan or different encoding schemes - Vocabulary Standards to express this
33NISO/DLF Image Metadata WorkshopPossible Goals
- Metadata fields
- Rules for Field Contents (authority control)
- Core set of necessary fields
- Syntax for expressing fields and contents
(headers)
34Other Metadata
- Description of depiction/surrogate (What VRA
calls its "Surrogate Categories") - Description of original object
- Rights and Reproduction Information
- Location Information
35Data StructuresThe VRA Core
- 28 elements specifically for visual resource
collections - Work Description Categories-
- Visual Document Description Categories-
- http//www.oberlin.edu/art/vra/dsc.html
36Data Value Metadata(vocabularies)
- LCSH
- TGM
- AAT
- ULAN
- TGN
- VRA Core
37Metadata for Digital Commerce
38Metadata Mapping-
- Crosswalks
- Resource Description Framework (RDF)
39Metadata Philosphies
- Minimalists vs. Structuralists
- From Pidgeon to Creole (add structure and tenses)
40Collaborative Metadata Projects-
- OCLC CORC Project
- Computerized Interchange of Museum Information
(CIMI)
41Visual Materials Metadata, Standards, and Best
Practices for Digital Libraries
- Howard Besser
- UCLA School of Education Information
- Baca, Murtha (ed). Introduction to Metadata, Los
Angeles Getty Information Institute, 1998 - http//www.gseis.ucla.edu/howard/image-meta.html
- http//www.gseis.ucla.edu/howard
- http//sunsite.Berkeley.EDU/Imaging/Databases/sta
ndards - http//sunsite.Berkeley.EDU/moa2/
- http//sunsite.Berkeley.EDU/Longevity/
- http//www.gii.getty.edu/timeandbits/
- http//www.nlc-bnc.ca/ifla/II/metadata.htm
- http//purl.oclc.org/metadata/dublin_core/
- http//purl.oclc.org/corc//
- http//lcweb.loc.gov/ead/
- http//www.cimi.org/