Title: Introduction to Metadata for Digital Asset Management
1Introduction to Metadatafor Digital Asset
Management
- Howard Besser
- UCLA School of Education Information
- http//www.gseis.ucla.edu/howard
2Metadata A fancy word for something familiar
- Cataloging
- Indexing
- Description
-
- But also new elements of technical description
(file format, compression schemes, file names, )
3Metadata for Digital Asset Management-
- Importance of Metadata Standards
- Types and Uses of Metadata
- Discovery Metadata The Dublin Core
- Administrative and Structural Metadata The
Making of America II Project
- Longevity Metadata
- Identification/Provenance
- The 4/99 NISO/DLF Image Metadata Workshop
- Various other Metadata
4What is Metadata
- Structured data describing other data used to
find or help manage information resources
- Aids in interoperability
- Titles, dates, captions, cataloging and indexing
data, file headers, rights info, provenance, code
books, transaction logs, ...
- One persons metadata is anothers data
5Sorting through the Standards Morass
- Data Structures (DC, CDWA, MARC, VRA Core, TEI,
EAD, MESL data dict)
- Data Interchange (Z39.50)
- Data Values/vocabularies (LCSH, AAT, ULAN, TGN)
- Data Content/syntax (AACR2)
6Semantics/Syntax/Structure
- Semantics
- meaning, as defined by a community to meet their
particular needs (DC)
- Syntax
- a systematic arrangement of data elements for
machine processing
- facilitates the exchange and use of metadata
among various applications (HTML, XML, RDF)
- Structure
- a formal arrangement of the syntax with the goal
of consistent representation of the semantics
(rules defining field contents like 1/11/99)
7What is MetadataTypes Uses
- lots of different ways of dividing the clusters
8Uses of Metadata
- Discovery Retrieval
- Identification/Provenance
- Rights Management
- Viewing
- Integrity
- Longevity
- Content rating
9Containers and Packages of MetadataWarwick, not
MARC
- modular
- overlapping
- extensible
- community-based
- designed for a networked world to aid commonality
btwn communities while still providing full
functionality within each community
10Some different schemes where Metdata is kept
- embedded within the object (HTML tags)
- in a separate related DB maintained by same
organization (OPAC, MOA II)
- in a separate DB maintained by a separate
organization (Books in Print, ratings systems)
- derived on-the-fly from a different scheme
(MARC-to-DC)
11Collaborative Metadata Projects
- Dublin Core
- NSF/ERCIM Digital Collaboratory
- OCLC CORC Project-
- Visual Resources Association (VRA) Core
- Encoded Archival Description (EAD)
- Computerized Interchange of Museum Information
(CIMI)-
- Records Export for Art and Cultural Heritage
(REACH)
12CORC--Cooperative Online Resource Catalog
- both bib records webliographies (pathfiinders)
- supports both AACR2/MARC and DC
- began 1/99, scheduled availability 7/00
- 100-200 participants
- Academic libraries
- OCLC networks, special libraries, public
libraries, state national libraries, consortia
13Dublin Core (3/95)
- improve resource discovery
- anticipate precision problems of Web
Crawler-based searching tools
- existing metadata could be dumbed down
- elements should be simple to understand and use,
so that any individual should be able to assign
terms him/herself
- software might eventually automatically generate
very base-level metadata
14Dublin Core
- Title
- Creator
- Subject
- Description
- Publisher
- Contributors
- Date
- Type
- Format
- Identifier
- Source
- Language
- Relation
- Coverage
- Rights
15Dublin Core
- every element is both optional and repeatable
- elements are cross-disciplinary
- elements are extensible by organized communities
- can employ a syntax such as htmls
tagset for use by Spiders and Harvesters
- May 2000 DLF Metadata Harvesting Project
16DC Qualifiers
- allows one community to express important nuances
and qualifications, while still making the basic
importance available to communities with simple
needs - our community can reflect alternate title,
transliterated title, and main title, yet they
will all be found under a simple Web search under
title
17Discovery MetadataRecent History
- Dublin Core (3/95)
- Warwick Framework (4/96)
- Image Metadata Workshop (9/96)
- Canberra, Helsinki, ... DC (98)
- Digital Library Collaboratory (97-)
- DC-8, Frankfurt 10/99
18Dublin Core--further work
- Warwick Framework
- metadata packages for extensible functions
- layed groundwork for RDF
- Canberra Qualifiers
- refining the semantics of the element set to
provide more precise info
- SUBELEMENT, SCHEME, LANG
- Granularity
- no hierarchical relationships w/i a given DC
record only one record per discrete object
(collection or item-level), and relationship
field plus qualifier links them
19The Research Process and Functional Categories
of Metadata
- Discovery
- Retrieval
- Collation
- Analysis
- Re-presentation
20Making of America II-
- Background of the DLF Project
- Administrative Metadata
- Structural Metadata
21MOA2 Goal is Interpoerability
22DLF Metadata for Interoperability Testbedthe
MOA II Project
- R D
- Distributed Repositories
- Transportation, 1869-1900
- Testbed Project
- Best Practices
- Structural and administrative metadata
23Previous Projects/Background
- Library Standards Background
- UC Berkeley Background
- Finding Aids
- EAD
- SGML
- EAD Digital Archives
24MOA II Classes of Objects
- Continuous Tone Photos
- Photo Albums
- Diaries, journals, letterpress books
- Ledgers
- Correspondence
25MOA II Metadata
- Administrative Metadata
- for enhancing resource management
- Structural Metadata
- for reflecting internal hierarchies and
relationships btwn parts
- Raw/Seared/Cooked
26Administrative Metadatato uniquely identify a
digital resource and manage it over time
- Information about where the various
pieces/versions of the object reside
- Information to view the digital object
- Information about the scanning process
27Structural Metadatathat which is relevant to
presentation of the digital object to the user
- metadata defining the "object a book, a diary,
a photo album
- metadata defining the sub-objects pages
(physical) or chapters and subheads (intellectual)
28SGML, XML, HTML
- TEI for structured humanities text
- EAD for Finding Aids
29Other Types of Metadata-
- Longevity
- Identification/Provenance
- Rights Management
30NISO/DLF Image Metadata WorkshopPossible Goals
- Metadata fields
- Rules for Field Contents (authority control)
- Core set of necessary fields
- Syntax for expressing fields and contents
(headers)
31Image MetadataFocus on Metadata that may prove
helpful for
- management
- use
- preservation
- ...
32Image MetadataBreak-out Groups Work Done
- Characteristics and Features of Images
- Image Production and Reformatting Features
- Image Identification and Integrity
33Other Metadata
- Description of depiction/surrogate (What VRA
calls its "Surrogate Categories")
- Description of original object
- Rights and Reproduction Information
- Location Information
34Data StructuresThe VRA Core
- 28 elements specifically for visual resource
collections
- Work Description Categories-
- Visual Document Description Categories-
- http//www.oberlin.edu/art/vra/dsc.html
35VRA CoreWork Description Categories
- Work type
- Title
- Measurements
- Material
- Technique
- Creator
- Role
- Date
- Repository name
- Repository place
- Repository number
- Current site
- Original site
- Style/period/group/movement
- Nationality/culture
- Subject
- Related work
- Relationship type
- Notes
36VRA CoreVisual Document Description Categories
- Visual document type
- Visual document format
- Visual document measurements
- Visual document date
- Visual document owner
- Visual document owner number
- Visual document view description
- Visual document subject
- Visual document source
37Data Value Metadata(vocabularies)
- LCSH
- TGM
- AAT
- ULAN
- TGN
- VRA Core
38LCSH
39Thesaurus for Graphic Materials
- designed for subject indexing of pictorial
materials, particularly large general collections
of historical images
- for cataloging and retrieval
- good for general audiences and broad approaches
to the material
- TGM-I Subject Terms TGM-II Genre and Physical
Characteristic Terms
- http//lcweb.loc.gov/rr/print/tgm/toc.html
40AAT
- 120,000 terms
- for describing objects, textual materials,
images, architecture, and material culture from
antiquity to present
- large and complex
- http//www.getty.edu/gri/vocabularies/
41ULAN
- name authority
- http//www.getty.edu/gri/vocabularies/
42Thesaurus of Geographic Names
- over 1 million records
- hierarchical and global
- throughout history
- most records include coordinates and descriptive
notes
43Metadata for Digital Commerce
44- formal structure for describing and uniquely
identifying intellectual property itself, the
people and businesses involved in its trading,
and the agreements which they make about it
(primarily for publishing, music, and visual
arts) - will develop high-level specifications for the
services that will be required to implement a
global IP trading system based on this
generic data model - focus is on encoding rights at a high level, not
on resource discovery
- likely to involve metadata schma registration and
directory to allow interoperation of personal
identifiers for rightsholders and users
- supported by EEC DG-13
- First meeting July 1999
- http//www.indecs.org/
45Metadata Mapping-
- Crosswalks
- Resource Description Framework (RDF)
46Crosswalks
- mapping btwn differing metadata structures
- eliminate the need for monolithic, universally
adopted standards
- focus on flexibility and interoperatiblity
- RDF-based metadata registries
47Crosswalk Example
48Resource Description Framework (RDF, spec
released 2/99)
- W3C Metadata activity
- designed to move the Web beyond simple links to
semantically-rich relationships btwn resources
- metadata application using XML as a common syntax
for exchange and processing
- flexible architecture for managing diverse
application-specific metadata packets that can be
processed by machines
- associates resources, property types, and
corresponding values
- http//www.w3.org/RDF/
49RDF
- Resources (character strings, names, digital
objects)
- Property (is the author of)
- Value
- resourcespropertiesrelationships
- many different relationships can be reflected
50XML-encoded RDF
- prefix"RDF" ?
- prefix"DC" ?
-
- Howard Besser
-
51Should you start building with RDF today?
- Tools are primitive
- Standard still likely to evolve
52Metadata for Digital Asset Mgmt
- Howard Besser
- UCLA School of Education Information
- Baca, Murtha (ed). Introduction to Metadata, Los
Angeles Getty Information Institute, 1998
- http//www.getty.edu/gri/standard/intrometadata/
- http//sunsite.berkeley.edu/Imaging/Databases/sta
ndards
- http//sunsite.berkeley.edu/moa2/
- http//sunsite.berkeley.edu/Longevity/
- http//www.ifla.org/II/metadata.htm
- http//purl.oclc.org/metadata/dublin_core/
- http//purl.oclc.org/corc/
- http//lcweb.loc.gov/ead/
- http//www.gseis.ucla.edu/howard/image-meta.html
- http//www.gseis.ucla.edu/howard/Metadata/UC-May0
0/
- http//sunsite.berkeley.edu/Metadata/sp2000.html