Emerging Standards for Complex Works - PowerPoint PPT Presentation

1 / 90
About This Presentation
Title:

Emerging Standards for Complex Works

Description:

Photo Albums. Diaries, journals, letterpress books. Ledgers. Correspondence ... metadata defining the 'object': a book, a diary, a photo album ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 91
Provided by: gseis
Category:

less

Transcript and Presenter's Notes

Title: Emerging Standards for Complex Works


1
Emerging Standards for Complex Works
  • Howard Besser
  • UCLA School of Education Information
  • http//www.gseis.ucla.edu/howard

2
Emerging Standards for Complex Works
  • Background Context for Standards
  • MOA2 Structural Administrative Metadata
  • NISO/DLF Technical Imaging Standards
  • Identification/provenance
  • Rich Media
  • Longevity

3
Key problems were facing
  • Discovery
  • Longevity
  • Interoperability

4
Traditional Digital Library Model
5
Ideal Digital Library Model
6
For Interoperability Digital Libraries Need
Standards
  • Descriptive Metadata for consistent description
  • Discovery Metadata for finding
  • Administrative Metadata for viewing and
    maintaining
  • Structural Metadata for navigation
  • ... Terms Conditions Metadata for controlling
    access...

7
Why are Standards and Metadata consensus
important?
  • Managing digital files over time
  • Longevity
  • Interoperability
  • Veracity
  • Recording in a consistent manner
  • Will give vendors incentive to create
    applications that support this

8
Collaborative Metadata Projects
  • Dublin Core
  • NSF/ERCIM Digital Collaboratory
  • OCLC CORC Project-
  • Visual Resources Association (VRA) Core
  • Encoded Archival Description (EAD)
  • Computerized Interchange of Museum Information
    (CIMI)-
  • Records Export for Art and Cultural Heritage
    (REACH)

9
CORC--Cooperative Online Resource Catalog
  • both bib records webliographies (pathfiinders)
  • supports both AACR2/MARC and DC
  • began 1/99, scheduled availability 7/00
  • 100-200 participants
  • Academic libraries
  • OCLC networks, special libraries, public
    libraries, state national libraries, consortia

10
Making of America II-
  • Background of the DLF Project
  • Administrative Metadata
  • Structural Metadata

11
MOA2 Goal is Interpoerability
  • Book example

12
DLF Metadata for Interoperability Testbedthe
MOA II Project
  • R D
  • Distributed Repositories
  • Transportation, 1869-1900
  • Testbed Project
  • Best Practices
  • Structural and administrative metadata

13
Previous Projects/Background
  • Library Standards Background
  • UC Berkeley Background
  • Finding Aids
  • EAD
  • SGML
  • EAD Digital Archives

14
MOA II Classes of Objects
  • Continuous Tone Photos
  • Photo Albums
  • Diaries, journals, letterpress books
  • Ledgers
  • Correspondence

15
MOA II Metadata
  • Administrative Metadata
  • for enhancing resource management
  • Structural Metadata
  • for reflecting internal hierarchies and
    relationships btwn parts
  • Raw/Seared/Cooked

16
MOA II Behaviors
  • Navigation
  • Display/Print

17
MOA II Best practices
  • Use/Users/Collection
  • Benchmarking
  • Masters vs. Derivatives
  • Scanning-
  • Administrative Metadata-
  • Structural Metadata-

18
Scanning Best Practices
  • Think about users (and potential users), uses,
    and type of material/collection
  • Scan at the highest quality that does not exceed
    the likely potential users/uses/material
  • Do not let todays delivery limitations influence
    your scanning file sizes understand the
    difference between digital masters and derivative
    files used for delivery
  • Many documents which appear to be bitonal
    actually are better represented with greyscale
    scans
  • Include color bar and ruler in the scan
  • Use objective measurements to determine scanner
    settings (do NOT attempt to make the image good
    on your particular monitor or use image
    processing to color correct)
  • Dont use lossy compression
  • Store in a common (standardized) file format
  • Capture as much metadata as is reasonably
    possiple (including metadata about the scanning
    process itself)

19
Why Scale is important
20
Administrative Metadatato uniquely identify a
digital resource and manage it over time
  • Information about where the various
    pieces/versions of the object reside
  • Information to view the digital object
  • Information about the scanning process

21
Structural Metadatathat which is relevant to
presentation of the digital object to the user
  • metadata defining the "object a book, a diary,
    a photo album
  • metadata defining the sub-objects pages
    (physical) or chapters and subheads (intellectual)

22
SGML, XML, HTML
  • TEI for structured humanities text
  • EAD for Finding Aids

23
NISO/DLF Image Metadata WorkshopPossible Goals
  • Metadata fields
  • Rules for Field Contents (authority control)
  • Core set of necessary fields
  • Syntax for expressing fields and contents
    (headers)

24
Image MetadataFocus on Metadata that may prove
helpful for
  • management
  • use
  • preservation
  • ...

25
Image MetadataBreak-out Groups Work Done
  • Characteristics and Features of Images
  • Image Production and Reformatting Features
  • Image Identification and Integrity

26
NISO/DLF Image Metadata Workshop (4/99) Image
Technical Information Possible Goals
  • Metadata fields
  • Rules for Field Contents (authority control)
  • Core set of necessary fields
  • Syntax for expressing fields and contents
    (headers)

27
Image MetadataFocus on Metadata that may prove
helpful for
  • management
  • use
  • preservation
  • ...

28
Image MetadataBreak-out Groups Work Done-
  • Characteristics and Features of Images
  • Image Production and Reformatting Features
  • Image Identification and Integrity

29
Image Metadata Elements for Data DictionaryData
Dictionary Entries
  • Element Name
  • Definition (short) of the element name
  • Is the element required? (Identified as
    Mandatory, Mandatory if Applicable, Recommended,
    Optional)
  • How is the value of the element represented?
  • Examples
  • When is this data collected?
  • What is the purpose of this data?
  • Who would the identified users be?
  • How is the metadata used?
  • What other metadata standards reference it?

30
Image Metadata Elements for Data
DictionaryCharacteristics and Features Element
List
  • Format Issues
  • Resolution Issues
  • Encoding
  • Compression
  • Others

31
Image Metadata Elements for Data Dictionary
Image Production Element List (Pertaining to the
Image)
  • In-image target(s)
  • System target(s), associated with the object
  • Responsible agent
  • Rationale
  • Hardware
  • Software

32
Image Metadata Elements for Data Dictionary
Image Production Element List (Pertaining to the
Process)
  • Format of the image
  • Intrinsic characteristics of the image
  • Identification
  • Provides a means for defining methodology
    including documentation and rationale
  • Who is involved with the file?
  • Who created the image file?
  • Who commissioned the creation of the image file
    (i.e., the chartering entity), as opposed to Who
    is the responsible agency? Who is the owner?
  • Where
  • What
  • When necessary dates including capture
    date/time, modification
  • Checksum
  • Navigational aid
  • Encoding tools

33
Image MetadataNISO/DLF Image MetadataIn
Progress
  • Data Dictionary for both Characteristics
    Features and for Image Production Elements due
    end of 6/00

34
Finding Image Origins
35
Identification/Provenance (Images)-
  • The number of variant forms of a work can be
    enormous
  • Image Families
  • A digital image frequently has many layers of
    parentage
  • Information about the parentage that can indicate
    the quality and veracity of the image (Dublin
    Core "Source" and "Relation")
  • how to deal with different versions derived from
    the same scan or different encoding schemes
  • Vocabulary Standards to express this

36
The number of variant forms of a work can be
enormous
  • different views of the same object
  • different lighting of the same object
  • different scans of the same photo
  • different resolutions
  • different compression schemes
  • different compression ratios
  • different file storage formats
  • different details of the same image
  • ...

37
Image Families
38
Identification/Provenance
  • how to deal with different versions (browse,
    hi-res, medium res) derived from the same scan or
    different encoding schemes (TIFF, PICT, JFIF)
  • Vocabulary Standards to express this
  • VRA Surrogate Categories
  • CIMI's "Image Elements

39
Other Metadata
  • Description of depiction/surrogate (What VRA
    calls its "Surrogate Categories")
  • Description of original object
  • Rights and Reproduction Information
  • Location Information

40
Metadata for Digital Commerce
  • DOI
  • ltindecsgt-

41
ltIndecsgt
  • formal structure for describing and uniquely
    identifying intellectual property itself, the
    people and businesses involved in its trading,
    and the agreements which they make about it
    (primarily for publishing, music, and visual
    arts)
  • will develop high-level specifications for the
    services that will be required to implement a
    global IP trading system based on this ltindecsgt
    generic data model
  • focus is on encoding rights at a high level, not
    on resource discovery
  • likely to involve metadata schma registration and
    directory to allow interoperation of personal
    identifiers for rightsholders and users
  • supported by EEC DG-13
  • First meeting July 1999
  • http//www.indecs.org/

42
Problems Potentialsof Rich Media-
  • Types of Rich Media
  • Technologies and problems
  • Opportunities--a scenario
  • Metadata
  • Indexing

43
Some Types of Rich Media
  • Moving image materials
  • Multimedia
  • Interactive programs
  • Computer art

44
After an uphill battle, tech and Tinseltown find
common ground (USA Today, 3/3/00)
45
Projected Changes Prospect of digitized movies
already has some mourning loss of film(SF
Chronicle, 3/5/00)
46
Video Technology to Make the Head Spin (NYT
3/2/00)
47
ECI - Hole in Space (both)
48
ECI - 84-locations
49
ECI - 84-Community Memory
50
ECI - 84-MOCA
51
ECI - Avatars Humans
52
ECI - Avatar Stage
53
Complexity of Rich Media
  • Works often have artistic nature (including video
    games)
  • Enormous number of elements can, at times, be
    very important to preserve (pacing, original
    artifact, elements used to construct the
    artifact)
  • Too complex to save every one of these aspects
    for every type of material
  • Importance of saving documentation

54
Rich Media Technologies
  • Streaming media vs. Downloaded files
  • Bandwidth and compression
  • Need to offload functions onto clients

55
The Inter-relation Problem
  • -Info is increasingly inter-related to other info
  • -How do we make our own Info persist when it
    points to and integrates with Info owned by
    others?
  • -What is the boundary of a set of information (or
    even of a digital object)?

56
The Translation Problem
  • Content translated into new delivery devices
    changes meaning
  • -A photo vs. a painting
  • -If Info is produced originally in digital form
    in one encoded format, will it be the same when
    translated into another format?
  • Behaviors

57
Problems of Rich Media
  • Complexity of formats (storage compression)
  • Synchronicity between media/streams
  • Pieces and Boundaries
  • Persistent IDs
  • Interactivity
  • Historical context
  • Content
  • Recontextualization (Postmodernism)

58
Opportunities--a scenario
  • Huge stable online DB of rich media (Prelinger
    Archives)
  • Creators create new works that consist mainly of
    links to and transitions btwn pieces of the rich
    media DB
  • Works are not really assembled until run-time
  • Securing IP permission may shift from
    capital-intensive producer to end-user
  • Economics of media production may change
    drastically

59
Structural Metadata for Complex Objects-
  • MPEG 4
  • SMIL

60
Synchronized Multimedia Integration Language
(SMIL)
  • For repurposing and reuse in different ways
  • Use XML to reference various pieces in different
    ways
  • Supported by Realmedia but not Microsoft or
    Macromedia

61
MPEG 4
  • Object-oriented
  • Very low level of granularity (even objects vs
    backgrounds)
  • Scaleable bandwidth use
  • Binary Format for Scenes (BIFS) borrows concepts
    from VRML

62
Indexing ofMoving Image Materials
  • Whole works vs. parts of Works
  • MPEG 7
  • Approaches to segmentation thumbnail
    representation
  • Closed caption indexing
  • Audio description indexing
  • Semiotics

63
Other Types of Metadata-
  • Longevity
  • Identification/Provenance
  • Rights Management

64
The Short Life of Digital Info Digital Longevity
Problems-
  • Disappearing Information
  • The Viewing Problem
  • The Scrambling Problem
  • The Inter-relation Problem
  • The Custodial Problem
  • The Translation Problem

65
The Viewing Problem
  • Digital Info requires a whole infrastructure to
    view it
  • Each piece of that infrastructure is changing at
    an incredibly rapid rate
  • How can we ever hope to deal with all the
    permutations and combinations

66
The Scrambling ProblemDangers from
  • Compression to ease storage delivery
  • Container Architecture to enhance digital commerce

67
The Inter-relation Problem
  • -Info is increasingly inter-related to other info
  • -How do we make our own Info persist when it
    points to and integrates with Info owned by
    others?
  • -What is the boundary of a set of information (or
    even of a digital object)?

68
The Custodial Problem
  • How do we decide what to save?
  • Who should save it?
  • How should they save it?
  • -methods for later access emulation, migration,
    etc.
  • -issues of authenticity and evidence

69
The Translation Problem
  • Content translated into new delivery devices
    changes meaning
  • -A photo vs. a painting
  • -If Info is produced originally in digital form
    in one encoded format, will it be the same when
    translated into another format?
  • Behaviors

70
Pieces of the Solution (1/2)
  • -We need to insist upon clearly readable
    standardized ways for digital objects to
    self-identify their formats
  • -We should discourage scrambling
  • -We need to better understand information
    inter-relates to other Info, and what constitutes
    boundaries of Info objects

71
Pieces of the Solution (2/2)
  • -People and organizations wishing to make
    information persist need guidelines of how to go
    about doing it
  • -We need to better understand how translating
    from one storage or display format to another
    affects the meaning of a work
  • -We need to save the behaviors of a digital
    object, not just its contents

72
Metadata can be the first line of defense
  • Can tell you
  • where the file is (if you cant find the file)
  • where more info about the file is (if you have
    the file but most other metadata has become
    separated)
  • what the file format is
  • what the compression scheme is
  • what application program and version is needed
    for the file

73
Groups Working onthe Big Longevity
Problemhttp//sunsite.Berkeley.EDU/Imaging/Databa
ses/Longevity/
  • CPA Task Force
  • CPA Study Group
  • Getty Time Bits Conference-
  • Internet Archive
  • Long Now

74
Migration/Refreshing
  • Impact on evidential value

75
Emerging Standards for Complex Works
  • Howard Besser
  • UCLA School of Education Information
  • http//www.gseis.ucla.edu/howard/image-meta.html
  • http//sunsite.Berkeley.EDU/moa2/
  • http//www.gseis.ucla.edu/howard/Classes/287-movi
    ng.html
  • http//www.gseis.ucla.edu/howard/Classes/287-mov-
    index-bib.html
  • http//www.gseis.ucla.edu/howard/Metadata/UC-May0
    0/
  • http//www.getty.edu/gri/standard/intrometadata/
  • http//sunsite.Berkeley.EDU/Imaging/Databases/sta
    ndards
  • http//sunsite.Berkeley.EDU/Longevity/
  • http//www.ifla.org/II/metadata.htm
  • http//purl.oclc.org/metadata/dublin_core/
  • http//purl.oclc.org/corc/
  • http//lcweb.loc.gov/ead/
  • http//sunsite.berkeley.edu/Metadata/sp2000.html

76
Data StructuresThe VRA Core
  • 28 elements specifically for visual resource
    collections
  • Work Description Categories-
  • Visual Document Description Categories-
  • http//www.oberlin.edu/art/vra/dsc.html

77
VRA CoreWork Description Categories
  • Work type
  • Title
  • Measurements
  • Material
  • Technique
  • Creator
  • Role
  • Date
  • Repository name
  • Repository place
  • Repository number
  • Current site
  • Original site
  • Style/period/group/movement
  • Nationality/culture
  • Subject
  • Related work
  • Relationship type
  • Notes

78
VRA CoreVisual Document Description Categories
  • Visual document type
  • Visual document format
  • Visual document measurements
  • Visual document date
  • Visual document owner
  • Visual document owner number
  • Visual document view description
  • Visual document subject
  • Visual document source

79
Thesaurus for Graphic Materials
  • designed for subject indexing of pictorial
    materials, particularly large general collections
    of historical images
  • for cataloging and retrieval
  • good for general audiences and broad approaches
    to the material
  • TGM-I Subject Terms TGM-II Genre and Physical
    Characteristic Terms
  • http//lcweb.loc.gov/rr/print/tgm/toc.html

80
AAT
  • 120,000 terms
  • for describing objects, textual materials,
    images, architecture, and material culture from
    antiquity to present
  • large and complex
  • http//www.getty.edu/gri/vocabularies/

81
ULAN
  • name authority
  • http//www.getty.edu/gri/vocabularies/

82
Thesaurus of Geographic Names
  • over 1 million records
  • hierarchical and global
  • throughout history
  • most records include coordinates and descriptive
    notes

83
Semantics/Syntax/Structure
  • Semantics
  • meaning, as defined by a community to meet their
    particular needs (DC)
  • Syntax
  • a systematic arrangement of data elements for
    machine processing
  • facilitates the exchange and use of metadata
    among various applications (HTML, XML, RDF)
  • Structure
  • a formal arrangement of the syntax with the goal
    of consistent representation of the semantics
    (rules defining field contents like 1/11/99)

84
Metadata Mapping-
  • Crosswalks
  • Resource Description Framework (RDF)

85
Crosswalks
  • mapping btwn differing metadata structures
  • eliminate the need for monolithic, universally
    adopted standards
  • focus on flexibility and interoperatiblity
  • RDF-based metadata registries

86
Crosswalk Example
87
Resource Description Framework (RDF, spec
released 2/99)
  • W3C Metadata activity
  • designed to move the Web beyond simple links to
    semantically-rich relationships btwn resources
  • metadata application using XML as a common syntax
    for exchange and processing
  • flexible architecture for managing diverse
    application-specific metadata packets that can be
    processed by machines
  • associates resources, property types, and
    corresponding values
  • http//www.w3.org/RDF/

88
RDF
  • Resources (character strings, names, digital
    objects)
  • Property (is the author of)
  • Value
  • resourcespropertiesrelationships
  • many different relationships can be reflected

89
XML-encoded RDF
  • lt?xmlnamespace nshttp//www.w3.org/RDF/RDF
    prefix"RDF" ?gt
  • lt?xmlnamespace nshttp//purl.oclc.org/DC/
    prefix"DC" ?gt
  • ltRDFRDFgt
  • ltDCCreatorgtHoward Besserlt/DCCreatorgt
  • lt/RDFDescriptiongt
  • lt/RDFRDFgt

90
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com