Planning to Maximize Longevity of Digital Information - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Planning to Maximize Longevity of Digital Information

Description:

Title: Planning to Maximize Longevity of Digital Information Last modified by: Howard Besser Document presentation format: On-screen Show Other titles – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 38
Provided by: besserTs
Category:

less

Transcript and Presenter's Notes

Title: Planning to Maximize Longevity of Digital Information


1
Planning to Maximize Longevity of Digital
Information
  • Howard Besser
  • UCLA School of Education Information
  • http//www.gseis.ucla.edu/howard

2
Planning to Maximize Longevity of Digital Info-
  • The Ecology Metaphor
  • Why are you Managing this Information?
  • Major Issues Facing Digital Projects
  • The Short Life of Digital Info
  • Important Planning Considerations
  • Key Considerations for Imaging Projects

3
The Ecology Metaphor
4
Why are you Managing this Information?
  • Organizational mission type
  • Users
  • Uses

5
Major Issues Facing Digital Projects
  • Dangerous Changes in Intellectual Property Law
  • Intellectual Access
  • Storage
  • Delivery
  • Integration with other tools
  • Interoperability

6
Serious Longevity Problems
  • What we know from prior widespread digital file
    formats
  • Images separating from their metadata
  • Inaccessibility of software needed to view a work
  • Inability to even decode the file format of a work

7
The Short Life of Digital Info Digital Longevity
Problems-
  • Disappearing Information
  • The Viewing Problem
  • The Scrambling Problem
  • The Inter-relation Problem
  • The Custodial Problem
  • The Translation Problem

8
The Viewing Problem
  • Digital Info requires a whole infrastructure to
    view it
  • Each piece of that infrastructure is changing at
    an incredibly rapid rate
  • How can we ever hope to deal with all the
    permutations and combinations

9
The Scrambling ProblemDangers from
  • Compression to ease storage delivery
  • Container Architecture to enhance digital commerce

10
The Inter-relation Problem
  • -Info is increasingly inter-related to other info
  • -How do we make our own Info persist when it
    points to and integrates with Info owned by
    others?
  • -What is the boundary of a set of information (or
    even of a digital object)?

11
The Custodial Problem
  • In the past, much of survival was due to
    redundancy
  • How do we decide what to save?
  • Who should save it?
  • Mellon-funded E-Journal Archives
  • How should they save it?-

12
The Custodial ProblemHow to save information?
  • Methods for later access
  • Refreshing
  • Migration
  • Emulation
  • Issues of authenticity and evidence

13
The Translation Problem
  • Content translated into new delivery devices
    changes meaning
  • -A photo vs. a painting
  • -If Info is produced originally in digital form
    in one encoded format, will it be the same when
    translated into another format?
  • Behaviors

14
Pieces of the Solution (1/2)
  • -We need to insist upon clearly readable
    standardized ways for digital objects to
    self-identify their formats
  • -We should discourage scrambling
  • -We need to better understand information
    inter-relates to other Info, and what constitutes
    boundaries of Info objects

15
Pieces of the Solution (2/2)
  • -People and organizations wishing to make
    information persist need guidelines of how to go
    about doing it
  • -We need to better understand how translating
    from one storage or display format to another
    affects the meaning of a work
  • -We need to save the behaviors of a digital
    object, not just its contents

16
Conceptual Approaches to Digital Preservation
  • Refreshing always necessary due to volatility of
    physical strata
  • Impact on evidential value
  • Migration -- advantages disadvantages
  • Emulation -- advantages disadvantages

17
To deal with Immediately-
  • Persistent IDs
  • Metadata

18
Persistent IDs--the Problem
  • Need to separate work ID from work location
  • URNs probably wont be ready until 2003
  • Becomes a business process issue when one
    organization maintains the resource and another
    organization references it (ie. licensed from
    vendors or managed by separate administrative
    structures)

19
More Persistent IDs--the Approach for today
  • PURLs
  • Handles
  • HTTP redirects
  • And worry about costs now and conversion costs
    when URNs become feasible

20
Data Set ManagementMore issues with referencing
IDs
  • References for mirror sites
  • References for back-up sites when main site is
    down or bottle-necked
  • References for off-site copies and archival copies

21
Metadata can be the first line of defense
  • Can tell you
  • where the file is (if you cant find the file)
  • where more info about the file is (if you have
    the file but most other metadata has become
    separated)
  • what the file format is
  • what the compression scheme is
  • what application program and version is needed
    for the file

22
Structural Metadata Issues
  • http//sunsite.berkeley.edu/moa2

23
Architecture Separating Longevity and Delivery
Servers
24
Groups Working onthe Big Problemhttp//sunsite.B
erkeley.EDU/Longevity/
  • CPA Task Force
  • Getty Time Bits Conference Follow-ups-
  • Emulation experiments in US and Europe
  • NEDLIB, CURL, Michigan
  • Mellon-funded E-Journal Archive experiments
  • Internet Archive
  • Long Now

25
Time Bits
26
Time Bits Participants
  • Steward Brand
  • Howard Besser
  • Brian Eno
  • Danny Hillis
  • Peter Lyman
  • Brewster Kahle
  • Kevin Kelly
  • Jaron Lanier
  • Doug Carlston
  • John Heilemann
  • Ben Davis
  • Margaret MacLean
  • Bruce Sterling
  • Paul Saffo

27
Groups Working onPieces of the Big
Problemhttp//sunsite.berkeley.edu/Longevity/
  • Internet Archive
  • Long Now
  • Emulation experiments in US and Europe
  • NEDLIB, CURL, Michigan

28
Journal Archiving
  • License, dont own may not be even able to
    obtain right to make archival copy
  • Increasingly no paper back-up at all
  • Usually we dont have the important redundancy
    factor
  • Stanfords LOCKSS Project (Lots of Copies Keeps
    Stuff Safe) and its problems (http//lockss.stanfo
    rd.edu)

29
Complexity of Rich Media
  • Works often have artistic nature (including video
    games)
  • Enormous number of elements can, at times, be
    very important to preserve (pacing, original
    artifact, elements used to construct the
    artifact)
  • Too complex to save every one of these aspects
    for every type of material
  • Importance of saving documentation

30
Important Planning Considerations
  • File Formats
  • Choosing Interoperable Systems
  • Adhere to standards
  • Vendors with large installed base
  • Refreshing and/or Migration

31
Key Considerations for Imaging Projects-
  • Users' Needs
  • Image Quality
  • Intellectual Property
  • Standards
  • Topology
  • Tools Processes

32
Key Considerations for Imaging Projects (1 of 3)
  • Users' Needs
  • Quality of Digital Surrogate
  • Interoperable desktop applications
  • Image Quality
  • Archival
  • Current online delivery

33
Key Considerations for Imaging Projects (2 of 3)
  • Intellectual Property
  • Standards
  • Modular and Layered Architecture
  • Terminology
  • Technical imaging information
  • Topology

34
Key Considerations for Imaging Projects (3 of 3)
  • Tools Processes
  • Scanners
  • Compression techniques
  • Linking files
  • Workflow
  • Interoperable desktop applications

35
Some nuts-and-boltsPlanning Considerations
  • Think about users (and potential users), uses,
    and type of material/collection
  • Scan at the highest quality that does not exceed
    the likely potential users/uses/material
  • Do not let todays delivery limitations influence
    your scanning file sizes understand the
    difference between digital masters and derivative
    files used for delivery
  • Many documents which appear to be bitonal
    actually are better represented with greyscale
    scans
  • Include color bar and ruler in the scan
  • Use objective measurements to determine scanner
    settings (do NOT attempt to make the image good
    on your particular monitor or use image
    processing to color correct)
  • Dont use lossy compression
  • Store in a common (standardized) file format
  • Capture as much metadata as is reasonably
    possiple (including metadata about the scanning
    process itself)

36
One Final QuestionWho will collect the digital
works of today that should become the Special
Collections of tomorrow?
  • web sites
  • zines
  • electronic journals
  • listserve and email discussions
  • drafts of works that later become famous

37
Planning to Maximize Longevity of Digital
Information
  • Howard Besser
  • UCLA School of Education Information
  • http//sunsite.berkeley.edu/Longevity/
  • http//www.gseis.ucla.edu/howard
  • http//sunsite.berkeley.edu/moa2
  • http//lockss.stanford.edu
  • http//www.longnow.com/10klibrary/TimeBitsDisc/
  • http//www.archive.org/
Write a Comment
User Comments (0)
About PowerShow.com