Digital Collections: Storage and Access - PowerPoint PPT Presentation

About This Presentation
Title:

Digital Collections: Storage and Access

Description:

Combine disk and automated tape storage with software to keep track of where files are located ... HPSS (High Performance Storage System) software ... – PowerPoint PPT presentation

Number of Views:77
Avg rating:3.0/5.0
Slides: 23
Provided by: JonD6
Learn more at: https://dlib.indiana.edu
Category:

less

Transcript and Presenter's Notes

Title: Digital Collections: Storage and Access


1
Digital CollectionsStorage and Access
  • Jon Dunn
  • Assistant Director for Technology
  • IU Digital Library Program
  • jwd_at_indiana.edu

2
Storage
  • Why is storage an issue?
  • Space requirements
  • Persistence
  • Accessibility
  • Needs depend on purpose of storage
  • Capture/encoding
  • Access/delivery
  • Preservation

3
Storage Working Space
  • Space for storage of digital files during
    capture/encoding/quality control process
  • Possibilities
  • PC hard drive
  • File server / LAN
  • Issues
  • Capacity, backup, speed, accessibility

4
Storage Access/Delivery
  • Storage of derivative files for web delivery
  • Image, audio, video, text files, etc.
  • Possibilities
  • Local web server
  • Commercially-hosted web site
  • Consortial service provider
  • Issues capacity, backup, performance, software
    integration, maintenance/migration

5
Storage Preservation
  • Much harder problem
  • Longer term
  • Issues of longevity of media, hardware, file
    format
  • Where did we put the files?
  • Larger files
  • Hard disk storage, traditional backup methods not
    cost-effective
  • Infrequency of access
  • Problems do not become immediately evident

6
Long-Term Storage Options
  • Removable media stored offline
  • Optical
  • CD-R (CD-Recordable)
  • DVD-R (DVD-Recordable), DVDR, DVDRW, DVD-RW,
  • Tape
  • DLT, 8mm, DAT,
  • Pros cheap, easy, produces tangible item
  • Cons Low capacity, physical space requirements,
    unknown longevity, migration, potential format
    obsolescence
  • Online/nearline storage systems
  • HSM Hierarchical Storage Management
  • Combine disk and automated tape storage with
    software to keep track of where files are located
  • Locally managed or remote provider
  • Pros high capacity, migration can be handled by
    software,
  • Cons expensive, complex, network bandwidth
    issues, must trust service provider, potential
    single point of failure

7
(No Transcript)
8
(No Transcript)
9
HSM Example IUs Massive Data Storage Service
(MDSS)
  • HPSS (High Performance Storage System) software
  • Developed as collaboration of IBM and US national
    labs
  • Four tape robots
  • 2 in Bloomington, 2 in Indianapolis
  • Data can be mirrored
  • 540 terabytes (TB) total storage
  • 75 TB used as of April 2001

10
A digital object is more than just a file!
Metadata
Delivery page image files (JPEG)
Hi-res page image files (TIFF)
Text file (TEI/XML)
11
A digital object is more than just a file!
EAD Finding Aid
12
DL Objects
  • Digital library objects have many parts
  • Metadata
  • Preservation/archival files
  • Delivery files
  • How do we keep them connected?
  • Now Good practice in file naming, directory
    organization, project documentation -not
    scalable!
  • Future Digital object repository

13
Data Persistence
  • Key is migration
  • Keeping the bits alive
  • Physical media
  • Logical media format
  • Keeping the bits understandable
  • File format
  • Metadata
  • Small pockets of digital content pose a problem
    for migration

14
DL Object Repository
Preservation version in HSM
Repository System
Users and applications
Delivery version(s) on web server
Metadata records
15
Web Delivery Functions
  • Searching
  • Metadata
  • Full text
  • Browsing
  • By subject, date, author,
  • Navigation
  • Page turning, image panning/zooming,
  • Streaming
  • For audio/video
  • Reuse
  • Downloading, format conversion
  • Linking, persistent naming
  • Access control
  • If necessary

16
Digital Collection Delivery Software
  • Very complex systems
  • Need to integrate data from databases, full-text
    search engines, file systems, and other sources
  • Cross-collection searching
  • Commercial
  • ContentDM, Luna Insight, various library
    management system addons
  • Open source
  • UMich DLXS, Greenstone, Eprints, MIT DSpace,
  • Homegrown

17
(No Transcript)
18
Demonstration
  • Hoagy Carmichael Collection,IU Digital Library
    Program
  • http//www.dlib.indiana.edu/collections/hoagy/

19
(No Transcript)
20
Exposing Digital Resources Broadly
  • Pay services
  • RLG Cultural Materials, Archival Resources
  • Free services
  • University of Michigan OAIster
  • www.oaister.org
  • UIUC Digital Gateway to Cultural Heritage
    Materials
  • oai.grainger.uiuc.edu
  • OAI-PMH
  • Open Archives Initiative Protocol for Metadata
    Harvesting
  • www.openarchives.org
  • Google

21
OAI Metadata Harvesting
  • Extract metadata from various sources
  • Build services on local copies of metadata

all searching, browsing, etc. performed on the
metadata here
user
search for Indiana
Service provider
local copy of metadata
metadata harvested offline
metadata harvested offline
metadata harvested offline
metadata harvested offline
Data providers
. . .
22
More Information
  • Bibliography to be made available at
  • http//www.dlib.indiana.edu/workshops/alioct03/
Write a Comment
User Comments (0)
About PowerShow.com