VRC: Preservation Risk Management for Web Resources - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

VRC: Preservation Risk Management for Web Resources

Description:

the VRC toolkit needs more than just Web crawlers. VRC Toolbox ... Web crawlers. Site managers. Change Detectors. Site Mappers (includes visualization) ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 22
Provided by: nancy306
Category:

less

Transcript and Presenter's Notes

Title: VRC: Preservation Risk Management for Web Resources


1
VRC Preservation Risk Management for Web
Resources Nancy Y. McGovern, Information Day 2004
2
VRC Funding
  • Part of a 4(5)-year NSF-funded project
  • supported by the Digital Libraries Initiative,
    Phase 2 (Grant No. IIS-9905955, the Prism
    Project)
  • Also partially funded by a grant from
    The Andrew W. Mellon Foundation
  • Political Communications Web Archiving
    http//www.crl.edu/content/PolitWeb.htm

3
Current Team
  • Anne R. Kenney, Research Advisor
  • Nancy Y. McGovern, Project Manager
  • Richard Entlich, Sr. Researcher
  • William R. Kehoe, Technology Coordinator
  • Ellie Buckley, Digital Research Specialist

4
Research
  • "Preservation Risk Management for Web Resources
    Virtual Remote Control in Cornell's Project
    Prism"
  • by Kenney, McGovern, et al, in DLib Magazine,
    January 2002
  • http//www.dlib.org/dlib/january02/kenney/01kenney
    .html
  • "Virtual Remote Control
  • Building a Preservation Risk Management Toolbox
    for Web Resources"
  • by McGovern, Kenney, et al, in DLib Magazine,
    April 2004
  • http//www.dlib.org/dlib/april04/mcgovern/04mcgove
    rn.html

5
Purpose
  • Risk Management Records Management
  • Passive (monitor) ? active (capture)
  • Lifecycle support selection to capture
  • Human (curator) tool interaction
  • Structural and change models of resources
  • Promulgate preservation practices
  • Understand Web resources and risks

6
VRC Stages
  • Identification
  • Analysis
  • Appraisal
  • Strategy
  • Detection
  • Response

7
Human Tool Scenario
  • 1. Identification
  • Human identify Web resources of interest
  • Toolbox verify list, expand list
  • 2. Analysis
  • Toolbox crawl sites, generate characterizations
  • Human accept/revise characterizations
  • 3. Appraisal
  • Human define/review attributes of value
  • Toolbox support appraisal, capture results

8
Human Tool Scenario
  • 4. Strategy
  • Human develop/review strategies
  • Toolbox plot appraisals, compile strategies
  • 5. Detection
  • Human define risk parameters
  • Toolbox identify/assess risks propose responses
  • 6. Response
  • Toolbox propose risk response based on rules
    automatic response for some risk categories
  • Human monitor automated responses select
    response based on recommended actions

9
Risk Display Grid
10
Monitoring Layers
11
Web Crawling
  • traversing Web sites via links
  • a capability common to most tools, but with
    different purposes and results
  • the VRC toolkit needs more than just Web crawlers

12
VRC Toolbox
  • Identify tools for each stage (adopt, adapt,
    define, devise)
  • Leverage existing apply to longevity
  • Analyze steps - automated and manual
  • Formalize protocol
  • Provide a framework to map existing, plug gaps
    with developments

13
VRC Toolkit
  • Development steps
  • extensive literature review
  • development of tool categories
  • definition of categories and test protocols
  • survey existing tools for evaluation
  • select representative for testing
  • highlight findings in category summaries

14
Tool Categories
  • Link checkers
  • Site monitors
  • Web crawlers
  • Site managers
  • Change Detectors
  • Site Mappers (includes visualization)
  • HTML Validators

15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
Strengths / Weaknesses
  • Positive evaluations of model
  • Addresses organizational context for method
    link to selection
  • Leverage existing tools gap analysis for missing
  • Move from method to toolkit demonstrate
    applicability

19
Current Activities
  • VRC Preservation Risk Management Program
  • Map stages to tool requirements
  • Apply to potential organizational scenarios
  • Enable risk/response scenario development
  • Toolkit
  • Revise and finish populating tool inventory
  • Maintain VRC Control Site

20
Future Projects
  • Develop approach for building collection
    capturing Web blogs and other Internet
    communications
  • State Government Web site case study
  • Demonstrators for toolkit scenarios

21
Availability
  • VRC Test Site public
  • http//irisresearch.library.cornell.edu/control/
  • Full access to Tool Inventory upon request
  • Contact nm84_at_cornell.edu
  • http//irisresearch.library.cornell.edu/VRC/
Write a Comment
User Comments (0)
About PowerShow.com