Breaking down the walls - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Breaking down the walls

Description:

By creating links from the Library's Web site, this approach would make ... Wiederhold mediators between raw data and end-user applications for ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 46
Provided by: carll8
Category:

less

Transcript and Presenter's Notes

Title: Breaking down the walls


1
Breaking down the walls
  • Moving libraries from collectors to portals

Carl LagozeCornell Universitylagoze_at_cs.cornell.e
du
2
The Library should selectively adopt the portal
model for targeted program areas. By creating
links from the Librarys Web site, this approach
would make available the ever-increasing body of
research materials distributed across the
Internet. The Library would be responsible for
carefully selecting and arranging for access to
licensed commercial resources for its users, but
it would not house local copies of materials or
assume responsibility for long-term
preservation. LC21 Digital Strategy for the
Library of Congresspage 5
3
LC21 Digital Strategy for the Library of
Congresspage 5
4
Towards a Virtual Control Zone
Some of the most fundamental aspects of library
operations entail the existence of a border,
across which objects of information are
transferred and maintained. Such a parameter,
demarcating a single, distributed digital library
(the "control zone"), needs to be created and
managed by the academic library community at the
earliest opportunity. Ross AtkinsonLibrary
Quarterly, 1996
5
Why distributed collections?
  • Scale of the Web
  • Prevalence of new publishing models and agents
  • Increasing complexity of licensing and access
    management
  • Dynamic nature of content

6
Towards Hybrid Portals
  • Traditional portal (e.g., Yahoo!)
  • linkage without responsibility
  • Hybrid Portal
  • assertion of (some semblance) of curatorial role
    over linked objects

7
New models have cultural/organizational
ramifications
  • Performance and ranking metrics "bigger is
    better"
  • Levels of confidence
  • Trust

8
that can be assisted by new technical foundations
  • Digital object architectures
  • that enable aggregating and customizing content
    for local access and management
  • Metadata frameworks
  • That model changes in objects in time and loci of
    responsibility
  • OAI Harvesting Protocol
  • for exchange of structured information
  • Preservation models
  • that enable non-cooperative and cooperative
    offsite monitoring

9
Digital Object Architectures
  • Acknowledgements
  • Naomi Dushay
  • Sandy Payette
  • Thorton Staples (U. Va.)
  • Ross Wayland (U. Va.)

10
From Mediators to Value-Added Surrogates
  • Wiederhold mediators between raw data and
    end-user applications for integration and
    transformation
  • Paepcke mediators as foundation for digital
    library interoperability
  • Payette and Lagoze mediators (V-A surrogates)
    to aggregate and create a localized service layer
    for distributed resources

11
FEDORA Digital Object Model
12
Establishing a Virtual Control Zone
13
V-A Surrogate Applications
  • Access management
  • Shared responsibility among trusted partners
  • Enhanced and customized functionality
  • Examples reference linking, format translation,
    special needs
  • Preservation
  • Monitoring "significant" events and acting on them

14
  • DigitalObject A
  • View Slides
  • View Video
  • View synchronized presentation using applet

Context Broker A
15
Context Broker A
16
Where we are now
  • Ongoing FEDORA reference prototype
  • http//www.cs.cornell.edu/cdlrg/FEDORA.html
  • Policy enforcement research
  • Content mediation
  • Proposed joint deployment with University of
    Virginia
  • Open source scalable implementation of FEDORA
    architecture
  • Testing and deployment with a number of research
    library partners.

17
Event-Aware Metadata Frameworks
  • Acknowledgements
  • Dan Brickley (ILRT, Bristol)
  • Martin Doer (FORTH, Crete)
  • Jane Hunter (DSTC, Brisbane)

18
Distributed ContentThe Metadata Challenge
  • From fixed,contained physical artifacts to fluid,
    distributed digital objects
  • Need for basis of trust and authenticity in
    network environment
  • Decentralization and specialization of resource
    description and need for mapping formalisms

19
Multi-entity nature of object description
20
Attribute/Value approaches to metadata
The playwright of Hamlet was Shakespeare
Hamlet has a creator
Shakespeare
21
run into problems for richer descriptions
The playwright of Hamlet was Shakespeare,who was
born in Stratford
Hamlet has a creator
Stratford
birthplace
22
because of their failure to model entity
distinctions
Shakespeare
name
R1
R2
creator
birthplace
title
Stratford
Hamlet
23
ABC/Harmony Event-aware metadata model
  • Recognizing inherent lifecycle aspects of
    description (esp. of digital content)
  • Modeling incorporates time (events and
    situations) as first-class objects
  • Supplies clear attachment points for agents,
    roles, occurrent properties
  • Resource description as a story-telling activity

24
Resource-centric Metadata
25
(No Transcript)
26
Queries over descriptive graphs
Rudolf Squish http//swordfish.rdfweb.org/rdfque
ry
List details of events where Lagoze is a
participating agent SELECT ?title, ?type, ?time,
?place, ?name FROM http//ilrt.org/discovery/h
armony/oai.rdf WHERE (webtype ?event
abcEvent) (abccontext ?event ?context)
.. AND ?name lagoze USING web FOR
http//www.w3.org/1999/02/22-rdf-syntax-ns
27
Where we are now
  • Stabilization of model
  • Collaboration with museum/CIDOC community for
    joint modeling principles
  • Plans
  • RDF api for model elements
  • UI for metadata creation
  • Query engine testing

28
Open Archives Initiative
  • Acknowledgements
  • Herbert Van de Sompel
  • OAI Steering and Technical Committees

29
Open Archives Initiative
  • Testing the hypotheses
  • exposing metadata in various forms will
    facilitate creation of value-added services
  • key to deployable DL infrastructure is low-entry
    cost
  • Individual communities can/will customize common
    infrastructure

30
Where weve come from
  • Late 1999 Santa Fe UPS meeting increase impact
    of eprint initiatives through federation
  • Santa Fe Convention metadata harvesting among
    eprint archives
  • Increasing interest outside the eprint community
  • Research libraries
  • Museums
  • Publishers

31
Progress over the past year
  • OAI workshops at US and EC DL conferences
  • Organizational stability
  • Executive committee and steering committee
  • September 2000 technical meeting
  • Reframe and rethink technical solutions for
    broader domain
  • Extensive testing and refinement of technical
    infrastructure

32
Technical Infrastructure key technical features
  • Deploy now technology 80/20 rule
  • Two-party model providers and consumers
  • Simple HTTP encoding
  • XML schema for some degree of protocol
    conformance
  • Extensibility
  • Multiple item-level metadata
  • Collection level metadata

33
OAI protocol requests
service provider
data provider
  • Supporting protocol requests
  • Identify
  • ListMetadataFormats
  • ListSets
  • Harvesting protocol requests
  • ListRecords
  • ListIdentifiers
  • GetRecord

34
Where we are now
  • Stable 1.0 protocol specification
  • Hopefully, self-documenting infrastructure
  • http//www.openarchives.org
  • 27 registered data providers
  • Increasing number of tools available
  • Research initiatives
  • NSF-funded NSDL
  • EC-funded Cyclades
  • Andrew W. Mellon service proposals
  • EC-funded community building

35
Where do we go from here
  • Controlling the stampede
  • Maintaining the organizational model lean and
    mean while encouraging community-specific
    exploitation
  • Encouraging testing especially through deployment
    and especially service development
  • Encouraging metadata diversification this isnt
    just above Dublin Core!!!
  • Preservation
  • Document access
  • Authentication

36
OAI Metadata Research
  • Dictionary of metadata terms (Tom Baker)
  • Mandating usage rules has only limited
    effectiveness
  • Compiling usage of those terms is vital to
    machine understanding and interoperability
  • Provide context heuristics for search engine and
    indexer processing
  • Large-scale deployment of OAI and web crawling
    enables (partial) automation of usage compilation
    (e.g., data mining of term usage)

37
Preservation Models
  • Acknowledgements
  • Bill Arms
  • Peter Botticelli (CUL)
  • Anne Kenney (CUL)

38
Preservation Remote Control
  • Organization Issues
  • assured preservation may not be possible
    without direct custodial control.
  • what are the levels of acceptability and for
    which types of resources?
  • Technical Issues
  • what are the technologies for remote control at
    the various levels of assurance deemed acceptable
    by the library?
  • what is the probability of a reasonable level of
    preservation in the context of such technologies?

39
Cost vs. Functionality
40
Leveraging Current Work
  • Event-based metadata
  • Metadata harvesting
  • Longevity and threats to digital resources

41
Level 0 Experiment
42
Level 1 Experiment
43
One of Six Core Integration Demonstration
Projects for the NSDL
44
How Big might the NSDL be?
The NSDL aims to be comprehensive -- all
branches of science, all levels of education,
very broadly defined. Five year targets
1,000,000 different users 10,000,000 digital
objects 100,000 independent sites
Requires low-cost, scalable, technology
automated collection building and maintenance
45
Levels of InteroperabilityMetadata Harvesting
Agreements on simple protocol and metadata
standard(s) Example Metadata harvesting
protocol of the Open Archives Initiative
(MHP) Moderate-quality services Low cost
of entry to participating sites Moderately large
numbers of loosely collaborating sites Promising
but still an emerging approach
46
Levels of InteroperabilityGathering
Robots gather collections automatically with no
participation from individual sites Examples Web
search services (e.g., Google) CiteSeer (a.k.a.
ResearchIndex) Restricted but useful services
Zero cost of entry to gathered sites Very
large numbers of independent sites Only suitable
for open access collections
Write a Comment
User Comments (0)
About PowerShow.com