Awareness Services for Digital Libraries - PowerPoint PPT Presentation

About This Presentation
Title:

Awareness Services for Digital Libraries

Description:

Awareness Services for Digital Libraries Arturo Crespo Hector Garcia-Molina Stanford University – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 19
Provided by: Artu52
Category:

less

Transcript and Presenter's Notes

Title: Awareness Services for Digital Libraries


1
Awareness Services for Digital Libraries
  • Arturo Crespo
  • Hector Garcia-Molina
  • Stanford University

2
Awareness Services for Digital Libraries
  • Digital library repository
  • Data store
  • Other components
  • Indexers
  • Name manager
  • Replica manager
  • Etc

3
Data Stores and Clients
DB Tech Reports
DB Indexer
AI Tech Reports
CS Indexer
HCI Tech Reports
Data Stores
Clients
4
Data Store Services
  • Object access
  • Via a handle
  • Object awareness
  • Clients must be aware of changes at the store

5
A Case Study CS-TR and SIFT
  • SIFT a selective dissemination service
  • CS-TR A digital library of technical reports
    from about 50 universities
  • Awareness based on timestamps
  • Problems
  • File system timestamps
  • Application timestamps
  • Deletions

6
Contributions
  • Survey of the spectrum of awareness options
  • Advantages and disadvantages of each one
  • All mechanisms can be capture by a single
    algorithm the UNI-AWARE algorithm
  • Enhancements for signature-based schemes
  • Reduced computation
  • Reduced communication costs

7
Related Work
  • Database replica maintenance
  • Remote file comparison
  • Deployment of programs over the network

8
The Client-store Design Space
  • Push vs. Pull
  • Statefull versus stateless stores and clients
  • Cognizant clients and sources
  • Number of clients per data store

9
The UNI-AWARE Algorithm
  • A unified algorithm that covers known schemes
  • Snapshot algorithm
  • Timestamps and versions
  • Logs
  • Triggers
  • Signatures
  • Algorithm is tailored to a specific scheme
    through the definition of custom functions

10
UNI-AWARE Signature Algorithm
  • Signature a token associated with each document
    that has a high probability of being unique and
    changes when the content of the object changes
  • Example CRC, checksums
  • Advantages
  • Robust as it does not require metadata
    maintenance
  • Easy to manage consistently when store fails or
    object migrates

11
UNI-AWARE Signature Algorithm
All signatures transferred
Data Store
Client
Document
Signature
Request Documents
12
DIST-UNI-AWARE Algorithm
  • Objective reduce amount of data exchanged
    between data store and clients
  • DIST-UNI-AWARE
  • Unified algorithm that can be tailored to
    different schemes
  • Hierarchical signatures
  • Hierarchical timestamps

13
DIST-UNI-AWARE
Signatures of Buckets transferred
Data Store
Client
Request more Signatures
Request Documents
Document
Signature
14
Advantages of Signature Algorithms
  • Support the push and pull models
  • No need for reliable storage of additional data
    structures if signatures are lost or corrupted,
    they can be recomputed
  • Efficient in usage of network resources, clients
    and data stores
  • Scales well in number of clients and documents

15
DIST-UNI-AWARE Performance
  • Performance depends on number of changes
  • No changes only one round is required
  • Single change log2n rounds
  • 2 changes log2n rounds, but twice as much data
  • Eventually, DIST-UNI-AWARE starts behaving worse
    than UNI-AWARE

16
DIST-UNI-AWARE Enhancements
  • Increase group split factor
  • Client sends additional information at split time
  • Clustering of changed objects

17
Conclusions
  • Awareness mechanism for digital libraries
  • Separation of storage functionality and other
    services
  • Awareness schemes must be resilient to computer
    environment changes and bugs
  • UNI-AWARE and DIST-UNI-AWARE

18
Awareness Services for Digital Libraries
Arturo Crespo Hector Garcia-Molina Stanford
University
Write a Comment
User Comments (0)
About PowerShow.com