Title: Digital Libraries: a Reference Model
1Digital Librariesa Reference Model
Vittore CasarosaISTI-CNR, Pisa and University
of Parma, Italy
- LIDA 200929 May 2009, Zadar, Croatia
2The traditional role of libraries
- Mediators between information and users
- Selection
- Definition of collections
- Acquisition
- Physical objects
- Description
- Catalogs
- Access
- Shelves
- Preservation
- Controlled enviroment
3Libraries some figures
- Volumes (in millions)
- Journals
- From 10.000 in 1950 to 150.000 in 2002
- Alexandria principle beginning to fade
4Evolution of technology
- Computer technology
- CPU and integrated chips
- Random Access Memories
- RAM from KB to GB
- External memories
- Tapes, hard disks, floppy disks
- Memory sticks
- CDs
- DVDs
- from MB to GB to TB to PB to EB
- Communication technology (networks)
- (Telephone) line speed
- Point to point (leased lines)
- Local Area Networks
- Inter-networking (TCP/IP)
5The World Wide Web
- Combination of computer technology and
communication technology - It all started with the hyperlink
- Then came the browser (Mosaic)
- Then came the first wave
- Then came the dot come, dot gone
- Then came the second wave
- Finally came the information explosion
- An estimate of 110 to 560 million hosts
- An estimate of 15 to 30 billion pages on line
- And now we have Web 2.0 (with Web 3.0 just around
the corner)
6DELOS main objective
- To define and conduct a joint program of
activities in order to integrate and coordinate
the on-going research activities of the major
European research teams in the field of digital
libraries for the purpose of developing the next
generation digital library technologies
7Digital Libraries in the Information Space
High
Databases/IR
Structure of User Behavior
Digital Libraries
Web
High
Low
Structure of Data
8Definition of Digital Library
9What is a Digital Library ?
- A DL is the combination of content and
services - A DL is an entity providing the functionality
to mediate between information objects and
information users in the context of distributed
collections of information objects. This
(external) functionality includes access,
publish, delivery, preservation, personalization,
etc. - A Digital Library is a tool at the centre of
intellectual activity having no logical,
conceptual, physical, temporal, or personal
borders or barriers on information - A Digital Library is an institution in charge
of providing at least the functionality of a
traditional library in the context of distributed
and networked collections of information objects
10DELOS - Grand 10610-Year Vision 1
- Digital libraries should enable any citizen
- to access all human knowledge anytime and
anywhere,in a friendly, multi-modal, efficient,
and effective way, by overcoming barriers of
distance, language, and culture and by using
multiple Internet-connected devices
11DELOS - Grand 10-Year Vision 2
- The potential exists for digital libraries to
become the universal knowledge repositories and
communication conduits for the future, a common
vehicle by which everyone will access, discuss,
evaluate, and enhance information of all forms
12Conceptual Framework
Usage
Management
Contents
Digital Library System
13Building a Digital Library
14Research Directions in DLs
15Foundations Research Issues
Reference Model for DLS
- Formalize a conceptual framework for Digital
Library systems - to serve as a yardstick of quality and richness
- to specify features and properties of generic
DLMS - to clarify relationships among
- digital libraries, digital repositories, digital
archives, - search engines, information infrastructures,
- knowledge commons
16System-related Research Issues
Architectures
- Peer-to-peer architectures
- Grid middleware
- Service-oriented architectures
Information Access
- Indexing for complex and novel data
- Query routing in complex distributed Digital
Libraries
17System-related Research Issues
Audio/Visual
- Automatic metadata extraction
- Context-aware content-based retrieval
- Audio/visual interfaces
Semantic Interop.
- Methods for the integration of heterogeneous
ontologies and domain-specific knowledge
organization systems - Interoperability with e-Learning applications
18User-related Research Issues
User Interfaces
- Framework for new digital library interfaces
- Task-oriented user interfaces
- Cooperation/collaboration tools, e.g.,
annotations
Visualization
- Self-adaptability to small screens
- Visual analysis and exploration of query results
Personalization
- Modeling foundations for user preferences and
context - Personalization of user interactions
- Peer-similarity-based query routing decisions
- User log analysis for profiling
19Horizontal Research Issues
Curation/Preservation
- Integration of preservation functionality
- Establishment of a testbed and evaluation
framework for preservation techniques - Automating selection and ingest processes
Evaluation Methodologies
- Standard frameworks for comparative evaluation of
DL Systems - Definition of standard events in a DL environment
- Identification of appropriate metrics
- Establishment of information repositories
20Applications Research Issues
e-Health
- Virtual electronic health records
- Integration of multiple medical information
streams
e-Learning
- Interoperability of e-Learning applications
e-Culture
- Integration of upper-level ontologies
- Mapping of core ontologies to schemas and KOS
21New name for Digital Libraries ?
Digital REALM
Digital REsources for Archives,Libraries and
Museums
- ltadjectivegtltcontent abstractiongtltcreated entitygt
- adjective Digital
- content abstraction
- created entity Library
22Alternatives for new name
23Whats in a name ?
DIGITAL LIBRARIES
... what's in a name? that which we call a rose,
by any other name would smell as sweet...
24Need for a Reference Model
- A reference model is an abstract framework for
understanding significant relationships among the
entities of some environment, and for the
development of consistent standards or
specifications supporting that environment - A reference model is based on a small number of
unifying concepts and may be used as a basis for
education and explaining standards to a
non-specialist - A reference model is not directly tied to any
standards, technologies or other concrete
implementation details, but it does seek to
provide a common semantics that can be used
unambiguously across and between different
implementations
25A Three-Entity Framework
26The three Entities
- Digital Library
- An organization, which might be virtual, that
comprehensively collects, manages, and preserves
for the long term rich digital content, and
offers to its user communities specialized
functionality on that content, of measurable
quality and according to codified policies - Digital Library System
- A software system that is based on a defined
(possibly distributed) architecture and provides
all functionality required by a particular
Digital Library. Users interact with a Digital
Library through the corresponding Digital Library
System - Digital Library Management System
- A generic software system that provides the
appropriate software infrastructure both (i) to
produce and administer a Digital Library System
incorporating the suite of functionality
considered foundational for Digital Libraries and
(ii) to integrate additional software offering
more refined, specialized, or advanced
functionality
27Different types of DLMS
- Extensible Digital Library System
- A complete Digital Library System that is fully
operational with respect to basic/ foundational
functionality required. It is based on an open
software architecture, so that further software
components can be incorporated on top of the ones
already there with ease (DelosDLMS, GreenStone) - Digital Library System Warehouse
- A collection of software components that
encapsulate the core suite of DL functionality
and a set of tools that can be used to combine
these components in a variety of ways (in
Lego-like fashion) to create Digital Library
Systems offering a tailored integration of
functionalities. New software components can
easily be incorporated into the Warehouse for
subsequent combination with those already there
(BRICKS, DILIGENT) - Digital Library System Generator
- A highly parameterized software system that
encapsulates templates covering a broad range of
functionalities, including a defined core suite
of DL functionality as well as any advanced
functionality that has been deemed appropriate to
meet the needs of the specific application domain
Through an initialization session, the
appropriate parameters are set and configured at
the end of that session, an application is
automatically generated, and this constitutes the
Digital Library System ready for installation and
deployment (MARIAN)
28DELOS DLMS
29Containment of models
30Actors in the Digital Library
31Main roles of Actors (1)
- DL End-Users
- They exploit the DL functionality for providing,
consuming, and managing the DL Content as well as
some of its other constituents They perceive the
DL as a stateful entity that serves their
functional needs The behaviour and output of the
DL depend on its state at the time a particular
part of its functionality is activated. DL
end-users may be further partitioned into - Content Creators
- Content Consumers
- Librarians (end user)
- DL Designers (Digital Librarian)
- They exploit their knowledge of the semantic of
the application domain to define, customize, and
maintain the Digital Library so that it is
aligned with the information and functional needs
of its end-users To perform this task, they
interact with the DLMS providing functional and
content configuration parameters The values of
these parameters, which can be modified during
the DL lifetime, configure the specific DL
perceived by the end-users because they determine
the particular Digital Library System instance
serving the Digital Library
32Main roles of Actors (2)
- DL System Administrators (System Librarian)
- They select the software components necessary to
create the Digital Library System needed to serve
the required DL (as specified by the DL Designer)
and decide where and how to deploy them They
interact with the DLMS by providing architectural
configuration parameters, such as the selected
software components, the hosting nodes, and the
components allocation. The value of the
architectural configuration parameters can be
changed over the DL lifetime Any change of these
parameters may result in the provision of
different DL functionality and/or different
quality - DL Application Developers
- They develop the software components of DLMSs and
DLSs, implementing the necessary functionality
33Hierarchy of Actors Views
34Main concepts (1)
35Main concepts (2)
- Content
- The Content concept encompasses the data and
information that the Digital Library handles and
makes available to its users Content is an
umbrella concept used to aggregate all forms of
information objects that a Digital Library
collects, manages, and delivers It encompasses
the diverse range of information objects,
including such resources as objects, annotations,
and metadata - User
- The User concept covers the various actors
(whether human or machine) entitled to interact
with Digital Libraries. Digital Libraries
connect actors with information and support them
in their ability to consume and make creative use
of it to generate new information User is an
umbrella concept including all notions related to
the representation and management of actor
entities within a Digital Library. It
encompasses such elements as the rights that
actors have within the system and the profiles of
the actors with characteristics that personalize
the systems behaviour or represent these actors
in collaborations - Functionality
- The Functionality concept encapsulates the
services that a Digital Library offers to its
different users, whether classes of users or
individual users While the general expectation
is that DLs will be rich in capabilities and
services, the bare minimum of functions would
include such aspects as new information object
registration, search, and browse Beyond that,
the system seeks to manage the functions of the
Digital Library to ensure that the functions
reflect the particular needs of the digital
librarys community of users and/or the specific
requirements relating to the Content it contains
36Main concepts (3)
- Policy
- The Policy concept represents the set (or sets)
of conditions, rules, terms and regulations
governing interaction between the Digital Library
and users, whether virtual or real. Examples of
policies include acceptable user behaviour,
digital rights management, privacy and
confidentiality, charges to users, and collection
delivery - Quality
- The Quality concept represents the parameters
that can be used to characterize and evaluate the
content and behaviour of a Digital Library.
Quality can be associated not only with each
class of content or functionality but also with
specific information objects or services Some of
these parameters are objective in nature and can
be automatically measured, whereas others are
subjective in nature and can only be measured
through user evaluations - Architecture
- The Architecture concept refers to the Digital
Library System entity and represents a mapping of
the functionality and content offered by a
Digital Library onto hardware and software
components There are two primary reasons for
having Architecture as a core concept (i)
Digital Libraries are often assumed to be among
the most complex and advanced forms of
information systems and (ii) interoperability
across Digital Libraries is recognized as a
substantial research challenge A clear
architectural framework for the Digital Library
System offers ammunition in addressing both these
issues effectively
37The main concepts in perspective
38The Digital Library Development Framework
39The Reference Model
40Concept Maps
41Digital Library Domains
42The Resource Domain
43The Content Domain
44The User Domain
45The Functionality Domain
46Main functions
- C32 Access Resource
- C33 Discover
- C34 Browse
- C35 Search
- C36 Acquire
- C37 Visualize
- C38 Manage Resource
- C39 Create
- C40 Submit
- C41 Withdraw
- C42 Update
- C43 Validate
- C44 Annotate
- C45 Manage Inform Object
- C64 Manage Actor
- C71 Manage Function
- C72 Manage Policy
- C73 Manage Quality Parameter
- C74 Collaborate
- C75 Exchange Information
- C76 Converse
- C77 Find Collaborator
- C78 Author Collaboratively
- C79 Manage DL
- C80 Manage Content
- C85 Manage User
- C90 Manage Functionality
- C92 Manage Quality
- C93 Manage Policy Domain
- C94 Manage Configure DLS
- C95 Manage DLS
- C104 Configure DLS
47Manage Information Object
- C46 Disseminate
- C47 Publish
- C48 Author
- C49 Compose
- C50 Process
- C51 Analyze
- C52 Linguistic Analysis
- C53 Qualitative Analysis
- C54 Examine Preservation State
- C55 Statistical Analysis
- C56 Scientific Analysis
- C57 Create Structured Representation
- C58 Compare
- C59 Transform
- C60 Physically Convert
- C61 Translate
- C62 Convert to a Different Format
- C63 Extract
48Manage Actor
- C65 Establish Actor
- C66 Register
- C67 Sign Up
- C68 Login
- C69 Personalise
- C70 Apply Profile
49Manage DL
- C80 Manage Content
- C81 Manage Collection
- C82 Import Collection
- C83 Export Collection
- C84 Preserve
- C85 Manage User
- C86 Manage Membership
- C87 Manage Group
- C88 Manage Role
- C89 Manage Actor Profile
- C90 Manage Functionality
- C91 Monitor Usage
- C92 Manage Quality
- C93 Manage Policy Domain
50Manage Configure DLS
- C95 Manage DLS
- C96 Create DLS
- C97 Withdraw DLS
- C98 Update DLS
- C99 Manage Architecture
- C100 Manage Architectural Component
- C101 Configure Architectural Componebnt
- C102 Deploy Architectural Component
- C103 Monitor Architectural Component
- C104 Configure DLS
- C105 Configure Resource Format
- C106 Configure Content
- C107 Configure User
- C108 Configure Functionality
- C109 Configure Policy
- C110 Configure Quality
51Access Resource
52Manage Resource (1)
53Manage Resource (2)
54Manage Information Object
55Manage Actor
56Collaborate
57Manage DL
58Manage Configure DLS
59The Policy domain
60Categorization of Policies
61The Quality Domain
62The Architecture Domain
63DLS Reference architecture
64DLS Concrete architecture
65Conclusions
Before you think that now you know everything
about (the technical aspects of) Digital
Libraries, there is one (recurring) question
Will the Web become the ultimate Digital Library ?
66The Web vs. Digital Libraries
67Claim 1
- All the information needs of an IT Society
(research, education, entertainment, business,
etc.) will be provided by this huge heap of
information called the Web
NO
NO
68Claim 2
- For all those activities that require organized
and controlled information, the actual
institutions (notably libraries, archives and
museums) will continue to have a significant role
NO
NO
(NOT ONLY THEM)
69Digital Libraries in the Information Space
High
Databases
Info Retrieval
Structure of User Behavior
CMS/DAMS
Wikis/blogs
Digital Libraries
Web
High
Low
Structure of Data
70Basic concepts of an Info Mgmt System
Policy
Quality
Functionality
Content
User
Architecture
71Refuting Claim 1
- All the information needs of an IT Society
(research, education, entertainment, business,
etc.) will NOT be provided by this huge heap of
information called the Web
72The Web as an Info Mgmt System
Architecture
73Three roads to Web Knowledge
- Handcrafted high-quality curated knowledge bases
(ontologies, encyclopedias, etc.) - Large-scale information extraction harvesting
(pattern matching, NLP, statist. learning, etc.) - Social wisdom from communities
(social tagging, folksonomies, etc.)
74Refuting Claim 2
- For all those activities that require organized
and controlled information, the actual
institutions (notably libraries, archives and
museums) will continue to have a (much less)
significant role
75Libraries as Info Mgmt Systems
Policy
Quality
Functionality
Content
User
Architecture
76No barriers to knowledge exchange
- More and more of the worlds info/knowledge lives
in specialized digital libraries that - Have content added/created by members of a
community - Are curated by specialists of that librarys
topic - Are maintained by (designated members of) the
community - No strict separation between producer, curator,
and consumer roles wrt which actors plays which - Advanced services annotation, personalization,
contextualization, preservation, collaboration,
etc.
77Hopeful Conclusion DLs vs. the Web
- They are not going to fight or replace each
other, but in the end they are going to
complement each other.
Hm!
Yes!
Probably!
With lots ofnew technology help for both
78The Web vs. Digital Libraries
79Libraries Digital Libraries - Web
- Will the Web be the
- Digital Library ? Library ?
- Content
- Structure
- Use(fullness)
80Libraries Digital Libraries - Web
- Use(fullness)
- Visits to libraries
- Approx 10 / year
- Of which private/touristic visits approx 10 /
year - Visits to Digital Libraries
- Several times per week
- Visits to the Web
- Several times per day
- Usually finding what is needed, specifically for
professional purposes - People increasingly use simple Google-style
interfaces for search rather than more complex
DL-like interfaces, not to mention library
catalogues
81Libraries Digital Libraries - Web
- Structure (Information and Metadata)
- Libraries
- very high, professional
- but inflexible one sorting, rest in (digital)
cataloguessearching these is actually like a DL
without content - Digital Libraries
- Very high, manually cared for, sometimes
community-driven - Flexible ways of interaction
- Web
- None, but
- Increasingly created automatically CiteSeer,
DBWorld, GoogleBooks, Genre Classification, Topic
analysis, Named Entity Detection,
82Libraries Digital Libraries - Web
- Content
- Libraries
- High-quality, selected content
- but increasingly not the content I need
proceedings, papers are first/only in DL before
they make it to the library catalogue - Digital Libraries
- Sometimes the only place where I can find certain
material (sufficiently easily) - Increasingly also older content digitized
- Web
- Sometimes even more comprehensive than theDLs
- Whats not on-line does not exist (cf. CiteSeer)
- Increasing amounts of traditional, high-qualiyt
content on the Web (or in a DL?) (e.g. Internet
Archive)
83Libraries Digital Libraries - Web
- So
- Looks like (parts of) the Web will turn into a
- Digital Library, which will eventually replace
the - Conventional Library
- Requiring only
- More content (will come)
- More structure (will be provided by better
computer programs), which will lead to - More users using this, assisted by better
interfaces
84The Web vs. Digital Libraries
85Googles Mission
- Google's mission is to organize the world's
information and make it universally accessible
and useful. - Organize
- By vertical/property Scholar, Book Search,
Product Search, News, Maps, etc - By search
- Worlds information
- What we can reach through the web
- What we license
- Universally accessible
- Via internet
- Internationalized and localized
- Useful focussing on and meeting our users needs
Is Google() the worlds Digital Library? put
here your favourite search engine
85
86 A Digital Librarys Mission
- A Digital Library 's mission is, for a selected
user community, to organize that communitys
information and make it universally accessible
and useful to that community. - Organize
- According to the needs of the user community
(art, photographs, scientific data, ... ) - Communitys information
- Information (including data) generated by the
community - That can be reached through the web
- That can be licensed (or purchased)
- (Universally) accessible
- Via internet (including via web search)
- Internationalized and localized
- Useful focussing on and meeting users needs
within the selected user community
86
87Major Differences in Missions
- Scale
- Information broad versus deep coverage
- General versus specific communities (and
therefore needs) - Organizing principles (can be very different)
- Services provided how we add value to
information/data - Other considerations
- Profit
- Quality, conservation and preservation
- Authority
87
88Conclusions (opinion 3)
- Web Search (Google) and Digital Libraries share
similar but complementary missions - Celebrate the diversity of missions, and
concentrate on strengths whether as web search
engine or digital library - Search engines scale, universal delivery,
universal services - Digital libraries specialized collections,
specialized services, library services - Focus on delivering value to users through useful
and relevant (web) services (Focus on the user
and all else will follow) - Web search is a service that Digital Libraries
should exploit to ensure universal access to
information and services
88
89The evolution of libraries
paper, pictures, audio, video
digital surrogatesborn digital objects
Digital Library
digital librarians, digital curators, etc
???????
90Final conclusion
The important thing is ....
Getting There