Title: Building the NSDL
1Building the NSDL William Y. Arms Cornell
University
Thinking aloud about the NSDL
2Acknowledgement and Disclaimer
The NSDL is a program of the National Science
Foundation's Directorate for Education and Human
Resources, Division of Undergraduate
Education. The ideas discussed in this talk do
not represent the official views of the NSF (or
of anybody except the author).
3What's in a name?
4The NSDL
5The NSDL
Can we build a comprehensive digital library for
science education, without building a National
Science Digital Library?
6The National Science Digital Library
7The National Science Digital Library It's BIG!
8How big might the NSDL be?
- To be comprehensiveall branches of science, all
levels of education, very broadly defined - Five year targets
- 1,000,000 different users
- 10,000,000 digital objects
- 100,000 independent sites
9Digital collections for science
10(No Transcript)
11(No Transcript)
12(No Transcript)
13(No Transcript)
14(No Transcript)
15Opportunities for the NSDL Categories of
material that have been given lower priority by
libraries and publishers, e.g., datasets,
software, and other dynamic content,
... Materials that are accessible for automatic
processing, e.g., scientific web sites and
databases, image collections, ... Materials
designed for education, e.g.,learning objects,
curricula, problem sets, ... Less opportunity for
the NSDL Conventional scientific literature
with restricted access
16(No Transcript)
17The NSF's strategy
18The NSF cannot fund all collections
19The NSF is funding selected collections ...
20... and a Core Integration team
The Core Integration task is to provide a
coherent set of services for users across great
diversity.
21Resources
Core Integration
Budget 4 million Staff 25 -
30 Management Diffuse
How can a small team, without direct management
control, create a very large-scale digital
library?
22A spectrum of interoperability
23Approaches to interoperability
The conventional approach ? Wise people develop
standards protocols, formats, etc. ? Everybody
implements the standards. ? This creates an
integrated, distributed system.
Unfortunately ... ? Standards are expensive to
adopt. ? Concepts are continually changing. ?
Systems are continually changing.
24Interoperability is about agreements
Technical agreements cover formats, protocols,
security systems so that messages can be
exchanged, etc. Â Content agreements cover the
data and metadata, and include semantic
agreements on the interpretation of the messages.
 Organizational agreements cover the ground
rules for access, for changing collections and
services, payment, authentication, etc. The
challenge is to create incentives for independent
digital libraries to adopt agreements
25Function versus cost of acceptance
Cost of acceptance
Few adopters
Many adopters
Function
26Example Textual mark-up
Cost of acceptance
SGML
XML
HTML
Function
ASCII
27Levels of interoperability
- Federations
- Collections follow strict standards for content,
metadata, protocols, authentication, etc. - Harvested Collections
- Each collection makes metadata about its
collections available in a simple exchange format
(Open Archives metadata harvesting protocol). - Gathered Collections
- Material is gathered automatically by selective
web crawling. -
28Levels of interoperability
Level Agreements Example Federation Strict use
of standards AACR, MARC (syntax, semantic, Z
39.50 and business) Harvesting Digital
libraries expose Open Archives metadata
simple protocol and registry Gathering Digital
libraries do not Web
crawlers cooperate services must and search
engines seek out information
29- Metadata is expensive
- The NSDL cannot afford to create it manually
-
30User portals
Metadata repository
Distributed collections
31Every collection is different
32From an NSF-funded collection We are pleased
with the technical sideof the database and web
accessbut we are complete novices in terms of
how to make our collection part of the digital
library. I assume this hinges on appropriate
metadata, but I am not sure exactly what kinds
33- Metadata strategy Support eight standard
formats - Collect all existing metadata in these
formats - Provide crosswalks to Dublin Core
- Expose records in the metadata repository for
others to harvest - Concentrate on collection-level metadata
- Use automatic generation to augment
item-level metadata - Most Core Integration services will be created
automatically from collection-level metadata or
directly from the content (e.g automatic indexing
of text, automatic reference linking).
34- Managing the NSDL
- Responsibility without authority
35A personal observation
Despite all the evidence to the contrary, ...
we repeatedly over-estimate the benefits of
collaboration ...
and under-estimate the obstacles.
36The NSDL challenge
During the preliminary phases ... Each project
worked independently (NSF grants have little
control) Coordination was through a loose set
of committees, with mailing lists, bulletin
boards, etc.
37The NSDL challenge
During the preliminary phases ... Each project
worked independently (NSF grants have little
control) Coordination was through a loose set
of committees, with mailing lists, bulletin
boards, etc. For the production phase ... We
must develop a robust, reliable set of
services We must make compromises, decide
priorities, etc. Yet we must attract the energy
of many independent individuals and organizations
38What doesn't work Decision making by online
forums Become dominated by a few people, not
necessarily the most knowledgeable. Either
usage dies away, or too many low-value messages
drive away the busy people. Decision making
without responsibility Vision is easy.
Implementation is hard.
39What does work? Money Thank you NSF! Online
discussions on specific topics Structured
discussions as part of a decision-making process
are often productive Patience and persistence
Success builds on success
40The last word From the Lisle, NY Volunteer Fire
Brigade September 17,2001
God bless America.
Bingo, Tuesday 730 - 1000.
41Building the National S Digital Library William
Y. Arms Cornell University