Title: Developments and Trends in the LMS and Discovery Arenas
1Developments and Trends in the LMS and Discovery
Arenas
Marshall Breeding Director for Innovative
Technology and Research Vanderbilt University
Library Founder and Publisher, Library Technology
Guides http//www.librarytechnology.org/ http//tw
itter.com/mbreeding
26 August 2010 Stockholm
Program on National Infrastructure
2Seminar Goal
- The aim of the seminar is to create an
understanding of the infrastructural challenges
and to contribute to a plan of action for the
future. - Library Directors and System managers will
discuss different solutions of availability and
management of e- resources in order to make
strategic choices for the development of the
infrastructure at a national level.
3Presentation Themes
- Trends and recent developments in the library
system market, - resource discovery services and resource
management as indexing/knowledge bases - Creation and management of data wells for
metadata - Ongoing discussion regarding options for building
data wells in-house, open source or partnering
with commercial actors.
4Summary
- development and trends in the library system
market, regarding resource discovery services and
resource management as indexing/knowledge bases.
If I should emphasize something special, it is
the question of data wells for metadata. We have
been investigating the data well question in a
report (plesase see below, Summary in English)
and there is a discussion about building data
wells in-house, open source or with commercial
actors. We have also invited three commercial
actors to the seminar. Not an easy
question!Related is also the topic of the
national catalogue LIBRIS as a local OPAC for the
libraries. How can Libris work as, not only the
national catalogue, but also as a local OPAC? The
third topic is the future for ExLibris,
Metalib/SFX in Sweden. Were happy with SFX, but
not with Metalib/federated search, how to
continue? But the main focus at the seminar will
be resource management/data well, although Libris
and Metalib/SFX questions need to be included in
the discussions.
5Basic Discovery Concepts
6Crowded Landscape of Information Providers on the
Web
- Lots of non-library Web destinations deliver
content to library patrons - Google Search / Google Scholar
- Amazon.com
- Wikipedia
- Ask.com
7User expectations
8Evolution of library collection discovery tools
- Bound handwritten catalogs
- Card Catalogs
- Library online catalogs OPACs
- Next-Gen Catalogs / Discovery interfaces
- Web-scale discovery services
9Bound Catalog
10Card Catalog
11Online Card Catalog
12Web-based online catalog
13Next-generation Catalog
14Next-generation Catalog
15Modernized Interface
- Single search box
- Query tools
- Did you mean
- Type-ahead
- Relevance ranked results
- Faceted navigation
- Enhanced visual displays
- Cover art
- Summaries, reviews,
- Recommendation services
16Web site as menu of search options
17Disjointed approach to information and service
delivery
- Silos Prevail
- Books Library OPAC (ILS module)
- Articles Aggregated content products, e-journal
collections - OpenURL linking services
- E-journal finding aids (Often managed by link
resolver) - Local digital collections
- ETDs, photos, rich media collections
- Metasearch engines
- All searched separately
18Lack of unified Web presence
- Users dont understand the distinctions we make
- Catalog?
- Articles and Databases?
- Digital Library?
- Search our Site?
- Search interfaces based on content formats or
management applications - Non-library Web sites are much more unified
19A simple vision
- A single point of entry to all the content and
services offered by the library - but with precision, nuanced sophistication, and
multiple dimensions
20(No Transcript)
21Web-scale discovery
22Online Catalog vs. Discovery Layer
- Online Catalog
- Interface conventions from an earlier Web era
- Scope Tied to the ILS and its content domain
- Discovery Layer
- Modern interface elements
- Scope aims to address broad range of components
that constitute library collections
23Discovery Products
http//www.librarytechnology.org/discovery.pl
24Decoupled from ILS
25Social discovery
- Tags, user-supplied ratings and reviews
- Leverage social networking interactions to assist
readers in identifying interesting materials
BiblioCommons - Leverage use data for a recommendation service of
scholarly content based on link resolver data Ex
Libris bX service
26Deep indexing
- Metadata can no longer serve as the only basis
for discovery - Increasing opportunities to search the full
contents - Google Library Print, Google Publisher, Open
Content Alliance, government publications, etc. - High-quality metadata will improve search
precision - Commercial search providers already offer search
inside the book and searching across the full
text of large book collections - Important transition to full-text book search
beginning in library projects - HathiTrust indexing 6 million volumes
- Must become a routine component of library
discovery - Deep search highly improved by high-quality
metadata
27Discovery product Trend
- Initial products focused on technology
- AquaBrowser, Endeca, Primo, Encore, VUfind
- Mostly locally-installed software
- Current phase focused on integrated access to
both local content and remote articles to deliver
Web-scale discovery. Examples - Summon (Serials Solutions)
- WorldCat Local (OCLC)
- EBSCO Discovery Service (EBSCO)
- Primo Central
- Encore Synergy
28Beyond Federated search
- Federated Search / Metasearch use real-time
queries against multiple information targets - No centralized index presentation of dynamic
results - Shallow results -- only a few results initially
fetched from each target - Difficult to calculate relevancy
- Performance challenges
29Beyond local discovery interfaces
- Pre-populated indexes
- Web-scale
- Exploits the full depth and breadth of library
collections - Beyond the bounds of the local librarys
collection - Targets the universe of objective, vetted library
content
30Pre-populated discovery services
- New-generation interface
- Harvested local content
- ILS metadata
- Institutional repositories, ETDs, Digital
Collection platforms - Vendor-supplied indexes of library content
- E-journals, databases, e-books
- Full-text and metadata corresponding to e-content
subscriptions - Book collections beyond local library collections
- Includes full-text indexing to the fullest extent
possible
31Online Catalog
ILS Data
Search Results
32Federated Search
ILS Data
Digital Collections
ProQuest
Search Results
EBSCOhost
MLA Bibliography
ABC-CLIO
Real-time query and responses
33Discovery Interface
ILS Data
Digital Collections
Local Index
ProQuest
Search Results
EBSCOhost
MetaSearch Engine
MLA Bibliography
ABC-CLIO
Real-time query and responses
34Web-scale Search
ILS Data
Digital Collections
ProQuest
EBSCOhost
Search Results
Consolidated Index
MLA Bibliography
ABC-CLIO
Pre-built harvesting and indexing
35Web-scale Search Federated Search
ILS Data
Digital Collections
ProQuest
Consolidated Index
Search Results
MLA Bibliography
ABC-CLIO
Pre-built harvesting and indexing
FedSearch
Non-harvestable Resources
36Discovery ? Delivery
- Discovered content delivered through original
repositories - Publisher agreements generally preclude exposing
content for direct access - Should necessarily circumvent core role of
publisher
37Benefits
- Libraries increased access to high-cost
electronic content - Users Easer access to research resources
- Publishers Increased impact of content products
- IT perspective advance harvesting makes more
efficient use of resources than simultaneous
real-time queries
38Toward a Large-scale National Discovery
environment
39Obstacles and Challenges
- Scaleable technology platform
- Acceptable relevancy-based retrieval for large
heterogeneous collections - Acquisition of data and metadata for aggregated
index
40Opportunities
- Climate more favorable to harvesting e-content
for indexing - Highly scaleable, open source tools for discovery
infrastructure - Lucene
- SOLR
- Many ongoing synergistic projects as possible
collaborative partners
41Potential Commercial Partners
- Three commercial organizations will participate
in the seminar - Ex Libris
- Serials Solutions
- EBSCO
- Each has negotiated access to commercial content
products - Paved the way for library driven projects
42Other similar projects
43Summa
- State and University Library of Denmark
- Locally built integrated search
- Catalogs articles
- Failed to receive EU funding due to lack of
guarantees to receive article data from
publishers - Now Partnering with Serials Solution to use
article index from Summon via API
44Trove
- National Library of Australia
- Previously called Single Business Discovery
Project - Brings together many previously separate
discovery systems - Built in-house at NLA
- Prototype released May 2009
- Includes some full-text as well as metadata
- Technology Java, Lucene, SOLR, MySQL
- Details http//www.nla.gov.au/pub/gateways/issue
s/101/story01.html
45What about OCLC?
- WorldCat ever expanding repository of metadata
- Books mostly, increasing article metadata
- Focused on expanding WorldCat for broad discovery
- ArticleFirst 23 million records
- April 2009 agreement with EBSCO for article
metadata (withdrawn?). - Quantity of article metadata apparently not on
track to attain the same level of
comprehensiveness as seen in Summon, EDS, Primo
Central
46Developing the Data Well / Aggregated index
- Aggregation of metadata and content
- Normalization map metadata to make indexing,
facets, and presentation meaningful - De-duplication of records within and between
content sources - FRBR Collapsible groupings according to FRBR
concepts - work expression -- manifestation item
47Content sources populating the Aggregated Index
- Article metadata and full text
- Index views according to profile
- Coordinated with local OpenURL knowledge bases
- Digital Collections
- LMS Metadata
- Books, Microfilm, periodical titles, DVD, etc
- Blending of vendor provided metadata and locally
managed unique content - At the cusp of being able to represent library
collections comprehensively
48Acquiring content for Aggregated Index
- Agreements with publishers and providers of
article content to libraries - Open access content
- Any OAI target
- Local digital collections
- Relevant library catalog data
- OK with OCLC record use policies when aggregated
at a national level?
49Data Well Construction
- Technical
- Assembling technologies of adequate scale and
capacity - Indexing, Search and retrieval
- Normalizing
- Business / Political
- Agreements with commercial publisher to provide
metadata or content - Increasing expectation from libraries to allow
harvesting for discovery - (Similar to COUNTER compliance, OpenURL support)
- Improved performance at delivering library end
users to publisher content
50Relationship with OpenURL Knowledgebase
- The aggregation of article-level citations and
content relates to journal title-level profile
and availability data in the OpenURL
knowledgebase - Important source of profiling needed to deliver
appropriate views of the index for different
libraries.
51A labor-intensive project
- Business process
- Develop relationships with providers and
publishers - Construct contracts and licenses
- Technical
- Create import process for each source
- Normalization, Mapping, de-duplication, FRBR
groupings - Initial load constant incremental updates
- Creation of highly scalable indexing and
retrieval platform - Must scale up to 1 billion articles
- Develop algorithms and tunings for appropriate
relevancy rankings - Interface design
52Building Expectations for Article Discovery
- Libraries should require agreements for
harvesting as part of content licensing process - Library licenses have led to broad support for
- COUNTER
- SUSHI
- OpenURL Linking
53Beyond Metadata
- Increasing expectation for full-text indexing
- Capacity present in e-journals for many years
- Full-text book indexing more problematic
- Much full text not available
- Complex to index
54Heterogeneous index
- Books mere millions
- Articles many hundreds of millions
- Digital objects many hundreds of millions
55How to deal with non-harvestable resources
- Metasearch?
- Resource recommendation service
- Database spotlighting
56Positioning of Discovery vs native Interfaces
- Current generation of discovery interfaces lack
important features - Service delivery (items borrowed, renewals, fee
payments, etc) - Browse and other advanced search or retrieval
features - Many libraries use native Web-based catalog to
supplement - Native interfaces of major information products
appeal to discipline specialists
57Content Services
- Must go beyond discovery to fulfillment
- Further integration of user services features
into discovery interface - Increased resource sharing capabilities
58LIBRIS
- National Union Catalog gt
- Local catalog?
- Local LMS?
59LMS deployments in Sweden -- Academic
60LMS deployments in Sweden -- Public
61Mobile
- The next new front for Library Discovery
62Relevant Technology Trends
63Service-oriented architecture
- Key technology for interoperability among diverse
software applications - New applications built with SOA throughout
- Legacy applications with a services layer
64Aggregating data and metadata
- Open source
- Commercial partnerships
65Mobile access to library content and services
- New opportunity to retain and attract library
users - Mobile web and apps
- Working toward a unified Mobile library presence
- Unify disjointed mobile silos the same ambitions
as we have for our the Web
66Questions and Discussion