Title: Federated Search
1Federated Search
- April Ens
- Rowena Koh
- Jovy Rosario
- Desy Wahyuni
- Suher Zaher-Mazawi
LIBR 557 November 2007
2Outline
- Definitions
- Federated vs. Metasearch
- Why Federated Search?
- Issues and Challenges
- Metasearch Initiative
- Examples
- MetaLib (Ex Libris)
- WebFeat
- AGent Search (Auto-Graphic Inc.)
- 360 Search (Serials Solutions)
- Free and Open Source Options
- Conclusion
3Definitions
- Federated search broadcasts a single search
query across multiple sources of information and
aggregates results into a single point of access,
usually displayed in a common format. - (Marshall et al. 2006)
4Definitions
- Federated search is the consolidated retrieval
of results in response to a query sent to several
databases hosted by different online information
systems. Federated searching consists of
transforming a query and broadcasting it to a
group of disparate databases with the appropriate
syntax, merging the results collected from the
databases, presenting them in a succinct and
unified format with minimal duplication, and
allowing the library patron to sort the merged
result set by various criteria. -
- (Jascó 2004)
5Federated vs Metasearch
- Metasearch, parallel search, federated search,
broadcast search, cross-database search, search
portal are a familiar part of the information
community's vocabulary. They speak to the need
for search and retrieval to span multiple
databases, sources, platforms, protocols, and
vendors at one time. - (NISO)
- but federated search vendors prefer not to be
known as metasearch engines
6Federated vs Metasearch
7Why federated search?
- Growth in variety of information sources, formats
and platforms - Limited budgets
- Web search engines especially
8Why federated search?
- OCLC study on the Perceptions of Libraries and
Information Resources - Open web search engines are viewed as more
reliable, easy to use, convenient and
cost-effective - 87 of respondents say that the web search engine
would be first choice next time they need a
source for information - Patrons stop using resources after becoming
frustrated by number of dissimilar search
interfaces they must use to access content
9Why federated search?
- But library-provided electronic resources are
still highly rated - Libraries now have an excellent opportunity to
provide a simple, yet powerful interface that
out-Googles Google. (Fryer 2004)
10Why federated search?
11Why Federated Search?General Features
- Customized interface design
- Organizes databases by subject
- Start search by selecting subject or desired
databases - Boolean and field searching
- One search box
- Most allow customization of basic search fields
- De-duplication of results
- Merging of results
- Can usually be sorted by date, relevancy, or
title - May also be able to filter by field
- User interaction functions
12Why federated search?
- One begins to glimpse here a world where
information silos are a thing of the past, where
library provided content truly is presented in a
unified, integrated fashion, where hypertext
linking begins to realize its ultimate potential
and users can customize offerings according to
their preferences in a true portal environment. - (Boss and Nelson 2005)
13Issues and Challenges
- Verification, authentication, certification,
licensing - access to multiple, password-protected databases
- access only to licensed users
- Lack of advanced search functions
- Basic Boolean
- No proximity searching
- Multiple-field searching
- Default to keyword
14Issues and Challenges
- Limited by functions of native database
- De-duplication
- Exact vs. Variable de-duplication
- De-duplicates only first set of results
15Issues and Challenges
- Relevancy ranking
- Generally based on frequency of search keywords
in citations - Full-text and abstracts not generally available
- Slow response time
- Too much information?
16Metasearch Initiative
- NISO has gathered vendors, content providers, and
library systems to work on its Metasearch
Initiative to create standards - proprietary vendor verification authentication,
and certification to use certain databases - search protocol standardization
- common descriptors for data and content tags, as
well as taxonomies - how result sets should be sorted, ranked, and
ordered
17MetaLib
18MetaLib Overview
- Developed by Ex Libris Ltd.
- Version 1.0 released in 2001.
- Version 4.0 released in February 8, 2007
- One of the major federated search tool available
- Aimed mainly for academic and research
institutions - Widely used internationally
19MetaLib in Practice
- Canada University of British Columbia
University of Victoria University of Quebec - USA West Texas A M University's Cornette
Library Aeronautics and Space Administration
(NASA) Johnson Space Center Cleveland Museum of
Art - Europe University of Birmingham (UK) Louis
Pasteur University (Strasbourg I) Groningen
University (Netherlands) Humboldt University
(Germany) Consejo Superior de Investigaciones
Científicas (Spain) National Electronic Library
of Finland (Finland) - Asia Korea Advanced Institute of Science and
Technology (South Korea) Beijing University of
Technology (China) - Africa Middle East The American University
(Cairo, Egypt) Collage of Management (Israel)
20MetaLib General Features
- Flexible consortia options
- Customizable and meets user needs
- Streamlined authentication and authorization
- Multilingual, accessible, and standards-based
- Works in conjunction with SFX
- Xerxes open-source implementation of Metalib
21MetaLib Search Features
- Search Levels
- QuickSearch vs. MetaSearch
- Simple search
- Advanced search
22MetaLib Search Features
- Syntax
- Boolean AND (default), OR, NOT
- Phrase uses quotation marks
- Truncation uses the symbol ?
- Field Searches
- subject, title, author, ISBN, ISSN, year
23MetaLib QuickSearch Interface
24MetaLib QuickSearch Interface
25MetaLib PowerSearch Interface
26MetaLib Unique Feature
- MetaLib uses Z39.50 or XML gateway protocols,
thus bypassing the native search interface and
using its own search to fetch data from native
databases. (Xiatian, 2006) - Consequences
- The search results look-and-feel
- Database ranking on the search result page
27MetaLib Search Process
28MetaLib Search Results
3
1
2
29MetaLib Search Results
30MetaLib Search Results
31MetaLib Strengths
- One clear, familiar user interface
- Customizable flexible infrastructure
- Easy to manage
- Works with any integrated library system
- Provides a layer above the institution's system
- Vendor-hosted or locally hosted
32MetaLib Weaknesses
- Retrieves first 30 records from each database
based on relevance rankings - Not all databases can be configured to work with
MetaLib's search software - Does not really offer one-stop shopping
- Searches databases within one subject only
- Slow Speed
- Internal navigation only
33MetaLib The UBC Library
34MetaLib The UBC Library
35MetaLib The UBC Library
36MetaLib The University of York Library
Archives
37MetaLib The University of York Library
Archives
38MetaLib The University of York Library
Archives
39- the original federated search engine
40Webfeat Overview
- Three products
- Webfeat Custom API
- Webfeat Express for budget-conscious libraries
- Webfeat Enterprise edition for multi-library
networks - Used primarily by large public library systems
and academic libraries
41WebFeat _at_ NYPL
42Selected Databases
43(No Transcript)
44WebFeat Search Features
45WebFeat Search Features
46WebFeat Search Features
47WebFeat Search Features
48WebFeat Syntax
49WebFeat Advanced Search
50WebFeat Search Process
51WebFeat Search Process
52WebFeat Search Results
53Clicking the View button leads to the article
citation found on interface of the native database
54WebFeat Strengths
- Flexible and highly customizable interface
55Some examples Princetons PULQuickSEARCH
56VPLs All in One Search
57University of Pittsburghs Zoom!
58WebFeat Strengths
- Development and maintenance of translators
handled by WebFeat - Translators are tools that allow WebFeat to
interact with the native database - Vendor-hosted
- One-stop shopping - ability to combine databases
in different subject categories
59WebFeat Strength
60WebFeat Weaknesses
- Vendor-hosted
- WebFeat IP addresses need to be added to all
database vendors - Inconsistent results
- Can only search databases with a search box in
the front page - Does not offer de-duplication
- Response time can be slow
61AGent Search
62AGent Search Overview
- A product of Auto-Graphics, Inc.
- Previously known as AGent Portal
- Can be integrated into a librarys existing
applications, or used entirely from a web portal.
63Who Uses AGent Search?
- 5,000 libraries use AGent Search
- Auto-Graphics, Inc. claims that more libraries
use AGent Search than any other federated search
tools - Users include Public Libraries, Academic
Libraries, and Library Consortia
64Library Consortia JerseyClicks
65Who uses AGent Search locally?
- Coquitlam Public Library
- Port Moody Public Library
- These libraries worked with Auto-graphics Inc. to
create a shared search between their two
libraries - Their case study has been featured in AGent
literature
66Coquitlam Port Moody
- Billed as Two Libraries. One Search.
- AGent Search can be accessed by either librarys
website, using an appropriate library card. - The search accesses both library catalogues, the
Outlook provincial catalogue, and a number of
encyclopedias and databases - Guest Access is also available, but very few
databases are accessible.
67AGent Search Guest Access
68AGent Search Patron Access (Keyword Search)
69AGent Search Patron Access (Advanced)
70AGent Search Syntax
- Wildcards, Phrase Searching, and Basic Boolean
(AND, OR, NOT) can be used in both keyword and
advanced search - When not otherwise specified, AND is assumed when
multiple words are used in a given search box
71AGent Search Wildcard Syntax
- ? - Use within a word to represent individual
characters, - i.e. p?jama to retrieve pajama or pyjama.
- - Use this to replace multiple characters.
72AGent Search Search Results
73AGent Search Results Sorted by Date
74(No Transcript)
75AGent Search - Strengths
- Search history, up to 50 searches
- Saved search preferences
- Ability to hide seldom-used resources, or select
favourite resources to be automatically selected - Boolean, phrase searching, and wildcards
available in all search interfaces - Search results can be grouped by resource, title,
author, or date. AGent is able to recognize
identical items - Full access to results (not just screenscraping)
- Statistical tracking for libraries
76AGent Strength Search History
77AGent Strength Saved Search Preferences
78AGent Search Weaknesses
- Results are delivered in random order.
- Selections offered by Port Moody / Coquitlam are
limited. A significant number of resources can
only be searched individually, or in the library
due to licensing issues.
79360 Search
80360 Search Overview
- Vendor Serials Solution/Proquest
- Formerly Central Search
81Libraries using 360 Search
- James A. Cannavino Library at Marist College
- http//library.marist.edu/
- The University of Arizona Library
- http//www.library.arizona.edu/
- Minneapolis Public Library
- http//www.mpls.lib.mn.us/
- And about 90 other libraries
82360 Search Features
- Boolean operators
- AND, OR, NOT
- Phrase search using quotation marks
- Truncation
- wildcard symbol
83360 Search Basic Search
84360 Search Basic Search
85360 Search Basic Search
86360 Search Advanced Search
87360 Search Search Results
88360 Search Search Results
89360 Search Strengths
- Hosted application
- no server required on library site
- no maintenance, upgrading, troubleshooting by
library personnel - Customizable
- the databases included
- interface
- results
- name of the service
90360 Search Strengths
- Results clustering
- Aggregates citations based on common terms
- Helps refine searches
- Vivisimo technology
91360 Search Results Clustering
92360 Search Results Clustering
93360 Search Results Clustering
94360 Search Results Clustering
95360 Search Weaknesses
- Clustered results can sometimes be weak and/or
random - A basic Any search for Civil War in a history
group of databases resulted in the following
clusters (with number of hits in each cluster)
Review (44), History (15), Iraq (8), Memory (7),
Account (5), Economy (5), Party , Hollande
(5), President (5), Wars (5), Other (51). Some of
these clusters work but many do not (indicated by
brackets here) and, most surprisingly and oddly,
a cluster for United States does not get
generated. (Callicott, 2007, p. 7) - Search ISSN but not journal title
96Open Source Options
97- Please refer to Fall 2006 557 presentation for a
thorough discussion of this search engine
http//www.slais.ubc.ca/courses/libr557/06-07-wt1/
presentations.htm
98Other Open Source Federated Search Engines
- LibraryFind (Oregon State University Libraries)
http//www.libraryfind.org/ - MasterKey, part of the Keystone Digital Library
Suite (Index Data) http//www.indexdata.dk/keyston
e/ - Xerxes (California State University)
http//xerxes.calstate.edu/
99Free Federated Search
100- "a free federated, vertical search portal"
101The debate
- Loss of controlled vocabularies and specialized
features of individual databases - Not all library databases may be available on the
federated system - Just another choice on your screen?
- It is dumbing down searching
102Final Thoughts
- Its time for librarians to accept that library
users are not interested in being more like us. - (Luther 2003)
103Final Thoughts
- Learning about these databases'
availability is one thing. Getting to them by
clicking through the labyrinth on many library
Web sites is another. Making patrons use them
while applying the strict semantic and syntax
rules of Boolean and proximity operators to terms
looked up from the thesauriis yet another thing.
It's no surprise that patrons are happy if they
make it through one database and catch just a few
small "fish." They don't go to see if another
database may have more and/or better results.
Most give up and angrily leave whatever database
they were using. Then they'll go to Google and
type the query library anxiety information
overload help, which will find a few good-enough
full-text reports, case studies, and articles
among the first several of the more than 11,400
free results. They may never come back to the
digital library again. - (Jascó Péter 2004)
104Resources
- Auto-Graphics, Inc. (2007). AGent Search.
Retrieved November 12, 2007, from
http//www4.auto-graphics.com/products/agentsearch
/agentsearch.htm - Auto-Graphics, Inc. (2007). AGent Search case
study Port Moody and Coquitlam - Public Libraries of British Columbia (BC).
Retrieved November 12, 2007, from
http//www4.auto-graphics.com/products/agentsearch
/cs_port_moody.htm - Boss, S. C., Nelson, M.L. (2005). Federated
search tools the next step in the quest for
one-stop-shopping. Reference Librarian 44,
139-160. - Boyd, J., Hampton, M., Morrison, P., Pugh, P.,
Cervone, F. (2006). The one-box challenge
Providing a federated search that benefits the
research process. Serials Review 32(4), 247-254. - Callicott, B. (2007). 360 Search (formerly
Central Search). The Charleston Advisor 9(1),
5-8. - Chen, X. (2006). MetaLib, WebFeat, and Google
The strengths and weaknesses of federated search
engines compared with Google. Online Information
Review 30(4), 413-427.
105Resources
- Coquitlam Public Library (2007). Port Moody and
Coquitlam Public Libraries Two libraries. One
search. Retrieved November 12, 2007, from - http//coquitlam-agent.auto-graphics.com/agent/lo
gin.asp?cidscbclidCOQR - Curtis, A. M., Dorner, D. G. (2005). Why
federated search? Knowledge Quest 33 (3), 35-37. - Dartmouth College Library (2007). About
Search360. Retrieved November 12, 2007 from - http//library.dartmouth.edu/search/search360/abo
ut360.shtml - ExLibris Ltd. (2007). MetaLib official website.
Retrieved November 10, 2007 from - http//www.exlibrisgroup.com/metalib.htm
- Fahey, S. (2007). Fed searchers? The debate
about federated search engines. Feliciter 52 (2),
62-63. - Fryer, D. (2004). Federated search engines.
Online 28 (2), 16-19. - Guinn, D. (2006). Federated search symposium
panelist fact sheet. Retrieved November 10, 2007
from http//www.thealbertalibrary.ab.ca/viewPostin
g.asp?postingID182
106Resources
- Hane, P. J. (2003). The truth about federated
searching. Information Today 20(9), 24. - Hollandsworth, B. L., Foy, J. (2007). Griffin
search How Westminster College implemented
WebFeat. Library Hi Tech 25(2), 211-219. - Jascó, P. (2004). Thoughts about federated
searching. Information Today 21 (9), 17/20. - Jacso, P. (2007). Vivisimo, Central Search, TIME
Magazine, and the Open Directory Project. Online
31(1), 58-60. - Marshall, P., Herman, S., Rajan, S. (2006). In
search of more meaningful search. Serials Review
12 (3), 172-180. - Newton, V. W., Silberger, K. (2007).
Simplifying complexity through a single federated
search box. Online 31(4), 19-21. - NISO (2007). Metasearch initiative. Retrieved
November 8, 2007 from http//www.niso.org/committe
es/MS_initiative.html
107Resources
- Port Moody Public Library (2007). Two libraries.
One search. Retrieved November 12, 2007, from
http//library.portmoody.ca/Resources/OneSearch/d
efault.htm - Rogers, M. (2006). Serials Solutions debuts
Vivisimo. Library Journal 131(18), 27-28. - SerialsSolutions (n.d.). 360 Search help.
Retrieved November 13, 2007 from - http//va3wn8qp2m.cs.serialssolutions.com/csStati
c/html/helpPages/clustering.html - Tenopir, C. (2007). Can Johnny search? Library
Journal 132(2), 30. - The UBC Library (2007). How to search MetaLib.
Retrieved November 10, 2007 from - http//www.library.ubc.ca/pubs/HowtoSearchMetaLib
.pdf - The UBC Library (2007). MetaLib search engine.
Retrieved November 10, 2007 from - http//montcalm.util.itservices.ubc.ca8331/V
- Xerxes (n.d.). Top ten reasons to use the Metalib
X-Server. Retrieved November 10, 2007 from
http//xerxes.calstate.edu/articles/1