Title: Interoperability Standards
1Interoperability Standards Searching Multiple
Repositories
- Ralph LeVan/OCLC
- Ray Denenberg/Library of Congress
2The Problem
- How do I provide a common interface for my users?
- How do I combine results from multiple sources?
3How do I provide a common interface for my users?
- How do I convert my queries into the Content
Providers (CPs) queries? - How do I ask for 10 records?
- How do I ask for more records?
- How do I interpret their response?
4How do I convert my queries into the CPs queries?
- My user said authortwain and titlehuck finn
- Google expects twain huck finn
- Z39.50 twain/1100342 huck finn/1441 and
- Lucene creatortwain and titlePhrasehuck finn
5How do I ask for 10 records?
- Amazon wont let you
- RedLightGreen MAXRECORDSn
- British Library recordsn
6How do I ask for more records?
- Amazon pagen
- RedLightGreen STARTINDEXn
- British Library startn
7How do I interpret their response?
- How many records did I retrieve?
- Did something go wrong?
- How do I convert the CPs records into something
my users will recognize?
8How many records did I retrieve?
- Amazon
- lta href"/gp/search/refsr_nr_i_0/002-2019116-8269
663?5FencodingUTF8keywordspratchettrhi3Aaps
2Ck3Apratchett2Ci3Astripbookspage1"gtBookslt/a
gtltspan class"narrowValue"gtnbsp(334)lt/spangt - RedLightGreen
- ltbgtViewinglt/bgt 1-10 of 239 results
- British Library
- ltopensearchtotalResultsgt190lt/opensearchtotalResu
ltsgt
9Did Something Go Wrong?
- RedLightGreen
- ltspan classsmallTextgtWe didn't find any matches
for ltbgtdog andlt/bgt.lt/spangt - British Library
- ltitem gt
- lttitle gtNothing found due to an errorlt/titlegt
- ltdescription gtToo many hits. Refine your
request.lt/descriptiongtlt/itemgt
10How do I convert the records?
- Amazon
- lttable class"searchresults" border"0"
width"100" cellpadding"0" cellspacing"0"gt - lttrgtlttd width"100" class"searchitem"
id"Td0"gt - lttable border"0" width"100" cellpadding"0"
cellspacing"0"gtlttr valign"top"gt - lttdgt
- lttable class"n2" border"0" cellpadding"0"
cellspacing"0"gt - lttrgt
- lttd class"imageColumn" width"88"gtlttable
border"0" cellpadding"0" cellspacing"0"gt - lttrgtlttd align"center" width"80"gt
- lta href"http//www.amazon.com/gp/product/00608152
21/sr8-1/qid1142436987/refpd_bbs_1/002-2019116-
8269663?5FencodingUTF8"gtltimg src"http//ec1.ima
ges-amazon.com/images/P/0060815221.01._PIsitb-st-a
rrow,TopLeft,-1,-14_SCTHUMBZZZ_.jpg" width"55"
alt"Thud! (Discworld, Book 32)" height"82"
border"0" /gtlt/agt - lt/tdgtlttd width"8"gtlt/tdgtlt/trgtlt/tablegtlt/tdgt
- lttd class"dataColumn"gtlttable cellpadding"0"
cellspacing"0" border"0"gtlttrgtlttdgt - lta href"http//www.amazon.com/gp/product/00608152
21/sr8-1/qid1142436987/refpd_bbs_1/002-2019116-
8269663?5FencodingUTF8"gtltspan
class"srTitle"gtThud! (Discworld, Book
32)lt/spangtlt/agt - by Terry Pratchett (ltspan class"binding"gtHardco
verlt/spangt - - Sep 13, 2005)lt/tdgtlt/trgt
- lttrgtlttd class"brandLink"gtltspan
class"aliasName"gtBookslt/spangt lta
href"/gp/search/refsr_nr_seeall_1/002-2019116-82
69663?5FencodingUTF8keywordspratchettrhi3Aa
ps2Ck3Apratchett2Ci3Astripbooks"gtSee all 334
itemslt/agtlt/tdgtlt/trgt - lttrgtlttdgtltspan class"priceType"gtlta
href"http//www.amazon.com/gp/product/0060815221/
sr8-1/qid1142436987/refpd_bbs_1/002-2019116-826
9663?5FencodingUTF8"gtBuy newlt/agt
lt/spangtnbspltspan class"listprice"gt24.95lt/spangt
ltspan class"saleprice"gt15.72lt/spangt
11Converting Records Cont.
- RedLightGreen
- lttd class"highlightcell"gtltspan
class"titleText"gtltbgtlta title"View more
information about this title." href"ucw.servlets.
UCWController?ACTIONEDITIONampWORKID21537371a
mpLANGUAGEENGampMATERIALbooksampFROMRSLT3
ampFROMWORK1amplangenglish"gtHogfatherlt/agtlt/bgt
, by Terry Pratchett ltbrgt3 editions published
between 1996 and 1998 in English.ltbrgtPrimary
Subject Discworld Imaginary Place -
Fictionltbrgtltimg src"/ucwprod/web/images/green.gif
" height"3" width"10" alt"A title's position
in a search result is based on relevancy (how
closely your search terms match the description)
xAand availability (how many libraries have a
copy of the title)."/gtltimg src"/ucwprod/web/image
s/white.gif" height"3" width"1"/gtltimg
src"/ucwprod/web/images/green.gif" height"3"
width"10" alt"A title's position in a search
result is based on relevancy (how closely your
search terms match the description) xAand
availability (how many libraries have a copy of
the title)."/gtltimg src"/ucwprod/web/images/white.
gif" height"3" width"1"/gtltimg
src"/ucwprod/web/images/green.gif" height"3"
width"10" alt"A title's position in a search
result is based on relevancy (how closely your
search terms match the description) xAand
availability (how many libraries have a copy of
the title)."/gtltimg src"/ucwprod/web/images/white.
gif" height"3" width"1"/gtltimg
src"/ucwprod/web/images/green.gif" height"3"
width"10" alt"A title's position in a search
result is based on relevancy (how closely your
search terms match the description) xAand
availability (how many libraries have a copy of
the title)."/gtltimg src"/ucwprod/web/images/white.
gif" height"3" width"1"/gtltimg
src"/ucwprod/web/images/gray.gif" height"3"
width"10" alt"A title's position in a search
result is based on relevancy (how closely your
search terms match the description) xAand
availability (how many libraries have a copy of
the title)."/gtltimg src"/ucwprod/web/images/white.
gif" height"3" width"1"/gtlt/spangtlt/tdgtlt/trgtlt/tabl
egtlttable xmlns"http//www.w3.org/TR/REC-html40"
border"0" cellpadding"0" cellspacing"0"
width"100"gtlttrgtlttd class"recordsepcell"
colspan"2"gtltimg src"/ucwprod/web/images/clear.gi
f" height"1"/gtlt/tdgtlt/trgtlt/tablegtlttable
xmlns"http//www.w3.org/TR/REC-html40"
border"0" cellpadding"3" cellspacing"0"
width"100"gtlttr valign"top"gtlttd width"25"
align"right" class"highlightcell"gtltspan
class"titleText"gt2.lt/spangtlt/tdgt
12Converting Records Cont.
- British Library
- ltitem gtlttitle gtThud! / Terry Pratchett.lt/titlegt
- ltlink gthttp//catalogue.bl.uk/F/-?funcdirect-do
c-setdoc_number013220851l_baseBLL01fromA9Ope
nSearchlt/linkgt - ltdescription gt Pratchett, Terry. London
Doubleday, 2005. . ISBN 0385608675 (hbk.)
17.99 . (Added 20050614 )lt/descriptiongtlt/item
gt
13How do I combine results from multiple sources?
- Things you might want the server to do for you
- Common Record Format
- Common Sort Order
- Common Rank Order
14Functional Matrix
Request Record Starting Point
Request Number of Records
Request Record Schema
Defined Query Grammar
Specify Sort Order
Specify Ranking Order
Diagnostic Messages
XML Response
Record Count In Response
Records In Known Schema
15The Old Solutions
- Screen Scraping
- Private APIs
- Z39.50
16Screen Scraping
- A query has to be generated and embedded in a CP
specific URL - Code has to be written to examine the HTML
returned by a CP - Prone to breakage
- Web sites change formatting frequently
- Every site is unique
- Separate code to be maintained for every site
17Private APIs
- Often only a slight improvement over screen
scraping - Provides documentation on how to construct the
URL - Might provide documentation on how to construct
the query - Might guarantee a stable response format
- Still requires unique code for each site
18Z39.50
- Guarantees a standard request and response
- But
- Not HTTP or HTML
- Binary encoding over raw TCP/IP
- Complicated
- 11 services
- 7 extended services
- Easy to be compliant and not interoperable
- Unfriendly
- The response to a protocol error was to drop the
connection
19Why Use A Standard API?
- Defined requests and responses
- Reusable code across sites
- Open Source code
20The New Solutions
- OpenSearch 1.1
- MXG
- Levels 0-2
- SRU
21OpenSearch 1.1
- From Wikipedia
- OpenSearch is a collection of technologies that
allow publishing of search results in a format
suitable for syndication. It is a way for search
engines to publish their search results in a
standard and accessible format
22OpenSearch 1.1 (cont.)
- Defines a Description Record with information
about the CP - ShortName and LongName
- Description
- Tags
- URL template
- Example
- http//herbie.bl.uk9080/opensearch.xml
23OpenSearch 1.1 (cont.)
- URL Template
- Server Indicates how to specify OpenSearch
request parameters - Parameters not specified in the template are
unavailable - The only mandatory parameter is searchTerms
- ltUrl type"application/rssxml"
template"http//herbie.bl.uk9080/cgi-bin/OSxml1.
cgi/?qsearchTermsstartstartIndex?records
count?formatrss" /gt
24OpenSearch 1.1 (cont.)
- Request Parameters
- searchTerms
- count
- startIndex
- startPage
- language
- outputEncoding
- inputEncoding
25OpenSearch 1.1 (cont.)
- Uses RSS 2.0 with a few extra elements for the
response - RSS define title, description and link elements
- OpenSearch adds the totalResults, startIndex,
itemsPerPage, link and Query elements - http//herbie.bl.uk9080/cgi-bin/OSxml1.cgi/?qlev
anformatrss
26Functional Matrix
OS 1.1
Request Record Starting Point ?
Request Number of Records ?
Request Record Schema
Defined Query Grammar
Specify Sort Order
Specify Ranking Order
Diagnostic Messages
XML Response ?
Record Count In Response ?
Records In Known Schema ?
Key ?Full Support ?Limited Support
27Cool Feature
- The RSS mechanism in OpenSearch provides the
ability to have persistent and periodic queries!
28NISO MetaSearch XML GatewayMXG
- MXG has been designed to provide a low
implementation barrier to content providers that
want to make their databases available to
metasearch engines. Interoperability across
content providers was explicitly not a goal of
MXG
29MXG Levels of Support
- Level 0 Requests are simple URLs using any
query grammar and responses are XML records - Level 1 Adds a description record for the
database - Level 2 Support a limited subset of a standard
query grammar CQL
30MXG Request
- Version (mandatory)
- Query (mandatory)
- StartRecord
- MaximumRecords
- http//alcme.oclc.org/MXG/search/ORPubs?version1.
1query"levan"startRecord1maximumRecords10
31MXG Response
lt?xml version"1.0" ?gt ltsearchRetrieveResponse
xmlns"http//www.loc.gov/zing/srw/"gt
ltversiongt1.1lt/versiongt ltnumberOfRecordsgt10lt/nu
mberOfRecordsgt ltrecordsgt
lt/recordsgt ltnextRecordPositiongt1lt/nextRecordPo
sitiongt ltechoedSearchRetrieveRequestgt
ltversiongt1.1lt/versiongt ltquerygtquotstuff
quotlt/querygt lt/echoedSearchRetrieveReques
tgt lt/searchRetrieveResponsegt
32MXG Response Records
ltrecordgt ltrecordSchemagt
infosrw/schema/1/dc-v1.1
lt/recordSchemagt ltrecordPackinggtxmllt/recordPack
inggt ltrecordDatagt
lt/recordDatagt ltrecordPositiongt1lt/recordPositio
ngt lt/recordgt
33MXG Response recordData
- ltsrw_dcdc xmlns"http//www.w3.org/TR/xhtml1/str
ict" xmlnsdc"http//purl.org
/dc/elements/1.1/"
xmlnssrw_dc"infosrw/schema/1/dc-v1.1"gt
ltdcidentifiergtrrl1234lt/dcidentifiergt
ltdctitlegtDog and
Catlt/dctitlegt lt/srw_dcdcgt
34MXG Error Messages
- ltdiagnosticsgt
- ltdiagnostic xmlns"http//www.loc.gov/z
ing/srw/diagnostic/"gt - lturigtinfosrw/diagnostic/1/51lt/urigt
- ltdetailsgt66ntqklt/detailsgt
- lt/diagnosticgt
- lt/diagnosticsgt
- http//www.loc.gov/z3950/agency/zing/srw/diagnosti
cs-list.html
35Functional Matrix
MXG Level 0
Request Record Starting Point ?
Request Number of Records ?
Request Record Schema ?
Defined Query Grammar
Specify Sort Order
Specify Ranking Order
Diagnostic Messages ?
XML Response ?
Record Count In Response ?
Records In Known Schema ?
Key ?Full Support ?Limited Support
36MXG Level 1
Add a description record for the
database http//www.loc.gov/z3950/agency/zing/srw/
explain.html http//alcme.oclc.org/MXG/search/ORP
ubs
37Functional Matrix
MXG Level 1
Request Record Starting Point ?
Request Number of Records ?
Request Record Schema ?
Defined Query Grammar
Specify Sort Order
Specify Ranking Order
Diagnostic Messages ?
XML Response ?
Record Count In Response ?
Records In Known Schema ?
Key ?Full Support ?Limited Support
38MXG Level 2
Support a limited subset of a standard query
grammar CQL Supports indexes and
Booleans http//www.loc.gov/z3950/agency/zing/cql/
http//alcme.oclc.org/srw/search/ORPublications?v
ersion1.1querydc.authorlevanmaximumRecords1
39Functional Matrix
MXG Level 2
Request Record Starting Point ?
Request Number of Records ?
Request Record Schema ?
Defined Query Grammar ?
Specify Sort Order
Specify Ranking Order
Diagnostic Messages ?
XML Response ?
Record Count In Response ?
Records In Known Schema ?
Key ?Full Support ?Limited Support
40SRU
- MXG Level 2 Plus
- Full Query Grammar (CQL)
- Full Sort Specification
41CQL Common Query Language
- Loosely based on CCL Search
- Boolean Proximity Operators
- Index Sets Indexes
- String Indexes vs. Keyword Indexes
- Truncation Characters , ?
- Relations , all, any, exact, within
- Example
- dc.titleharry potter or bib1.isbn123-456-78x
42Sort
- sortKeys parameter with the following comma
separated values specified - Xpath (path to the element to be sorted on)
- Schema (that the xpath comes from)
- Ascending (value is 1true or 0false,
defaulttrue) - CaseSensitive (value is 1true or 0false,
defaultfalse) - missingValue (values are omit, abort, highValue
or lowValue, defaulthighValue) - e.g. sortKeystitle,onix,0
43Functional Matrix
SRU
Request Record Starting Point ?
Request Number of Records ?
Request Record Schema ?
Defined Query Grammar ?
Specify Sort Order ?
Specify Ranking Order ?
Diagnostic Messages ?
XML Response ?
Record Count In Response ?
Records In Known Schema ?
Key ?Full Support ?Limited Support
44Cool Feature
- Combining SRU response data and echoed data with
javascript and stylesheets allows for thin,
browser based, clients - http//alcme.oclc.org/MXG/search/ORPubs?version1.
1query"levan"startRecord1maximumRecords10
45Functional Matrix
OS 1.1 MXG L0 MXG L1 MXG L2 SRU
Request Record Starting Point ? ? ? ? ?
Request Number of Records ? ? ? ? ?
Request Record Schema ? ? ? ?
Defined Query Grammar ? ?
Specify Sort Order ?
Specify Ranking Order ?
Diagnostic Messages ? ? ? ?
XML Response ? ? ? ? ?
Record Count In Response ? ? ? ? ?
Records In Known Schema ? ? ? ? ?
Key ?Full Support ?Limited Support