Interoperability Standards - PowerPoint PPT Presentation

About This Presentation
Title:

Interoperability Standards

Description:

Interoperability Standards & Searching Multiple Repositories Ralph LeVan/OCLC Ray Denenberg/Library of Congress The Problem How do I provide a common interface for my ... – PowerPoint PPT presentation

Number of Views:109
Avg rating:3.0/5.0
Slides: 46
Provided by: Ralp145
Category:

less

Transcript and Presenter's Notes

Title: Interoperability Standards


1
Interoperability Standards Searching Multiple
Repositories
  • Ralph LeVan/OCLC
  • Ray Denenberg/Library of Congress

2
The Problem
  • How do I provide a common interface for my users?
  • How do I combine results from multiple sources?

3
How do I provide a common interface for my users?
  • How do I convert my queries into the Content
    Providers (CPs) queries?
  • How do I ask for 10 records?
  • How do I ask for more records?
  • How do I interpret their response?

4
How do I convert my queries into the CPs queries?
  • My user said authortwain and titlehuck finn
  • Google expects twain huck finn
  • Z39.50 twain/1100342 huck finn/1441 and
  • Lucene creatortwain and titlePhrasehuck finn

5
How do I ask for 10 records?
  • Amazon wont let you
  • RedLightGreen MAXRECORDSn
  • British Library recordsn

6
How do I ask for more records?
  • Amazon pagen
  • RedLightGreen STARTINDEXn
  • British Library startn

7
How do I interpret their response?
  • How many records did I retrieve?
  • Did something go wrong?
  • How do I convert the CPs records into something
    my users will recognize?

8
How many records did I retrieve?
  • Amazon
  • lta href"/gp/search/refsr_nr_i_0/002-2019116-8269
    663?5FencodingUTF8keywordspratchettrhi3Aaps
    2Ck3Apratchett2Ci3Astripbookspage1"gtBookslt/a
    gtltspan class"narrowValue"gtnbsp(334)lt/spangt
  • RedLightGreen
  • ltbgtViewinglt/bgt 1-10 of 239 results
  • British Library
  • ltopensearchtotalResultsgt190lt/opensearchtotalResu
    ltsgt

9
Did Something Go Wrong?
  • RedLightGreen
  • ltspan classsmallTextgtWe didn't find any matches
    for ltbgtdog andlt/bgt.lt/spangt
  • British Library
  • ltitem gt
  • lttitle gtNothing found due to an errorlt/titlegt
  • ltdescription gtToo many hits. Refine your
    request.lt/descriptiongtlt/itemgt

10
How do I convert the records?
  • Amazon
  • lttable class"searchresults" border"0"
    width"100" cellpadding"0" cellspacing"0"gt
  • lttrgtlttd width"100" class"searchitem"
    id"Td0"gt
  • lttable border"0" width"100" cellpadding"0"
    cellspacing"0"gtlttr valign"top"gt
  • lttdgt
  • lttable class"n2" border"0" cellpadding"0"
    cellspacing"0"gt
  • lttrgt
  • lttd class"imageColumn" width"88"gtlttable
    border"0" cellpadding"0" cellspacing"0"gt
  • lttrgtlttd align"center" width"80"gt
  • lta href"http//www.amazon.com/gp/product/00608152
    21/sr8-1/qid1142436987/refpd_bbs_1/002-2019116-
    8269663?5FencodingUTF8"gtltimg src"http//ec1.ima
    ges-amazon.com/images/P/0060815221.01._PIsitb-st-a
    rrow,TopLeft,-1,-14_SCTHUMBZZZ_.jpg" width"55"
    alt"Thud! (Discworld, Book 32)" height"82"
    border"0" /gtlt/agt
  • lt/tdgtlttd width"8"gtlt/tdgtlt/trgtlt/tablegtlt/tdgt
  • lttd class"dataColumn"gtlttable cellpadding"0"
    cellspacing"0" border"0"gtlttrgtlttdgt
  • lta href"http//www.amazon.com/gp/product/00608152
    21/sr8-1/qid1142436987/refpd_bbs_1/002-2019116-
    8269663?5FencodingUTF8"gtltspan
    class"srTitle"gtThud! (Discworld, Book
    32)lt/spangtlt/agt
  • by Terry Pratchett (ltspan class"binding"gtHardco
    verlt/spangt
  • - Sep 13, 2005)lt/tdgtlt/trgt
  • lttrgtlttd class"brandLink"gtltspan
    class"aliasName"gtBookslt/spangt lta
    href"/gp/search/refsr_nr_seeall_1/002-2019116-82
    69663?5FencodingUTF8keywordspratchettrhi3Aa
    ps2Ck3Apratchett2Ci3Astripbooks"gtSee all 334
    itemslt/agtlt/tdgtlt/trgt
  • lttrgtlttdgtltspan class"priceType"gtlta
    href"http//www.amazon.com/gp/product/0060815221/
    sr8-1/qid1142436987/refpd_bbs_1/002-2019116-826
    9663?5FencodingUTF8"gtBuy newlt/agt
    lt/spangtnbspltspan class"listprice"gt24.95lt/spangt
    ltspan class"saleprice"gt15.72lt/spangt

11
Converting Records Cont.
  • RedLightGreen
  • lttd class"highlightcell"gtltspan
    class"titleText"gtltbgtlta title"View more
    information about this title." href"ucw.servlets.
    UCWController?ACTIONEDITIONampWORKID21537371a
    mpLANGUAGEENGampMATERIALbooksampFROMRSLT3
    ampFROMWORK1amplangenglish"gtHogfatherlt/agtlt/bgt
    , by Terry Pratchett ltbrgt3 editions published
    between 1996 and 1998 in English.ltbrgtPrimary
    Subject Discworld Imaginary Place -
    Fictionltbrgtltimg src"/ucwprod/web/images/green.gif
    " height"3" width"10" alt"A title's position
    in a search result is based on relevancy (how
    closely your search terms match the description)
    xAand availability (how many libraries have a
    copy of the title)."/gtltimg src"/ucwprod/web/image
    s/white.gif" height"3" width"1"/gtltimg
    src"/ucwprod/web/images/green.gif" height"3"
    width"10" alt"A title's position in a search
    result is based on relevancy (how closely your
    search terms match the description) xAand
    availability (how many libraries have a copy of
    the title)."/gtltimg src"/ucwprod/web/images/white.
    gif" height"3" width"1"/gtltimg
    src"/ucwprod/web/images/green.gif" height"3"
    width"10" alt"A title's position in a search
    result is based on relevancy (how closely your
    search terms match the description) xAand
    availability (how many libraries have a copy of
    the title)."/gtltimg src"/ucwprod/web/images/white.
    gif" height"3" width"1"/gtltimg
    src"/ucwprod/web/images/green.gif" height"3"
    width"10" alt"A title's position in a search
    result is based on relevancy (how closely your
    search terms match the description) xAand
    availability (how many libraries have a copy of
    the title)."/gtltimg src"/ucwprod/web/images/white.
    gif" height"3" width"1"/gtltimg
    src"/ucwprod/web/images/gray.gif" height"3"
    width"10" alt"A title's position in a search
    result is based on relevancy (how closely your
    search terms match the description) xAand
    availability (how many libraries have a copy of
    the title)."/gtltimg src"/ucwprod/web/images/white.
    gif" height"3" width"1"/gtlt/spangtlt/tdgtlt/trgtlt/tabl
    egtlttable xmlns"http//www.w3.org/TR/REC-html40"
    border"0" cellpadding"0" cellspacing"0"
    width"100"gtlttrgtlttd class"recordsepcell"
    colspan"2"gtltimg src"/ucwprod/web/images/clear.gi
    f" height"1"/gtlt/tdgtlt/trgtlt/tablegtlttable
    xmlns"http//www.w3.org/TR/REC-html40"
    border"0" cellpadding"3" cellspacing"0"
    width"100"gtlttr valign"top"gtlttd width"25"
    align"right" class"highlightcell"gtltspan
    class"titleText"gt2.lt/spangtlt/tdgt

12
Converting Records Cont.
  • British Library
  • ltitem gtlttitle gtThud! / Terry Pratchett.lt/titlegt
  • ltlink gthttp//catalogue.bl.uk/F/-?funcdirect-do
    c-setdoc_number013220851l_baseBLL01fromA9Ope
    nSearchlt/linkgt
  • ltdescription gt Pratchett, Terry. London
    Doubleday, 2005. . ISBN 0385608675 (hbk.)
    17.99 . (Added 20050614 )lt/descriptiongtlt/item
    gt

13
How do I combine results from multiple sources?
  • Things you might want the server to do for you
  • Common Record Format
  • Common Sort Order
  • Common Rank Order

14
Functional Matrix
Request Record Starting Point
Request Number of Records
Request Record Schema
Defined Query Grammar
Specify Sort Order
Specify Ranking Order
Diagnostic Messages
XML Response
Record Count In Response
Records In Known Schema
15
The Old Solutions
  • Screen Scraping
  • Private APIs
  • Z39.50

16
Screen Scraping
  • A query has to be generated and embedded in a CP
    specific URL
  • Code has to be written to examine the HTML
    returned by a CP
  • Prone to breakage
  • Web sites change formatting frequently
  • Every site is unique
  • Separate code to be maintained for every site

17
Private APIs
  • Often only a slight improvement over screen
    scraping
  • Provides documentation on how to construct the
    URL
  • Might provide documentation on how to construct
    the query
  • Might guarantee a stable response format
  • Still requires unique code for each site

18
Z39.50
  • Guarantees a standard request and response
  • But
  • Not HTTP or HTML
  • Binary encoding over raw TCP/IP
  • Complicated
  • 11 services
  • 7 extended services
  • Easy to be compliant and not interoperable
  • Unfriendly
  • The response to a protocol error was to drop the
    connection

19
Why Use A Standard API?
  • Defined requests and responses
  • Reusable code across sites
  • Open Source code

20
The New Solutions
  • OpenSearch 1.1
  • MXG
  • Levels 0-2
  • SRU

21
OpenSearch 1.1
  • From Wikipedia
  • OpenSearch is a collection of technologies that
    allow publishing of search results in a format
    suitable for syndication. It is a way for search
    engines to publish their search results in a
    standard and accessible format

22
OpenSearch 1.1 (cont.)
  • Defines a Description Record with information
    about the CP
  • ShortName and LongName
  • Description
  • Tags
  • URL template
  • Example
  • http//herbie.bl.uk9080/opensearch.xml

23
OpenSearch 1.1 (cont.)
  • URL Template
  • Server Indicates how to specify OpenSearch
    request parameters
  • Parameters not specified in the template are
    unavailable
  • The only mandatory parameter is searchTerms
  • ltUrl type"application/rssxml"
    template"http//herbie.bl.uk9080/cgi-bin/OSxml1.
    cgi/?qsearchTermsstartstartIndex?records
    count?formatrss" /gt

24
OpenSearch 1.1 (cont.)
  • Request Parameters
  • searchTerms
  • count
  • startIndex
  • startPage
  • language
  • outputEncoding
  • inputEncoding

25
OpenSearch 1.1 (cont.)
  • Uses RSS 2.0 with a few extra elements for the
    response
  • RSS define title, description and link elements
  • OpenSearch adds the totalResults, startIndex,
    itemsPerPage, link and Query elements
  • http//herbie.bl.uk9080/cgi-bin/OSxml1.cgi/?qlev
    anformatrss

26
Functional Matrix
OS 1.1
Request Record Starting Point ?
Request Number of Records ?
Request Record Schema
Defined Query Grammar
Specify Sort Order
Specify Ranking Order
Diagnostic Messages
XML Response ?
Record Count In Response ?
Records In Known Schema ?
Key ?Full Support ?Limited Support
27
Cool Feature
  • The RSS mechanism in OpenSearch provides the
    ability to have persistent and periodic queries!

28
NISO MetaSearch XML GatewayMXG
  • MXG has been designed to provide a low
    implementation barrier to content providers that
    want to make their databases available to
    metasearch engines.  Interoperability across
    content providers was explicitly not a goal of
    MXG

29
MXG Levels of Support
  • Level 0 Requests are simple URLs using any
    query grammar and responses are XML records
  • Level 1 Adds a description record for the
    database
  • Level 2 Support a limited subset of a standard
    query grammar CQL

30
MXG Request
  • Version (mandatory)
  • Query (mandatory)
  • StartRecord
  • MaximumRecords
  • http//alcme.oclc.org/MXG/search/ORPubs?version1.
    1query"levan"startRecord1maximumRecords10

31
MXG Response
lt?xml version"1.0" ?gt ltsearchRetrieveResponse
xmlns"http//www.loc.gov/zing/srw/"gt
ltversiongt1.1lt/versiongt ltnumberOfRecordsgt10lt/nu
mberOfRecordsgt ltrecordsgt
lt/recordsgt ltnextRecordPositiongt1lt/nextRecordPo
sitiongt ltechoedSearchRetrieveRequestgt
ltversiongt1.1lt/versiongt ltquerygtquotstuff
quotlt/querygt lt/echoedSearchRetrieveReques
tgt lt/searchRetrieveResponsegt
32
MXG Response Records
ltrecordgt ltrecordSchemagt
infosrw/schema/1/dc-v1.1
lt/recordSchemagt ltrecordPackinggtxmllt/recordPack
inggt ltrecordDatagt
lt/recordDatagt ltrecordPositiongt1lt/recordPositio
ngt lt/recordgt
33
MXG Response recordData
  • ltsrw_dcdc xmlns"http//www.w3.org/TR/xhtml1/str
    ict" xmlnsdc"http//purl.org
    /dc/elements/1.1/"
    xmlnssrw_dc"infosrw/schema/1/dc-v1.1"gt
    ltdcidentifiergtrrl1234lt/dcidentifiergt
    ltdctitlegtDog and
    Catlt/dctitlegt lt/srw_dcdcgt

34
MXG Error Messages
  • ltdiagnosticsgt
  • ltdiagnostic xmlns"http//www.loc.gov/z
    ing/srw/diagnostic/"gt
  • lturigtinfosrw/diagnostic/1/51lt/urigt
  • ltdetailsgt66ntqklt/detailsgt
  • lt/diagnosticgt
  • lt/diagnosticsgt
  • http//www.loc.gov/z3950/agency/zing/srw/diagnosti
    cs-list.html

35
Functional Matrix
MXG Level 0
Request Record Starting Point ?
Request Number of Records ?
Request Record Schema ?
Defined Query Grammar
Specify Sort Order
Specify Ranking Order
Diagnostic Messages ?
XML Response ?
Record Count In Response ?
Records In Known Schema ?
Key ?Full Support ?Limited Support
36
MXG Level 1
Add a description record for the
database http//www.loc.gov/z3950/agency/zing/srw/
explain.html http//alcme.oclc.org/MXG/search/ORP
ubs
37
Functional Matrix
MXG Level 1
Request Record Starting Point ?
Request Number of Records ?
Request Record Schema ?
Defined Query Grammar
Specify Sort Order
Specify Ranking Order
Diagnostic Messages ?
XML Response ?
Record Count In Response ?
Records In Known Schema ?
Key ?Full Support ?Limited Support
38
MXG Level 2
Support a limited subset of a standard query
grammar CQL Supports indexes and
Booleans http//www.loc.gov/z3950/agency/zing/cql/
http//alcme.oclc.org/srw/search/ORPublications?v
ersion1.1querydc.authorlevanmaximumRecords1
39
Functional Matrix
MXG Level 2
Request Record Starting Point ?
Request Number of Records ?
Request Record Schema ?
Defined Query Grammar ?
Specify Sort Order
Specify Ranking Order
Diagnostic Messages ?
XML Response ?
Record Count In Response ?
Records In Known Schema ?
Key ?Full Support ?Limited Support
40
SRU
  • MXG Level 2 Plus
  • Full Query Grammar (CQL)
  • Full Sort Specification

41
CQL Common Query Language
  • Loosely based on CCL Search
  • Boolean Proximity Operators
  • Index Sets Indexes
  • String Indexes vs. Keyword Indexes
  • Truncation Characters , ?
  • Relations , all, any, exact, within
  • Example
  • dc.titleharry potter or bib1.isbn123-456-78x

42
Sort
  • sortKeys parameter with the following comma
    separated values specified
  • Xpath (path to the element to be sorted on)
  • Schema (that the xpath comes from)
  • Ascending (value is 1true or 0false,
    defaulttrue)
  • CaseSensitive (value is 1true or 0false,
    defaultfalse)
  • missingValue (values are omit, abort, highValue
    or lowValue, defaulthighValue)
  • e.g. sortKeystitle,onix,0

43
Functional Matrix
SRU
Request Record Starting Point ?
Request Number of Records ?
Request Record Schema ?
Defined Query Grammar ?
Specify Sort Order ?
Specify Ranking Order ?
Diagnostic Messages ?
XML Response ?
Record Count In Response ?
Records In Known Schema ?
Key ?Full Support ?Limited Support
44
Cool Feature
  • Combining SRU response data and echoed data with
    javascript and stylesheets allows for thin,
    browser based, clients
  • http//alcme.oclc.org/MXG/search/ORPubs?version1.
    1query"levan"startRecord1maximumRecords10

45
Functional Matrix
OS 1.1 MXG L0 MXG L1 MXG L2 SRU
Request Record Starting Point ? ? ? ? ?
Request Number of Records ? ? ? ? ?
Request Record Schema ? ? ? ?
Defined Query Grammar ? ?
Specify Sort Order ?
Specify Ranking Order ?
Diagnostic Messages ? ? ? ?
XML Response ? ? ? ? ?
Record Count In Response ? ? ? ? ?
Records In Known Schema ? ? ? ? ?
Key ?Full Support ?Limited Support
Write a Comment
User Comments (0)
About PowerShow.com