Lorcan Dempsey (with contributions from colleagues) - PowerPoint PPT Presentation

About This Presentation
Title:

Lorcan Dempsey (with contributions from colleagues)

Description:

in the areas of metadata management and. knowledge organization. ... Book vendors and bibliographies. ABE Books. ABAA. Alibris. HCBIB. BookPage ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 58
Provided by: lorcand
Learn more at: https://www.oclc.org
Category:

less

Transcript and Presenter's Notes

Title: Lorcan Dempsey (with contributions from colleagues)


1
OCLC some development and research directionsin
the areas of metadata management and knowledge
organization.Presented to Library of Congress
cataloging managers retreat.
  • Lorcan Dempsey(with contributions from
    colleagues)
  • VP Research and Chief Strategist
  • Library of Congress, 15 June 2004

2
Topics
Metadata management and knowledge organization
Framework for WorldCat directions
Open WorldCat
Working with web services
Some research, some production
Making data work harder
3
Framework for WorldCat directions
4
Collections grid
stewardship
high
low
low
uniqueness
high
5
WorldCat the what?
  • WorldCat - Grow - Version - Improve
  • Easier to use (FRBR)
  • Microcontent
  • Evaluative content

The Open Web Both surface and acquire WorldCat
content
Add special collections institutional content
to WorldCat dissertations, cultural heritage
collections, Eprints, learning objects
6
WorldCat the how?
7
Some issues
  • Metadata variety
  • Encoding, element sets, values/content
  • Provenance
  • Metadata manipulation
  • Validation, identification
  • Enhancement, augmentation
  • Relation, FRBR, deduplication
  • Transformation
  • Schematization and web services
  • Make data available in forms that allow machine
    services to be flexibly built on top of them
  • Everything is a service

8
Open WorldCat
9
Open WorldCat
  • Facilitate the rendezvous of users and library
    services on the web
  • Surface the library where the users are
  • Help release the value of library services in the
    working and learning lives of their users.

10
Open WorldCat Architecture
WorldCat , Additional collections can be added to
Worldcatlibraries domain
Metadata
Schemas and Vocabularies
OCLC Developed Geo-locator services to matches
users to extensive FirstSearch WorldCat
institution and user profiles
Profiles and Relationships
OCLC Uses Host of Authentication and
Authorization tools to progressively match
content to rights
Content Owner
Access
OCLC Organizes WorldCat content in model
suitable for harvesting, anticipate unique
aspects of various portals
Aggregators
Distribution, Search, Display
Google, Yahoo and Book Vendors
Portals
Organization and Presentation
11
Current partners
Click in presentation mode to go through
toexamples
  • Book vendors and bibliographies
  • ABE Books
  • ABAA
  • Alibris
  • HCBIB
  • BookPage
  • Search engines (pilot with 2M records exposed as
    web pages for harvesting)
  • Google
  • Yahoo!

Try a search forA history of caricature and
grotesque in literature and art
12
Google and Yahoo! timeline
13
Traffic
Full record displays. Projected for June.
14
Metadata management and knowledge organization
15
Research activities
  • Structures
  • FRBR
  • VIAF
  • BT
  • FAST
  • Vocabulary encoding and mappings
  • Services
  • xISBN
  • Metadata transformation services
  • Terminology services
  • Authority services
  • Automatic classification and cataloging
  • Eprints uk
  • Web harvesting

16
FRBR
Click in presentation mode to go through
toFictionFinder
  • OR Work-set algorithm
  • Work-based view incorporated into WorldCat in
    FirstSearch in late 2004
  • FictionFinder
  • 2.6 million fiction records from Worldcat,
    clustered by OCLCs FRBR algorithm
  • Make greater use of data (genres, settings,
    imaginary characters, etc)
  • Participate in ongoing FRBR refinement

17
FAST
18
Vocabulary mappings
19
Services
  • Web services
  • Computer to computer applications over the web
  • Unplug and play
  • Unbundling monolithic applications and making
    functionality available in more modular ways
  • Reuse and sharing
  • Of services!
  • Release the value in a web environment of the
    historical library investment in vocabularies and
    structures

20
xISBN
  • An experimental web service
  • Leverages FRBRization work
  • Give it an ISBN, it returns all related ISBNs
  • Based on WorldCat
  • Designed for machine-to-machine data exchange
  • Examples
  • Check user ILL requests against all
    editions/versions in OPAC
  • Find librarys editions when user finds any
    edition/version of item on Amazon
  • Check OPAC for all editions during
    selection/acquisitions/gift book processing

21
xISBN
Install FRBR Bookmarklets in your browser to see
xISBN working.See Bookmarklets page At
www.oclc.org/research/researchworks/
Click cover to search Seattle Public Library
Click cover to search amazon.co.uk
22
Metadata schema transformations
  • Metadata Schema Transformation Services
  • Evaluate approaches to crosswalking metadata
  • Prototype transformation environments
  • The XSLT short path
  • Supports lightweight XML processing
  • Designed for public access
  • Deliverables
  • OAI repository of METS-captured xwalks NEW
  • The long path option
  • Designed for high-fidelity translations
  • May be public or proprietary
  • Deliverables Toolkit expertise in non-MARC
    formats

23
(No Transcript)
24
A crosswalk as a METS record
  • Describe the crosswalk object in the METS header.
  • Assemble and identify six objects in the METS
    structural map
  • The source metadata schema
  • The target metadata schema
  • The crosswalk
  • Human-readable and executable versions of each
  • Associate metadata for each file in the METS
    Descriptive Metadata Section.

25
Crosswalk METS record in OAI repository
26
(No Transcript)
27
(No Transcript)
28
What the METS encoding solves
  • The semantic and syntactic information required
    for interpreting and executing a crosswalk is
    collected into a single object.
  • The repository is searchable by humans and
    automated processes.
  • Services can be built on top of it.
  • It encourages the development and standardization
    of crosswalks.

These outcomes are possible because every
component in the system is a standard.
29
Terminology Services
  • Terminology services are web services for
    knowledge organization schemes (kos)
  • e.g., authority files, subject heading systems,
    thesauri, taxonomies, and classification schemes
  • A web service that provides mappings from a term
    in one vocabulary to one or more terms in another
    vocabulary is an example of a terminology service

30
Current Situation
  • A plethora of vocabularies
  • Many encoding formats
  • Few inter-vocabulary connections
  • Identifiers inadequate
  • Unavailable
  • Temporary
  • Inconsistent

31
Terminology services system framework
  • Schema transformation
  • MARC XML
  • SKOS
  • Zthes
  • Record enhancement
  • Inter-vocabulary mappings
  • Persistent identifiers (infouri)
  • Access
  • Human-readable
  • Browse interface (ERRoLs)
  • Search/retrieve records (SRU/W)
  • Switch between schema-specific views (XSLT)
  • m2m
  • Publishing (OAI)
  • Search/retrieve records (SRU/W)
  • infouri resolution (OpenURL)
  • Open standards
  • MARC 21
  • XML/XSLT/XPath
  • SKOS
  • Zthes
  • SRU/SRW
  • OAI
  • infouri
  • OpenURL
  • Open source software
  • OCLC OAICat
  • OCLC SRU/SRW server
  • OCLC ERRoL J2EE webapp
  • Open content
  • GSAFD, others
  • Open access
  • Web services-oriented

32
Schema Transformation
  • MARC XML
  • Authority Format Classification Format
  • SKOS
  • Simple Knowledge Organization Systems
  • Zthes
  • Z39.50 Profile for Thesaurus Navigation.5
  • Based on Z39.19 (NISO Thesaurus Standard)

33
Vocabulary Processing
schema transformation
data enhancement
  • Add
  • provenance (MARC Org. Codes)
  • persistent identifiers (infokos)
  • Conversion from most
  • formats
  • Z39.19
  • wordlists in PDF, etc.
  • Optionally, add
  • inter-vocabulary mappings
  • Concepts terms
  • persistent identifers
  • (infokos)
  • Initial conversion to
  • MARC XML
  • Authorities format, or,
  • Classification format

34
Infokos
  • Infouri
  • provides a mechanism for the registration of
    public namespaces that are used for the
    identification of information assets
  • The kos identifier
  • provides a mechanism for identifying knowledge
    organization schemes and the concepts used in
    those schemes. It has two elements
  • scheme
  • concept

35
New services environment
36
(No Transcript)
37
(No Transcript)
38
(No Transcript)
39
(No Transcript)
40
(No Transcript)
41
Name authority lookup
Lorcan Dempsey
  • Interactive
  • As a web service
  • An example authority control serviceinvoked
    from within Dspace ?

Click in presentation mode.
42
(No Transcript)
43
(No Transcript)
44
(No Transcript)
45
(No Transcript)
46
Working with web services
47
Making data work harder
48
Data mining
  • Research
  • Production
  • Collection analysis service in development phase
  • Leverages WorldCat data in interactive mode
  • Compare my collection to my peers
  • Compare my collection to my neighbors
  • Profile my collection by subject, by age,
  • etc

49
Collection
  • Change creates demand for better data.
  • Growing interest in knowing more about
  • Characteristics
  • Gaps and overlaps
  • Use
  • Tuning collections based on data.
  • Focus collection spending where creates most
    value.

50
Some projects
  • Characteristics of collections
  • WorldCat
  • CIC
  • Compare ILL, circulation and holdings data.
  • Last copy what is irreplaceable?
  • ARL Global Resources.
  • Exploring coverage of overseas titles in ARL
    libraries.
  • Depends on consistency, coverage, currency

51
Comparing CIC Collection Profiles
52
Audience level
Forge
Letters
53
Profiles of Letters Forge Example
54
Topics
Metadata management and knowledge organization
Framework for WorldCat directions
Open WorldCat
Working with web services
Some research, some production
Making data work harder
55
Thoughts
  • Machines will do more work
  • Consistency becomes more important
  • Variety
  • Low precision
  • Make data work

56
The pattern is new
The knowledge imposes a pattern and falsifies For
the pattern is new in every moment
57
Further information
Thanks to colleagues in OCLC Research
for contributions to this presentation. Further
information about OCLC Research projectscan be
found at http//www.oclc.org/research/
Thanks to colleagues in OCLC Collection
Management Services for contributions to this
presentation. Further information aboutOpen
WorldCat athttp//www.oclc.org/worldcat/pilot/
Write a Comment
User Comments (0)
About PowerShow.com