Building national and largescale Internet Information Gateways - PowerPoint PPT Presentation

1 / 115
About This Presentation
Title:

Building national and largescale Internet Information Gateways

Description:

Slide 10. They offer ... France - les Signets. DESIRE II. Slide 17. A guided tour ... SOSIG ... setting the stage (ROADS, DESIRE) expansion of services ... – PowerPoint PPT presentation

Number of Views:159
Avg rating:3.0/5.0
Slides: 116
Provided by: kerry6
Category:

less

Transcript and Presenter's Notes

Title: Building national and largescale Internet Information Gateways


1
Building national and large-scale Internet
Information Gateways
  • A DESIRE Workshop
  • WELCOME!

2
Building national and large-scale Internet
Information Gateways
  • Introductions and Welcome
  • Nicky Ferguson, Institute for Learning and
    Research Technology, University of Bristol, UK
  • Titia van der Werf, National Library of the
    Netherlands

3
(No Transcript)
4
What is an information gateway?
  • Emma Place
  • Institute for Learning and Research Technology
  • University of Bristol, UK

5
  • The Web is quickly becoming the
  • Worlds fastest growing
  • repository of data
  • Tim Berners-Lee
  • W3C Director and creator of the WWW

6
People are increasingly ...
  • going to the Internet
  • before they go to the library

7
Librarians are increasingly ...
  • taking librarianship out of libraries and onto
    the Internet

8
Information gateways ...
  • doing for Internet
  • information resources
  • what librarians do for books

9
Gateways are an Internet search tool
  • to help people find resources on the Internet
    eg
  • electronic journals
  • software
  • datasets
  • electronic books
  • mailing lists / discussion groups (and their
    archives)
  • articles / papers / reports
  • bibliographic databases
  • bibliographies
  • organisational home pages
  • educational materials
  • news
  • resource guides

10
They offer ...
  • Linked collections of Internet resources via a
    database of resource descriptions. This can be
  • browsed - thanks to classification
  • searched - thanks to cataloguing
  • quality controlled - thanks to selection

11
The key ingredient ...
  • the semantics that only the
  • human factor can bring
  • subject specialists
  • library / information professionals

12
Characteristics of an information gateway
  • An online service providing links to Internet
    resources with
  • semantic selection
  • semantic description
  • semantic classification
  • at least part semantic cataloguing
  • (Traugott Koch, Netlab)

13
What gateways are NOT!
  • Internet
  • search engines
  • eg. Altavista or Excite

14
What gateways are not 2!
  • Web directories
  • eg. Yahoo /
  • The Open Directory

15
What gateways can be ...
  • Virtual libraries involving
  • distributed teams of librarians
  • distributed databases that can be
    cross-searched

16
Some national gateway initiatives in Europe
  • UK Resource Discovery Network
  • The Netherlands - DutchESS
  • Finland - Finnish Virtual Library Project
  • Germany - Virtual subject libraries
  • France - les Signets

17
A guided tour ...
  • SOSIG
  • The Social Science Information Gateway
  • http//www.sosig.ac.uk/
  • DutchESS
  • http//www.konbib.nl/dutchess/

18
URLs
  • SOSIG Scope Policy
  • http//www.sosig.ac.uk/desire/escope.html
  • SOSIG Selection Criteria
  • http//www.sosig.ac.uk/desire/ecrit.html
  • DutchESS Manual (in Dutch only)
  • http//www.konbib.nl/dutchess/manual/

19
(No Transcript)
20
Information gateways
  • 49 reasons for National Libraries to be cheerful
    -)

Nicky Ferguson Institute for Learning and
Research Technology University of Bristol, UK
21
Why Gateways at all ?
  • a familiar place - a community centre
  • intermediaries have always been important
  • subject focus leads gently

22
Why Gateways at all ?
  • many users are inexpert users
  • browsing serendipity
  • searching precision
  • both quality

23
Why libraries ?
  • the natural metaphor
  • browsing, reference desk
  • expertise in relevant areas
  • classification, acquisition, keywords
  • information seeking behaviour
  • guiding and helping users
  • who else will do it better ?

24
Why libraries ?
  • the natural metaphor
  • browsing, reference desk
  • expertise in relevant areas
  • classification, acquisition, keywords
  • information seeking behaviour
  • guiding and helping users
  • who else will do it better ?
  • the electronic librarian !

25
Why National Libraries ?
  • too much for one institution
  • too much for one country
  • the influence to collaborate externally
  • and to co-ordinate internally
  • eg The Finnish Virtual Library

26
Benefits, national
  • save the cost of duplicated national effort
  • in academic/public libraries and elsewhere
  • spotlight on nationally funded research
  • trade and business attracted as a result
  • national profile increased
  • by international collaboration

27
Benefits, library
  • leading the way into the information age
  • communicating with non-nerds
  • access to huge high quality collections
  • at lower cost than creating them
  • integrate into existing structures

28
Benefits, users
  • diverse resources brought together
  • research, learning, leisure, enrichment
  • also brought together
  • access for a far wider population
  • someone to ask
  • whats where ? whats what ? whats good ?

29
Benefits of collaboration
  • access to many countries efforts
  • much work done on
  • standards - technical and information
  • rules and procedures
  • formats, consistency
  • quality controls and quality standards

30
The Ideal
  • only create records for one nations resources
  • access records from all nations
  • cross-searching
  • across discipline
  • across subject
  • across language

31
The Message
  • go forth and multiply (your gateways)

32
(No Transcript)
33
Information Gateways in perspective
  • Rachel Heery
  • UKOLN The UK Office for Library and Information
    Networking, University of Bath

34
Information gateways in perspective summary
  • Information gateways as part of the resource
    discovery landscape
  • spectrum of resource discovery initiatives
  • variety of service models
  • information gateways and metadata
  • variety of metadata creation models
  • metadata v. cataloguing
  • collaboration and integration
  • setting the stage (ROADS, DESIRE)
  • expansion of services
  • developing common approaches

35
Spectrum of resource discovery
  • Selective services
  • targeted coverage
  • explicit selection policy
  • value added description
  • RDN (eLib) gateways
  • Nordic Web Index
  • Dutchess
  • GEM etc
  • total services
  • complete coverage
  • business driven selection
  • shallow description
  • Alta Vista
  • Google
  • Yahoo etc

36
Characteristics of selective services
  • breadth of coverage
  • (selection criteria)
  • quality of resource
  • by subject area
  • by region
  • target audience
  • depth of subject description
  • hand crafted
  • metadata aware harvesting
  • use of standard classification schemes
  • authority files applied

37
Metadata creation
  • Who creates metadata?
  • authors
  • experts
  • metadata creation agencies
  • Where is the metadata?
  • embedded in resource
  • local on site database
  • third party databases

38
Metadata creation aphorisms
  • do work as near source as possible
  • do it once, do it right!
  • but need to consider benefit of
  • pattern of enhancement, incremental approach

39
Metadata creation collaboration
  • working with information providers
  • linking libraries and publishers
  • BIBLINK
  • co-operation between libraries
  • Intercat
  • OCLC CORC project
  • enhancing harvested metadata

40
Description of BIBLINK Workspace
Publishers
BIBLINK Workspace A shared facility for storing
and manipulating BIBLINK workspace records
Third parties eg Identification agencies - ISBN,
ISSN, etc.
BIBLINK Workspace Administrator
National Bibliographic Agencies
15
41
Shared approaches
  • compatible technical solutions
  • shared semantics (common metadata sets)
  • shared syntax (HTML, RDF/XML )
  • consistency of content (cataloguing rules)

42
Support activities
  • ROADS
  • DESIRE
  • IMesh
  • Range of associated information gateways
  • DutchESS
  • Finnish Virtual Library project
  • EELS
  • NOVAGate
  • SOSIG
  • Internet Scout
  • . etc

43
(No Transcript)
44
Information Gateways and the international
perspective
  • Marianne Peereboom, KB, The Netherlands
  • Dan Brickley, ILRT, UK

45
Why co-operate?
  • enhancing Internet resource discovery for
    end-users
  • access to much broader collections than any
    single gateway could offer, including high
    quality Internet resources on many subjects, from
    many countries, in many languages
  • access to a large number of metadata records via
    a single, user-friendly interface
  • the ability to locate new gateways that they may
    not have heard about
  • the possibility to search a selection of gateways
  • simultaneously as opposed to one by one

46
Why co-operate?
  • improving the efficiency and sustainability of
    gateway services
  • use established technologies, methods and
    practices - and avoid starting from scratch
  • divide responsibilities for creating or sharing
    metadata records - and avoid duplication of
    effort
  • combine effort for technical development - and
    avoid repetition of work and errors
  • create joint publicity, training and promotion
  • share staff effort (management/technical/
    administrative/cataloguing)
  • create shared strategies for long-term
    sustainability

47
Why not?
  • political or funding issues
  • competition instead of co-operation?
  • Safeguard own identity, position in the market
    place
  • possible disadvantages of co-operation
  • co-operation can incur extra expense
  • intellectual property rights
  • agreeing on aims and objectives

48
Models for co-operation
  • co-operative agreements for metadata records
  • creation of metadata records
  • use of metadata records
  • building integrated services
  • pointing to other services
  • mirroring other services
  • cross searching / cross browsing
  • integrated interface
  • customized interfaces to one collection of
    records
  • interoperability

49
Interoperability
  • being able to search, browse and retrieve
    information from distributed gateways based on
    (broadly) the same technologies, protocols and
    metadata formats
  • being able to search, browse and retrieve
    information form distributed gateways based on a
    variety of software solutions, search-retrieve
    protocols and metadata formats

50
Standards
  • search and retrieve (or indexing) protocols
  • Z39.50, Whois, LDAP
  • metadata formats
  • cataloguing standards
  • subject indexing schemes

51
Key initiatives
  • ROADS
  • software and standards for developing information
    gateways which can be cross-searched with any
    other gateway
  • http//www.ilrt.bris.ac.uk/roads/
  • DESIRE
  • European project which aims to promote the
    development of the gateway model in Europe
  • http//www.desire.org/

52
Key inititatives
  • ISAAC
  • a research project of Internet Scout in the USA.
  • aim to create an architecture that enables
    repositories of metadata records to be cross
    searched
  • http//scout.cs.wisc.edu/research/index.html

53
Key initiatives (2)
  • iMesh
  • informal discussion forum to promote
    collaboration amongst information gateways
  • http//www.desire.org/html/subjectgateways/communi
    ty/imesh/
  • iMesh Toolkit
  • project fundes by National Science Foundation
    (USA) and JISC (UK) to develop architecture
    toolkit for distributed
  • subject gateways. (building on ROADS and ISAAC)

54
Reynard project
  • European project 5th Framework programme
  • Duration Jan 2000 - June 2002
  • Aims
  • 1. to provide a one-point-access to, and to aim
    at a consistent presentation of national
    subject-services in Europe
  • 2. to exploit existing services by way of
    creating a shared test environment within which
    national initiatives will experiment with
    co-operative efforts, devise models for sharing
    metadata, agree on technical solutions and
    short-cyclic innovation, develop
  • business models and foster standardisation
    activities

55
Reynard partners
  • national libraries and national resource
    discovery network organisations
  • research libraries who have acquired expertise in
    different areas of subject gateway development
  • library related technology centres and university
    computer centres

56
Demonstrations
  • http//www.desire.org/html/research/demonstrations
    /

57
(No Transcript)
58
Panel Session
  • Nicky Ferguson, ILRT (Chair)
  • Eric Miller, OCLC
  • Toini Alhainen, Finnish Virtual Library Project
  • Titia van der Werf, National Library of the
    Netherlands
  • Rachel Heery, UK Office for Library and
    Information Networking
  • Debra Hiom, The Social Science Information
    Gateway
  • Traugott Koch, Electronic Engineering Library,
    Sweden

59
Questions for the panel
  • 1) Why give Internet resources different
    treatment in cataloguing? (eg why use metadata
    such as Dublin Core rather than MARC or ISBD-ER
    and why catalogue resources into something
    separate from the library OPAC?)
  • 2) What are the key strengths and weaknesses of
    the information gateway approach?
  • 3) How far should there be a national strategy
    and is there one in your country?
  • 4) What lies in the future? Is creating
    national information gateways a sound foundation
    for future developments?
  • (eg in the light of forthcoming technologies
    /
  • metadata formats and possible international
  • collaborations)

60
(No Transcript)
61
Sustaining resource description
  • Lorcan Dempsey
  • UKOLN The UK Office for Library and Information
    Networking, University of Bath

62
(No Transcript)
63
Setting up a gateway- practical issues
  • Emma Place
  • Institute for Learning and Research Technology
  • University of Bristol, UK

64
Three parts ...
  • 1) Overview - Emma Place
  • 2) Information issues - Marianne Peereboom
  • 3) Technical issues - Paul Hollands

65
Coming Soon ...
  • The DESIRE
  • Information Gateways Handbook
  • www.desire.org

66
So what do you need to set up a gateway?
67
Basic ingredients
  • money
  • people
  • equipment
  • time

68
Phases in a gateway project
  • 1) planning
  • 2) set-up
  • 3) building your collection
  • 4) running the service
  • 4) ongoing maintenance / development
  • 5) adding new features
  • 6) managing a mature gateway

69
(1) Planning
  • what is ideal vs. what is possible
  • money / resources
  • strategy, aims, objectives

70
Scoping
  • Components
  • target audience
  • Your decisions!
  • Scope Policy
  • Selection Policy

71
Staff and skills required
  • Skills needed
  • subject expertise
  • information
  • technical
  • interface design
  • training / publicity
  • management
  • Your decisions!
  • central staff
  • distributed staff

72
System requirements
  • What you need
  • network connectivity
  • hardware
  • Web server software
  • database and software
  • PCs and materials for staff
  • Your decisions!
  • standard gateway software
  • your own system

73
(2) Set up
  • Components
  • database
  • user interface
  • admin interface
  • records
  • Your decisions!
  • metadata formats
  • classification scheme
  • tools and guidelines
  • cataloguing rules

74
(3) Building your collection
  • finding resources
  • selecting resources
  • describing resources
  • ensuring quality, consistency and coverage

75
(4) Running the service
  • building a user community
  • publicity and promotion
  • user support and training
  • announcing Whats New
  • day-to-day management

76
(5) Ongoing maintenance
  • collection management
  • link checking
  • editing records
  • updating resource descriptions
  • server integrity and functionality

77
(6) Adding new features
  • a harvested index
  • thesaurus feature
  • primary content
  • community areas
  • cross-search features
  • user profiles / personalised views
  • mirrors

78
(7) Mature gateways
  • scalability issues
  • collaboration
  • displaying larger collections
  • rising maintenance costs
  • upgrading the system
  • future proofing
  • hardware
  • software
  • content

79
  • So thats the overview
  • - now lets
  • get down to detail ...

80
Information Issues
  • Marianne Peereboom
  • Koninklijke Bibliotheek

81
Workflow for information staff
  • selection of resources
  • cataloguing of resources
  • editing and adding resources to database
  • housekeeping maintenance of resources

82
Selection
  • Resources for the gateway will be selected by
    skilled staff (subject specialists, librarians,
    information specialists). Their value judgement
    will be guided by
  • Scope policy
  • Which type of resources will be included in the
    catalogue
  • Quality criteria
  • Criteria to judge whether a resource that falls
    within the scope of the gateway is of
    sufficiently high quality

83
Selection policy
  • helps users to appreciate that the service is
    selective and quality controlled
  • helps users to understand what type of quality
    information they will find when using the service
  • ensures consistency of selection by individual
    staff members
  • ensures consistency of selection among members of
    a (distributed) team
  • can be used in training new staff

84
Scope policy
  • First identify
  • target user group
  • the information needs of the user group
  • aims and objectives of the gateway
  • balance what you would like to cover with what
    you have the resources to cover

85
Scope policy
  • Metadata and cataloguing
  • granularity
  • resource description
  • Geographical issues
  • geographical restraints
  • language
  • Information coverage
  • subject matter
  • acceptable types of resource
  • acceptable sources
  • acceptable level of difficulty
  • advertising
  • Access
  • cost
  • technology
  • registration
  • special needs

86
Selection criteria
  • Content criteria
  • validity
  • authority and reputation of source
  • accuracy
  • comprehensiveness
  • uniqueness
  • composition and organisation
  • currency, and adequacy of maintenance
  • Form criteria
  • ease of navigation
  • provision of user support
  • use of recognised standards
  • appropriate use of technology
  • aesthetics
  • Process criteria (the system)
  • information integrity (info provider)
  • site integrity (webmaster)
  • system integrity (systems administrator)

87
Examples
  • Make you scope policy and selection criteria
    available for your users, so they will know what
    to expect
  • Scout report
  • http//scout.cs.wisc.edu/report/sr/criteria.html
  • EELS Engineering Electronic library
  • http//www.ub2.lu.se/eel/qualcrit.html
  • SOSIG
  • http//sosig.ac.uk/desire/ecrit.html

88
Resource description
  • To be able to create and maintain resource
    descriptions you need
  • a metadata format
  • cataloguing rules
  • database maintenance tools

89
Resource description - types of data
  • A resource description will record different
    types of information
  • bibliographic-type descriptive information
  • author, title, publisher, location etc.
  • subject information
  • classification code, keywords, thesaurus terms
  • administrative metadata
  • record creation date, intellectual property,
  • individuals responsible for selection,
    cataloguing, etc.

90
Metadata
  • Types of metadata formats
  • 1 relatively unstructured data automatically
    extracted from resources and indexed for use by
    robot-based Web services
  • 2 structured formats simple enough to be
    created by non-specialist users. Usually manually
    created, but some data may be extracted
    automatically.
  • Examples ROADS, Dublin Core
  • 3 specialised formats, developed to organise
    complex relations between objects or collections
    of objects and are often based on implementations
    of SGML
  • Examples TEI header, MARC

91
Metadata format
  • Considerations when choosing a metadata format
  • which minimum set of fields do you need to enable
    the modes of access/search functionality you want
    to provide?
  • do you want to enable interoperability with other
    services in the future? Possibilities for
    conversion.
  • are there any conventions in your subject
    community?
  • does the software you have chosen for your
    service impose restrictrions on the format you
    may use?
  • who is going to do the cataloguing?

92
Dublin Core (DC)
  • Content
  • Title
  • Subject
  • Description
  • Source
  • Relation
  • Coverage
  • Type
  • Intellectual property
  • Creator
  • Publisher
  • Contributor
  • Rights
  • Instantiation
  • Date
  • Language
  • Format
  • Identifier

93
DC Examples
  • EdNA Education Network Australia
  • http//www.edna.edu.au/EdNA/
  • Combination of DC and local EdNA elements
  • AGRIGATE Agriculture Information Gateway
    (Australia)
  • http//www.agrigate.edu.au/index.html
  • Overview of Metadata Fields

94
A simple format DutchESS
  • Administrative
  • Library subject specialist code
  • Record creation date
  • Record last verified date
  • Record last update date
  • Elements
  • Mandatory
  • Title
  • BC code (classification)
  • URL
  • Annotation (in English)
  • Optional
  • Author
  • Identification (not URL, e.g. ISSN)

95
DutchESS format
  • Characteristics
  • home grown....
  • mappings ROADS (future DC?)
  • easy to maintain, cataloguing by subject
    specialist possible - no need for skilled
    cataloguers
  • functionality restricted by simple format and
    cataloguing rules

96
More complicated.... ROADS
  • ROADS offers a complete solution to setting up a
    gateway - with metadata templates included
  • has different templates for different types of
    resources e.g. software, document, service,
    mailarchive
  • some templates have up to 80 fields but gateways
    create minimum sets involving fewer
  • easily converted to other formats (MARC/Dublin
    Core)
  • increaes functionality

97
Providing subject access
  • Classification
  • describing the broad subject area or discipline a
    resource belongs to
  • used to group documents in well defined subject
    areas
  • Keywords
  • give more detailed description of individual
    document
  • used as a searching aid
  • Thesauri
  • controlled vocabulary with defined (hierarchical)
    relationships between terms
  • structured search for relevant term by indexer
    and user possible

98
Classification
  • Types of schemes
  • universal scheme (DDC, UDC) - example CyberDewey
  • national scheme (BC, SAB) - example DutchESS
  • subject specific scheme (Ei, NLM) - example
    EELS,
  • Finnish Virtual library
  • home grown (Yahoo!)
  • Main advantages
  • good basis for browsing structure
  • multilingual access possible
  • interoperability (cross browsing)

99
Keywords
  • Useful as an extra search aid
  • Uncontrolled keywords
  • problems with different spellings,
    (near-)synonyms
  • Controlled
  • general (LCSH Library of Congress Subject
    Headings)
  • subject specific (MESH Medical Subject
    Headings)
  • user will need access to the vocabulary to be
    able
  • to find the right term

100
Thesauri
  • Semantic relations between terms defined
  • Broader term
  • Narrower term
  • Top term
  • Related term
  • Preferred term
  • Non-preferred term
  • Use
  • to aid users to find the relevant term (SOSIG)
  • as a basis for the browsing structure, in place
    of a classification scheme (OMNI)

101
Cataloguing a resource
  • Examples
  • HTML form to submit a resource for DutchESS
  • http//www.konbib.nl/nbw-cgi/usr/nbw_aanmeldform.p
    l
  • Cataloguing template used in SOSIG
  • http//www.ukoln.ac.uk/metadata/roads/templates/

102
Housekeeping
  • Identify tasks and the staff responsible for
    them
  • Validating records to ensure that the record is
    accurate
  • Link checking records to ensure that resources
    are available
  • Updating resource descriptions to ensure that the
    record still adequately reflects the content of
    the resource
  • Maintenance tool
  • to enable appointed staff to add, edit, remove
    records provide access to link checker output
    etc.
  • DutchESS Maintenance tool

103
Staff support
  • Provide training
  • Face to face workshops
  • Online training
  • Provide online documentation
  • DutchESS Manual
  • http//www.konbib.nl/dutchess/manual/
  • SOSIG Admin Centre
  • http//sosig.ac.uk/admin/section-editors/
  • (password protected)

104
Technical issues
  • Paul Hollands
  • Institute for Learning and Research Technology
  • University of Bristol, UK

105
Components of a subject gateway architecture
  • Requirements for your back-end database
  • nested boolean and fielded searching
  • truncation / stemming
  • browseable indexes
  • ranking
  • stored searches
  • batched results
  • flexibility is important.database format is not

106
Components - Web interfaces 1
  • Front of house
  • search forms (simple and advanced)
  • results with a variety of format options
  • browseable directory style subject listings
  • forms to suggest new entries
  • personalised portal style my gateway
    interfaces

107
Components - Web interfaces 2
  • Back office / administration interfaces
  • metadata creation / cataloguing
  • database administration
  • edit and delete records and authority lists
  • managing indexing (immediate and deferred)
  • link checking
  • handling new submissions
  • checking for duplicates and database integrity

108
Interoperability
  • can you accept queries from and send results back
    to clients using
  • WHOIS ?
  • Z39.50 ?
  • SQL ?
  • LDAP ?
  • can you generate Centroids to become a part of a
    cross-searchable mesh of servers?
  • can you produce results in a range of metadata
    formats?

109
Other tools
  • harvested databases
  • automated cataloguing
  • user profiles
  • thesauri

110
Open source
  • do others have the opportunity to build on your
    work to the benefit of you both?
  • will you have to pay a developer to make changes
    to the functionality of your system rather than
    do it yourself?
  • will you have to wait for your developer to
    produce bug fixes your staff could make
    themselves given the source code?

111
ROADS gateway toolkit
  • www.ilrt.bris.ac.uk/roads
  • www.opensource.ac.uk

112
(No Transcript)
113
Discussion/Surgery
  • 1. technical issues
  • 2. information management
  • 3. organisational and day-to-day management
  • 4. funding and business models

114
Closing Address
  • Nicky Ferguson
  • Institute for Learning and Research Technology
  • University of Bristol, UK

115
Lunch... then Surgery time
Write a Comment
User Comments (0)
About PowerShow.com