Title: Content Organisation for Internetbased Information Services
1Content Organisation for Internet-based
Information Services
T.B. Rajashekar Visiting ScientistInformatics
(India) Ltd.Bangalore - 560 003 (E-Mail
raja_at_informindia.com)
2Content Organisation for Internet-based
Information Services
- Web information architecture
- Internet-based LI services
- How are the applications accessed?
- Issues in content hosting
- Content formats
- Tools for content creation processing
- Related sources
3Web Information Architecture
- Web servers and browsers
- Web servers store variety of web compatible
documents and provide access to these on the
Internet or an intranet - PCs, RISC-based workstations/servers
- These documents are accessed using Web browsers
like Netscape and IE - Palm tops, Laptops, PCs, workstations, etc.
4Web Information Architecture
- Web sites and URL
- One or more web servers identified with a unique
web site address on the Internet (e.g.
www.iisc.ernet.in) - Documents available on a Web site are uniquely
identified using the URL scheme access
protocol//host.domain port/path/file
name(Ex. http//www.ncsi.iisc.ernet.in/ncsi/data
base.html)
5Web Information Architecture...
- Anatomy of a web site
- Hardware, software (OS, web server, CGI,
database, indexing and search, etc.) - Dedicated Internet/intranet connectivity
- Information content Documents stored in variety
of formats (HTML, SGML,PDF, databases, images,
audio, video, etc.)
6Web Information Architecture...
- Anatomy of a web site
- HTML pages integrate access to this information
- Organized in a hierarchical manner
- Home page (root page) provides links to second
level HTML pages which in turn link to third
level HTML pages, and so on - These pages may contain images and provide access
to databases through search forms, PDF files,
audio and video, etc. or link to documents on
other servers
7Web Information Architecture...
8Applications
- Let us consider different Internet-based
applications in the library - Assumptions
- Library has a LAN with enough network computers
for staff and users - Library LAN is linked to institutional intranet
- Has full, round-the-clock Internet access
- Library web site is the integrating factor
9(No Transcript)
10Applications
11Applications
- Information about the library its services
- Aims objectives, opening hours, rules and
regulations, departments, collection, budget,
staff, map/ directions, contact, FAQs, etc - Locally owned electronic info sources
- OPAC
- Networked CD-ROMs
- Electronic journals
- Internal publications (e.g. staff publications,
dissertations, reports, projects, manuals)
12Applications
- Locally owned electronic info sources
- Reference sources (e.g. dictionaries,
encyclopedias) - Digital audio, video and multi-media collections
13Applications
- Remote information sources
- Available via the Internet
- Subscribed electronic journals, bibliographic
databases, reference sources - Free (e.g. PubMed, PubSci)
- Z39.50 based library catalogues and databases
- Content available on resource sharing networks
(local, regional, national) - Available on the intranet
- Content hosted on other web sites in the intranet
14Applications
- Push-based services
- Current awareness (e.g. new additions, news
letters, content pages) - Profile-based alerting services (e.g. SDIs,
content pages) - Discussion forums
- Housekeeping operations
- Book acquisitions
- Technical processing (cataloguing,
classification) - Serials management
15Applications
- Other content/ services
- Training material, guides
- Administrative, procedural manuals
- FAQs, Feedback
16How Are the Applications Accessed?
- Typically, access to all these services is
integrated through the library web site (library
web site design will be discussed in the next
session) - User visits the library web site using a web
browser and browses through HTML pages - Selects a link pointing to content likely to
provide the information and accesses the
information source/ service/ application - Content has to be delivered in browser compatible
format. How do we achieve this?
17Issues In Content Hosting
- Key questions to ask while hosting content
- What content I want to provide access?
- What is the delivery medium?
- Web
- Browser compatible content
- E-Mail
- Push-based services, discussion forums
- Telnet
- Legacy databases (non-web aware)
- FTP
- Download files
18Issues In Content Hosting
- Key questions to ask while hosting content
- Web-enabled content
- Browser aware ? (HTML, GIF, JPEG)
- Other formats will require helper applications/
plug-ins - (e.g. PDF Acrobat Reader, DOC MS Word)
- How is this going to be accessed?
- Browse
- Search
- Site search
- Content specific search
- Search level Full text, field specific
- Browse Search
19Issues In Content Hosting
- Key questions to ask while hosting content
- Who is going to access this? (download time)
- Intranet user
- Internet user
- How is the content stored (granularity)?
- File
- Record (in a database)
- How is this content going to be created?
- Paper to Web
- Scanning and conversion
- Data entry
- Electronic to Web (e.g. Word to HTML or PDF)
20Issues In Content Hosting
- Key questions to ask while hosting content
- What content creation/conversion tools are to be
used? - Free
- Commercial
- Security considerations for purchased content
(Locally hosted, Remote sources) - How do we implement access restrictions?
- How do we facilitate easy access to all internal
users? - Based on IP range, domain name
- User name/ password (how to administer this?)
21Content Formats
- Content Information about library its
services, policy/plan documents, e-journals,
annual reports, manuals - Running text with/without graphics, tables,
charts, multimedia - Storage File
- Access Browse and select (e.g. Doc title ?
Content page ? Section/ chapter) - Content formats HTML, PDF, ASCII (text), images
(GIF, JPEG), audio (.WAV, .RA), video (.AVI,
.MOV, .MPG), etc - Search Site search, collection specific search
22File level content (example 1)
23File level content (example 1) (contd.)
24File level content (example 1) (contd.)
25File level content (example 2)
26File level content (example 2) (contd.)
27File level content (example 2) (contd.)
28Content Formats
- Content OPAC
- How to handle non-web enabled OPACs (legacy
library automation/ DBMS packages)? - Export OPAC records into a text files
- Index using a web enabled software (e.g. MG,
Free-WAIS) - Export OPAC records into ISO 2709 format
- Import into a CDS/ISIS database
- Use CDS/ISIS web tools (e.g. WWWISIS)
- Use ODBC for providing web access via CGI
- CDS/ISIS databases
- Use CDS/ISIS web tools (e.g. WWWISIS)
29Content Formats
- Content Bibliographic, experts, institutions,
projects - Document surrogates (meta data)
- Storage Database
- Structured (RDBMS)
- Unstructured (text-oriented)
- Access
- Database title ? Record/field -based search form
- Tools WWWISIS, MG, SQL, free-WAIS
- Content format Database specific
- Could also be used to provide access to full text
at file level using hypertext linking
30Record level content
31Record level content(contd.)
32Record level content (contd.)
33Record level content (contd.)
34Record level content (RDBMS)
35Record level content (RDBMS) (contd.)
36Content Formats
- Content Web access to networked CD-ROM databases
- Traditional solution File server with access
limited to PC clients - Solutions now exist to provide platform
independent access to Windows applications on the
Internet - Ex. Citrixs Metaframe using ICA
(www.citrix.com) - ICA client can be used alone or as a plug-in to
Web browsers to access Windows applications - We can thus present an integrated list of
networked databases on the library web site and
allow users to access these using web browsers
from different platforms - Trend is towards hard disc hosting for web access
(e.g. ERL, OVID)
37Hard disc hosting of CD-ROM content using ERL
38Hard disc hosting of CD-ROM content using ERL
(contd.)
39Hard disc hosting of CD-ROM content using ERL
(contd.)
40Hard disc hosting of CD-ROM content using ERL
(contd.)
41Content Formats
- Content Remote information sources
- Most library web sites provide links to Internet
sites (subscribed and/or free) of relevance to
its users (Internet resource catalogue/gateway) - Typical details include title, description,
keywords, source type, access details, etc. - Resource description (meta data) standards
Dublin Core - Format
- Simple HTML-based listings (by subject, source
type) - Database driven catalogues, supporting keyword
search and subject/ source type browsing - Z39.50 access to library catalogues (using client
softwares like Bookwhere)
42Internet resource catalogue - e-journals (RDBMS)
43Internet resource catalogue - e-journals (RDBMS)
(contd.)
44Internet resource catalogue - e-journals (RDBMS)
(contd.)
45Content Formats
- Content Push Services
- Current Awareness Services (e.g. list of
additions, news letters) - Format HTML, ASCII
- Delivery E-Mail
- Also hosted on the web site
- Profile-based alerting services
- Web-based profile set-up and maintenance by
individual users - Processing Extraction from databases
- Delivery E-Mail
46An example Push service
47An example Push service (profile set up)
48An example Push service (modification)
49Content Formats Summary
- Major content types in library web sites
- Web pages (HTML), Images (GIF, JPEG), Text
(ASCII), Text (PDF), Search-based content from
databases - Other content types
- DHTML (interactive web pages), Audio (WAV, RA,
MP3), Video (AVI, MOV, MPEG), Animations (GIF,
Flash), Multimedia presentations (Macromedia
Director) - Emerging formats
- XML (Extensible Markup Language)
50Tools for Content Creation Processing
- Some content creation and conversion tools
- HTML Netscape Composer, MS Frontpage
- Imaging (GIF, JPEG) Adobe Photoshop and Paint
Shop Pro - OCR TextBridge
- PDF Adobe Acrobat Exchange
- Conversion tools (Word-to-HTML, text-to-HTML)
- Audio capture and conversion (Real Audio and MP3)
51Tools for Content Creation Processing
- Search tools
- Web site indexing (HtDig)
- Bibliographic databases (WWWISIS)
- Bibliographic and full text databases (MG)
- RDBMS access using ODBC (FOXPRO ORACLE)
52Related Sources
- Network-Based Library and Information Services
(NETLIS smart.ncsi.iisc.ernet.in/netlis) - Digital Library Sunsite - Digitisation tools and
resources (sunsite.berkeley.edu) - D-Lib magazine e-journal reporting new
developments in digital libraries, including
tools for handling digital content (www.dlib.org)
53Related Sources
- From Paper to Web How to make information
instantly accessible (Tony McKinley)
(imagebiz.com) - Digitisation of exam papers (Andrew Hampson et
al) (The Electronic Library, 17(4), Aug 1999,
239-46) - Web publishing with Acrobat and PDF (Bruce Page
and Diana Holm, 1996. John Wiley)
54Related Sources
- Networked Digital Library of Theses and
Dissertations (www.ndltd.org) - Guide to Networked Resources and Tools (GNRT)
(www.terena.nl/libr/gnrt) - IFLA (www.ifla.org) particularly
(www.ifla.org/I/training/html) - The digital library tool kit (Dr. Peter Noerr,
Sun Microsystems, April 1998) (www.edulib.com)
55Thank You