Title: Digital Library: The HKU Libraries
1Digital Library The HKU Libraries experiences
- Kam-ming Ku
- HKUL
- kmku_at_hku.hk
2The presentation is about
- How to achieve delivering right information to
the right person at the right time in anywhere? - HKUL resources/projects
- Going to do
- Challenges
- Overcome the challenges
- Discussion
31. HKUL resources/projects
- 1.1 Staffing
- 1.2 Networking
- 1.3 Hardware
- 1.4 Software
- 1.5 DL initiatives
41.1 Systems Staff
- Systems Librarian
- 2 Computer Officers
- Assistant Librarian
- Assistant Computer Officer
- Senior Library Assistant
- 5.5 Technicians
51.2 Networking
- From 10 ? 100 ? 1000 ? wireless ? Bluetooth??
- Gigabit Ethernet backbone and Fast Ethernet
running to users. About 1000 network points. - ACENet connection (Access Everywhere Network
plug-in network for roaming users) 450 fixed
points 18 wireless access points.
61.2 Networking (cont.)
- Libraries within Campus are connected to Campus
Backbone by Gigabit Ethernet link or Fast
Ethernet link. - 2 remote sites, Dental Medical Libraries, are
connected to Main Campus by 10Mbps links
respectively. - Gigabit Firewall (Cisco PIX Firewall)
- Packeteer Network shaper
71.3 Hardware
- Compaq AlphaServer GS60E ( for library catalogue)
- SUN Enterprise 4000, 10000
- 3 Linux, 5 Windows and 3 Novell Servers
81.3 Hardware (cont.)
- 10 CDROM Towers
- 4 Towers for Staff
- 2 Towers in Medical Library
- 4 Towers for Network
- 3 WinFrame Servers 1 Thin Client server
- 1 Network CD-ROM MetaFrame Server
- 1 Standalone CD-ROM MetaFrame Server
- 1 Network CD-ROM WinFrame Server
- 1 Dell Server for 6 Thin Clients
91.3 Hardware (cont.)
PC MAC
Office/Staff 289 6
Counter 35
Student 342 7
Printer Scanner
Office/Staff 107 17
Student 27 12
101.4 Software
- SUN Solaris 8, DEC UNIX, Windows 2000/NT, Novell
Netware, Linux - III Innopac library management system
- Oracle 9i database, 9iAS (Web) and Context
(full-text indexing/searching) - ERL server for SilverPlatter databases
- WinFrame server for legacy and network CDROM
databases - Apache Web servers
111.4 Software (cont.)
- TRS 4.0 server
- CJN server for hosting 6000 China full-text
journals - Proxy server, Samba server
- Pcounter server
- Tamino XML server
- VOD server (IBM Videocharger)
- Ezproxy Server
121.4 Software (cont.)
- Illiad server (Inter-library Loan)
- Taiwan Newspaper database
- Chinese Database Server Sibucongkan (????)
Sikuquanshu (????) ekangxi dictionary (????)
131.5 HKUL DL initiatives
141.5 HKUL DL initiatives
Imaging database
151.5 HKUL DL initiatives
- 1.5.1. Digitalization projects
- e.g. ExamBase
- First in-house developed database
- Imaging database for past exam. papers
- Released in 1996
- Use DMS, client-server model
- Shifted to web-based soon
- tiff only (on-the-fly convert to gif/jpg) , no
PDF!!!
16- Hardware
- High-speed flat bed scanner (36ppm)
- Software
- Kofax capture 3.0
- Sophisticated software includes scanning, OCR,
verifications. -
17- Logistics
- Scanning
- Automatic indexing
- Verification and manual inputting
- Data Publishing
- Publish data to Oracle database
-
18- Scanning
- Papers are scanned in batch mode (200 pages per
batch) - Uses separation sheet to separate different
documents (The separation sheet is printed with
barcoded index (e.g. department, course code) and
fixed-sized font text The separation sheets can
be re-used.)
19- Automatic indexing
- To recognize those barcoded indexes and text
printed on the separation sheet - Verification and manual inputting
- No need to verify the barcoded indexes, as the
accuracy is gt 99.999 - In-doubt OCRed text is marked in red, it is easy
to verify - Input other indexes manually (e.g. exam. date)
201.5 HKUL DL initiatives (cont.)
- e.g. Newspaper clippings
- Full-text imaging database
- Outsource scanning/indexing/OCR
- Oracle context cartridge as full-text search
engine (supports no Chinese!) - Decision keep on using? or buying a 3-rd party
full-text software??
211.5 HKUL DL initiatives (cont.)
- 1.5.2 Value-added Bibliographic databases
- Subset of library catalogue
- e.g. TOC , Thesis Online, AV materials..
- Debate
- single point source or a number of subsets??
221.5 HKUL DL initiatives (cont.)
- e.g. Table of Contents
- To automate the inputting of TOC into
bibliographic records
23- Hardware
- Overhead book scanner (4sec per image)
- Software
- Kofax capture 3.0
- Sophisticated software includes scanning, OCR,
verifications. -
24- Techniques
- Scanning
- Chinese OCR
- Proofreading
- Data Publishing
- Publish data to Catalogue
-
25- Scanning
- Use book scanner to scan the books TOC
- benefits
- no need to flip the book for scanning
- can scan two sides at one time
- increase the speed of scanning
- Chinese OCR
- A plug-in module was written to interface with
Kofax Capture for Chinese OCR (TH-OCR 7.5)
26- Proofreading
- Use MS Word (Chinese) to do the proofreading
- Macro program was written to ease the step of
assigning MARC sub-fields - Publish data to Catalogue
- Done at night in batch mode
- Use tcl/tk expect script to automate the upload
process
271.5 HKUL DL initiatives (cont.)
- 1.5.3 Subject-based e-resources
- Redesign tag 996
- A number of useful information on e-resources
- Grouping of materials by subject fulfill users
needs - Ease of extending our further DL projects (e.g.
portal) - See HKUL HP (databases, EJ, Ebooks ENews)
- 1.5.4 Internet resources
- 1.5.5 Electronic Delivery (ILLiad)
281.5 HKUL DL initiatives (cont.)
- 1.5.6 Virtual services
- E-forms (e.g. BRO)
- Online reference
- 1.5.7 Automation
- Increase efficiency
- e.g. amend thousand of records in batch
- Electronic submission
- Staff intranet
- Innoface
291.5 HKUL DL initiatives (cont.)
- 1.5.8 Collaboration
- Union catalogue w/ Jinan University
- 1.5.9 Authentication Proxy, ezproxy, IP control
- 1.5.10 Others for accessing legacy CDROM
databases
30(No Transcript)
312. Going to do
- Storage Area Network (SAN)
- Abundance of servers
- One-stop search
- Alert service
- Wireless applications
322.1 SAN
- Problem a Storage
- large data size of our hosted databases
- high monthly data increase rate
- Databases are hosted in different hosts/OS
332.1 SAN (cont.)
- Problem b Backup
- backup drive for every machine
- backup software license for every machine
- Need to handle a lot of backup tapes
342.1 SAN (cont.)
- Solution (SAN)
- Put all data storage into a single large-sized
expandable storage device. - The storage device is connected to the hosts by
high-speed Fiber channels - Fiber channel loop is used to connect to each
host in order to ensure high availability - Backup can be done on a single device
352.2 Abundance of servers
- Problem
- Hard to monitor the status and activities of each
server - Waste time to tune the performance of each server
362.2 Abundance of servers (cont.)
- Solution Server consolidation
- Buy several powerful servers instead of many
cheap mid-range servers - Keep as minimal servers as possible
- Save space and UPS power ratings , i.e. saving
- Save man power to administer/maintain server
performance , i.e. cost saving
372.3 One-stop search
- Before searching, one needs to know which
database suit ones need - To search multiple databases simultaneously
-
- e.g. OAI (http//www.openarchives.org/ )
- e.g. CDL SearchLight (http//www.cdlib.org/cgi-bi
n/searchlight)
382.4 Alert service
- To alert users for new information
- SDI
2.5 Wireless Application
- A study on mobile and PDA application in Library
393. Challenges
- Changes
- New Technologies
- Competitors
- What are the (future) standards?
- Contents
- Digital Vs printed
- Information overflow
- Lifelong education
403.1 The causes of changes
- Development of I.T.
- Network, telecommunications, digitalization,
storage format, access model, - Economy
- Online, e-commerce, smart card ,
- Learning environment
- Life-long learning
- Mode of communication
- Email, ICQ
413.2 New technologies
- Changing so fast
- Acronyms
- Help http//www.webopedia.com
- Who knows what the future would be?
- Reluctant to change
- Dont be afraid to dig in
- See Editors notes, Computers in Libraries,
vol.22, no.8, p.6
423.3 Competitors
- Who?
- See OCLC White paper on the Information Habits
of College Students (http//www2.oclc.org/oclc/pdf
/printondemand/informationhabits.pdf) - 79 use a search engine for every or most
searches!!
43Technology Adoption Life Cycle
Late Majority
Early Majority
Early Adopters
Laggards
Innovators
Source Crossing the Chasm, Geoffrey Moore
44Crystal Ball??
- Number of visits ?
- Usage of physical materials ?
- Training to users real-time support ?
- Demand for subject knowledge ?
- Competitors ?
- Fast services high productivity
- Information provider and producer
- Cost-effectiveness
- Library workflow goes to e-business model
- Partnership
- Provide services that lead to income
454. Overcome the challenges
- What business are we in?
- What are our major strengths weakness?
- Who are our competitors?
- Who are our customers? their needs?
- What factors are affecting Library?
- Do we have the skills?
464. Overcome the challenges how?
- Training - to keep abreast with new technologies
- Human resources - partners
- Value-added services
- User-oriented mindset
- Automation
- Improve the social image of librarians
- Co-operation
- Talk with other people in order to understand the
technology different areas - Research
474. Overcome the challenges (cont.)
- Skills?
- Librarianship IT knowledge
- Teamwork, Commitment
- Thinking methodology creativity, use of
knowledge - Outlook of the world
- Interpersonal skills
- Health!!
48Principles for building DL
- Expect change
- Know your content
- Involve the right people
- Design usable system
- Ensure open access
- Beware of data rights
- Automate whenever possible
- Adopt and adhere to standards
- Ensure quality
- Be concerned about persistence
- McCray, A. Gallagher, M. (2001). Principles for
Digital Library Development, Communications of
the ACM, 44(5), pp.49-54.
49