Managing Libraries with Creative Data Mining - PowerPoint PPT Presentation

About This Presentation
Title:

Managing Libraries with Creative Data Mining

Description:

Managing Libraries with Creative Data Mining Learning to Use Your Library s Data Warehouse to Understand and Improve the Services You Provide – PowerPoint PPT presentation

Number of Views:112
Avg rating:3.0/5.0
Slides: 29
Provided by: JustinL151
Category:

less

Transcript and Presenter's Notes

Title: Managing Libraries with Creative Data Mining


1
Managing Libraries with Creative Data Mining
Learning to Use Your Librarys Data Warehouse to
Understand and Improve the Services You Provide
Ted Koppel The Library Corporation Computers in
Libraries 2005 Session B203, March 17, 2005
2
The Plan
  • What is data mining and why is it useful?
  • Who else does it?
  • Does it make sense for libraries?
  • Are libraries already doing data mining?
  • What data can libraries mine?
  • How much sophistication do I need?

3
What is Data Mining?
  • Collection and Analysis of ones own data in
    order to make better business decisions.
  • More than simple data storage
  • Business intelligence technology for discerning
    unknown patterns from large databases
  • Uses statistics, artificial intelligence,
  • various modeling techniques
  • Related to, but different from,
  • bibliomining

4
Value and Importance
  • By identifying patterns and predicting future
    trends
  • Make decisions based on facts, not
  • guesswork
  • Develop sensible processes
  • Reduce costs or increase services by efficient
    use of resources
  • Serve the customer better

5
High Level planning
  • Remember -- GIGO.
  • Define the data mining goals
  • Data collection
  • Data organization and normalization
  • Analysis
  • Analysis
  • Analysis
  • Reiteration

6
Who is Data Mining now?
  • Manufacturing process control
  • Banks and financial institutions full service
  • Government and law fraud, abuse
  • Sports RHP versus LHB? Sucker for a curve
    ball?
  • Service industries almost all CRM systems
  • Retail product stock and placement
  • Travel airline overbooking
  • Las Vegas guest tracking for comps and benefits
  • Groceries affinity cards
  • Internet GoogleAds

7
Nuggets Found by Mining
  • Chase Bank minimum balance versus other bank
    business
  • Home Depot hurricane planning
  • WalMart (UK) diapers and beer (actually a hoax,
    but an informative one)
  • Casino security in Las Vegas
  • - fraud

8
Implementer Level Tools
  • Oracle Data Mining Suite
  • Microsoft SQL Server 2000
  • SPSS and similar
  • Statistica STATSOFT
  • Open Source
  • Cornell Univ. Himalaya Data
  • Mining Tools
  • WEKA Waikato Environment for
  • Knowledge Analysis (Univ. of Waikato, NZ)

9
Looking for the Dog that Doesnt Bark
  • NORA Non Obvious Relationship Awareness
  • Examines third level relationships between
    datasets
  • ANNA Anonymized Data
  • Double-blind application/offshoot of NORA that
    deals with personal attributes anonymously

10
Vocabulary Lesson
  • Bagging (averaging)
  • Boosting (calculating predictive data)
  • Drilling down
  • Stacking (combining predictions from different
    models)
  • Predictive mining (using X to predict Y)
  • Data Models
  • CRISP Cross Industry Standard Process for DM
  • SEMMA Sample, Explore, Modify, Model, Access

11
Value to Libraries ? a Tool
  • Citizens demand more/better service at a time of
    reduced funding.
  • Anticipate USER behavior
  • Anticipate STAFF behavior
  • Service hours and staffing needs,
  • facilities planning
  • Collection development
  • anticipating customer needs

12
Do Libraries Use DM?
  • Association of Research Libraries ARL Spec Kit
    274 (2003) Mento and Rapple
  • 124 surveys, 65 responses
  • 40 already doing some data mining
  • 90 had plans
  • Major areas of activity
  • Research and Collection Support
  • Administration
  • Repository management (future)

13
ARL Member Benefits Seen
  • Serials cancellation projects
  • Collection Development tuning
  • Budget allocation by material use
  • Workflow analysis
  • Weeding
  • OPAC and Web presence usability
  • and redesign
  • Hacking and break-in analysis
  • (defensive data mining)

14
Other Library Data Mining
  • Kun Shan University of Technology (Taiwan)
  • ABAMDM Model Acquisition Budget Allocation
    Model based on Data Mining
  • More material use ? More money
  • Compared
  • Circulation
  • Collection size
  • Department size
  • of courses
  • students/faculty per department

15
Other Library Data Mining (2)
  • OCLCs ACAS (Automated Collection Analysis
    System) (recently upgraded!)
  • Analyzes bibliographic records by call number
    ranges (LC 4-digit, Dewey tens for example)
  • Subdivides by years and aggregated years
  • Subdivides by branch / collection
  • Collection conspectus as a way to
  • Compare library collections
  • Identify collection deficiencies

16
Other Library Data Mining (3)
  • Univ. of Florida with FCLA
  • Decision Support System for acquisitions
    activities
  • Extracted from NOTIS bib files saved to DB2
  • Screen scraped Acq files
  • Created large database of bib and in-process
    records which allowed querying
  • Circ history of approval versus firm orders?
  • spent on titles that never circulate
  • Do originally-cataloged items circulate? More or
    less than copy cataloged items?
  • How many items circulate more than n times?
  • Assesses collection development and tech service
    activity

17
Libraries are fountains of data
18
Everything is countable(example Circulation
transaction)
  • User
  • age
  • Location
  • Language
  • Sex
  • Zipcode
  • phone
  • School
  • Loan history
  • delinquencies
  • Book
  • branch
  • location
  • Media type
  • pubdate
  • size
  • color
  • thickness
  • circs
  • cost
  • vendor
  • holds
  • Extractable
  • Census Tract
  • Curriculum
  • Holds
  • Circ History
  • Repairs

Multiply this by 10 million times a year!
19
Expand to
  • Acquisitions information (book attributes, vendor
    history and performance, fund history, requester
    and department, etc.)
  • OPAC searching and navigation (databases,
    searches, not founds)
  • Metasearch usage (databases, usage)
  • Reference desk interactions (who, what, how
    long?). VRD by extension
  • Resource sharing (NCIP, ILL)
  • In-house usage transactions
  • Physical plant elevator, restroom, copier use

20
Crunch (Data) Creatively
  • Unlikely variables give interesting data
  • Ideas
  • Sex of user versus color of book
  • Call range vs. age of item vs. circulation
    ratio by avg. paid per item
  • Story hour attendance vs. Adult circ vs. Fines
    collected
  • Best sellers cost vs. Trade books by cost per
    circ
  • Etc.

21
If you can count it, you can analyze it
  • But remember -
  • QUALITY and
  • CONSISTENCY

22
  • Library Automation vendor for over 30 years
  • Family-owned, customer focused
  • LibrarySolution
  • LibrarySolution for Schools
  • CARLSolution
  • CARLX

23
LibrarySolution Reports
  • Utilizes ReportNet software
  • Drag and Drop Report Design
  • Completely Web-based
  • Fitted to Library.Solution data framework
  • Zero footprint on workstations
  • Central reporting with enhanced distribution
  • Multiple export formats
  • Charts, tables, etc.
  • Powerful

24
Using Library Data Outside the Library
  • City, County, RCOG, State Planning and
    Development Authorities
  • Require solid statistics about population,
    educational level, etc.
  • Quality of Life and capital budget services
    planning
  • Preserve user anonymity but share trends
  • Input to GIS systems for real time projection of
    future library needs

25
Applying GIS in the Library Market
  • Library.Decision
    product
  • Works with ILS vendors including TLC
  • Focus collections development
  • Strengthen advocacy planning undertake
    cardholder development campaigns
  • Support grant applications
  • Site new facilities
  • Calculate service indicators
  • Evaluate service delivery in relation to the
    unique needs of your community

26
In closing
  • Libraries are producing data every minute of
    every day
  • You need
  • Some tools
  • Some creativity
  • Some analytical ability
  • Knowledge is Power !

27
Acknowledgements
  • Nicholson and Stanton, Gaining strategic
    advantage through bibliomining. At
    www.bibliomining.com
  • Banerjee, Is Data Mining Right for your library?
    Computers in Libraries, Nov. 98
  • Kao, Chang, and Lin. Decision Support for the
    Academic Library, Information Processing and
    Management 39(2003)
  • Fabris. Advanced Navigation. CIO May 1998
  • Library Administration and Management (journal)
    Winter 1996, section on Data Mining

28
Thank You
  • Contact information
  • Ted Koppel
  • The Library Corporation
  • tedk_at_tlcdelivers.com
  • (800)624-0559
Write a Comment
User Comments (0)
About PowerShow.com