Title: Gathering Statistics
1Gathering Statistics
- NFAIS Philadelphia
- October 27, 2006
- Michele Newberry
2Spreadsheet Hell!!!
3ABC-CLIO
- Logon to the Stats interface at
http//serials.abc-clio.com/reports/ - ID ltxxxgt_at_ufl.edu PW ltnnnngt
- http//serials.abc-clio.com/reports/start_formnam
eloginfoappnamereportsloginnameltxxxgt_at_ufl.edu
passwordltnnnngtforgotten_password0 - Choose Select All for Institutions
- Choose a reporting period (one or multiple
months) - Choose an Output Type (I choose Excel Friendly
HTML) - Click Run Report
- http//serials.abc-clio.com/reports/go/ABC-Clio-Se
rials-Reports_appnamereports_operationDoReport
addlcidI00271addlcidI00267addlcidI00268addl
cidI00269addlcidI00270addlcidI00637addlcidI
00725addlcidI00652addlcidI00697addlcidI00764
startmonth20051001stopmonth20051101outputtype
C - outputtypeC for CSV, E for Excel friendly html,
H for HTML - start,stop month in YYYYMMDD
- Save Page As HTML file FCLAReportMMYY.html,
i.e. FCLAReport0905.html. - Upload to FCLA website and transcribe total
annual searches into the master spreadsheet.
4BePress
- BePress stats are VERY complicated to gather. Go
to the admin URL http//www.bepress.com/cgi/myacc
ount.cgi - Login with the user ID and Password for each
school. - Copy the URL for each schools report and paste
into the browser to get an Open or Save As
popup window. - Open the file in Excel, and make the following
modifications - Add a line above line one that says BePress
Usage Report (Arial, 14 point) - Bold Italicize color purple the next line
(Full-Text Downloads.) - Delete the usage data for each title up to
January of the current year (these reports keep
data from the creation of any given universitys
account), and then move the data for the current
year to the left so that the January data is in
Column B. - Resize Column A so that the full title of each
Journal is viewable - Merge Center these two header lines over the
width of the report - Once finished, Save As Web Page naming by
school year, i.e. famu_2005.html - Repeat for each school.
- Upload to FCLA website and transcribe total
annual searches into the master spreadsheet.
5CSA
- Login to CSA Illumina Usage at
http//mars3.csa.com/usage/ou_login.aspx - Login with ID ltxxxgt and PW ltnnnngt
- Choose either Live Reporting or Emailed Reports
(depending on how current the data needs to be
(explained on the site) normally choose Live
for current data. - Under Consortium Reports, select a date range
(one or multiple months), and run the report. - Save As HTML, naming the file
csa_fcla_MMYY.html, i.e., csa_fcla_0905.html - Upload to FCLA website and transcribe total
annual searches into the master spreadsheet.
6EBSCO
- Login to EBSCO Admin at http//eadmin.epnet.com/e
admin/login.aspx - ID ltxxxxgt PW ltnnnngt
- Click on the Reports Statistics tab, then
configure the report. - Normally choose
- By Database, Consortium ALL
- Level Site
- Date range one or multiple months
- Include All Sites
- Fields to show Sessions, Searches, Total Full
Text Requests - (any other fields are fine as well, but those
are the required fields) - Then either Show, E-mail, or Schedule this report
to be run. - Save As HTML, naming the file
ebscoYYYY_MM.html, - i.e., ebsco2005_08.html.
- Upload to FCLA website and transcribe total
annual searches into the master spreadsheet.
7Gale / IAC InfoTrac
- Login to Gale InfoTrac Config at
http//infotrac.galegroup.com/itconfig/fcla_000 - ID ltxxxxgt PW ltnnnngt
- Click on Reports in the navigation bar.
- Under Consortium, select E-mail Gale or COUNTER
reports, or setup a monthly report. Normally we
choose the Gale report, as the COUNTER report
does not display stats in the searches by
university month style that FCLA prefers. - So choose Gale report, and select a date range.
- Under Gale Standard Use Reports, check Usage
Summary, Usage by Database, Library
Location. - Choose FormatComma Separated Values
CompressionNone and Attachment Yes. - Recipient enter E-mail address for the report
and click Get Report to get it sent via
E-mail. - Once received, format it to resemble the existing
reports at http//www.fcla.edu/FCLAinfo/stats/iac
/iac.html . - Saved as HTML with a filename of
gale_MM_YYYY.html, i.e., gale_09_2005.html. - Upload to FCLA website and transcribe total
annual searches into the master spreadsheet.
8 LexisNexis Academic, Congressional
Statistical
- LexisNexis stats can be gathered by accessing
their website at http//www3.lexisnexis.com/aur/s
ignon.html - ID ltXXXXgt PW ltNNNNgt
- Once there, you can view HTML or download CSV
reports. HTML reports cannot be downloaded. FCLA
worked with LexisNexis to get an FTP account
through which we download the HTML reports. - FTP Setup Info
- FTP Host - ftp.lexisnexis.com
- Login ltXXXXgt
- Password ltNNNNgt
- Download all of the new reports for Academic,
Congressional, Statistical, and the Rollup
reports - Rename the files as institutionYYMM.html, i.e.,
famu0508.html . - Upload to the FCLA website for each product and
transcribe total annual searches into master
spreadsheet.
9ProQuest
- Login to ProQuest Local Admin at
http//lad.proquest.com/ladweb - ID ltXXXXgt PW ltNNNNgt
- Click on the tab (or link) for
- Select Report Type
- Database Activity Detail
- Delivery Method Download or Email now
- Show items with zero usage Yes
- Include sub-accounts in this report Yes
- Select a date range for the usage period (one or
multiple months) - Click Create Report.
- Save as HTML as pqYYMM.html, i.e., pq0905.html.
- Upload to FCLA website and transcribe total
annual searches into the master spreadsheet. - NOTE There is also a cumulative report for
ProQuest Digital Dissertations usage that is
emailed once a month in HTML format and uploaded
upon arrival as pqdd_stats.html.
10RLG
- Login to RLG Stats at http//reports.rlg.org
- Invoice Account Code (IAC) ltXXXXgt
- Access via two reports
- 6. Union Cat and Citation Files Searches for
Month - 7. Other Info Resources Search Activity for
Month - Aggregate stats by institution manually due to
the shared IAC. - Do not post on the FCLA website.
- Transcribe total annual searches into master
spreadsheet.
11Standard Poors
- Login to SP NetAdvantage at http//www.netadvan
tage.standardandpoors.com/NASApp/NetAdvantage/usag
e/Usage.do - ID ltXXXXgt PW ltNNNNgt
- SP reports are only available for single
month/single institution. - 10 reports must be downloaded per month.
- If multiple institutions are selected, the report
gives an aggregated total, rather than a
breakdown by site, which FCLA needs. - Select Month, Year and Institution, then click
Show Report. - Click Printer Friendly to generate a new window
with the full report, then Save Page As HTML,
as sp_institution_MM_YYYY.html, i.e.,
sp_famu_09_2005.html - Repeat for all 10 reports.
- Once all reports have been saved, all hyperlinks
must all be removed as they point to resources
are not be available from the posted page. - Upload to FCLA website and transcribe total
annual searches into master spreadsheet.
12ValueLine
- ValueLine statistics are emailed to FCLA
directly from the vendor rep. - Excel format.
- Save as a valueline_MM_YYYY.html,
- i.e., valueline_09_2005.html
- Upload to the FCLA website and transcribe total
annual searches into master spreadsheet.
13Wilson
- Login to WilsonWeb at http//www.hwwstats.com/ng
/ - Account Number ltNNNNgt (then click Login)
- Password ltXXXXgt (then click Continue)
- Click Database Usage
- (COUNTER reports are available, but the reports
under Database Usage are more appropriate to the
kind of data that FCLA gathers monthly) - Select Bill To Account (Ship To Account will
generate ZERO usage) - Account ALL (or you can run individual school
reports) - Product ALL
- Detail Level Complete Report
- Choose a date range (one or multiple months)
- Sort By Number of Searches, then click Submit
- Once the report is generated, save as a file
(HTML) or email the file. Save the HTML file as
wilson_MM_YYYY.html, i.e., wilson_09_2005.html. - Upload to the FCLA website and transcribe total
annual searches into master spreadsheet.
14ACM
- ACM provides usage stats only twice per year,
after June 30 and after December 31. - Data is provided personally by ACM rep.
- ltinject humorous stories about ACM and AMS heregt
- Files are delivered in HTML format, by school.
- Add the following header
- ACM Digital Library
- Usage Report for (school name)
- (Date Range of report)
- Save the file as HTML (acm_school_year.html
i.e. acm_famu_2005.html). - Upload to FCLA website and transcribe total
annual searches into master spreadsheet.
15CHEST-UK and ATHENS
- Does what the ACM suggested uses the
authentication conduit to monitor usage - Athens single sign on access point for users
- Provides COUNTER compliant stats over and above
any that are supplied by the vendor - for individual accounts, groups of accounts
(using various grouping methods) or all accounts - for individual resources or all
- for a single day or a date range
16BUT ATHENS Statistics
- can only counts sessions, the libraries still
have to rely on the publisher/ vendor statistics
for searches and journal article downloads, etc
17FCLA Database Search Statistics
18EBSCOHost Usage Reports
19EBSCOadmin Database Usage Report January 2006
20The SPREADSHEET
21VIVA -- from Kathy Perry, Director
- I always think it's sad when publishers (and
some librarians) think usage stats are only
useful for collection development and they think
collection development only involves
cancellation. In fact, one of the primary uses
of our stats is actually collection development
when we ADD resources. This is particularly true
in considering leasing or buying the archives,
but it is also true of other collections.
22VIVAs Bottom Line
- But the best application of our stats is in
speaking to decision makers at the State Council
of Higher Education or in the legislature. We
have to show that the state is getting a return
on investment. - It really helps that we show the increase in
usage over time and the millions and millions of
articles downloaded and searches conducted. - ...I know there are serious problems with many of
these numbers They show trends and are an
indication of the use by our faculty and
students. This is one of our primary ways of
keeping and increasing our funding each year.
23The Same Story - Coast to Coast
- CSU-SEIR also uses statistics
- to report to decision makers
- to demonstrate the value of the resources the
consortium licenses to the legislature and to our
stakeholders - to formulate "measures of success
- The reports are instrumental to receiving
funding for our library programs. - Lisa Moske, Director
24More from the West Coast UC
- UC employs vendor usage data in the way the
NFAIS conveners fear, as one of the indicators
reviewed in decisions about which journals to
keep or drop when titles are swapped in and out
of packages. - In addition to comparative use among titles in a
package, usage trends are analyzed over time (up
or down), cost per use, etc. - Many other factors are examined besides usage, so
this is far from a mechanistic or one-dimensional
process.
25From the South - GALILEO
- Use usage stats help determine marketing and
training needs, particularly for our K-12 and
public library arena. - Did training result in an increase in use?
- Usage spikes then settles down, but its still
more than before the training. - Statistics play a big role but so do all the
other issues associated with vendor assessment -
like easy to use access management, effective
user interfaces, OpenURL compliance, etc...
26GALILEOs Bottom Line
- It is still a transition environment. Do we have
a resource because it is used or because it is
important and ought to be used? - We are just in the early stages of trying to be
data driven vs. intuition/political (e.g.,
getting the journal for prof. so and so because
he can make trouble) organizations. - When it comes down to it the reality is content
and price and whether people see the content as
being worth the price.
27Also From the South (Florida that is)
- We do use stats when products come up for
renewal, although low stats don't always mean we
will cancel. - High stats don't mean renewal necessarily either.
- Low stats do send a red flag - we try to promote
that resource in training and marketing,
especially when it seems like there is no other
reason for it not being used other than people
aren't aware of it and its content.
28And from the SouthWest
- TexShare uses several customer surveys to analyze
its services. - 150 academic libraries complete an annual survey
that just measures the overall user satisfaction
with the various TexShare programs, the database
program being one. - 700 consortium members are surveyed on the
databases.
29TexShare Electronic Resources
- Selected through recommendations of a
collaborative working group - Decisions are guided by criteria
- Membership Surveys
- Usage Statistics
- Database Content
- Vendor Reliability
- Best Value
- Surveys are very important for this process.
30Theres that Bottom Line Again
- The vendor statistics collected for the
databases go into the cost avoidance performance
measures, which are reported to the legislature
and used to justify funding. - Ann Mason,Texas State Library and Archives
Commission
31Again, its not just Statistics
- From the Massachusetts Board of Library
Commissioners - Online survey responses from 600 libraries on
patron satisfaction with statewide licensed
resources - Data used to look at what is licensed and what
should be acquired next. - Compare most valued with actual usage.
- The top tier (about 25) of databases usually
coincides (high usage, most valued), after that
the two metrics diverge.
32Massachusetts - Use vs. Rank
33Canada Weighs In - Bottom Line Goes International
- We use statistics, at a very broad level, to
illustrate the value of our services - those
millions of downloads from one of our most
popular databases doesn't hurt to explain to our
funders why it makes sense to keep funding us. - Heather Morrison, BC ELN
34Statistical analysis can illuminate "side
problems"
- Case in point at CSU the discovery of crawlers
that created spikes of usage in a period of
nanoseconds for a single vendor. - Worked with the vendor for months to get them to
comply with simple standards for reporting. - As part of the process the vendor's technical
staff found that the sessions that bumped the
shared-use scenario into turnaway-chaos were due
to bizarre activity, at single points-in-time,
initiated from five campuses.
35More side problems
- Checking stats showed that a library did not use
a popular service for several months - turns out
their access wasn't working - Stats showed that a rarely used resource (12 hits
per month) managed to have turnaways. This
indicated a technical problem which the vendor
had to resolve and for which compensation was
required. - Low usage in a particular region is a reason to
prioritize in-person training in that area.
36From OHIOLink aka Tom Sanville
- Let's face it much of what we buy are
no-brainers. Databases and journals are core,
unique resources. - Stats show how much more bang we're getting for
the buck as a consortium than we would as
individual libraries - Using the data for cancellations has limited
application. - Use is just one measure of value to us. Buying
things that with phenomenally low cost per use
would mean many core resources would be toast. - That said- the vendors/publishers have to
recognize use is a factor and easy access to
those stat are vital.
37Were in this together
- Libraries and their consortia are valued
partners in the process of standardization, needs
assessment, and statistical analysis. We are not
mulling over these numbers simply to determine
what resources we can do away with. Much of the
activity on our part is to make sure that the
resources continue to be accessible, and, hence,
of value to our user - Lisa Moske, CSUS-SEIR - Are the publishers/vendors with us too?
38OHIOLink - Definite Preferences
- Biggest need for collecting journal title usage
stats of multiple institution data to be
aggregated and delineated in multiple ways over
multiple time frames - The data must be transportable into Excel in
pivot table friendly form that can regularly be
added to and manipulated and output in whatever
views of the raw data that are needed. - The data must be at the institution level but
must be retrievable in one report per period
(month) or multiple self-defined periods. Not 85
individual reports!
39OHIOLink - Strong Opinions
- COUNTER reports maybe great for a single
institution. They are death for compiling
consortium data member by member. - Many publishers/vendors output is naturally
compatible and extractable in the right chunks.
Ebsco is an example that works pretty well. - Others not so at all. Try to get ACS downloads by
title by institution. Have to do it one by one.
Others fall in-between. - Britannica comes in one report but needs
complicated Excel games to reverse out all the
formatting to a more usable state.
40OHIOLink - What He Said!
- I'm interested in the raw data not the
presentation of it by the vendors. - So if SUSHI can solve some of these problems,
amen and I hope so. - Tom Sanville, Ohiolink
41FCLAs Automated COUNTER Harvester
- Assumptions
- Retrieve data either via ftp or html. If ftp,
only the address is needed. If html, one or more
pages may need to be traversed. - Design
- Create a harvesting script for each COUNTER
source. Each script is maintained in a separate
file. - A COUNTER database holds the retrieved data and
COUNTER source information.
42COUNTER Harvester Preliminary Design
SOURCE Table
RESOURCES Table
STATISTICS Table
43COUNTER Harvester Preliminary Design
- Scripts
- XML documents that specify a set of actions to be
performed and the information necessary to
perform the action. - Comments are formatted as XML, i.e., lt!-- ...--gt.
44Example Gale
- ltCounterHarvesterScriptgt
- ltdebug/gt
- ltnavigategthttp//infotrac.galegroup.com/itconf
ig/fcla_000lt/navigategt - ltpausegt 2 lt/pausegt ltnavigategthttp//infotrac.gale
group.com/itconfig/fcla_000?idfclareportsamppas
sfclareports - lt/navigategt
- ltpausegt 2 lt/pausegt ltnavigategthttp//web6.infotrac
.galegroup.com/infotrac_config/session/191/591/761
10636w6/pgir!166ampui3CONSORTampui4fclaamp
ul3ampun0ampuo9ampuiyY1ampuin1ampu
ifCSVampuicNoneampuilYESampuimmarkhi_at_ufl
.eduampuzGetReport - lt/navigategt
- ltalertgt Statistics will be sent by email to
markhi_at_ufl.edu. lt/alertgt - lt/CounterHarvesterScriptgt
45Problems encountered
- COUNTER data is treated as secure data, so
creation of a web-walking robot is problematic. - Session data must be extracted at some point in
the session and then inserted into URLs to be
sent later in the session. - Reporting periods are defined in columns, one
column per period. The column headers must be
walked, looking for Total YTD after the monthly
data column(s).
46COUNTER compliance is based on an Excel worksheet
geared for human consumption not computer
processing.
- The various reports dont start reporting periods
in the same column - In Journal Report 1, periods start in column F
- In Journal Report 2, periods start in column G
- Institution name may be in a single cell row or a
column - Some csv files have information that must be
skipped - Headings at the top of the sheet
- Subtotal and total lines interspersed and/or at
the bottom of the sheet
47Counter Harvester Conclusion
- COUNTER compliance is claimed by many but
delivered by few. - In fact, as of this writing, we havent found a
single csv that fully conforms. -
- Many are very different from the standard format.
- Others, like EBSCO, are very close but since they
are not exact are still not amenable to machine
processing.
48- INTERNATIONAL COALITION OF LIBRARY CONSORTIA
(ICOLC) - PRESS RELEASE FOR IMMEDIATE DISTRIBUTION
- September 28, 2006
- REVISED GUIDELINES FOR STATISTICAL MEASURES OF
USAGE OF WEB-BASED INFORMATION RESOURCES - (Initially released in November 1998, revised
December 2001, September 2006) - With the continuing endorsement of 83
consortia from around the world (see list page
6), this revision reflects the ICOLCs previous
endorsement of Project COUNTER and the ICOLC
communitys new endorsement of NISOs
Standardized Usage Statistics Harvesting
Initiative (SUSHI) protocol and reliance on XML
as the standard delivery format for usage
statistics.
49ICOLC Guidelines
- 5. DELIVERY Usage reports must be delivered via
an interactive web-based reporting system
preferably on a real time basis, but at least
within 15 days after the end of the month. Report
content should be customizable, as specified in
the Requirements section. Information providers
are also encouraged to present data as graphs and
charts. Vendors should maintain a minimum of
three years of historical data. These data also
should be available in flat files containing
specified data elements that can be downloaded
and manipulated locally. The preferred format is
XML through the web services protocol described
in the documents available from the NISO
Standardized Usage Statistics Harvesting
Initiative (SUSHI) - lthttp//www.niso.org/committees/SUSHI/SUSHI_comm.
htmlgt.
50Hope for the future - SUSHI
- The SUSHI schema needs more specificity,e.g.
- the format of dates is not specified in the
schema but different date formats may cause
different servers to reject the request or, worse
still, to fail. - Integrating SUSHI and csv-based data
- In SUSHI, the reporting period is generalized
with a start/end date. In the Excel-based
standard, the column headings are of the form
mm-yyyy. To place both into a common database
requires normalization.
51CONCLUSIONS
- Statistics are important
- We use them for many reasons
- Evaluating effectiveness of marketing and
training - Assessing value and usefulness of products
- Demonstrating value of resources to funding
agencies - Collecting statistics is not easy
- Manipulating them is even less easy
- COUNTER helps
- SUSHI could help a whole lot more
52CONCLUSIONS
- Statistics are important
- We use them for many reasons
- Evaluating effectiveness of marketing and
training - Assessing value and usefulness of products
- Demonstrating value of resources to funding
agencies - Collecting statistics is not easy
- Manipulating them is even less easy
- COUNTER helps
- SUSHI could help a whole lot more-GO SUSHI!