Gathering Statistics - PowerPoint PPT Presentation

1 / 52
About This Presentation
Title:

Gathering Statistics

Description:

NFAIS, Philadelphia, Newberry. 3. ABC-CLIO ... NFAIS, Philadelphia, Newberry. 4. BePress. BePress stats are VERY complicated to gather. ... – PowerPoint PPT presentation

Number of Views:97
Avg rating:3.0/5.0
Slides: 53
Provided by: nfa2
Category:

less

Transcript and Presenter's Notes

Title: Gathering Statistics


1
Gathering Statistics
  • NFAIS Philadelphia
  • October 27, 2006
  • Michele Newberry
  • Or

2
Spreadsheet Hell!!!
3
ABC-CLIO
  • Logon to the Stats interface at
    http//serials.abc-clio.com/reports/
  • ID ltxxxgt_at_ufl.edu PW ltnnnngt
  • http//serials.abc-clio.com/reports/start_formnam
    eloginfoappnamereportsloginnameltxxxgt_at_ufl.edu
    passwordltnnnngtforgotten_password0
  • Choose Select All for Institutions
  • Choose a reporting period (one or multiple
    months)
  • Choose an Output Type (I choose Excel Friendly
    HTML)
  • Click Run Report
  • http//serials.abc-clio.com/reports/go/ABC-Clio-Se
    rials-Reports_appnamereports_operationDoReport
    addlcidI00271addlcidI00267addlcidI00268addl
    cidI00269addlcidI00270addlcidI00637addlcidI
    00725addlcidI00652addlcidI00697addlcidI00764
    startmonth20051001stopmonth20051101outputtype
    C
  • outputtypeC for CSV, E for Excel friendly html,
    H for HTML
  • start,stop month in YYYYMMDD
  • Save Page As HTML file FCLAReportMMYY.html,
    i.e. FCLAReport0905.html.
  • Upload to FCLA website and transcribe total
    annual searches into the master spreadsheet.

4
BePress
  • BePress stats are VERY complicated to gather. Go
    to the admin URL http//www.bepress.com/cgi/myacc
    ount.cgi
  • Login with the user ID and Password for each
    school.
  • Copy the URL for each schools report and paste
    into the browser to get an Open or Save As
    popup window.
  • Open the file in Excel, and make the following
    modifications
  • Add a line above line one that says BePress
    Usage Report (Arial, 14 point)
  • Bold Italicize color purple the next line
    (Full-Text Downloads.)
  • Delete the usage data for each title up to
    January of the current year (these reports keep
    data from the creation of any given universitys
    account), and then move the data for the current
    year to the left so that the January data is in
    Column B.
  • Resize Column A so that the full title of each
    Journal is viewable
  • Merge Center these two header lines over the
    width of the report
  • Once finished, Save As Web Page naming by
    school year, i.e. famu_2005.html
  • Repeat for each school.
  • Upload to FCLA website and transcribe total
    annual searches into the master spreadsheet.

5
CSA
  • Login to CSA Illumina Usage at
    http//mars3.csa.com/usage/ou_login.aspx
  • Login with ID ltxxxgt and PW ltnnnngt
  • Choose either Live Reporting or Emailed Reports
    (depending on how current the data needs to be
    (explained on the site) normally choose Live
    for current data.
  • Under Consortium Reports, select a date range
    (one or multiple months), and run the report.
  • Save As HTML, naming the file
    csa_fcla_MMYY.html, i.e., csa_fcla_0905.html
  • Upload to FCLA website and transcribe total
    annual searches into the master spreadsheet.

6
EBSCO
  • Login to EBSCO Admin at http//eadmin.epnet.com/e
    admin/login.aspx
  • ID ltxxxxgt PW ltnnnngt
  • Click on the Reports Statistics tab, then
    configure the report.
  • Normally choose
  • By Database, Consortium ALL
  • Level Site
  • Date range one or multiple months
  • Include All Sites
  • Fields to show Sessions, Searches, Total Full
    Text Requests
  • (any other fields are fine as well, but those
    are the required fields)
  • Then either Show, E-mail, or Schedule this report
    to be run.
  • Save As HTML, naming the file
    ebscoYYYY_MM.html,
  • i.e., ebsco2005_08.html.
  • Upload to FCLA website and transcribe total
    annual searches into the master spreadsheet.

7
Gale / IAC InfoTrac
  • Login to Gale InfoTrac Config at
    http//infotrac.galegroup.com/itconfig/fcla_000
  • ID ltxxxxgt PW ltnnnngt
  • Click on Reports in the navigation bar.
  • Under Consortium, select E-mail Gale or COUNTER
    reports, or setup a monthly report. Normally we
    choose the Gale report, as the COUNTER report
    does not display stats in the searches by
    university month style that FCLA prefers.
  • So choose Gale report, and select a date range.
  • Under Gale Standard Use Reports, check Usage
    Summary, Usage by Database, Library
    Location.
  • Choose FormatComma Separated Values
    CompressionNone and Attachment Yes.
  • Recipient enter E-mail address for the report
    and click Get Report to get it sent via
    E-mail.
  • Once received, format it to resemble the existing
    reports at http//www.fcla.edu/FCLAinfo/stats/iac
    /iac.html .
  • Saved as HTML with a filename of
    gale_MM_YYYY.html, i.e., gale_09_2005.html.
  • Upload to FCLA website and transcribe total
    annual searches into the master spreadsheet.

8
LexisNexis Academic, Congressional
Statistical
  • LexisNexis stats can be gathered by accessing
    their website at http//www3.lexisnexis.com/aur/s
    ignon.html
  • ID ltXXXXgt PW ltNNNNgt
  • Once there, you can view HTML or download CSV
    reports. HTML reports cannot be downloaded. FCLA
    worked with LexisNexis to get an FTP account
    through which we download the HTML reports.
  • FTP Setup Info
  • FTP Host - ftp.lexisnexis.com
  • Login ltXXXXgt
  • Password ltNNNNgt
  • Download all of the new reports for Academic,
    Congressional, Statistical, and the Rollup
    reports
  • Rename the files as institutionYYMM.html, i.e.,
    famu0508.html .
  • Upload to the FCLA website for each product and
    transcribe total annual searches into master
    spreadsheet.

9
ProQuest
  • Login to ProQuest Local Admin at
    http//lad.proquest.com/ladweb
  • ID ltXXXXgt PW ltNNNNgt
  • Click on the tab (or link) for
  • Select Report Type
  • Database Activity Detail
  • Delivery Method Download or Email now
  • Show items with zero usage Yes
  • Include sub-accounts in this report Yes
  • Select a date range for the usage period (one or
    multiple months)
  • Click Create Report.
  • Save as HTML as pqYYMM.html, i.e., pq0905.html.
  • Upload to FCLA website and transcribe total
    annual searches into the master spreadsheet.
  • NOTE There is also a cumulative report for
    ProQuest Digital Dissertations usage that is
    emailed once a month in HTML format and uploaded
    upon arrival as pqdd_stats.html.

10
RLG
  • Login to RLG Stats at http//reports.rlg.org
  • Invoice Account Code (IAC) ltXXXXgt
  • Access via two reports
  • 6. Union Cat and Citation Files Searches for
    Month
  • 7. Other Info Resources Search Activity for
    Month
  • Aggregate stats by institution manually due to
    the shared IAC.
  • Do not post on the FCLA website.
  • Transcribe total annual searches into master
    spreadsheet.

11
Standard Poors
  • Login to SP NetAdvantage at http//www.netadvan
    tage.standardandpoors.com/NASApp/NetAdvantage/usag
    e/Usage.do
  • ID ltXXXXgt PW ltNNNNgt
  • SP reports are only available for single
    month/single institution.
  • 10 reports must be downloaded per month.
  • If multiple institutions are selected, the report
    gives an aggregated total, rather than a
    breakdown by site, which FCLA needs.
  • Select Month, Year and Institution, then click
    Show Report.
  • Click Printer Friendly to generate a new window
    with the full report, then Save Page As HTML,
    as sp_institution_MM_YYYY.html, i.e.,
    sp_famu_09_2005.html
  • Repeat for all 10 reports.
  • Once all reports have been saved, all hyperlinks
    must all be removed as they point to resources
    are not be available from the posted page.
  • Upload to FCLA website and transcribe total
    annual searches into master spreadsheet.

12
ValueLine
  • ValueLine statistics are emailed to FCLA
    directly from the vendor rep.
  • Excel format.
  • Save as a valueline_MM_YYYY.html,
  • i.e., valueline_09_2005.html
  • Upload to the FCLA website and transcribe total
    annual searches into master spreadsheet.

13
Wilson
  • Login to WilsonWeb at http//www.hwwstats.com/ng
    /
  • Account Number ltNNNNgt (then click Login)
  • Password ltXXXXgt (then click Continue)
  • Click Database Usage
  • (COUNTER reports are available, but the reports
    under Database Usage are more appropriate to the
    kind of data that FCLA gathers monthly)
  • Select Bill To Account (Ship To Account will
    generate ZERO usage)
  • Account ALL (or you can run individual school
    reports)
  • Product ALL
  • Detail Level Complete Report
  • Choose a date range (one or multiple months)
  • Sort By Number of Searches, then click Submit
  • Once the report is generated, save as a file
    (HTML) or email the file. Save the HTML file as
    wilson_MM_YYYY.html, i.e., wilson_09_2005.html.
  • Upload to the FCLA website and transcribe total
    annual searches into master spreadsheet.

14
ACM
  • ACM provides usage stats only twice per year,
    after June 30 and after December 31.
  • Data is provided personally by ACM rep.
  • ltinject humorous stories about ACM and AMS heregt
  • Files are delivered in HTML format, by school.
  • Add the following header
  • ACM Digital Library
  • Usage Report for (school name)
  • (Date Range of report)
  • Save the file as HTML (acm_school_year.html
    i.e. acm_famu_2005.html).
  • Upload to FCLA website and transcribe total
    annual searches into master spreadsheet.

15
CHEST-UK and ATHENS
  • Does what the ACM suggested uses the
    authentication conduit to monitor usage
  • Athens single sign on access point for users
  • Provides COUNTER compliant stats over and above
    any that are supplied by the vendor
  • for individual accounts, groups of accounts
    (using various grouping methods) or all accounts
  • for individual resources or all
  • for a single day or a date range

16
BUT ATHENS Statistics
  • can only counts sessions, the libraries still
    have to rely on the publisher/ vendor statistics
    for searches and journal article downloads, etc

17
FCLA Database Search Statistics
18
EBSCOHost Usage Reports
 
 
19
EBSCOadmin Database Usage Report January 2006
20
The SPREADSHEET
21
VIVA -- from Kathy Perry, Director
  • I always think it's sad when publishers (and
    some librarians) think usage stats are only
    useful for collection development and they think
    collection development only involves
    cancellation. In fact, one of the primary uses
    of our stats is actually collection development
    when we ADD resources. This is particularly true
    in considering leasing or buying the archives,
    but it is also true of other collections.

22
VIVAs Bottom Line
  • But the best application of our stats is in
    speaking to decision makers at the State Council
    of Higher Education or in the legislature. We
    have to show that the state is getting a return
    on investment.
  • It really helps that we show the increase in
    usage over time and the millions and millions of
    articles downloaded and searches conducted.
  • ...I know there are serious problems with many of
    these numbers They show trends and are an
    indication of the use by our faculty and
    students. This is one of our primary ways of
    keeping and increasing our funding each year.

23
The Same Story - Coast to Coast
  • CSU-SEIR also uses statistics
  • to report to decision makers
  • to demonstrate the value of the resources the
    consortium licenses to the legislature and to our
    stakeholders
  • to formulate "measures of success
  • The reports are instrumental to receiving
    funding for our library programs.
  • Lisa Moske, Director

24
More from the West Coast UC
  • UC employs vendor usage data in the way the
    NFAIS conveners fear, as one of the indicators
    reviewed in decisions about which journals to
    keep or drop when titles are swapped in and out
    of packages.
  • In addition to comparative use among titles in a
    package, usage trends are analyzed over time (up
    or down), cost per use, etc.
  • Many other factors are examined besides usage, so
    this is far from a mechanistic or one-dimensional
    process.

25
From the South - GALILEO
  • Use usage stats help determine marketing and
    training needs, particularly for our K-12 and
    public library arena.
  • Did training result in an increase in use?
  • Usage spikes then settles down, but its still
    more than before the training.
  • Statistics play a big role but so do all the
    other issues associated with vendor assessment -
    like easy to use access management, effective
    user interfaces, OpenURL compliance, etc...

26
GALILEOs Bottom Line
  • It is still a transition environment. Do we have
    a resource because it is used or because it is
    important and ought to be used?
  • We are just in the early stages of trying to be
    data driven vs. intuition/political (e.g.,
    getting the journal for prof. so and so because
    he can make trouble) organizations.
  • When it comes down to it the reality is content
    and price and whether people see the content as
    being worth the price.

27
Also From the South (Florida that is)
  • We do use stats when products come up for
    renewal, although low stats don't always mean we
    will cancel.
  • High stats don't mean renewal necessarily either.
  • Low stats do send a red flag - we try to promote
    that resource in training and marketing,
    especially when it seems like there is no other
    reason for it not being used other than people
    aren't aware of it and its content.

28
And from the SouthWest
  • TexShare uses several customer surveys to analyze
    its services.
  • 150 academic libraries complete an annual survey
    that just measures the overall user satisfaction
    with the various TexShare programs, the database
    program being one.
  • 700 consortium members are surveyed on the
    databases.

29
TexShare Electronic Resources
  • Selected through recommendations of a
    collaborative working group
  • Decisions are guided by criteria
  • Membership Surveys
  • Usage Statistics
  • Database Content
  • Vendor Reliability
  • Best Value
  • Surveys are very important for this process.

30
Theres that Bottom Line Again
  • The vendor statistics collected for the
    databases go into the cost avoidance performance
    measures, which are reported to the legislature
    and used to justify funding.
  • Ann Mason,Texas State Library and Archives
    Commission

31
Again, its not just Statistics
  • From the Massachusetts Board of Library
    Commissioners
  • Online survey responses from 600 libraries on
    patron satisfaction with statewide licensed
    resources
  • Data used to look at what is licensed and what
    should be acquired next.
  • Compare most valued with actual usage.
  • The top tier (about 25) of databases usually
    coincides (high usage, most valued), after that
    the two metrics diverge.

32
Massachusetts - Use vs. Rank
33
Canada Weighs In - Bottom Line Goes International
  • We use statistics, at a very broad level, to
    illustrate the value of our services - those
    millions of downloads from one of our most
    popular databases doesn't hurt to explain to our
    funders why it makes sense to keep funding us.
  • Heather Morrison, BC ELN

34
Statistical analysis can illuminate "side
problems"
  • Case in point at CSU the discovery of crawlers
    that created spikes of usage in a period of
    nanoseconds for a single vendor.
  • Worked with the vendor for months to get them to
    comply with simple standards for reporting.
  • As part of the process the vendor's technical
    staff found that the sessions that bumped the
    shared-use scenario into turnaway-chaos were due
    to bizarre activity, at single points-in-time,
    initiated from five campuses.

35
More side problems
  • Checking stats showed that a library did not use
    a popular service for several months - turns out
    their access wasn't working
  • Stats showed that a rarely used resource (12 hits
    per month) managed to have turnaways. This
    indicated a technical problem which the vendor
    had to resolve and for which compensation was
    required.
  • Low usage in a particular region is a reason to
    prioritize in-person training in that area.

36
From OHIOLink aka Tom Sanville
  • Let's face it much of what we buy are
    no-brainers. Databases and journals are core,
    unique resources.
  • Stats show how much more bang we're getting for
    the buck as a consortium than we would as
    individual libraries
  • Using the data for cancellations has limited
    application.
  • Use is just one measure of value to us. Buying
    things that with phenomenally low cost per use
    would mean many core resources would be toast.
  • That said- the vendors/publishers have to
    recognize use is a factor and easy access to
    those stat are vital.

37
Were in this together
  • Libraries and their consortia are valued
    partners in the process of standardization, needs
    assessment, and statistical analysis. We are not
    mulling over these numbers simply to determine
    what resources we can do away with. Much of the
    activity on our part is to make sure that the
    resources continue to be accessible, and, hence,
    of value to our user - Lisa Moske, CSUS-SEIR
  • Are the publishers/vendors with us too?

38
OHIOLink - Definite Preferences
  • Biggest need for collecting journal title usage
    stats of multiple institution data to be
    aggregated and delineated in multiple ways over
    multiple time frames
  • The data must be transportable into Excel in
    pivot table friendly form that can regularly be
    added to and manipulated and output in whatever
    views of the raw data that are needed.
  • The data must be at the institution level but
    must be retrievable in one report per period
    (month) or multiple self-defined periods. Not 85
    individual reports!

39
OHIOLink - Strong Opinions
  • COUNTER reports maybe great for a single
    institution. They are death for compiling
    consortium data member by member.
  • Many publishers/vendors output is naturally
    compatible and extractable in the right chunks.
    Ebsco is an example that works pretty well.
  • Others not so at all. Try to get ACS downloads by
    title by institution. Have to do it one by one.
    Others fall in-between.
  • Britannica comes in one report but needs
    complicated Excel games to reverse out all the
    formatting to a more usable state.

40
OHIOLink - What He Said!
  • I'm interested in the raw data not the
    presentation of it by the vendors.
  • So if SUSHI can solve some of these problems,
    amen and I hope so.
  • Tom Sanville, Ohiolink

41
FCLAs Automated COUNTER Harvester
  • Assumptions
  • Retrieve data either via ftp or html. If ftp,
    only the address is needed. If html, one or more
    pages may need to be traversed.
  • Design
  • Create a harvesting script for each COUNTER
    source. Each script is maintained in a separate
    file.
  • A COUNTER database holds the retrieved data and
    COUNTER source information.

42
COUNTER Harvester Preliminary Design
SOURCE Table
RESOURCES Table
STATISTICS Table
43
COUNTER Harvester Preliminary Design
  • Scripts
  • XML documents that specify a set of actions to be
    performed and the information necessary to
    perform the action.
  • Comments are formatted as XML, i.e., lt!-- ...--gt.

44
Example Gale
  • ltCounterHarvesterScriptgt
  • ltdebug/gt
  • ltnavigategthttp//infotrac.galegroup.com/itconf
    ig/fcla_000lt/navigategt
  • ltpausegt 2 lt/pausegt ltnavigategthttp//infotrac.gale
    group.com/itconfig/fcla_000?idfclareportsamppas
    sfclareports
  • lt/navigategt
  • ltpausegt 2 lt/pausegt ltnavigategthttp//web6.infotrac
    .galegroup.com/infotrac_config/session/191/591/761
    10636w6/pgir!166ampui3CONSORTampui4fclaamp
    ul3ampun0ampuo9ampuiyY1ampuin1ampu
    ifCSVampuicNoneampuilYESampuimmarkhi_at_ufl
    .eduampuzGetReport
  • lt/navigategt
  • ltalertgt Statistics will be sent by email to
    markhi_at_ufl.edu. lt/alertgt
  • lt/CounterHarvesterScriptgt

45
Problems encountered
  • COUNTER data is treated as secure data, so
    creation of a web-walking robot is problematic.
  • Session data must be extracted at some point in
    the session and then inserted into URLs to be
    sent later in the session.
  • Reporting periods are defined in columns, one
    column per period. The column headers must be
    walked, looking for Total YTD after the monthly
    data column(s).

46
COUNTER compliance is based on an Excel worksheet
geared for human consumption not computer
processing.
  • The various reports dont start reporting periods
    in the same column
  • In Journal Report 1, periods start in column F
  • In Journal Report 2, periods start in column G
  • Institution name may be in a single cell row or a
    column
  • Some csv files have information that must be
    skipped
  • Headings at the top of the sheet
  • Subtotal and total lines interspersed and/or at
    the bottom of the sheet

47
Counter Harvester Conclusion
  • COUNTER compliance is claimed by many but
    delivered by few.
  • In fact, as of this writing, we havent found a
    single csv that fully conforms.
  • Many are very different from the standard format.
  • Others, like EBSCO, are very close but since they
    are not exact are still not amenable to machine
    processing.

48
  • INTERNATIONAL COALITION OF LIBRARY CONSORTIA
    (ICOLC)
  • PRESS RELEASE FOR IMMEDIATE DISTRIBUTION
  • September 28, 2006
  • REVISED GUIDELINES FOR STATISTICAL MEASURES OF
    USAGE OF WEB-BASED INFORMATION RESOURCES
  • (Initially released in November 1998, revised
    December 2001, September 2006)
  • With the continuing endorsement of 83
    consortia from around the world (see list page
    6), this revision reflects the ICOLCs previous
    endorsement of Project COUNTER and the ICOLC
    communitys new endorsement of NISOs
    Standardized Usage Statistics Harvesting
    Initiative (SUSHI) protocol and reliance on XML
    as the standard delivery format for usage
    statistics.

49
ICOLC Guidelines
  • 5. DELIVERY Usage reports must be delivered via
    an interactive web-based reporting system
    preferably on a real time basis, but at least
    within 15 days after the end of the month. Report
    content should be customizable, as specified in
    the Requirements section. Information providers
    are also encouraged to present data as graphs and
    charts. Vendors should maintain a minimum of
    three years of historical data. These data also
    should be available in flat files containing
    specified data elements that can be downloaded
    and manipulated locally. The preferred format is
    XML through the web services protocol described
    in the documents available from the NISO
    Standardized Usage Statistics Harvesting
    Initiative (SUSHI)
  • lthttp//www.niso.org/committees/SUSHI/SUSHI_comm.
    htmlgt.

50
Hope for the future - SUSHI
  • The SUSHI schema needs more specificity,e.g.
  • the format of dates is not specified in the
    schema but different date formats may cause
    different servers to reject the request or, worse
    still, to fail.
  • Integrating SUSHI and csv-based data
  • In SUSHI, the reporting period is generalized
    with a start/end date. In the Excel-based
    standard, the column headings are of the form
    mm-yyyy. To place both into a common database
    requires normalization.

51
CONCLUSIONS
  • Statistics are important
  • We use them for many reasons
  • Evaluating effectiveness of marketing and
    training
  • Assessing value and usefulness of products
  • Demonstrating value of resources to funding
    agencies
  • Collecting statistics is not easy
  • Manipulating them is even less easy
  • COUNTER helps
  • SUSHI could help a whole lot more

52
CONCLUSIONS
  • Statistics are important
  • We use them for many reasons
  • Evaluating effectiveness of marketing and
    training
  • Assessing value and usefulness of products
  • Demonstrating value of resources to funding
    agencies
  • Collecting statistics is not easy
  • Manipulating them is even less easy
  • COUNTER helps
  • SUSHI could help a whole lot more-GO SUSHI!
Write a Comment
User Comments (0)
About PowerShow.com