Title: Bibliometrics uncovered: principles
1Bibliometrics uncovered principles practices
of citation-based research evaluation
- Rosarie Coughlan
- Research Support Librarian (STM)
- Rosarie.coughlan_at_nuigalway.ie
2Outline
- Research evaluation using bibliometrics the
context - What to measure when
- Tools to retrieve the data
- Limitations considerations
- Evolving models metrics
- Bibliometrics best practice
- Influencing the debate
3Drivers for increased evaluation and assessment
4Scholarship is increasingly collaborative global
Largest Collaboration in 2006 2512 authors, a
collaboration of collaborations
5Justify budget of 2bn, colleges warned
UNIVERSITIES and colleges will have to justify
their 2bn budget as part of an unprecedented
government investigation into third-level
spending. By Ralph Riegel and John Walshe, Irish
Independent, Thursday August 14 2008
6- According to Higher Education Authority
figures, the number of permanent
research-oriented staff in universities has risen
30 per cent in just six years. No wonder the
Department of Education can't devote resources to
reducing pupil/teacher ratios or helping autistic
children. It's too busy funding research papers
on post-modernist underwater hang gliding
techniques in Outer Mongolia
Marc Coleman, Sunday Independent, May 18th 2008
7 - The metrics of bibliometrics
8Bibliometrics
- Bibliometrics
- the discipline of measuring the performance of
a researcher, a collection of articles, a
journal, a research discipline or an institution.
- This process involves the application of
statistical analyses to study patterns of
authorship, publication, and literature use
(Lancaster 1977). - Citation analysis
- Citation analysis involves counting how many
times a paper or researcher is cited by other
scholars in the field. This performance measure
assumes that influential scientists and important
works are cited more often than others.
Citation counts are a measure of impact And
impact is closely related to quality Nonetheless,
the two concepts are not synonymous!
9Stakeholders in research evaluation
Thomson (2007) Using Bibliometrics
10What to measure (1)?
- Academic Peer Review Employer Review Faculty
Student Ratio Citations per Faculty
International Faculty International students
Teaching Excellence Student Satisfaction
Heads/peer assessments A-level/Higher points
Unemployment Firsts/21s awarded Total Papers
Total Citations Citation Impact (cites per
paper) Percent Cited Paper Impact Relative to
Field Percentile Rank in Field Collaboration
Indicators Expected Citation Count Ratio of
Citations to Expected citation count Expected
Citation Rate for Category Mean / Median Citation
H Index Citation Frequency Distribution
Time Series Trends Eigenfactor Impact Factor
Bibliometrics Scientometrics Informetrics
Citation Analysis Webometrics Virtual
Ethnography Web Mining Conference Papers
Number Patents Co-Citation Immediacy Factor
PageRank Weighted In-degree Weighted
Out-degree In-Degree entropy Out-Degree
entropy g-index
10
11When to measure (2) variables of scale?
higher educations most critical goals are
difficult, if not impossible, to
measure (Birnbaum, 2001, p.84
- (1) Applicability
- Impact vs. quality
- Scholarly publishing practices
- Knowledge dissemination methods different in SSH
- Regional/ national significance (regional
readership?) - Field or discipline variation
- Not all influences are counted i.e. books, gov
pubs, grey literature etc - (2) Accuracy
- Citation bias may exist
- Publication exclusion
- Same name authors
- English language bias
- Bias to international titles
- of authors (distribution of work)
- Cronyism
- (3) Validity
- Quality vs. quantity
- Time-span
- Only a small percentage of articles are highly
cited - Controversial papers
12 - Evaluating research is difficult! Not everything
that counts is countable, and not everything that
is countable counts. (Albert Einstein, 1879
1955) - but
- Is a partial portrait an invalid portrait?
13What to measure?
higher educations most critical goals are
difficult, if not impossible, to
measure (Birnbaum, 2001, p.84
Funding Proposals?
Tenure Promotion?
Field Strength's / Collaborators?
Publication Dissemination?
University Rankings?
14The tools strength's weaknesses
S - Complete indexing of authors addresses
Complete indexing of a known proportion of
academic journals Multidisciplinary coverage
Indexing of citations Better foreign language
coverage 1200 Open Access Journals Increased
SSH pubs, including journals, book chapters
3500 titles (June 2009) __________________________
__ W - limited coverage English language
bias exclusion of certain types of
documents classification of journals by
discipline changes in journal titles names
spelled the same way (homographs)
- (SCI, SSCI AHSSCI)
- S - Complete indexing of authors addresses
- Complete indexing of a known proportion of
academic journals - Multidisciplinary coverage
- Indexing of citations
- International coverage
- ______________________________
- W - limited coverage
- English language bias
- exclusion of certain types of documents
- classification of journals by discipline
- changes in journal titles
- names spelled the same way (homographs)
- S - Peer-reviewed papers, theses, books,
abstracts, and other scholarly literature - Variety publishers prof. soc
- OA Journals / Institutional Repositories
- View citing articles
- Articles ranked by weighing full text of each
article, author, publication by relevance
times cited - Full-text via NUI Galway Library
- Link to full-text via Library (on campus)
- _________________________
- W
- Data dirty - duplication
- Data structure not suited for citation analysis
these points ruin its great potential for a wide
range of subjects
15Issues (1) mining the data - attribution
- NUIGalway
- NUI, Galway
- NUI Galway
- NUIG
- National University of Ireland Galway / National
University of Ireland, Galway - UCG
- University College Galway
- University College Hospital Galway
- UCHG
- University Hospital Galway
- University Hospital
- OÉ Gaillimh
- Ollscoil na hÉireann Gaillimh
- Ollscoil na hÉireann, Gaillimh
14 of 83 for National University of Ireland,
Galway
16Impact of inconsistent identifier use
Loss of output approx 19
Source Scopus
17Issue (2) mining the data - author name syntax
- Murphy, Paul V.
- Murphy, P. V.
- Murphy, P.
- Murphy, Paul
- Murphy, Paul Vincent
Use of multiple author name variants
18Examples
Institutions
Countries
Research Groups
Papers
Journals
Authors
19 Bibliometric outputs
20Data source
21Jiao Tong University in ShanghaiTop 500
Universities
22European Commission Science Indicators 2007
23 - Benchmarking countries
- (a bibliometric overview)
(1) ISI Essential Science Indicators
24Which Country has the highest impact in Biology
Biochemistry?
- by
- Citation
- Papers
- Cites per paper
Source ISI Essential Science Indicators
(coverage 1954-)
25What are the highest ranking disciplines
nationally?
- What are the highest ranking disciplines
nationally by - Citation
- Papers
- Cites per paper
Source ISI Essential Science Indicators
(coverage 199?
26 - Benchmarking institutions
- (a bibliometric overview)
- ISI Essential Science Indicators
- Scopus
27What are the emerging areas of research
excellence institution n.? (ISI Essential Science
Indicators (c1954-))
NB data derived from ISI journal categories
28Sample Irish HEIs by discipline Biochemistry,
Genetics Molecular Biology (Scopus 1996 - )
- Getting the most from the metric to identify
local / inter-institutional hotspots - Express it in relative terms i.e. of scholars
and level of funding productivity indicator - Expected citation ratio (aligned to world
discipline average) Percentile position of
the paper based on citations in the same field. - Disciplinarity - level of multidisciplinarity in
a set of papers from 0 to 1 (the lower the
number, the greater the multidisciplinarity).
(Herfindal Index)
NB data derived from Scopus journal categories
29 - Benchmarking disciplines
- (a bibliometric overview)
- ISI Essential Science Indicators
- (2) Scopus (Elsevier)
- (3) ISI Web of Science
30What are the highest ranking disciplines within
the University (using citation analysis)?
NB data derived from Scopus journal categories
31 - Benchmarking authors / groups
Bibliometric outputs
Scopus ISI Web of Science
32Pro-Intelligent Design Astronomer Denied Tenure
Ranks Top in His Department According to
Smithsonian/NASA Database, Say Analysts(The
Chronicle of Higher Education, 1st June 2007)
Mr. Gonzalez has a normalized h-index of 13, the
second highest of the 10 astronomers in his
department. The only person who ranks higher is
Curtis J. Struck, a professor with an h-index of
17
33Metrics
- Total number of pubs
- Total number of cites
- H-Index
- J. E Hirsch (2005) Number of papers (N) in a
given dataset having N or more citations. - 4. Number of papers in Journal with IF gt N.
- Key variables
- career profile
- Early career researchers!
- Only meaningful when compared to others within
the same discipline area. (e.g. Life Sciences vs
Physics) - Bias against exceptional/ papers
- Paper A 1000 cites
- Paper B 3 cites!
34Data integrity author profiles in citation
indices (1)
Select relevant author name variants
34
35Ensuring author profile accuracy for improved
precision tracking
- Checklist
- Coverage
- Publication years
- Inclusion of all publication types (i.e. peer
review articles and conference proceedings) - Accuracy (name syntax)
- Are all possible name variants for your profile
included - Are there any gaps?
- Are there any inaccuracies?
- How an author set may be affected by a name
change - Accuracy (affiliation syntax)
- Your current affiliation
- Your affiliation history?
- Accuracy (paper groups)
- Whether all papers in one set are indeed by the
same author - Whether an article has been omitted from a set
- Select the feedback link
- Use your NUIG email (this is used as verification
by Scopus) - 3-4 weeks time lag
- You will receive notification when your profile
has been updated
36Example
- Functions
- Top cited papers
- citations
- H-index
37- The h-index number of papers (N) in the list
that have N or more citations. - discounts disproportionate weight of highly cited
papers or papers that have not yet been cited.
Scopus h-index 42
38ISI Web of Science Distinct Author ID
Select Provide Feedback to update your
profile. Can also do institutional batch load
39 - Assessing grant publication productivity impact
(using citation analysis)
Scopus ISI Web of Science Google Scholar
40Assessing publication impact
- Sources
- Researchers in the field/discipline
- Institutions / Departments / Labs
- Journals, core collections within the discipline
- Discipline impact by
- Citing researchers
- Journals
- Institutions / Departments / Labs
- Knowledge transfer
- Across disciplines
- Across countries
41Cited Reference Searching
Traditional search
Cited reference search
2004 paper
2003 paper
1987 paper
1993 paper
2001
2001
1996 paper
1996 paper
1982 paper
1982 paper
1957 paper
1957 paper
42Cited Reference Search ISI Web of Science
- Esashi F, Christ N, Cannon J, Liu Y, Hunt T,
Jasin M, West SC CDK-dependent phosphorylation
of BRCA2 as a regulatory mechanism for
recombinational repair. Nature 2005,
434(7033)598-604.
Funded by Cancer Research UK
.
43ISI Web of Science (1)
View citing articles 2nd generation cites
indirect recognition (must be normalised to
field)
Find related Articles that share the same
reference
View Journal IF
44ISI Web of Science (2)
Analyse by Author Doc type Source title Subject
45Scopus (1)
Receive RSS Feed fro new citing articles
Source Scopus Cited 112 times
46Scopus (2) Results analysis
Knowledge transfer what are the
domains/disciplines using the knowledge created
in the original work?
Journals where the citing research is being
published
Authors?
47Google Scholar
Google Scholar Cited 107 times
48Benchmarking authors / group evolving metrics
- Authors/ Groups
- Successive h-index - h2 Index (h individuals with
h index of at least h) - hP index (ranked list of authors their number
of pubs) - hC (ranked list of authors their number of
cites) - Market power value
- Social network analysis (co-authorship /
co-publication collaboration activity) - Webometrics
- Network based metrics see MESUR project seeks
to capture ALL interaction as attention metrics
i.e. PDF downloads etc, when, where, who, whys?
clickstream data yields - Behaviours between web sites web pubs
- Google Web-URL citations
- Google Analytics who, how, where web searchers
accessing sites time periods i.e. peaks of
activity etc.
49 - Measure field trends or journals
ISI Web of Knowledge Journal Citation Reports
50Journal evaluation criterion
- Peer evaluation respondents requested in
specific disciplines to rate or classify the
journals into several categories, so that
rankings can be produced. - Journals compliance with publication criteria
i.e. periodicity, blind review of manuscripts
and so on. - Diffusion diffusion of journals through a
variety of methods presence in the main
international databases of their discipline,
presence in libraries, inclusion in directories
of periodicals, presence in the Internet, or
inclusion of contributions from foreign authors.
- Citations analysis
51Citation counts as a measure for evaluation (1)
- Measure
- Numbers of publications level of production of
new knowledge - Number of citations
- Citation networks establishing Knowledge
transfer across titles / disciplines - Metrics
- Impact factor
- Eigenfactor Scope
- Article Influence Score
- Scimago Journal Ranbk
- ISI Essential Science Indicators
- Scimago Journal rank (data source Scopus)
52The Impact Factor (1)
2008 Impact Factor
All Previous Years
2008
2007
2006
Immediacy Index
Impact Factor Citations during the current
year, 2008 in this example, to articles
published within the prior two years.
SUM No. of citations in 2008 to articles
published in journal in 2007 and
2006 _____________________________________ No. of
articles published in journal in 2007 and 2006
53The impact factor (2)
- Interpreting the metric
- Negates research longevity a given IF for any
journals only presents an average - cannot be
used to measure the performance of an individual
author. - Quantity vs quality
- Negative cites citation bias
- Time
- Not all research work is published and cited in
the citation indices. - Different fields of research publish at different
rates i.e. Biomedicine vs. Engineering.
- Internationally recognized measurement of journal
influence - Shows the most influential journals by
discipline, publisher etc. - Used mainly for
- making decisions on where to publish
- making decision which journals to subscribe
cost effectiveness - Calculated against activity in the 2 previous
years
54Journal impact the Impact Factor
55Journal impact number of articles published
56SCImago Journal Rank (2008)
Data Source Scopus
- SJR indicator attributes different weight to
citations depending on the "prestige" of the
citing journal (excl. journal self-citations)
prestige is estimated using the Google PageRank
algorithm in the network of journals. - The SJR indicator includes total number of
documents of a journal in the denominator of the
relevant calculation
57Reputation metrics citation networks
58Prestige metrics e.g. UK ABS Academic Journals
Quality Guide
- Criterion used to rank journals include
- Originality
- Well executed etc.
- Top journals have highest citation Impact Factor
in field - (IF available for 493 out of 1041 journals in
the ABS guide)
- Academic Journals Quality Guide -Aim to
benefit membership and academicsto make
informed decisions, whether at the level of the
business school or at the level of the
individual academic
59 Outputs
Books, monographs, book chapters, reports,
working papers etc
60Citation counts as a measure for evaluating books
other publications (1)
- Measure
- Numbers of publications level of production of
new knowledge - Metrics means
- Classification based on scholars quality
perceptions e.g. case study Flemish Law - single multi-authored books
- PhD thesis
- Pubs lengths gt 5 pages
- Library collections analysis
- Number of academic library copies per book title
- E.g. use of Worldcat (Lianmans, CWTS 2007)
- Google Books (GB world books unknown)
- Journal book weights
- Lists of journals publishers rated in surveys
of national international experts (trimmed
statistical weights computed (Lewel et. al.,1999
Moed et al, 2002 Nederhof et al., 2001) - need application to international standards
61Evaluating books other publications - evolving
metrics (2)
Opportunities for evaluation of heterogeneous
publication types
- Establishing robust institutional systems
- Research Support System
- Institutional Repositories
- Libraries key agencies in helping to collect,
curate research outputs, analysis
interpretations
Outputs, outputs, outputs
62Limits of bibliometrics
If you want to fatten a pig, you don't keep
weighing it.
63(No Transcript)
64Bad Science Funding findings the impact
factor (Guardian 14th Feb 2009)
studies of flu shots funded by pharmaceutical
companies are more likely to be published in
prestigious journals than those funded by other
sources, in spite of the fact that they have the
same sample size and comparable
methodology (study by Thomas Jefferson
Cochrane Vaccine Institute Relation of study
quality, concordance, take home message, funding,
and impact in studies of influenza vaccines
systematic review)
65Limitations the tools
- Applicability
- Accuracy
- Validity
- WoS Scopus not strong in coverage of humanities
journals - Not strong on non-English sources
- The humanities, engineering, computer science are
less dependent on journals than other subject
areas - Google Scholar data is very dirty and there is
duplication data structure not suited for
citation analysis these points ruin its great
potential for a wide range of subjects - Limited number of articles in any indices
- Same name authors also known as homographs
THES-QS 2008 Net change of rank by country
66 67Evolving metrics (1)
- Institutions
- G Factor links to university web sites by other
international universities (perspective based) - Authors/ Groups
- Successive h-index - h2 Index (h individuals with
h index of at least h) - hP index (ranked list of authors their number
of pubs) - hC (ranked list of authors their number of
cites) - Market power value
- Topic / fields
- Networks of Science social network analysis
(co-authorship / cp-publication collaboration
activity) - Journals
- Journal h-index
- Eigenfactor
- Diffusion Factor
- Webometrics
- Network based metrics see MESUR project seeks
to capture ALL interaction as attention metrics
i.e. PDF downloads etc, when, where, who, whys?
clickstream data yields - Behaviours between web sites web pubs
- Google Web-URL citations
- Google Analytics who, how, where web searchers
accessing sites time periods i.e. peaks of
activity etc.
68Evolving metrics (2) Open Access
- Citations (C)
- CiteRank
- Co-citations
- Downloads (D)
- C/D Correlations
- Hub/Authority index
- Chronometrics
- Latency/Longevity
- Endogamy/Exogamy
- Book citation index
- Research funding
- Students
- Prizes
- h-index
- Co-authorships
- Number of articles
- Number of publishing years
- Semiometrics (latent semantic indexing, text
overlap, etc.)
Source Harnard, S (2008) Metrics Mandates
69- Establishing Best Practices
- Consider whether available data can address the
question - Choose publication types, field definitions, and
years of data - Decide on whole or fractional counting
- Judge whether data require editing to remove
artifacts - Ask whether the results are reasonable
- Use relative measures, not just absolute counts
- Obtain multiple measures
- Recognize the skewed nature of citation data
- Confirm data collected are relevant to question
- Compare like with like
A 2-pronged approach it takes experts to
evaluate experts
70Considerations
- Relevant and appropriate
- Are metrics correlated with other performance
estimates? - Do metrics really distinguish excellence as we
see it? Are these the metrics the researchers
would use? - Cost effective
- The beauty of citations - data accessibility,
coverage, cost validation - Few studies undertaken of specialised databases
(indexing non-journal pubs) validity coverage
yet to be proven - Transparent, equitable and stable
- Is it clear what the metrics do?
- Are all institutions, staff subjects treated
equitably? - Peer review?
- Influence on behaviour?
- Articles - publish less / only very best papers?
- Collaborate more intensely?
- Search engine optimisation?
- Strategic planning benchmarking
- Are bibliometrics retrospective only? Or do they
provide a reliable predictive value with which to
evaluate future research strategy i.e. time
series trends?
71The Library supporting institutional research
evaluation
- Provide customized research performance profiles
and workshops for disciplines and schools - Getting Published Making an Impact
- Benchmarking Your Research Performance using
Bibliometrics - University rankings
- Local, national international research
benchmarking requirements - Advocacy and awareness of evolving metrics,
measures and their application - Customized bibliometric data profiles
- Data analysis of university profile in key
citation indices
- Research Office
- Quality Office
- Deans of Research
- Research / Centre Heads / PIs
- Library colleagues
Key collaborations
72Some conclusions
- Evaluating research is difficult! Not everything
that counts is countable, and not everything that
is countable counts. (Albert Einstein, 1879
1955) - Do the metrics corroborate or validate peer
review and equally does peer review moderate the
metrics? - Need to move towards research impact not outputs
- A 2-pronged approach it takes experts to
evaluate experts - Use multiple metrics fit for purpose and best
advantage - Content is king but context is God maximising
research tracking dissemination exposure
greater impact!
73Bibliometrics uncovered principles practices
of citation-based research evaluation Thank You!
- Rosarie Coughlan
- Research Support Librarian (STM)
- Rosarie.coughlan_at_nuigalway.ie