Title: ICPSR
1ICPSRs Approach to Data Citation and Persistent
Identifiers
- Mary VardiganAssistant Director, ICPSR
- Workshop on Persistent Identifiers in the Social
Sciences -- Bonn, GermanyFebruary 1, 2011
2Todays Presentation
- ICPSRs use of data citations and persistent
identifiers - Ways that ICPSR encourages good practice
- Issues to be resolved
- Future directions
3ICPSRs Use of Citations
- ICPSR has been providing citations to its data
since 1990 - Citations based on Cataloging Machine-Readable
Data Files by Sue Dodd, American Library
Association, 1982
4What Makes Up an ICPSR Citation?
- Content Creator/Principal Investigator
- Title
- Distributor ICPSR
- Distribution place and date
- ICPSR study number
- Version number
- Materials designation Computer file
- DOI
5Example
- Schneider, Barbara, and Linda J Waite. The 500
Family Study 1998-2000 United States Computer
file. ICPSR04549-v1. Ann Arbor, MI
Inter-university Consortium for Political and
Social Research distributor, 2008-05-30.
doi10.3886/ICPSR04549
6ICPSRs Use of DOIs
- ICPSR started assigning DOIs in 2008
- DOIs apply at the study or collection level (a
study can have multiple datasets) - DOIs are of the form doi10.3886/ICPSR04549
- DOIs resolve to the study homepage (metadata
record)
7How ICPSR Obtains DOIs
- ICPSR uses the CrossRef service, the official
DOI link registration agency for scholarly and
professional publications - ICPSR pays a modest annual Publisher Fee (based
on publishing revenues) and pays 6 cents per DOI - To begin assigning DOIs, in 2008 sent CrossRef an
XML file containing metadata on all ICPSR 7000
studies - Now get DOIs weekly
8Weekly Process
- ICPSR runs script to create XML metadata in
CrossRef format - Contributors and their roles
- Title
- Publication date
- Update date
- Study number
- DOI
- URL
9Weekly Process, continued
- ICPSR submits XML file to register new DOIs
- CrossRef sends email confirming the file is
correct - At that point, the DOI has an associated URL on
the ICPSR Web site
10Alternative Process
- Registration could happen in a script-driven
manner through an API - This would happen without human intervention
- ICPSR database could communicate with the
CrossRef database with DOIs registered
automatically
11Requests for DOIs
- Journals are requiring that authors provide PIDs
to data they analyzed for their articles - Authors are coming to ICPSR for DOIs
pre-publication, generally depositing data into
the Publication-Related Archive
12Encouraging Good Practice
- Bibliography of Data-Related Literature includes
60,000 citations to publications based on ICPSR
data - Two-way linking Studies link to publications,
Bibliography links back to studies - Widely used DOIs for data would make searching
for and harvesting related publications much
easier
13(No Transcript)
14(No Transcript)
15(No Transcript)
16Making Citations and DOIs More Prominent
- ICPSR provides RIS export for data citations into
bibliographic citation software - ICPSR highlights the data citation and DOI in
several places
17(No Transcript)
18For each study
19Working with Vendors to Promote Links to Data
- ICPSR has a project with Thomson Reuters to
display data linkages in Web of Knowledge - Full and summary records in Web of Knowledge will
link to related data when appropriate - ICPSR is providing a periodic data feed of
datasets and related publications to TR - TR is integrating data feeds from others
including UK Data Archive
20Influencing Journals
- On behalf of the Data-PASS partners, ICPSR wrote
to professional associations in sociology,
political science, and economics - Letters urged them to raise the standards for
data citations in their journals - Professional associations are in a position to
set standards for their members and for journal
editors (including copy editors)
21More on Influencing Journals
- Approach was to point to the variety of ways that
data were cited in specific journal issues - The letter stressed the importance of citing data
the same way that publications are cited and the
value of persistent identifiers - Organizations discussed the letters at recent
national meetings - American Sociological Review just revised its
Notice to Contributors to reflect the importance
of data citations and DOIs
22Updating Citation Software
- ICPSR worked with EndNote (owned by Thomson
Reuters) to ensure that data citations display
correctly - The result is that Dataset is now a Reference
Type in EndNote. - Zotero also needs adjustment for datasets
23Working with the Community
- ICPSR has joined DataCite as an associate member
- ICPSR has joined ORCID Open Researcher and
Contributor ID. ORCID aims to create a central
registry of unique identifiers for individual
researchers - ICPSR is heading up an IASSIST special interest
group on data citation (SIGDC)
24IASSIST Session
- IASSIST SIGDC has proposed a session as part of a
data citation track including DataCite - Tracking Data Reuse Motivations, Methods, and
Obstacles -- Heather Piwowar, NESCent, University
of British Columbia - Building Data Citations for Discovery Hailey
Mooney, Michigan State University, and Mark
Newton, Purdue University - ICPSRs Efforts to Encourage Data Citation --
Elizabeth Moss, Inter-university Consortium for
Political and Social Research (ICPSR) - Reactor Panel from SIGDC
25Issues to Resolve
- With the community, address situations when data
resources have multiple distributors (and
multiple DOIs) - Implement versioning in DOIs
- Address level of granularity for DOIs
- Move to DataCite
26Multiple DOIs for Same Data
- Eurobarometer 72.2 (Nuclear Energy, Corruption,
Gender Equality, Healthcare, and Civil
Protection)Â Â DOI doi10.4232/1.10009
Principal Investigator Antonis Papacostas
Publication Agent GESIS - Leibniz-Institut für
Sozialwissenschaften - Papacostas, Antonis. Eurobarometer 72.2 Nuclear
Energy, Corruption, Gender Equality, Healthcare,
and Civil Protection, September-October 2009
Computer file. ICPSR28186-v1. Cologne, Germany
GESIS/Ann Arbor, MI Inter-university Consortium
for Political and Social Research distributors,
2010-07-19. doi10.3886/ICPSR28186
27From CrossRefs Publisher Rules
- CrossRef only registers DOIs for Definitive
Works but not for Duplicative Works, as defined
in the CrossRef Glossary. CrossRef does not
permit multiple DOIs to be assigned to certain
closely related versions of a work Where a
CrossRef member has content which is
substantially Duplicative of Definitive Works,
the member must retrieve the DOIs of Definitive
Works for display in such substantially
Duplicative Works and must link from the
substantially Duplicative Works to the Definitive
Works.
28More on Multiple DOIs
- CrossRef policy oriented toward publications not
data - Arrangement between ICPSR and GESIS is clear, but
there are other co-distributor relationships - How much of a problem is this and can we develop
a community solution? - Can we use the DataCite metadata kernel
(relationType) to specify relationships? - Would providing explanatory text and
cross-referencing DOIs in archives metadata
records be useful?
29Versioning and DOIs
- ICPSR has decided to add version numbers to its
DOIs - ICPSR may not have previous versions online
- User will have to contact ICPSR for access
- So far the number of users requesting older
versions has been very small
30Level of Granularity for DOIs
- ICPSRs current practice is to assign the DOI at
the study level - DOI resolves to the study homepage, which
includes Version History detailing changes to all
files in the collection - Assigning dataset-level DOIs is a challenge
because ICPSR has over 65,000 datasets - ICPSR is undertaking a large project to revamp
archival management and dataset-level DOIs will
be integrated in the new infrastructure
31Moving to DataCite for DOIs
- DataCite offers several advantages because of its
focus on data - Metadata kernel more robust and intended to
describe data - Community of trusted data centers is a shared
goal
32Future Directions
- Address situations when data resources have
multiple distributors and multiple DOIs - Approach other vendors including Google Scholar
after TR service deployed - Contact other professional associations and
journals - Work with other data producers on providing
visible citations and DOIs and encouraging their
use - Continue spreading the word about data citation
and persistent identifiers!
33Thank you