Crosslinking and Referencing Data and Publications in CLADDIER - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Crosslinking and Referencing Data and Publications in CLADDIER

Description:

Jessie Hey (Southampton) Brian Matthews (STFC) Catherine Jones (STFC) Alistair Miles (STFC) ... Conventions for the citation of data ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 17
Provided by: claddie
Category:

less

Transcript and Presenter's Notes

Title: Crosslinking and Referencing Data and Publications in CLADDIER


1
Cross-linking and Referencing Data and
Publications in CLADDIER
  • Brian Matthews,
  • E-Science Centre,
  • STFC Rutherford Appleton Laboratory

2
About CLADDIER
Citation, Location and Deposition in Discipline
and Institutional Repositories
Funded via a JISC grant, through the Digital
Repositories programme - July 2005-Oct 2007
  • Bryan Lawrence (PI, BADC)
  • Sam Pepler (Project Manager, BADC)
  • Sue Latham (BADC)
  • Pauline Simpson (NOCS)
  • Jessie Hey (Southampton)
  • Brian Matthews (STFC)
  • Catherine Jones (STFC)
  • Alistair Miles (STFC)
  • Katie Portwin (STFC)
  • Shoaib Sufi (STFC)
  • Kevin ONeil (STFC)
  • Katherine Bouton (Reading, NCAS)

3
(No Transcript)
4
Citation and linking in repositories
  • In order to achieve this scenario we need to
    provide a set of key mechanisms
  • Publishing of Data
  • Conventions for the citation of data
  • Can then treat data citation in similar way to
    publications
  • Browsing and searching
  • across different repositories
  • across data and publication
  • Cross-citation of data and publication
  • forward and backward citation
  • need to maintain currency of citation links
  • A simple mechanism to push citation information
    between repositories
  • A practical look at citation of data and how
    repositories could communicate citation
    information.

5
Data Publication
  • In this context publication is defined as the
    process through which data is fixed and made
    retrievable over the long term, and may imply
    that there has been some quality control process.
  • Defining data fixing and encapsulating a
    meaningful data set
  • Quality Control Publishers, Data Centres

Natural Environment Research Council,
Mesosphere-Stratosphere-Troposphere Radar
Facility Thomas, L. Vaughan, G. .
Mesosphere-Stratosphere-Troposphere Radar
Facility at Aberystwyth, Internet. Version 2,
Cartesian products. British Atmospheric Data
Centre (BADC), 1990- cited 2006 Apr 25.
Available from http//badc.nerc.ac.uk/data/mst.
6
Browsing and Searching
  • Browsing and searching
  • across different repositories
  • across data and publication
  • CLADDIER has provided a harvesting and search
    tool to support cross-repository searching

7
Discovery Service
  • The Discovery Service gives a broad-brush search
  • Give you both publications and data sets
  • indexed by keyword
  • Google across repositories.
  • Uses OAI-PMH a conventional approach
  • Simple but it works!
  • Simple key-word searching
  • Three participating repositories in the pilot
    BADC, STFC ePubs, SOTON ePrints

8
Adding Cross-Citations
  • Cannot tell whether the data and publication are
    actually related.
  • what data and publications inspire a piece of
    work (generating a new data set)
  • what publications arise from a data set
  • We need to exploit the concept of cross-citation
    to see whether items are actually related.

Traditional Citation
Cross Citation
9
Maintaining Links
  • Ideally the archives holding the datasets and
    publications would be notified that a paper
    citing them had been submitted.
  • Metadata associated with those records would be
    updated to reflect the citations.
  • The metadata in the publication repository should
    also link to the metadata in the data archives
    and vice versa.
  • It would be great if this notification could be
    done automatically.
  • Tedious to enter citations
  • forward citations (cited-by) are hard to
    track
  • We adapted a protocol from the world of Blogging
  • Trackback
  • Designed to allow cross-referencing of blog
    articles
  • Extended to allow richer metadata

10
Trackback Protocol
11
Sender Publication
This publication has a citation to a technical
report
12
Adds Citation
Sends trackback call to this URI
13
Embedded Metadata
Trackback URI
Formats accepted
14
After Trackback cited-by link added
Receiver Publication
Added this cited by link
15
Notes on Trackback
  • A simple existing protocol
  • P2P loosely federates repositories
  • Extended to carry metadata of the citation
  • To add cited-by links
  • Can also indicate which metadata is expected
  • Simple Dublin Core
  • ePrints Application Profile
  • Can also use the metadata of the receiver
  • Improves the citation metadata
  • Implemented in ePubs
  • Also partially in BADC
  • Receiver only send email to admin.
  • Some problems or extensions are under
    consideration
  • Link to metadata not full text
  • Spamming anyone could send trackbacks
  • Whitelists
  • Administrator intervention
  • Multiple entries
  • Same citation multiple times
  • Same citation in different repositories
  • Retraction of citation
  • A delete protocol

16
Conclusions
  • CLADDIER supports the scientific process with
    federated repositories
  • This requires the cross-linking network of
    information objects.
  • Which needs to be stored, maintained and searched
  • Now doing some user testing
  • Tools and ideas relatively straightforward
  • Lots of gluing of existing components
  • Keep it simple so it will get used
  • http//claddier.badc.ac.uk/
Write a Comment
User Comments (0)
About PowerShow.com