UKOLN is supported by: - PowerPoint PPT Presentation

About This Presentation
Title:

UKOLN is supported by:

Description:

Monica Duke m.duke_at_ukoln.ac.uk Project Manager, SageCite Project http://blogs.ukoln.ac.uk/sagecite/ #sagecite Developing Data Attribution and Citation Practices and ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 19
Provided by: Monica319
Category:

less

Transcript and Presenter's Notes

Title: UKOLN is supported by:


1
Monica Duke m.duke_at_ukoln.ac.uk Project Manager,
SageCite Project http//blogs.ukoln.ac.uk/sagecite
/ sagecite Developing Data Attribution and
Citation Practices and Standards An International
Symposium and Workshop August 22-23, 2011
UKOLN is supported by
2
  • Citation in the domain of disease network
    modelling
  • Funded August 2010 July 2011

3
SageCite project overview
  • Review of data citation (issues, technology)
  • Understanding the domain
  • Sage Bionetworks partners in project
  • Site visit
  • Documenting processes (workflow tools)

4
SageCite project overview
  • Demonstrator
  • Adding support for data citation
  • Using DataCite services
  • Working with publishers
  • Benefits analysis KRDS Taxonomy

5
(No Transcript)
6
www.sagebase.org
  • US-based non-profit organisation
  • Creating a resource for community-based,
    data-intensive biological discovery
  • Community-based analysis is required to build
    accurate model

7

8
www.sagebase.org
  • US-based non-profit organisation
  • Creating a resource for community-based,
    data-intensive biological discovery
  • Community-based analysis is required to build
    accurate models

9
Slide by Lara Mangravite Sage Bionetworks
10
Sage data and processes
  • Idealised 7-stage process
  • A combination of phenotypic, genetic, and
    expression data are processed to determine a list
    of genes associated with diseases
  • Different people are responsible for different
    stages of the modelling process. One person
    oversees the whole process.

11
  • Stage 1 Data Curation
  • basic data validation to ensure integrity and
    completeness
  • datasets include microarray data and clinical
    data.  
  • ensures that the format of the data is understood
    and the required metadata is present.

12
(No Transcript)
13
Agreeing standards to support sharing
  • Derry J et. al Developing predictive Molecular
    Maps of Human Disease through Community-based
    Modeling.
  • http//precedings.nature.com/documents/5883/versio
    n/1/files/npre20115883-1.pdf

14
Workflow capture using Taverna http//www.vimeo.co
m/27287109
  • Documenting data processes through workflow tools
  • supports better citation
  • makes the cited resource more re-usable
  • strengthening the reproducibility and validation
    of the research.

15
Data Citation Purposes
  • For attribution
  • Leading to credit and reward
  • For reproducibility
  • Supports validation, re-use
  • Eric Schadt at Sage Bionetworks Congress 2011
  • http//fora.tv/2011/04/16/Eric_Schadt_Map_Building
    (start at 4.28)

16
Open challenges attribution
  • Preserving link with original data
  • Some discipline-based repositories have their own
    identifiers
  • Bi-directional links
  • Attributing data creators
  • including individuals?
  • Defining creation of new intellectual object e.g.
    curated dataset?
  • Cultural challenge in recognising non-standard
    contributions microattribution
  • New metrics
  • Identification of contributors

17
Open challenges reproducibility
  • Identification and granularity
  • Discipline identifiers, global identifiers
  • How much value has been added since the data
    entered the workflow?
  • Identifying processes and software

18
Acknowledgements
  • University of Manchester
  • Carole Goble
  • Peter Li
  • British Library
  • Max Wilkinson
  • Tom Pollard
  • Sage Bionetworks
  • UKOLN
  • Liz Lyon
  • Monica Duke
  • Nature Genetics
  • Myles Axton
  • PLoS Comp Bio
  • Phil Bourne
Write a Comment
User Comments (0)
About PowerShow.com