Title: Bloomsbury Conference, London
1C21st Scholarship Data as an Agent for
Change Dr Liz Lyon, Director, UKOLN, University
of Bath, UK Associate Director, UK Digital
Curation Centre 3rd Bloomsbury Conference,
London, June 2009 .
UKOLN is supported by
This work is licensed under a Creative Commons
LicenceAttribution-ShareAlike 2.0
2Perspectives
- The 21stC Scholar Team Science in the
Cloud - Chemical Crystallography Data Publishing
Showcase - The Future a Transformational Agenda
3The 21stC Scholar Team Science in the Cloud
http//www.flickr.com/photos/wwarby/3632317031/
4What does the C21st research(er) look like?
- From users to choosers (Yanosky)
- Pro-sumers (Toffler)
- Digital nomads
- Work on the Webtop
http//www.flickr.com/photos/shankrad/2905938179/
- Multi-scale complex
- Highly data-intensive
- Increasingly open
http//www.flickr.com/photos/stormsriver/228601159
7/
5Continuum of Openness?
OPEN
CLOSED
6(No Transcript)
7What do we mean by Team Science?
- Science as a social activity
- Tweet
- Blog
- Comment
- Rate
- Vote
- Recommend
- Tag
- Share
- Mash
- Trust is key
- Inter-institutional collaboration better science
(Brian Uzzi, 2008)
- Highly collaborative
- Multi-disciplinary
- Core team skills
8A new digital economy?
- Data is
- On demand
- A utility
- Commoditised
- Un-differentiated
- Publish then filter (Shirky)
- Traded
- Cloud model?
- Brokers aggregators are key roles
- Free, pay per use, pay as you grow..
http//www.flickr.com/photos/will-lion/2738252562/
- Economies of scale
- Network effects
- New data publishing business models
9Chemical Crystallography Data Publishing
Showcase
http//www.flickr.com/photos/thomasreichart/213001
8485/sizes/l/
10Data Deluge
Slide Dr Simon Coles, Univ Soton
40 years ago a PhD student would determine about
3 crystal structures for their thesis this can
now be easily achieved in a day
0.5 million
Few thousand
2.5 million
35 million
A bottleneck the primary cause is the current
data publication process, which is tied to
journal articles and peer review
11Multi-scale from Diamond Light Source ..
12..to the Laboratory bench
13eCrystals Team
- Simon Coles, Mike Hursthouse, Jeremy Frey,
Cameron Neylon, Andrew Milsted, Richard
Stephenson, Jamie Robinson, Steven Wilson, Andrew
Bailey, Mark Borkum -
- Dave DeRoure, Les Carr, Monica Schraefel, Chris
Gutteridge, Tim Myles-Board, Arouna Woukei, Dave
Tarrant, Stuart Middleton -
- Liz Lyon, Manjula Patel, Rachel Heery, Monica
Duke, Michael Day, Traugott Koch, Pete Cliff
Domain (Chemists)
Computer science
Informatics
14eCrystals Data Repository
- Quick simple to deposit
- Software tools
- Laboratory archive
- Community involvement
- Embargo facility
- Structured foundations
- Discoverable harvestable
http//ecrystals.chem.soton.ac.uk
15Data sustainability
- Trust
- Standards
- Audit and certification tools
- TRAC
- DRAMBORA
- PLATTER
- NESTOR
- Data Seal of Approval
- eCrystals Curation Reports (3)
- Preservation metadata
- PREMIS Data Dictionary
- OAIS
- Representation Information
- Registry/Repository RRORI
16Data Discovery Access
- Community Criteria for Interoperability
(Scaling Up Report 2008) - Domain data format standard CIF
- Domain data validation standard CheckCIF
- Metadata schema eCrystals Application Profile
- http//www.ukoln.ac.uk/projects/ebank-uk/schemas/
- Crystallography Data Commons
TIDCC Data Model in development - Embargo Rights http//ecrystals.chem.soton.ac.u
k/rights.html - Domain identifier International Chemical
Identifier - Citation linking DOI http//dx.doi.org/10.1594/
ecrystals.chem.soton.ac.uk/145
17Paris, March 2009
18Memorandum of Understanding
19Dr Simon Coles, Univ Southampton
http//wiki.ecrystals.chem.soton.ac.uk/index.php/M
ain_Page
20(No Transcript)
21Slide of data services CrystalEye, Crystal Web,
Chemxseer etc search structures check PMR stuff
aggregate, syndiucate, filter etc.
New Web service to aggregate published
crystallography data...
22... federated search.....
23structure search...
24Data casts Lab Blogs
Original slide Dr Simon Coles, Univ Soton
Tools
Machines
Sensors
25Publishing and sharing methodologies ...
26... and workflows ...
... data for re-use, mash-ups, mining,
computation, models, simulations ...
27Slide Dr Simon Coles, Univ Soton
27
28The Future a Transformational Agenda?
http//www.flickr.com/photos/cyber_chof/1246303241
/sizes/m/
29- We need to understand the value and benefits of
data publishing and associated data curation /
management.... and articulate them clearly - Values benefits may be
- political
- economic
- societal...
- DCC Research Data Management Forum 3
- Some issues and challenges.....
301. Research quality
- Publications based on closed peer review
- Maintain reputation
- Demonstrate provenance
- Open pilots Nature
- Use collective intelligence
- Ratings, polls, recommender systems
- Data publishing policy?
312. Research sustainability
- Ensure curation preservation of long term
scientific record including the data - Requires significant investment in infrastructure
- Assure data security
- Demonstrate resilience robustness
- Establish trust
- New business models
- Understand full costs
323. Research capacity capability
- Multi-disciplinary team
- Hybrid skills
- New field - data informatics
- New roles for information professionals?
33IJDC 2009 (in press)
34Take homes
- Team science is a social activity
- We need to advocate the value benefits of data
publishing - Data informatics underpins C21st scholarship
35Moving to Multi-Scale Science Managing
Complexity Diversity
Thank you Slides will be available at
http//www.ukoln.ac.uk/ukoln/staff/e.j.lyon/pres
entations.htmlhttp//www.dcc.ac.uk/