Title: My perception of the fundamental problem
1My perception of thefundamental problem
- Data are not sexy
- PI easily signs up to need to deliver
- But the published paper takes priority
- and then the next proposal takes priority
- Data managers are viewed as geeks
- who can only cajole, not enforce, data delivery
2My perception of fundamental questions
- How can we raise the profile of data management?
- How can we persuade scientists to take it really
seriously? - NB carrot not stick e.g.
- Funding specifically for data management
- Show PIs how it can help them
3Background - some of theGEOTRACES recommendations
- Appoint a Data Liaison Officer (DLO) at the IPO
- Appoint a Data Specialist on each Project to be
responsible for data - Agree time scales for data submission and release
- Use CCDHO for CTD and bottle data
- Involve data management professionals in all
GEOTRACES data activities from the start
4DLO responsibilities
- Maintain a list of IMBER projects
- Keep track of project metadata
- Maintain a catalogue of actual and expected data
sets (DIF) or equivalent discovery metadata
records (ISO19115 standard) - Ensure that standardized parameter descriptions
are adopted (e.g. the BODC parameter usage
vocabulary) - Ensure that protocols for naming of cruise,
station positions, etc., adhere to a rule system
developed by the DMC - Interact with DAC(s) to coordinate their
activities and interactions with PIs - In particular, ensure timely delivery of metadata
and actual data to the DAC(s) - Contribute to (and possibly maintain) the project
web-site
The DLO will be an ex officio member of the DMC,
and will report to the Director of the IPO and to
the DMC
5Data Specialist
- Ensure that suitable log sheets have been
provided for all activities - Assist and support scientists in preparation of
metadata - Maintain regular checks that all logs are being
correctly completed - Assemble all metadata from a section or process
study - Assist with preparation of data files, ensuring
that all necessary parameters are included - Evaluate the quality of data, either by personal
expertise or by discussion with PIs, and help to
document quality and missing or suspect data - Facilitate assembly of shipboard data sets and
data integrity checks
6Data Management for IMBER
- IMBER is a relatively young project
- But has taken early initiatives on data
management - Sophie Beauvais appointed as Data Liaison Officer
(DLO) in May 2006 - Data Management Committee (DMC) appointed
recently - DMC discussions by email/blog? Getting off to a
slow start - need to meet face to face
7Data Management Committee
- Observationalists
- Raymond Pollard, Jay Cullen
- Modellers (data users)
- Wilco Hazeleger, Reiner Schlitzer
- Data specialists
- Todd OBrien, Gwen Moncoiffe, Toru Suzuki
Balanced to improve communication and mutual
understanding
8Data v metadata
- IMBER manages no projects so strictly owns no
data - IMBER can encourage but not enforce improved
standards for both metadata and data - Should DMC emphasize the minimum (basic metadata)
or the maximum (streamlined access to data?) - Answer both
9Metadata
- One view says it is hard enough to get decent
metadata, so if we can achieve that it will be a
major step forward - Cruise Summary Report (CSR) - can we persuade all
nations/labs to complete them (USA included)? - Create DIFs at GCMD
10Data
- Another view is that we must aim for accessible,
high quality end data, arguing that nothing is
more irritating than going through a dozen
metadata links only to end up at data - contact
PI - and of course the PI never answers - Possible ways to achieve this
- Seamless access to widely distributed data
- Relatively small number of specialist data
centres - Both cost serious money
11Specialist data centres(by which I mean,
specializing in a particular type of data)
- CCHDO (CLIVAR and Carbon Hydrographic Data
Office), at Scripps, was WHPO (WOCE Hydrographic
Programme Office) - COPEPOD, the global plankton database
- Are these the way forward?
12Why advocate specialist data centres?
- To gather in data from individuals to central
archives - To have the ability to quality check a particular
kind of data - To have specialist experience to help an
individual PI
Useless without long term funding
13Improving communication
- A major goal must be to get scientists and data
managers talking to each other. - Carrot, not stick. How can data specialists HELP
scientists - Backup PIs data at early stage - security
- Help with calibration
- Help with validation
- Long term archive
- Answer requests for data
14Summary - points for discussion
- Raise profile of data management
- Improve communication (meeting, blogs)
- Carrots (funding, recognition, support)
- Specialist data centres (cf WOCE)
- Metadata standards (CSR, DIF)
- Adequate manpower