Title: Data Management Considerations for the International Polar Year
1Data Management Considerations for the
International Polar Year
World Data Center for Glaciology, Boulder
Facilitating the international exchange of snow
and ice data
- Mark A. Parsons, Ronald L. Weaver, Ruth Duerr,
and Roger G. Barry
American Geophysical Union San Francisco,
California 14 December 2004
2(No Transcript)
3What will IPY4 bring?
- Will you be able to find all the data relevant to
your research and see relationships between data
sets. - Will you be able to retrieve IPY4 data in 2050?
- Will you be able to merge and integrate different
data sets across experiments and disciplines? - Will you be able to subset, visualize, and
transform your data? - etc.
4Organization of IPY Data Management
- Data Policy Management Subcommittee
- scientists
- data managers
- funding agencies
IPY Joint Committee
eGY
Programme Office
Data Information Service
Users
Projects
Data Centers, Virtual Observatories, etc.
5Systems and Innovation
Succeeded
Challenged
Failed
The Standish Groups CHAOS report. An
assessment of 40,000 IT application projects
6Organization of IPY Data Management
- Data Policy Management Subcommittee
- scientists
- data managers
- funding agencies
IPY Joint Committee
eGY
Programme Office
Data Information Service
Users
Projects
Data Centers, Virtual Observatories, etc.
7The People Part
A striking proportion of project difficulties
stem from people in both customer and supplier
organisations failing to implement known best
practice.
Oxford University/Computer Weekly survey of
public and private sector IT projects (emphasis
added)
However, people are much more able to adapt to
change, uncertainty, and messy systems
8The People Part Science and Data Management
- Many have stated the need to involve scientists
in data management, but - It is also important to involve data managers in
conducting science. - Field Experiments
- 20 increase in data quality (Parsons, et al.
2004) - 70 of experiment cost is data collection
(Longley, et al. 2001) - Observing systems
9Preservation and AccessTwo Peas in a Pod
- Scientific Data Stewardship
- preservation and responsive supply of reliable
and comprehensive data, products, and information
for use in building new knowledge to - USGCRP, 1998
- the long-term preservation of the scientific
integrity, monitoring and improving the quality,
and the extraction of further knowledge from the
data - H. Diamond et al., NOAA/NESDIS, 2003
10Access. What is it?
- Preservation requirements are well defined in the
Open Archive Information System (OAIS) Reference
Model, but - No similar model for access requirements eGY
could help - Not even a common definition of access and what
restricts it - Unique access requirements for social science
data and non-digital collections (physical
samples, photographs, audio, etc.)
11Documentation
- Use existing standards, e.g.
- ISO19115 metadata standard
- OAIS Reference Model
- Describe uncertainty
- Challenge your assumptions
We must not start from any and every accepted
opinion, but only from those we have defined
those accepted by our judges or by those whose
authority they recognize. Aristotle c. 350 BC
12The Data Itself
01100010100100111101011100011110110010101000111001
11001010100111010101001110001101011010000100001001
01001001010110010010001010100100100101010101001010
10010100101010000011111001011010101011010001011110
10110101101010100110001010010011110101110001111011
00101010001110011100101010011101010100111000110101
10100001000010010100100101011001001000101010010010
01010101010010101001010010101000001111100101101010
10110100010111101011
- Formats
- Archives and users may have different needs
- Consider four themes (Raymond, 2004)
- Transparency
- Interoperability
- Extensibility
- Storage or transaction economy
13Data Management Considerations or Themes
- Manage technical innovation
- Systems need people
- Scientists and data managers working together
- Preservation and AccessTwo peas in a pod
- The nature of the documentation
- The nature of the data
14Data Management Principles (bumper stickers)
Preservation without access is pointless access
without preservation is impossible.
Its about DATA not systems
Involve scientists in data management data
managers in science
Think about long-term archiving NOW!
Document uncertainty!
Keep things simple flexible
Consider the needs of current, future, and
unknown users
15Whats Next?
- The Data and Information Service should be
created soon. - The Data Sub-Committee needs to consider these
themes and principles when developing the IPY
data policy.