Title: Research Data Access and Preservation Summit
1Research Data Access and Preservation
Summit Panel 2 - Promoting Re-Use of Scientific
Collections Some responses to the questions
posed... John Harrison SHAMAN Project University
of Liverpool john.harrison_at_liverpool.ac.uk
2How do you handle organization of collections
today?
- We created a highly structured hierarchy of
directories within our storage system (currently
iRODS) - Allows logical separation, but association of
- Collection data
- Supporting documentation (context, provenance)
- System
- Policies
- Software code
- Configurations, Workflows
- Discovery mechanisms (indexes)
3What are the biggest issues with building
collections for new communities?
- Scalability quantity of data is increasing
rapidly - More important to select, and prioritize data
with most potential to be useful to future
generations. - Mechanisms for identifying useful items in large
reference collections become more important.
4When new communities access existing data
collections, what new access capabilities are
required?
- It's difficult to generalize depends a great
deal on expectations of the community in
question. - Viewing the data will be essential for all
communities - One important aspect of our approach has been to
develop a display technology, independent of the
originating application - Emulation, but with a layer of abstraction from
the operating system (Java Virtual Machine) - Provides a platform for development of new and
unforeseen capabilities for interaction with
legacy (potentially obselete) file formats.
5What level of description is required to meet the
expectations of new communities?
- Impossible to say for certain. Expectations
evolve as technology develops. - Best we can do
- Rigidly adhere to most stringent and well
documented standards of today. - Preserve the means for future generations to
interpret these descriptions by preserving
documentation on the standard - Tag libraries Schemas for XML
- Ontologies
6Is long-term sustainability enabled through
re-purposing of collections?
- Theoretically, yes only time will tell for sure
- Best change of achieving sustainability by using
open standards to describe - Digital objects, their structure and associations
- Metadata (digital objects and the archive as a
whole) - Data management policies and processes
7Are there other driving purposes behind promoting
re-use of collections?
- Data may provide insights into unforeseen areas.
- e.g. results of drug trials might inform future
drug development in the pharmaceutical
industryIn such a highly regulated industry,
the ability to get back to raw data to ensure
authenticity is very important!
8Which institutions can be approached for
sustaining re-purposed collections?
- So far, it seems to be mainly memory institutions
that are looking at issues of digital
preservation (Libraries, Archives, Museums) - Anyone with significant data should be thinking
about issues surrounding preservation of their
knowledge/information assets. - In the future, funding bids should consider the
costs of preserving the results of their
research. - I think inevitably many organizations will end up
out-sourcing digital preservation.