The eXtensible Past - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

The eXtensible Past

Description:

Exploring the possibilities of XML and OAI (Open Archives Initiative) for ... XML is platform independent. XML integrates Data, Metadata and Structure. XML a hype? ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 19
Provided by: Anne246
Category:

less

Transcript and Presenter's Notes

Title: The eXtensible Past


1
The eXtensible Past
  • XML As a Means for Easy Access to Historical
    Research Data and a Strategy for Digital
    Preservation

2
NHDA
  • Registered research
  • Deposited Datasets
  • Archiving, Dissemination Publishing
  • No restrictions
  • Restricted access

3
Reasons for starting X-past
  • Problems with ASCII preservation format
  • HTML-datasets
  • Need for decentralized repositories
  • Making data searchable on the Web
  • Data-publishing increases demand for data.
  • Improve reusability of datasets

4
X-past Project principles
  • To investigate XML as a possible new strategy for
    the long-term preservation of research data
  • Exploring the possibilities of XML and OAI (Open
    Archives Initiative) for providing better access
    to and sharing of digital data collections by
    researchers

5
XML for long-term preservation
  • XML is human readable
  • XML is platform independent
  • XML integrates Data, Metadata and Structure
  • XML a hype?

6
ASCII fixed format representation and codebook.
  • DATA
  • 26502 Martensz Matheeus Kruidenier Antwerpen
    19-05-1586 B 35
  • CODEBOOK
  • Name Explanation Sort position
  • PersID personal identification
    number integer 1-5
  • Family name Surname of person text 6-25

7
Record in XML-representation
  • ltrecordgt
  • ltrowgt
  • ltpersIDgt26502lt/persIDgt  
    ltfamily_namegtMartenszlt/family_namegt
    ltfirst_namegtMatheeuslt/first_namegt
    ltprofessiongtKruidenierlt/professiongt
  • ltorigingtAntwerpenlt/origingt
  • ltdate_of_entrygt19-05-1586lt/date_of_entrygt
  • ltentry_numbergtBlt/entry_numbergt
  • ltentry_pagegt35lt/entry_pagegt
  • lt/rowgt
  • lt/recordgt

8
Providing better access
  • Data searchable on the web
  • Customizable datapresentation using XSLT
  • Downloads available in xml original format
  • Integration with other Metadata repositories
    possible OAI

9
OAI Overview
10
What have we built?
  • Repository Server
  • Management Application
  • Access Portal

CMS
Repository Server
Manager
Default Portal
Other Application
11
System Architecture
  • Modular
  • Extensible

12
Automatic Conversion
  • Archivist uploads dataset
  • Intermediate format (CSV)
  • Original Format (DBASE, ASCII)
  • More formats to come
  • System automatically converts format to XML
  • Creates searchable index on material

13
Current Issues
  • Carried out prototype validation with different
    user groups
  • Difference between metadata search and dataset
    search is not clear
  • Downloading XML files can take a while
  • Navigation of search results must be improved

14
XML-issues
  • Only applicable for text-oriented datasets
  • Limited human-readability for binary content
    (images)
  • Problems arise with more complex datasets
  • Loss of functionality
  • Loss of context

15
Loss of context
  • Data is converted from original format to XML
  • Information can be available in other parts of
    the dataset than just the raw data
  • Field names / labels
  • Queries generating reports

16
Conclusions
  • XML is human readable (text, databases
    spreadsheets)
  • XML-schema like DDI makes integration of Data,
    Metadata Codebooks possible
  • XSLT makes rapid Customized Data-publishing
    possible.
  • Reusability improves with XML.

17
Conclusions II
  • OAI makes distributed Repository Access possible
  • X-past has a flexible modular Architecture

18
Follow-up Project XARA
  • Should adress following issues
  • Loss of functionality
  • related tables xlink?
  • Stored procedures
  • Loss of context
  • Developing customized websites of datasets.
  • Using DDI-schema for integrated metadata-data
    presentation.
Write a Comment
User Comments (0)
About PowerShow.com