Title: Metadata requirements for archiving structured data
1Metadata requirements for archiving structured
data
- Alice BornStatistics Canada
- Joint UNECE/Eurostat/OECD Work Session on
Statistical Metadata (9-11 April 2008)
2IMDB business model archive focus
Survey
Applications/Software
Datafiles in archive repository
Survey instance
Frame and sample
Datasets
Questionnaire
Products(COR)
Data elements
Survey design
Value domains
3Outline
- Overview of the integrated metadatabase (IMDB) in
the survey life cycle - Archive and disposal processes
- Metadata requirements for archiving
- Registration
4IMDB in the survey life cycle
Data Warehouses
Operations Management
Quality Assurance
Analysis
Dissemination
IMDB
IMDB
Metadata
Collect
Edit
Estimate
Tabulate
Publish
Design
Archive disposal
Operational Data
Registers
Survey Data
Administrative Data
Operational Data Stores
5Processes for archiving and disposal
- 8. Archiving and disposal
- 8.1 Manage archive repository
- define archival format, format data, load data,
record event dates, links to metadata registry
(IMDB) - 8.2 Preserve data and associated metadata
- Identify data/metadata, record archive request,
transfer data/metadata, etc. - 8.3 Dispose data and associated metadata
- Identify data to dispose, remove from archive,
etc.
6Types of metadata for archiving structured data
(1)
- Administrative metadata
- Maintaining and keeping track of the archived
datafile - Link to an IMDB record SDDS or unique datafile
identifier - Structural metadata
- Name of the datafile
- File format record layout
- Software
- Storage media, location
7Types of metadata for archiving structured data
(2)
- 3. Survey and definitional metadata
- Already in the IMDB
- Survey description
- Questionnaires, questions, response choice
- Methodology
- Measures of data quality
- Variables (ISO 11179)
- Additional documentation interviewer guide,
coding tool
8Administrative layer
Statistical Activity
Organization
Survey
Contact
Stewardship
Universe
Documentation
Frame
Identification
Survey instance
Identification
Time Frame
Instrument
Keyword
Question
Classification
Theme
Data file
Methodology
Data Element
Instrument designSamplingData sourceError
detectionImputationEstimationQuality
evaluationDisclosure controlRevisions and
seasonal adjustmentData accuracy
Data Element Concept
Administereditems
Object Class
Property
Formula
Conceptual Domain
Value Domain
9(No Transcript)
10Registration status
Disposed
Registration Authority
Preferred standard
(Completeness, accuracy, adherence to quality and
terminological description standards)
Archived
Standard
Retired
Standards Division Registrar
Qualified
Regular Registrar
Recorded
Responsible Owner (Content)
Candidate
Historical
Submitter
Steward
Incomplete
11Corporate Memory Data and metadata for archive
and disposal phases
Operational data
Registers
Survey Data
Administrative Data
Metadata for archive and disposal
Public Use Microdata Files
Archive repository
Clean Master Files
Disposal