Metadata models to support the statistical cycle: IMDB - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Metadata models to support the statistical cycle: IMDB

Description:

Role of the IMDB ... IMDB data dimension, business dimension, questionnaire model, administration ... Metadata in the IMDB is organized around the survey ... – PowerPoint PPT presentation

Number of Views:83
Avg rating:3.0/5.0
Slides: 31
Provided by: born4
Learn more at: http://www.unece.org
Category:

less

Transcript and Presenter's Notes

Title: Metadata models to support the statistical cycle: IMDB


1
Metadata models to support the statistical cycle
IMDB
  • Alice BornStatistics Canada
  • UNECE Workshop on Statistical Metadata
  • July 4 to 6, 2007

2
Outline
  • Survey life cycle and the IMDB
  • IMDB model
  • Data dimension model
  • Business dimension model
  • Questionnaire model
  • Registration
  • Classification of administered items
  • Use of metadata in the statistical system

3
Role of the IMDB
  • Information management interpretability of
    Statistics Canadas 590 current surveys
  • Assist in coherence of the data
  • Promote knowledge sharing across STC and with
    external users
  • Preserve corporate memory
  • Promote reuse of our metadata assets

4
IMDB in the survey life cycle
Data Warehouses
Operations Management
Quality Assurance
Analysis
Dissemination
IMDB
IMDB
Metadata
Collect
Edit
Estimate
Tabulate
Publish
Design
Archive
Operational Data
Registers
Survey Data
Administrative Data
Operational Data Stores
5
IMDB metadata model
  • Corporate Metadata Repository (CMR), which is an
    extension of ISO/IEC 11179 Metadata Registries
  • Statistical surveys
  • Sample
  • Questionnaire
  • Data sets
  • Products
  • Systems
  • IMDB data dimension, business dimension,
    questionnaire model, administration and documents
    model

6
Data dimension model ISO/IEC 11179
Data Element
Data Element Concept
Object Class
Survey variable
Property
Conceptual Domain
Value Domain
7
Data dimension model
  • Currently in the IMDB
  • 85 object classes (statistical units)
  • 290 properties
  • 506 data element concepts (O.C. property)
  • 202 conceptual domains (representation class
    property)
  • 1509 value domains (classifications)
  • 1034 data elements ( representation class
    property object class variables)
  • Type of revenues of establishments

8
Business dimension model in the IMDB

Survey
Applications/Software
Survey instance
Frame and sample
Datasets
Questionnaire
Products(COR)
Data elements
Survey design
Value domains
9
Administrative layer
Statistical Activity
Organization
Survey
Contact
Stewardship
Universe
Documentation
Frame
Identification
Survey instance
Identification
Time Frame
Instrument
Keyword
Question
Classification
Theme
Data file
Methodology
Data Element
Instrument designSamplingData sourceError
detectionImputationEstimationQuality
evaluationDisclosure controlRevisions and
seasonal adjustmentData accuracy
Data Element Concept
Administereditems
Object Class
Property
Formula
Conceptual Domain
Value Domain
10
Information management - Administered items
  • Any item that is managed, tracked, organized and
    registered in a registry
  • Administered items have
  • their own set of characteristics specific to the
    administered item
  • and shared administrative characteristics which
    are common to all administered items
    administrative layer

11
Information management - Administrative Layer
  • Shared administrative characteristics
  • Terminological Designation (Names)
  • Terminological Description
  • Time Frame
  • Organization/Contact
  • Reference Document1
  • Version Management
  • Stewardship/Registration
  • Classification
  • 1 Reference document is an administered item with
    all the administrative layer characteristics.

12
IMDB Administrative Layer - Version Management
  • A snapshot of the information recorded for the
    administered item.
  • Rules for creation of a version are established
    for each type of administered item.

13
Information Management - IMDB Administrative
Layer
  • The administrative layer is used to manage
    administrative information for all IMDB
    administered items.
  • Administered items are managed in a consistent
    manner.

14
Surveys
  • Metadata in the IMDB is organized around the
    survey administered item
  • Refers to collection, compilation and publication
    of data measuring characteristics of a population
  • Three types of surveys are recognized
  • Direct
  • Administrative
  • Derived

15
Statistical Activities
  • Group of surveys that share common feature,
    common explanatory text
  • E.g., System of National Accounts, Unified
    Enterprise Statistics, Health Statistics

16
Common metadata set
  • Statistical activity
  • Survey (direct, administrative, derived)
  • Target population (population, statistical
    unit)
  • Survey instance (each survey process)
  • Collection instrument (questionnaire)
    Methodology
  • Data accuracy
  • Documentation
  • Data file
  • (Data elements, value domains)

17
Common metadata set for survey life cycle
  • Methodology
  • Instrument design
  • Sampling
  • Collection method
  • Error detection
  • Imputation
  • Estimation
  • Quality evaluation
  • Disclosure control
  • Revisions and seasonal adjustment

18
Questionnaire model
Question block Item_IDBlock_type, etc
Questionnaire Item_ID,etc
Data element Item_IDRepresentation_class, etc
Response choice Question_item_IDResponse choice,
etc
Question Item_IDDE_item_ID, etc
Value domain Item_IDVD_type, etc
19
Questionnaire model in the IMDB
  • Metadata for survey planning and design phase
  • Does the concept or question already exist?
  • Metadata discovery - STCWiki
  • Align with output variables - definitions
  • Harmonized Content Modules Project
  • Content development of key socio-demographic data
    elements (e.g., marital status, age, ethnic
    origin) in IMDB for registration as a STC
    standard
  • Leading to development of standard question
    blocks and questions stored in the IMDB
  • Specifications (i.e., skip patterns, modes) /
    BLAISE and other code stored in Survey
    Specification Manager

20
Registration/Stewardship
  • Registration and stewardship information is
    managed for each administered item
  • Who is the owner of the item?
  • Who is responsible for the items information?
  • Who is responsible for registration?
  • Verification for editorial, accuracy, bilingual
    conformance?
  • State new, candidate, recorded, qualified,
    standard, preferred/prescribed standard, retired?
  • Degree of sharing/harmonization divisional,
    branch, agency, provincial, national,
    international?
  • Dissemination Internal, public?
  • Versioning note

21
Registration Attributes in the IMDB
  • Three registration attributes
  • Registration status identifies the quality or
    progression of quality
  • Registration level level of conformance or
    harmonization
  • Administrative status stage in the registration
    process

22
1. Registration status
Registration Authority
Preferred standard
Retired
(Completeness, accuracy, adherence to quality and
terminological description standards)
Standard
Superseded
Standards Division Registrar
Qualified
Regular Registrar
Recorded
Responsible Owner (Content)
Candidate
Historical
Submitter
Steward
Incomplete
Application
23
2. Registration level
Level of conformance or harmonization
International
Departmental
U.S.
Recommended
Program-specific
Canadian
Survey
Provincial
24
3. Administrative status
Stages in registration process
De-registered
Registered
Reserved for edit
Not registered
New
25
Classification of administered items
  • Organization and classification of the
    administered item
  • Keyword
  • STC taxonomy (28 themes, 200 sub-themes)
  • UNECE Classification of International Statistical
    Activities data elements
  • Program Activity Architecture for reporting to
    Treasury Board Secretariat and to parliament
  • Organization of the items administrative and
    item-specific information for different purposes
  • HTML, Wiki, SDMX, CWM, DDI, XBRL.,

26
Survey design and dissemination phases
Collect
Edit
Estimate
Tabulate
Publish
Design
Survey Universe Frame Instance Collection
Instrument Methodology Data Files Enterprise
Architecture
Concepts (Object Class, Property, Data Element
Concept) Data Elements Questions Questions
Blocks Classifications (Conceptual Domain Value
Domain)
27
Reuse of Information Assets in Applications
Development
Classification coding
Collection instrument development Survey
Specification Manager Integrated Questionnaire
and Metadata System
Publishing
Other applications Software Register
28
Reuse of Information AssetsIntegration with Data
Data Warehouses
CANSIM
29
Reuse of Information Assets in Dissemination and
information discovery
  • One meta data source
  • many uses for the information
  • many output formats



30
Corporate Memory Data FilesDissemination and
archive phases
Operational Data
Registers
Survey Data
Administrative Data
Operational Data Stores
Public Use Master File
Archival information
Clean Master File
Archived Data
Write a Comment
User Comments (0)
About PowerShow.com