Input Data Warehousing Canada - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Input Data Warehousing Canada

Description:

Canada's Experience with Establishment Level Information ... Example of the data warehouse at work ... Improved frame (business register) ... – PowerPoint PPT presentation

Number of Views:14
Avg rating:3.0/5.0
Slides: 25
Provided by: gregpe9
Category:

less

Transcript and Presenter's Notes

Title: Input Data Warehousing Canada


1
Input Data WarehousingCanadas Experience with
Establishment Level Information
  • Presentation to the Third International
    Conference on Establishment Statistics
  • Montreal, QC
  • June 20 2007

2
Overview
  • Introduction of data warehousing as a concept
  • Approaches to holding data
  • Introduction to the Statistics Canadas Unified
    Enterprise Statistics (UES) Program
  • Centralized warehousing of UES data
  • Example of the data warehouse at work

3
Subject-matter areas need or generate different
types of information
  • Data to support collection
  • Questionnaires and supporting metadata
  • Frame and sample information
  • Status of each respondent during collection
  • Survey data
  • Administrative data
  • Post-collection processing
  • Edits (metadata)
  • Imputation specifications
  • Allocation specifications
  • Generation of clean datasets
  • Tabulation of estimates/analysis of results
  • Value of estimate
  • Data quality indicators
  • Suppression patterns
  • Analysis of coherence

Input Data
4
Input Data Warehouse
  • A copy of statistical input data specifically
    structured for querying and reporting
  • Collection
  • Post-collection processing
  • Tabulation of estimates

5
Approaches to organizing information holdings
  • Decentralized
  • In a completely decentralized approach, each
    subject matter area maintains its own input data
  • Centralized
  • Centralized data warehouse contains all input
    data from all subject matter program areas
  • All program areas need to use common concepts and
    standards for classification, or else a
    concordance would have to be found among these
    systems.
  • These are extremes along a continuum

6
Centralized approach
  • Advantages
  • Economies of scale should lead to reduced overall
    development and maintenance costs
  • Some human resource issues are eased (knowledge
    and skills retention and transfer)
  • Eases integration of data to support data
    analysis, coherence analysis, etc.
  • Allows subject-matter divisions to specialize in
    data analysis rather than data management

7
Decentralized approach
  • Advantages
  • Specialized subject matter expertise readily
    available
  • Subject matter areas are not dependent on a
    central authority to make changes therefore
    flexibility is increased
  • Care and control of the data is clearly
    established

8
Questions to address in moving to a more
centralized environment
  • What purpose does it serve?
  • What must be done to the statistical model to
    ensure compatibility with other data sources?
  • What mechanisms need to be in place to ensure
    productive client-service relationship?
  • Who is custodian of the data?
  • Do the benefits in moving to a more centralized
    environment truly outweigh the costs?

9
Statistics Canada and the Unified Enterprise
Survey Program
  • In the late 1990s, Statistics Canada undertook a
    major program to improve the quality of the
    provincial economic accounts released by the
    Agency and the annual business surveys that feed
    into accounts
  • These surveys were integrated in order to
    increase the quality of data produced from these
    surveys in terms of
  • Consistency
  • Coherence
  • Breadth
  • Depth

10
Features of the UES
  • Improved frame (business register)
  • Sampling made to be consistent across surveys and
    improved coverage
  • Harmonized content and common collection
    applications
  • Administrative data are to be used instead of
    survey data if possible and if the data are of
    good quality
  • Common post-collection processing systems
  • Common storage of data
  • Central contact management system
  • Improvements in outputs

11
Moving to a more centralized environment
  • What is the purpose?
  • The UES data warehouse forms a repository of all
    the files created through the processing phases
    of UES and accompanying metadata.
  • This supports the work of analysts and survey
    managers in subject matter divisions, collection
    managers, statistical methodologists and users in
    the System of National Accounts

12
Moving to a more centralized environment
  • What must be done to the statistical model to
    ensure compatibility with other data sources?
  • The statistical model for UES surveys forced the
    harmonization of concepts, definitions and
    classifications across surveys
  • Integration of survey and administrative data
    required the mapping of tax data to survey data
    (harmonized conceptually as well as
    characteristically)

13
Moving to a more centralized environment
  • What mechanisms need to be in place to ensure
    productive client-service relationship?
  • Project management structure for the UES that
    crosses functional boundaries
  • Change management function to ensure seamless
    integration of surveys into UES

14
Moving to a more centralized environment
  • Who is custodian of the data?
  • ESD controls access to all common systems.
  • Subject matter divisions are exclusively
    responsible for dissemination, including the
    determination of aggregations and data
    suppressions (due to quality and confidentiality)

15
Moving to a more centralized environment
  • Do the benefits in moving to a more centralized
    environment truly outweigh the costs?
  • Reduction in development costs
  • Development of best practices that can be shared
    across the bureau
  • Single point of access for input data improves
    security of all UES related data
  • Rationalization of hardware to minimize the
    number of servers

16
The UES Data Warehouse
  • UES Warehouse is centrally managed within
    Enterprise Statistics Division
  • Major components of the data warehouse include
  • Metadata repository
  • Processing metadata
  • Central data store (CDS)
  • External data
  • Data that originate outside UES but have been
    integrated in the UES framework

17
The UES Data Warehouse
  • Systems interfacing with the data warehouse
  • Unified Tracking and Retrieval Tool (USTART)
  • Integrated Questionnaire Metadata System (IQMS)
  • UES Processing Interface
  • Working Estimation Environment (WEE) interface
  • Macro-data adjustment Facility

18
Operational applications
  • Operational monitoring
  • Coherence analysis
  • Baseline information for operational research
  • Quality measures (i.e. response rate analysis)
  • Integrated data analysis

19
Response rates in collection
20
(No Transcript)
21
(No Transcript)
22
Final response rates
23
The centralized system in action
  • Outcomes
  • The centralized input data warehouse provides a
    centralized tool that allows users to track
    performance on a consistent basis
  • Same method
  • Same source data

24
Conclusion
  • The centralized data warehouse offers benefits to
    statistical programs
  • There are a number of conditions that must be
    fulfilled for success
  • Purpose
  • Data compatibility
  • Client-service relationship
  • Custodian of data
  • Cost-benefit
Write a Comment
User Comments (0)
About PowerShow.com