Handling inconsistencies in integrated business data - PowerPoint PPT Presentation

1 / 9
About This Presentation
Title:

Handling inconsistencies in integrated business data

Description:

Title: Topic Author: Carina de Boer Last modified by: Jeffrey Hoogland Created Date: 10/17/2005 12:52:07 PM Document presentation format: Diavoorstelling – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 10
Provided by: Cari74
Learn more at: https://unece.org
Category:

less

Transcript and Presenter's Notes

Title: Handling inconsistencies in integrated business data


1
Handling inconsistencies in integrated business
data
  • Bonn
  • 25-27 September 2006
  • Jeffrey Hoogland
  • Ilona Verburg

2
ESD Integration
  • Goals
  • Improvement of transparency and quality of
    business data sources
  • Integration of business data for enterprises for
    FATS, National Accounts, SBS, CEREM (external
    users such as CPB)
  • Improvement of consistency of data sources
  • Improvement of usability of business registers to
    determine reliable aggregates

3
ESD Integration phase 1
  • 5 business registers and 3 annual business
    surveys for 2001-2004
  • 6 key variables
  • Enterprises with less than 100 employees
  • Goal integrated data, consistent on aggregated
    level (publication cell ? size class group) for
    2004
  • Development of methodology
  • methods for filtering outliers in registers
  • methods for weighting of incomplete registers
  • methods for detecting influential inconsistencies
    at micro level
  • List of causes, consequences and solutions for
    inconsistencies

4
Annual business data sources
GBR VAT CT TS JSSD SBS GFCF SEE ICT
PC RD
surveys
registers
5
Table 1. Available annual sources on enterprise
level for six key variables.
GBR VAT CT TS SSD SBS SEE PC
Number of employed persons X X X X
Gross wages and salaries X X X X X
Total labour costs X X
Net turnover X X X X
Purchase value X X X
Profit X X
6
Table 2. Causes for differences between sources
at publication and/or micro level.
Causes for differences at publication level (only) Causes for differences at publication and micro level
Difference in target population Matching error
Difference in weights Difference in variable definition
Classification error Difference in measurement time (period)
Measurement errors in variables
Processing errors in variables, e.g. due to wrong unit transformations
Difference in editing strategy
Observed versus imputed value
Difference in imputation method
7
Steps in integration process I
  • - Tune target populations
  • - Synchronize classifications (NACE, size
    class)
  • - Harmonization of variables and units
  • - Match data on enterprise level
  • - Correct obvious mistakes
  • - Filter and weight incomplete registers

8
Steps in integration process II
  • - Filter and weight incomplete registers
  • - Compute temporary aggregates
  • - Indicate inconsistent aggregates
  • - Detect influential inconsistent records
  • - Solve matching errors, edit influential
    errors, and adapt weights
  • - Compute consistent aggregates

9
Long-term challenges
  • Use the Fellegi-Holt principle to obtain
    consistent integrated micro-data
  • Use repated weighting techniques to obtain
    consistent aggregates
  • Develop a general editing system for business
    registers and surveys
  • Minimize the burden for respondents using a
    maximum number of registers and a minimum number
    of surveys
Write a Comment
User Comments (0)
About PowerShow.com