Title: Units in OECD.Stat Bo Sundgren
1Units in OECD.StatBo Sundgren Lars Thygesen
2Standardising decentralised databases
- Often based on paper publications
- Squeeze in as much as possible
- Mix apples and pears
- Generalised format The hypercube
- Obey certain rules
- Clean dimensions
- One (or more) parameter(s) (measure)
- Time
- Country
3Example
- Residents in OECD countries 1990-2005 by Country,
Time, Sex, and AgeGroup Count, SumOfIncome, and
AverageOfIncome - Dimension 1 Country (Member countries of the
OECD) - Dimension 2 Time (1990-2005)
- Dimension 3 Sex
- Dimension 4 AgeGroup
- Dimension 5 Parameters (Count, Sum(Income),
Average(Income))
4Organisation of OECD.Stat
- A number of relatively independent datasets,
multi-dimensional tables - common dimensions (e.g. country, time,
frequency, age) provide links between the
datasets - compatible with Statistical Data and Metadata
Exchange standards (SDMX) - hierarchical thematic structure
5Irregular statistical hypercubes
- Several classification variables concatenated in
one dimension - Different parameters for different dimension
members - or worse
6As a result unit can be
- common to a dataset
- in a dimension
- differ for different members of one dimension in
a dataset - differs for different combinations of members of
more than one dimension in a dataset - not mentioned at all
7Possible solutions
- The radical Restructure 500 datasets to become
regular - The pragmatic Keep datasets and repair
8Proposal for a solution
- Introduce common Unit code
- Attached to any observation in any dataset
- Unit multiplier code
- Unit position code
- Dataset with only one unit
- Dataset with unit in a separate dimension
- Dataset where unit depends on only one dimension
- Dataset where unit differs for different
combinations of members of more than one
dimension - Map unit texts to codes
9Translation table
10(No Transcript)
11i - Unit of measurement Million USD
12i - Unit of measurement Million USD
13i - Unit of measurement Million USD
- Unit of measurement depends on the dimension
Country
14i - Unit of measurement Million USD
i - Unit of measurement Million national currency
15i - Unit of measurement Million USD
i - Unit of measurement Million national currency
- Unit of measurement depends on the dimension
Variables
16i - Unit of measurement Million USD
i - Unit of measurement Million national currency
- Unit of measurement depends on the dimension
Variables
17Units of measurement Click i for member of
dimension Subject to see unit
i - Unit of measurement Million USD
i - Unit of measurement Million national currency
- Unit of measurement depends on the dimension
Variables
18Units of measurement Click i for member of
dimension Subject to see unit
i - Unit of measurement Million USD
i - Unit of measurement Million national currency
- Unit of measurement depends on the dimension
Variables
19Units of measurement Please see explanation in
metadata i
Units of measurement Click i for member of
dimension Subject to see unit
i - Unit of measurement Million USD
i - Unit of measurement Million national currency
- Unit of measurement depends on the dimension
Variables
20Problems
- Keep Unit clean as separate from measure
- Agree on unit codes (adapt to SDMX)
- Dataset owners think that even pragmatic
solutions are too radical
21Tentative time line
22Questions
- Are these problems recognizable?
- Are similar solutions applied?
23- There is nothing more practical than a good
theory D. Hilbert