Metadata use in the Statistical Value Chain - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Metadata use in the Statistical Value Chain

Description:

Clustered Metadata Entities (CME) ... More complex than the collection metadata (more CME entities needed) Among others they contain: ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 20
Provided by: pong8
Learn more at: https://unece.org
Category:

less

Transcript and Presenter's Notes

Title: Metadata use in the Statistical Value Chain


1
Metadata use in the Statistical Value Chain
  • UNECE-Eurostat-OECD Meeting
  • on
  • Management of Statistical Information Systems
  • MSIS 2008
  • Luxembourg, 7-9 April 2008
  • Georges Pongas
  • Adam Wronski

2
Content
  1. Introduction
  2. Operational Characteristics of Metadata
  3. Technical Characteristics of the Metadata
  4. Metadata types needed in the various steps of the
    SVC (statistical value chain)
  5. Conclusion

3
Seven SVC steps
  1. Expression of the need
  2. Data collection design
  3. Specification and development of the tools needed
    for the data collection
  4. Data collection
  5. Data editing and imputation
  6. Data processing
  7. Data dissemination

4
Basics
  • Leave out the statistical notions from the
    technical (implementation oriented)
    characteristics of the metadata.
  • Design metadata technical characteristics so the
    same metadata structures can cover both
    statistical and non-statistical requirements

5
Operational Characteristics of Metadata
  • Static nature
  • Long production process
  • Located in various places (resources)
  • Critical link with statistical data
  • depends on statistical data changes
  • Strong coupling of structural metadata with the
    statistical data
  • Large number of metadata entities needed in SVC

6
Technical Characteristics of Metadata
  • Terminology often complex
  • Technical characteristics and statistical notions
    frequently mixed

7
Statistical Notions and Metadata
  • Examples
  • Classification, keyword list and set of
    information related to the SDDS standard
  • Correspondence table between two classifications
    table containing the links (access rights)
    between the user names and the statistical
    datasets of a database
  • The only difference is the context, i.e., the
    user interface
  • Thus develop separately
  • a common set of functionalities and
  • the interface layer for an application

8
Metadata Technical Structure Categories
  • Three categories proposed
  • Simple Metadata Entities (SME)
  • Binary Relationships (BR)
  • Clustered Metadata Entities (CME)

9
Simple Metadata Entities (SME)
  • simple key
  • variable number of attributes appropriate for
    vertical type storage
  • Example 1 Example 2
  • Entity NACE user name
  • Entity element 2122 gpongas
  • Attribute name English label phone no
  • Attribute value Mining 430139

10
Examples of SMEs
  • SDDS documents
  • Dublin Core
  • Classifications
  • Keywords
  • Administrative entities
  • Programs
  • Publications

11
Binary Relationships (BR)
  • Two types
  • Between two different entities
  • correspondence tables, access rights definitions
  • Inside the same entity
  • thesauri, classification hierarchies, links
    between regulations, statistical documents
  • Example
  • Relationship id UN thesaurus
  • First entity id EUROPE
  • First entity role Parent
  • Second entity id FR
  • Second entity role Child
  • Reason of link Broader term

12
Clustered Metadata Entities (CME)
  • Complex entities characterised by variable keys
    cardinality and references to other entities of
    type CME, SME and BR
  • Description techniques
  • XML schema is appropriate

13
Examples
  • SDMX, Gesmes definitions
  • Dataset definitions
  • Annotations to dataset cells
  • Confidentiality definitions linked to datasets

14
Metadata in the various steps of the SVC
15
Collection Metadata
  • Mostly of type BR and SME
  • Among others they contain
  • source agencies
  • data files descriptions
  • codelists
  • validation rules linked to initial data checks

16
Editing, Imputation and Processing Metadata
  • More complex than the collection metadata (more
    CME entities needed)
  • Among others they contain
  • Dataset definitions
  • Formulas, programs, scripts
  • Conditional and ordinary annotations
  • Dissemination feeding information

17
Dissemination Metadata
  • The most complex metadata types are located here.
  • They contain almost all the previously described
    metadata plus their own
  • Reasons for this complexity
  • Dissemination contains all the statistical
    domains
  • It must cover all user types
  • It has tight delivery deadlines
  • It must offer navigation presentation and
    extraction facilities of great friendliness

18
Among others dissemination metadata contain
  • Sitemap description
  • Release calendars
  • Dataset links to publication tables
  • Questionnaires definitions linked to datasets
  • Units of measurement
  • Ready made queries

19
Conclusion
  • Separation of
  • statistical notions (context) and structure
    (functionality) of metadata
  • gives
  • minimisation of structural metadata types
  • consequently it makes easier to
  • build and implement a complex statistical
    (metadata and data) system
Write a Comment
User Comments (0)
About PowerShow.com