Completeness - PowerPoint PPT Presentation

About This Presentation
Title:

Completeness

Description:

Data quality component that describes whether the entity objects represent all ... for evaluating the ability of the data set to meet up to the requirements ... – PowerPoint PPT presentation

Number of Views:10
Avg rating:3.0/5.0
Slides: 12
Provided by: sungsoo
Learn more at: https://gis.depaul.edu
Category:

less

Transcript and Presenter's Notes

Title: Completeness


1
Completeness
  • February 27, 2006
  • Geog 458 Map Sources and Errors

2
Outlines
  • Completeness
  • Testing completeness
  • Documenting completeness in the metadata
  • Data quality

3
Completeness
  • The data set is called complete if whats
    defined/needed is encoded in the DB
  • Spatial completeness degree to which all
    features are captured corresponding to data
    capture specifications
  • Attribute completeness degree to which the
    relevant attributes of a feature are available
    corresponding to a given capture specifications
  • Data quality component that describes whether the
    entity objects represent all entity instances of
    the corresponding abstract universe
  • Relationship between the objects represented in
    the data set and the abstract universe of all
    such objects

4
Abstract universe
  • Can be thought of a reference frame
  • Data set digital representation of a subset of
    (perceived) reality
  • Abstract universe terrain nominale abstract
    view of the universe universe of discourse
    miniworld subset of perceived reality (it
    involves selection and abstraction process)
  • Data set is intended to represent the abstract
    universe
  • Since completeness means the relationship between
    data set and abstract universe, a useful
    characterization of completeness relies on a
    comprehensive definition of the abstract universe

5
Data completeness vs. Model completeness
  • It is possible to classify completeness into two
    categories depending on how the abstract universe
    is defined or specified
  • Data completeness the abstract universe is
    defined on generic uses of data
    application-independent
  • Model completeness the abstract universe is
    defined on specific uses of data
    application-dependent
  • So which would be more flexible? Which would have
    multiple versions of completeness on the same
    data?

6
Spatial completeness
  • Lets say the abstract universe lake is defined
    as the water body with the area more than 1
    square mile
  • Check the number of entities in the abstract
    universe set this number to A
  • Check the number of entities encoded in the DB
    (lake data set) set this number to B
  • Completeness would be B/A
  • The definition of lake varies depending on
    applications, thus so does A vary

7
Attribute completeness
  • Subordinated to spatial completeness
  • Define what the relevant attributes will be
  • Lake will have area, depth, type (freshwater),
    and so on
  • Check if attribute values are missing for entity
    in hand
  • Geometric description might be incomplete (area)
  • Report on the number of missing values out of the
    total number of features for each attribute

8
Relation to other data quality components
  • Completeness may affect the logical consistency
    of a data set
  • Missing arc, node ? connectivity, closed polygon
  • Missing attribute (left and right-node) ?
    connectivity
  • Missing attribute in PK ? key constraint
  • Missing attribute in FK ? referential constraint
  • So where do I document this in completeness or
    logical consistency?
  • If incompleteness causes logical inconsistency,
    describe it in logical consistency section
  • Else it will be included in completeness section

9
Data quality vs. fitness of use
  • Data quality
  • The totality of features and characteristics of a
    data set that bear on its ability to satisfy a
    stated set of requirements application-independen
    t
  • Fitness of use
  • The totality of features and characteristics of a
    data set that bear on its ability to satisfy a
    set of requirements given by the application
    application-dependent

10
Data quality vs. fitness of use
  • Data quality information is usually provided by
    the producer of a data set
  • Fitness of use is assessed when evaluating the
    use of a data set by users ? this principle is
    referred to truth in labelling (users are
    responsible for quality control indeed)
  • See different approaches to quality control in
    the lecture note on spatial data quality

11
Data quality report
  • What you are reporting in data quality section of
    the metadata will be data-independent, so that it
    can be reused for any potential uses of the data
  • Reporting data quality can be thought of the
    process for evaluating the ability of the data
    set to meet up to the requirements
  • In that how well the value is close to ground
    truth (attribute/positional accuracy), whether it
    exhibits lack of contradictions (logical
    consistency), and whether whats relevant is
    encoded in the DB (completeness)
Write a Comment
User Comments (0)
About PowerShow.com