Data capture and data storage - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Data capture and data storage

Description:

Data capture and data storage . . - – PowerPoint PPT presentation

Number of Views:97
Avg rating:3.0/5.0
Slides: 22
Provided by: Aleks72
Category:

less

Transcript and Presenter's Notes

Title: Data capture and data storage


1
Data capture and data storage
  • ????. ????. ?-? ?????????? ????????
  • ???????? ???????? ??????
  • 2008 ???.

2
Introduction
  • Because of political, legal and administrative
    developments both the amount and the quality of
    environmental data that is being collected has
    increased considerably over the past 35 years.
    This development was influenced by improvements
    in the collection, management and utilisation of
    environmental information.
  • Sensor networks have been installed and upgraded
    to monitor the quality of water, air, and the
    soil. Satellite data and remote sensing data are
    used increasingly to obtain environmental
    information.
  • Environmental data sets are large and complex.
    Their administration requires powerful processors
    and efficient storage technologies. The question
    is rather how to handle and to process these
    large unstructured data sets in order to obtain
    efficient decision support.

3
Object taxonomies
  • The term data capture denotes the process of
    deriving environmental data objects from
    environmental objects, where any real world
    object can be regarded as an environmental
    object. Living and non-living environmental
    objects will be grouped into a number of classes
    with typical attributes (e.g. taxonomy of
    species).
  • Simpler taxonomy structures of the environment
    are given by the medias soil, water and air.
    This taxonomy is commonly used by governmental
    environmental agencies. Understanding the
    environment as an integrated and comple system
    this taxonomy leads to interdisciplinary tasks.
  • Therefore, interdisciplinary task forces groups
    and environmental network organisations are
    becoming increasingly common.

4
General examples of object taxonomies
  • atmosphere which includes all objects above the
    surface of the earth
  • hydrosphere contains all waterrelated objects
  • lithosphere relates to soil, sediments and rocks
  • biosphere is collecting all living matter
  • technosphere is used to denote manmade objects
  • sociosphere denotes social and economic
    interrelationships within the human society

5
Object taxonomy of ecology
  • Object taxonomy of ecology is given by
  • Autecology (Interrelationships between species
    and their interrelations with the abiotic
    environment). Ecological processes take place at
    ecosystem scale
  • Synecology (interrelationships between
    communities and their environments, and between
    populations within a community). Ecological
    processes take place on a community scale
  • Demecology (population ecology)
    (interrelationships of individuals within a
    population, and interrelationships of populations
    with the biotic and abiotic environment). The
    processes take place on a population scale.

6
Mapping the environment
  • The main question is which environmental objects
    should be monitored, and what data should be
    collected on them.
  • There are many ways to obtain environmental data
    objects from environmental objects.
  • The results are achieved as time series of
    measurements.
  • The incoming raw data has to be subjected to some
    domain-specific and device-specific processing.
  • Depending on the source of data this may include
    some manipulations just like optical
    rectification, noise suppression, filtering, or
    contrast enhancement.

7
Raw data processing
  • Raw data processing procedures are known as
    complex analytical techniques (laboratory
    methods) which may be used to survey toxicants in
    the environment.
  • Air and satellite imagery is increasingly being
    used to monitor remote areas and to recognise
    long-term environmental loads. The raw imagery is
    usually processed and represented as a thematic
    map in order to visualise the load distribution.
  • For forest and wildlife management manual counts
    of animals and plantsare often the most reliable
    source of data.
  • For objects from the technosphere or from the
    socio-sphere it is useful to study printed
    documentations in order to extract and condense
    the required environmental data objects.

8
Data validation procedures
  • Data validation procedures are given by
  • Temporal validation Recent measurements are
    compared to previous measurements and to some
    reference data obtained under similar conditions.
  • Geographic validation Data that do not fit the
    usual patterns are subjected to cross-validation
    with measurements from other equipment in the
    same area that measures the same parameter.
  • Space-time validation Data are compared with
    previous measurements from the same equipment.
  • Parameter validation Data that do not fit the
    norm are forwarded to across-validation with
    equipment that measure different parameters.

9
Advanced techniques
  • For the processing and initial evaluation of raw
    environmental data objects knowledge-based
    systems have been considerable potential. With
    regard to knowledge representation the
    requirements of environmental applications can be
    met by standard database and artificial
    intelligence techniques.
  • Knowledge representation
  • Data merging
  • Bayesian probability theory and uncertain
    information
  • Data storage and data security
  • Data base management systems
  • Databases
  • Geographic Information Systems

10
Knowledge representation
  • Static knowledge is stored in specialised file
    systems or in relational or objectoriented
    databases.
  • Object-oriented databases give the users the
    ability to group similar objects into classes and
    to connect those classes in an inheritance
    hierarchy.
  • The objects in each class all, share a set of
    attributes and possibly a number of methods,
    i.e., special procedures that take one or more
    objects of the class as arguments.
  • The notion of inheritance is that attributes and
    methods that are defined for some class C higher
    up in the heritance hierarchy are also valid for
    all classes in the sub-tree below C.

11
Dynamic knowledge
  • Dynamic knowledge in EIS is represented by
    IF-THEN rules. The concept of a rule-based
    knowledge system is to encode the available
    information on environmental objects by a
    possible large number of relatively simple rules
    rather than by a complex procedural program.
  • Each rule consists of a IF part and a THEN part.
    Starting from some initial state, the system
    checks which of the rules are currently
    applicable
  • If more than one rule can be applied, the system
    picks up one of the rules according to a given
    priority scheme.

12
Data merging
  • Environmental data capture can be performed with
    techniques that are standard in the areas of
    statistical classification, database management,
    and artificial intelligence. If raw data are
    aggregated and evaluated, the input data are only
    one part of information.
  • Other circumstantial information is also taken
    into account in order to extract those
    environmental data objects that the user is
    interested in. Human experts always take such
    information into account when evaluating a
    sample.
  • A promising strategy is to form a working
    hypothesis, and to support this hypothesis based
    on the information available. This has to include
    the possibility that the input information may
    partly contradict each other.

13
Bayesian probability theory and uncertain
information
  • Environmental data are often inaccurate and
    uncertain. From such data statements of
    probability can be derived only.
  • Bayesian statistics requires that events are
    independent from each other.
  • This assumption is rarely true in environmental
    context.
  • Uncertainties within raw data sets and data bases
    can be valuated by the Dempster-Shafer approach.
    It is used widely for environmental data capture.

14
Dempster-Shafer approach
  • The key idea is that one should logically
    separate the arguments for and against a given
    hypothesis H. This separation is managed by
    distinguishing between belief B(H) and
    plausibility Pl(H).
  • Both concepts are represented by a number between
    zero and one.
  • The belief represents the weight of the facts
    which support the working hypothesis.
  • In opposite of this plausibility is one minus the
    weight of the facts speaking against H.

15
Degree of uncertainty
  • Therefore Pl(H) 1 - B(H), if H denotes the
    hypothesis that H is false.
  • The belief of the counterhypothesis B(H) is
    sometimes referred to as doubt D(H) with respect
    to the workinghypothesis H.
  • Therefore Pl(H) 1 - D(H).
  • In Bayesian probability theory belief and
    plausibility coincide p(H) B(H) Pl(H) 1 -
    p(H).
  • For Dempster-Shafer theory B(H) Pl(H).
  • The difference between B(H) and PI(H) represents
    the degree of uncertainty U(H) about the
    hypothesis.

16
Data storage and data security
  • In former years, most environmentally relevant
    data are only available in analogue form. This
    concerns historical data records but also a large
    number of more recent thematic maps, images, and
    documents.
  • Those historical data sets that are of relevance
    in current and future applications are rapidly
    being digitised. This process is supported by the
    continuous progress in scanning technologies.
  • New data is almost captured in some digital
    format, and it is mainly a question of logistics
    to make the data available.
  • There are essentially two options for storing a
    given digital data set.

17
Data storage and data security (2)
  • A data base management system (DBMS) with a
    welldefined data model, typical relational,
    object-relational, or object-oriented
  • an application-specific file system, as it is
    still used by many geographic information systems
    (GIS).
  • Environmental data have special demands to
    databases and data storage. In the most cases,
    environmental data consist of three parts of
    information matter or substance based
    information, time information, and space
    information.
  • An environmental data base is characterised by
    the type of data stored in the data base, by the
    management system used for data storage and by
    the type of information available from the data
    base.

18
Data storage and data security (3)
  • Operations between applications and inquiries are
    organised by interfaces. While in the past data
    storage and data processing were tightly coupled,
    more recent systems make a clear distinction
    between those tasks.
  • This trend results of the general tendency
    towards to open systems. As user demand
    comfortable interfaces between different hardware
    and software tools across heterogeneous computer
    platforms, vendors have been forced to decompose
    their products along the lines of more narrowly
    defined functionalities.
  • GIS in particular used for data storage, data
    querying and data visualisation of geographic
    information in a tightly integrated manner.

19
Data base management systems
  • A DBMS serves as a complete pool of data
    languages where the parts are given by
  • data definition language (DDL),
  • query language (QL),
  • data manipulation language (DML).
  • Links to higher programming languages are given.
    Mainly a structured query language (SQL) is used.
  • All data operations within the data base are
    performed by transactions which should allow
    multi user operations. Mostly, commercial DMBS
    are the result of application-oriented
    developments.

20
Geographic Information Systems
  • Geographic information systems (GIS) are
    essential environmental informatic tools for the
    management of the environment, including decision
    support and visualisation of large amounts of
    environmental data. The original idea for GIS was
    to computerise the metaphor of a thematic map.
  • In general, GIS are computer- based tools to
    capture, manipulate, process, and display spatial
    or georeferenced data. The spatial data is still
    mostly held in proprietary file systems.
    Therefore, most of the underlying data models are
    layer-based. The information is encoded in a
    number of thematic maps, such as vegetation maps,
    soil maps, or topographic maps.
  • With regard to geometry, each map corresponds to
    a partition of the universe into disjoint
    polygons. Each polygon represents a region that
    is sufficiently homogeneous with respect to the
    theme of the map. Maps may be enhanced by lines
    and points to represent specific features, such
    as roads or cities.

21
Questions?
Write a Comment
User Comments (0)
About PowerShow.com