Title: Folie 1
1Welcome to the 1st GLOWA-Volta Database Workshop
2Agenda
- Aims of the workshop
- Deficits relating the datastocks and data
management of the GVP - Datamanagement
- Livecycle of data
- Conclusions for the GVP
- Need for integration of the data users to
database developement - Role of disciplines to data management
- Steps forward to an optimized data management
3Aims of the workshop
- Initiation of a dialogue with the GVP-members
about their requirements to an efficient data
management - this dialogue is a process in which the
following items should be discussed - data
- use and access
- database structure
- metadatabase
- webpresence
- database team and division of work
4Aims of the workshop
- These points should be discussed within the
working groups as far as possible. In this
workshop we are focussing the items - data (data flow)
- data use and access
- - set up of a database team and division of work
Technical implementation, structure and type of
the databases, including ways of access should be
developed in a team by members of the departments
as well as computer scientists and project
leaders!
5Deficits relating the datastocks and data
management of the GVP currently
6The current situation
- Data server
- data stock is not completed
- data searching by criterias
- is not possible
- arrangement of data is unclear
- relation to the project is
- unclear
- there are no rules for data
- uploading (location, topic etc.)
7The current situation
- Data mediums
- what is its content?
- to which project/thesis does it belong to?
?
8The current situation
- Metadatabase
- data stock representation
- is not completed
9The current situation
- Metadatabase
- if you are looking for data,
- you have to ask your
- colleague in and outside of ZEF!
- maybe the contact person is not available
10The current situation
11The current situation
12The current situation
- Datasets
- lack of data description
- which method background?
- are the values correct?
?
13Data management
- For the avoidance of such problems there is the
necessity of datamanagement
- Definition (by the Data Management
Association) - Data Resource Management is the development and
execution of architectures, policies, practices
and procedures that properly manage the full
data lifecycle needs of an enterprise
- Normally the processes of data management
should be implemented within a project, when it
starts!
14Structuring (Data modeling) - Categorisation -
Sortation - Description and Storing
Lifecycle of Data and Aspects of his Management
15Lifecycle of Data Procurement of Data
- can happen from
- own investigations
- other institutions
- other (sub-)projects within the main project
- serves
- for providing the operating processes with input
data
- needs
- certain data sources and formats
- quality
- application interfaces (import)
16Lifecycle of Data Structuring and Storing
- means
- sorting of data related to a classification
schema - by themes
- by projects/subprojects
- by formats
- by applications
- by spatial research area
- .....
17Lifecycle of Data Structuring and Storing
- or/ and
- by a conceptual data model
- it obtains the data entities and their
relationships within a scope of a system - the entities have properties (attributes)
- it is independend of the storing in a database
and other technical requierements - it can be designed in different forms
(relational, network, hierarchical) - the target system for data storing can be a
relational database as well as a file system
18Lifecycle of Data Structuring and Storing
- serves for
- easy search, find and use of data
- needs
- consensus among data producers and users
within an organization about - conceptual data model
- data needed and not needed
- rules about data updating and archival storage
- standards for metadata-content
- control of compliance to structure criterias
19Lifecycle of Data Structuring and Storing
- means
- the physical storing of data
- needs
- storage places for the databases
(central/distributed) - physical data model
- derived from the conceptual/logical data model
- takes into account the facilities and
constraints of a given - database management system
- database management system with
- interfaces for applications
- query and search services
- backup and security functions
20Lifecycle of data Administration
- means
- on technical base
- install and maintenance of database system
(database database management system) - user access constraints (rights)
- back up and archiving tasks
- security
- performance
- on content base
- Integrity - verifying or helping to verify
- control of data deliver
- control of data input
- metadata
21Lifecycle of data Administration
- needs
- cooperation between data producers/users and
administrators for - maintenance and upgrading the database(-schema)
- definition of the authorization concept for
database access - (read only, read/write only, database schema
modification etc.)
22Lifecycle of data Use and Processing
- means
- use of data for analysis
- processing of data inside and outside of models
- production of new or modified (output-)data
- control of data accuracy
- preparation of data for other processes/projects
23Lifecycle of data Distribution
- means
- delivery of data
- inside an organization/project
- by storing in a database (access by transfer
counterpart) - transfer by a portable media
- by publishing the metadata
- outside an institution/project
- by direct access to a database
- Web-Services
- publishing the metadata
- data extract service from a database
- data downloads
- Map Services (geodata)
24Lifecycle of data Distribution
- serves
- inside an organization/project
- for providing work processes with adjusted data
- outside an organisation/project
- for providing work processes with adjusted data
- for providing data for public information about
the projects
- needs
- knowledge about the requierements of demand
concerning - further use of data
- formats
- clients
- ...
25Lifecycle of data Disposal
- means
- updating the data
- selection and deleting or archiving of data
- being out of date
- being in disuse
- serves
- against data overflow into the databases
- for maintenance the quality of data
- needs
- cooperation between the data producers/users and
the - database administrator
26Conclusions for the GVP
- Conditions
- GVP is divided in a range of projects and
subprojects - e.g. in Phase II Land Use with subprojects L1,
L2 etc. - e.g. in Phase III Analysis of Long-Term
Environemental - Change with the subprojects E1, E2 etc.
- with their own processings, models, input and
output data - (- formats) data flows and -storages
- with specific integrations and dependencies
among each - other and within use case frameworks
- Projects and their models are provided also
with data from - different scientific disciplines like
Hydrology, Pedology, Social - Economy, Ecology etc.
27Conclusions for the GVP
- in Phase III main objective ist the
Integration of Phase I - and II research results, knowledge, data and
tools
- in Phase III the DSS will be realized as the
GVPs primary output
The several subprojects are connected by data
flow (transfer)
The data flow should be adjusted to the GVP and
DSS requierements. This means there must be a
transparent management, which is centralized and
standardized
GVP Phase III Proposal, S. 8
28Need for integration of the data users during
development and setup of a GVP-data management
- Each researcher (or on a higher level project)
is a kind of data manager in his own work space.
He has - is own (local) database
- his own input and output data and data
procurement requierements - his own usage and processings
- his own distributing of data (to other
users/projects) - and therefore his own (short) lifecycle of data
- and is integrated in the data flow between the
projects and also their life - cycle of data
29Project 3
Project 2
Project 4
Project 1
Project 1
Project 3
Project 4
30Role of disciplines in developing concepts of a
data management
project members....
- have to decide, together with other project
members and the - database developers, which data should be
stored centrally to share them, and which can be
stored locally or at other places - have to decide which structure of data storing
is most convenient for an optimized using - have to give information about their data
(create metadata)
- and
- they are responsible for the data management in
their own work area - before they will be
interdisciplinary coordinated by the database
administrator
31Role of disciplines in developing concepts of a
data management
developers of a database ....
- have the responsibility to consult the project
members about the requirements of data
management - have to organize the data flow concerning the
(technical) way of data storing and access. The
activities must be adjusted to the operating
processes/projects and their interfaces - have to develop the data management standards
together with the project- members
32Steps forward to an optimized data management
(within this workshop)
My request to you
Step 1 analyze the data stock (data
dictionary) Step 2 analyze the data flows Step
3 develope the logical data model for data
storing
33Basic for working groups
Data flow modell combined with data dictionary
Notation
Terminator data producers (data source) or users
(data hollow) outside the system (external
Partners, public)
Process transfer of input data into output data
e.g. by algorithms
Data storage unit as data pool (not local).
Building time differs from using time. A ?
dictionary
A
a
Data flow direction for dataset a ? dictionary
a
Data flow relay in two directions (processes)
34Basic for working groups
Context-Diagram
GLOWA-Volta
35Basic for working groups
Diagram 1 GLOWA-Volta
DSS
36Basic for working groups
Diagram 2 Analysis of Long-Term Environmental
Change
Vendor of remote sensing data
GVP LUDAS (E3)
S 1
37Basic for working groups
Diagram 3 GVP-LUDAS
working group natural scientists
working group social economists
a, b, c
Elicitation Ghana
A
g
e
f
Evaluation of Elicitation Results (House- hold
Survey)
d
E 4
38To Do
- Please try to draw a general overview about
data flows and stocks
- And relate data management options to the
certain data flows or storages
In the afternoon I would like to discuss the
requirements of a data management system from
your point of view.
Take it all as a form of brainstorming!!
Thank you!!
39(No Transcript)
40How to organize (sort) the data into the database
???
Central Database
- Project 1
- theme 1
- format 1
- format 2
- theme 2
- ....
- Formats
- SPSS
- project 1
- project 2
- remote sensing
- ....
- Region 1
- Project
- subproject
- theme
- format
- Project 1
- theme 1
- format 1
- format 2
- theme 2
- ....
41Basic for working groups
Data flow modell combined with data dictionary
Notation II
a
Dataflow relay in two directions (processes)
b
a
Dataflow division from dataset a into datasets
b and c
c
b
Dataflow a is originated from b and c
a
c
a
Dataflow updating of data to a storage