Title: Attribute data handling in a GIS environment'
1Attribute data handling in a GIS environment.
Chapter 10 Creating and maintaining geographic
databases
B. Klinkenberg Geog 376 07
2Outline
- Linking data to place
- Definitions
- Characteristics of DBMS
- Types of database
- Relational model
- SQL
- Database design
3Linking data to place
- Having defined how the geography can be modeled
within a GIS, we now need to consider how the
characteristics (or attributes) of the geographic
features are associated with that geography.
4GIS Data
Attribute Linkages
Spatial Data
Attribute Data
5Storing attribute data
- attribute data are stored separately from the
coordinate data - feature identifier points to an attribute table
- point attribute table
- line or arc attribute table
- polygon attribute table
6Storing attribute data
polygon attribute table
1
2
3
similarly we can define point or line attribute
tables if the spatial features are, for example,
villages and roads
7Storing attribute data
- good organization of the attribute data is very
important - in socioeconomic GIS applications, the attribute
data component is often much larger than the
database component (e.g., few provinces, but
hundreds of variables)
8103
102
101
104
105
107
106
9Outline
- Linking data to place
- Definitions
- Characteristics of DBMS
- Types of database
- Relational model
- SQL
- Database design
10Definitions
- Database an integrated set of data on a
particular subject - Geographic (spatial) database - database
containing geographic data of a particular
subject for a particular area - Database Management System (DBMS) software to
create, maintain and access databases
11Storing data
- There are two fundamental ways to store data
- As a simple file (e.g., a text file)
- In a database
12Simple file structures
13Moving from files to databases
http//cs1.mcm.edu/rob/class/dbms/notes/Chapt01/i
ndex.html
14Advantages of Databases over Files
- Avoids redundancy and duplication
- Reduces data maintenance costs
- Applications are separated from the data
- Applications persist over time
- Support multiple concurrent applications
- Better data sharing
- Security and standards can be defined and
enforced
15Disadvantages of Databases over Files
- Expense
- Complexity
- Performance especially complex data types
- Integration with other systems can be difficult
16Characteristics of DBMS (1)
- Data model support for multiple data types
- MS Access Text, Memo, Number, Date/Time,
Currency, AutoNumber, Yes/No, BLOBs, OLE Object,
Hyperlink, Lookup Wizard (other DBMSs are
similar) - Load data from files, databases and other
applications - Index for rapid retrieval
17Characteristics of DBMS (2)
- Query language SQL (also QBE, )
- Security controlled access to data
- Multi-level groups
- Controlled update using a transaction manager
- Backup and recovery
- DBA tools
- Configuration, tuning
18Characteristics of DBMS (3)
- Applications
- CASE tools (Computer-aided software engineering)
- Forms builder
- Reportwriter
- Internet Application Server
- Programmable API (Application Programming
Interface)
19Outline
- Linking data to place
- Definitions
- Characteristics of DBMS
- Types of databases
- Relational model
- SQL
20Role of DBMS
Task
System
- Data load
- Editing
- Visualization
- Mapping
- Analysis
Geographic Information System
- Storage
- Indexing
- Security
- Query
Database Management System
Data
21Types of DBMS Models
- Hierarchical
- Network
- Relational - RDBMS
- Object-oriented - OODBMS
- Object-relational - ORDBMS
22Hierarchical DBMS
23Hierarchical and Network
Hierarchical
Network
24Relational Tables Topological data model
Keys
Foreign keys
25Object-oriented DBMS
Inheritance, encapsulation
26Overview
- Network
- essentially a programmer's database model
- efficient but inflexible and hard to understand
- Relational
- its only complex data type is the relation
- it is the only complete data model
- aimed at users instead of programmers
- relational query languages are easier to use than
full-blown programming languages - rich underlying theory
- separation of implementation and design
- Object-Oriented
- an extension of object-oriented programming
- no generally agreed upon formal data model
- great freedom regarding complex data structures
- inheritance
- user-defined types
- encapsulation
27Outline
- Linking data to place
- Definitions
- Characteristics of DBMS
- Types of databases
- Relational model
- SQL
- Database design
28Relational DBMS (1)
- Data stored as tuples (tup-el), conceptualized as
tables - Table data about a class of objects
- Two-dimensional list (array)
- Rows objects
- Columns object states (properties, attributes)
29Table
Column property
Table Object Class
Row object
30Table file relation
Column field attribute of columns degree
FID Primary Key Index
Row record tuple rows cardinality
Text Table 4.1
31Relational DBMS (2)
- Most popular type of DBMS
- Over 95 of data in a DBMS is in a RDBMS
- Commercial systems
- IBM DB2
- Informix
- Microsoft Access
- Microsoft SQL Server
- Oracle
- Sybase
32Relation Rules (Codd, 1970)
- Only one value in each cell (intersection of row
and column) - All values in a column are about the same subject
- Each row is unique
- No significance in column sequence
- No significance in row sequence
33Normalization
- Process of converting tables to conform to Codds
relational rules - Split tables into new tables that can be joined
at query time - The relational join
- Several levels of normalization
- Forms 1NF, 2NF, 3NF, etc.
- Normalization creates many expensive joins
- De-normalization is OK for performance
optimization
34Relational Join
- Fundamental query operation
- Occurs because
- Normalization
- Data created/maintained by different users, but
integration needed for queries - Table joins use common keys (column values --
foreign keys) - Table (attribute) join concept has been extended
to geographic case
35Outline
- Linking data to place
- Definitions
- Characteristics of DBMS
- Types of database
- Relational model
- SQL
- Database design
36SQL
- Structured (or Standard) Query Language
(pronounced SEQUEL) - Developed by IBM in 1970s
- Now de facto and de jure standard for accessing
relational databases - Three types of usage
- Stand alone queries
- High level programming
- Embedded in other applications
37Types of SQL Statements
- Data Definition Language (DDL)
- Create, alter and delete data
- CREATE TABLE, CREATE INDEX
- Data Manipulation Language (DML)
- Retrieve and manipulate data
- SELECT, UPDATE, DELETE, INSERT
- Data Control Languages (DCL)
- Control security of data
- GRANT, CREATE USER, DROP USER
38Outline
- Linking data to place
- Definitions
- Characteristics of DBMS
- Types of database
- Relational model
- SQL
- Database design
39Steps involved in database creation
- Data investigation consider the type, quantity
and qualities of data to be included in the
database the nature of the entities and
attributes is decided (inventory of data, needs
analysis). - Data modeling form a conceptual model of data by
examining the relationships between entities and
the characteristics of entities and attributes
(logical design--infological model).
40Steps involved in database creation
- Database design creation of a practical design
for the database. This step depends upon and is
constrained by the software being used. Field
names, specific attribute types and structures
(e.g., tables) are decided (physical
design--datalogical model). - Database implementation populating the database
with attribute data. This is followed by
monitoring and upkeep, fine tuning, modification
and updating.
41Database design perspectives
- Infological problems deal with how to define the
information to be provided by the system to
satisfy the needs of its users. - Datalogical problems are about how to design the
structure and operation of the system and to take
full advantage of current information technology
available. - Essentially, infological work refers to system
analysis and conceptual modeling, and datalogical
work to technical design and physical
implementation of the system.
42ERM
- The identification of entities
- The identification of relations between entities
- The identification of attributes of entities
(Infological steps) - The derivation of tables from this (datalogical
steps) - Entity Relationship Modelling
43ERM
44ERM Relations
Mapping an ER Model into a table. Example of
11 relations
Example of a 1M relation
Example of a MN relation
45Database design perspectives
- Prof. Börje Langefors recognized the importance
of three contexts in the infological approach.
They are the organizational context, wherein
organized collections of people/individuals are
perceived the language context, wherein
organized collections of symbols and linguistic
behaviors are perceived and technical context,
wherein organized collections of technical
artifacts (computers, telecommunication
technologies, software) are perceived (Iivari
Lyytinen, 1998 p. 170).
This quote perfectly describes the situation wrt
GIS within an organization, as well.
http//isworld.student.cwru.edu/tiki/tiki-index.ph
p?pageLangefors_Review
46Summary
- Database an integrated set of data on a
particular subject - Databases offer many advantages over files
- Relational databases dominate
- Database design issues