11. Building Information Systems - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

11. Building Information Systems

Description:

... concerns two or more tables that are related. ... Concepts related to Normalization: KEYS and FUNCTIONAL DEPENDENCE. Ex: ... Poorly Designed. Non-unique ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 43
Provided by: www58
Category:

less

Transcript and Presenter's Notes

Title: 11. Building Information Systems


1
DATA RESOURCE MANAGEMENT
2
Data Hierarchy in a Computer System
3
Entitities and Attributes
4
Problems with the Traditional File Environment
  • Data redundancy
  • Program-Data dependence
  • Lack of flexibility
  • Poor security
  • Lack of data-sharing and availability

5
Traditional File Processing
6
Database Management System (DBMS)
  • Creates and maintains databases
  • Eliminates requirement for data definition
    statements
  • Acts as interface between application programs
    and physical data files
  • Separates logical and physical views of data

7
The Contemporary Database Environment
8
Components of DBMS
  • Data definition language Specifies content and
    structure of database and defines each data
    element
  • Data manipulation language
  • Manipulates data in a database
  • Data dictionary Stores definitions of data
    elements, and data characteristics

9
Sample Data Dictionary Report
10
Relational Data Model
11
Three Basic Operations in a Relational Database
  • Select Creates subset of rows that meet specific
    criteria
  • Join Combines relational tables to provide users
    with information
  • Project Enables users to create new tables
    containing only relevant information

12
Three Basic Operations in a Relational Database
13
FLAT FILE NOT NORMALIZED
14
A Normalized Relation of ORDER
15
Ensuring Database Integrity
  • Database integrity involves the maintenance of
    the logical and business rules of the database.
  • There are two kinds of DB Integrity that must
    be addressed
  • Entity Integrity
  • Referential Integrity

16
Entity Integrity
  • Entity integrity deals with within-entity rules.
  • These rules deal with ranges and the permission
    of null values in attributes or possibly between
    records

17
Examples of Entity Integrity
  • Data Type Integrity very common and most basic.
    Checks only for data type compatibility with DB
    Schema, such as numeric, character, logical,
    date format, etc.
  • Commonly referred to in GIS manuals as
  • Range and List domains
  • Ranges - acceptable Numeric ranges for input
  • List - acceptable text entries or drop-down lists.

18
Enforcing Integrity
  • Not a trivial task!
  • Not all database management systems or GIS
    software enable users to enforce data integrity
    during attribute entry or edit sessions.
  • Therefore, the programmer or the Database
    Administrator must enforce and/or check for
    Integrity.

19
Referential Integrity
  • Referential integrity concerns two or more tables
    that are related.
  • Example IF table A contains a foreign key that
    matches the primary key of table B THEN
    values of this foreign key either match the value
    of the primary key for a row in table B or must
    be null.
  • Necessary to avoid Update anomaly, Delete
    anomaly.

20
Querying Databases Elements of SQL
  • Basic SQL Commands
  • SELECT Specifies columns
  • FROM Identifies tables or views
  • WHERE Specifies conditions

21
Using SQL- Structured Query Language
  • SQL is a standard database protocol, adopted by
    most relational databases
  • Provides syntax for data
  • Definition
  • Retrieval
  • Functions (COUNT, SUM, MIN, MAX, etc)
  • Updates and Deletes

22
SQL Examples
  • CREATE TABLE SALESREP
  • Item definition expression(s)
  • item, type, (width)
  • DELETE table
  • WHERE expression

23
Data Retrieval
  • SELECT list FROM table WHERE condition
  • list - a list of items or for all items
  • WHERE - a logical expression limiting the number
    of records selected
  • can be combined with Boolean logic AND, OR, NOT
  • ORDER may be used to format results

24
UPDATE tables
  • SET item expression
  • WHERE expression
  • INSERT INTO table
  • VALUES ..

25
Database Normalization
  • Normalization The process of structuring data to
    minimize duplication and inconsistencies.
  • The process usually involves breaking down a
    single Table into two or more tables and defining
    relationships between those tables.
  • Normalization is usually done in stages, with
    each stage applying more rigorous rules to the
    types of information which can be stored in a
    table.

26
Normalization
  • Normalization a process for analyzing the design
    of a relational database
  • Database Design - Arrangement of attributes into
    entities
  • It permits the identification of potential
    problems in your database design
  • Concepts related to Normalization
  • KEYS and FUNCTIONAL DEPENDENCE

27
Ex Database Normalization (1)
  • Sample Student Activities DB Table
  • Poorly Designed
  • Non-unique records
  • John Smith
  • Test the Design by developing sample reports and
    queries

28
Ex Database Normalization (2)
  • Created a unique ID for each Record in the
    Activities Table
  • Required the creation of an ID look-up table
    for reporting (Students Table)
  • Converted the Flat-File into a Relational
    Database

29
Ex Database Normalization (3)
  • Wasted Space
  • Redundant data entry
  • What about taking a 3rd Activity?
  • Query Difficulties - trying to find all swimmers
  • Data Inconsistencies - conflicting prices

30
Ex Database Normalization (4)
  • Students table is fine
  • Elimination of two columns and an Activities
    Table restructuring, Simplifies the Table
  • BUT, we still have Redundant data (activity fees)
    and data insertion anomalies.

Problem If student 219 transfers we lose all
references to Golf and its price.
31
Ex Database Normalization (5)
  • Modify the Design to ensure that every non-key
    field is dependent on the whole key
  • Creation of the Participants Table, corrects our
    problems and forms a union between 2 tables.

This is a Better Design!
32
The Normal Forms
  • A series of logical steps to take to normalize
    data tables
  • First Normal Form
  • Second
  • Third
  • Boyce Codd
  • Theres more, but beyond scope of this

33
First Normal Form (1NF)
  • All columns (fields) must be atomic
  • Means no repeating items in columns

Solution make a separate table for each set of
attributes with a primary key (parser, append
query)
Customers CustomerID Name
Orders OrderID Item CustomerID OrderDate
34
Second Normal Form (2NF)
  • In 1NF and every non-key column is fully
    dependent on the (entire) primary key
  • Means Do(es) the key field(s) imply the rest of
    the fields? Do we need to know both OrderID and
    Item to know the Customer and Date? Clue
    repeating fields

Solution Remove to a separate table (Make Table)
OrderDetails OrderID Item
Orders OrderID CustomerID OrderDate
35
Third Normal Form (3NF)
  • In 2NF and every non-key column is mutually
    independent
  • means Calculations
  • Solution Put calculations in queries and forms

OrderDetails OrderID Item Quantity Price
Put expression in text control or in
query Quantity Price
36
Data Warehousing and Datamining
  • Data warehouse
  • Supports reporting and query tools
  • Stores current and historical data
  • Consolidates data for management analysis and
    decision making

37
What is a Data Warehouse?
  • "A warehouse is a subject-oriented, integrated,
    time-variant and non-volatile collection of data
    in support of management's decision making
    process".
  • Bill Inmon (1990)
  • "A Data Warehouse is a repository of integrated
    information, available for queries and analysis.
    Data and information are extracted from
    heterogeneous sources as they are generated.
  • Anonymous

38
Components of a Data Warehouse
39
Data Mining
  • ON-LINE ANALYTICAL PROCESSING (OLAP) ability to
    manipulate, analyze large volumes of data from
    multiple perspectives
  • MINING Seeking relationships that are not known
    in advance. A function of the software and data
    organization.

40
DW Characteristics
  • Subject OrientedData that gives information
    about a particular subject instead of about a
    company's ongoing operations.
  • Integrated Data that is gathered into the data
    warehouse from a variety of sources and merged
    into a coherent whole.
  • Time Variant All data in the data warehouse is
    identified with a particular time period.

41
Data Acquisition
  • The process of moving company data from the
    source systems into the warehouse.
  • Often the most time-consuming and costly effort.
  • Performed with software products known as ETL
    (Extract/Transform/Load) tools.
  • Over 50 ETL tools on market.

42
Data Cleansing
  • Typically performed in conjunction with data
    acquisition.
  • A complicated process that validates and, if
    necessary, corrects the data before it is
    inserted.
  • AKA "data scrubbing" or "data quality assurance".
Write a Comment
User Comments (0)
About PowerShow.com