DATA WAREHOUSING - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

DATA WAREHOUSING

Description:

Rich source of historical data, but it's difficult to retrieve, ... Synonyms - use the same field names to store different data in the different database ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 16
Provided by: perdanaFs
Category:

less

Transcript and Presenter's Notes

Title: DATA WAREHOUSING


1
DATA WAREHOUSING
2
Legacy System
  • Systems that were developed in the early years of
    business processing
  • Rich source of historical data, but its
    difficult to retrieve, because of non-standard
    features
  • This is why we need data warehouse

3
Problems with Legacy System
  • Access data from a legacy system may be difficult
    for several reasons
  • Developed for a different hardware or software
    platform
  • Use a different data model
  • Use a different DBMS
  • Use a different data definitions
  • Use a different data format
  • All these make difficulty in integration and
    sharing data

4
Data Definitions Problems
  • Homonyms use different field names to store the
    same data in the different database
  • Synonyms - use the same field names to store
    different data in the different database
  • Domain integrity domain for the same field may
    be different
  • Business rules may be different in different
    database
  • Referential integrity may be problems linking
    related records from different databases
  • Concurrency control when multiple users access
    a database that design for single user

5
Data Warehouse Concepts
  • Technique of extracting and filtering data from
    diverse database and use this data to build a new
    database
  • Stores information extracted from historical,
    operational and external databases
  • The primary purpose to provide information for
    management decision making

6
Database vs data warehouse
7
Data Warehouse Architecture
  • Operational database / external database layer
  • Information access layer
  • Data access layer
  • Metadata layer
  • Process management layer
  • Application messaging layer
  • Physical layer
  • Data staging layer

8
Data Warehouse Implementation
  • Data includes operational, historical and
    external data
  • Extraction and transformation extract and
    transform data in different table
  • Data warehouse storage store the extracted and
    transformed data in different table
  • Historical data used for forecasting purposes
  • Reports, statistics, data analysis and
    presentation output from data warehouse to make
    a decision

9
Data Warehouse Benefits and Risks
  • Benefits
  • Reduces reporting cost
  • Reduces data consolidation and integration cost
  • Increase efficiency and decision making
    capabilities
  • Risks
  • House the wrong data
  • Expensive to build and maintain
  • Require organizational changes

10
Online Analytical Processing
  • Support data modeling and multidimensional data
    analysis
  • Share the characteristics
  • Provide user-friendly interface
  • Use multidimensional data analysis technique
  • Provide advanced database support
  • Support client/server architecture

11
Online Analytical Processing
  • Can be classified
  • Relational Online Analytical Processing use
    RDBMS
  • Multidimensional Online Analytical Processing
    extension of RDBMS

12
Data Mining
  • Data mining is a decision support tools that
    enables a user to access directly large amount of
    data and analyzes the data
  • Data mining is the set of activities used to find
    new, hidden, or unexpected patterns in data

13
Data Mining Technique
  • Data mining process has four phases
  • Data preparation main data sets to be used are
    identified and cleaned
  • Data analysis and classification identify
    common data characteristic or pattern
  • Knowledge acquisition develop a model resemble
    target data
  • Prediction used to predict future behaviour and
    forecast business outcomes

14
Data Mining Tools
  • Data mining tools today has this following
    characteristics
  • Data preparation facilities
  • Selection of data mining operations
  • Product scalability and performance
  • Facilities for visualization of results

15
  • END
Write a Comment
User Comments (0)
About PowerShow.com