Title: Introduction to Data Warehousing
1 Introduction to Data Warehousing
- Data Warehousing Technologies
-
- Dr. Ilieva I. Ageenko
2History of Information Processing
- Most organizations began information processing
on a small scale, automating one application at a
time. - Systems tend to grow independently to support
defined functional areas. - Each functional area tended to plan and develop
systems in isolation from other areas.
3Stages of Information Processing
- File Transaction Processing
4File Transaction Processing
- The Traditional approach to file processing
encouraged each functional area to develop and
maintain specialized applications. - Individual applications ran on unique master
files.
5Problems with traditional file processing
- Data Redundancy
- Lack of Data Integrity
- Program-Data Dependency
- Lack of Flexibility
- Poor Security
- Lack of Data Sharing and Availability
6Traditional File Processing
Data Redundancy and Inconsistency across all
files
Registration Application
Library Application
Financial aids Application
Credit Records Application
File A Student ID Name Address ZIP Code phone
number
File B Student ID First Name Last Name Address
ZIP phone number
File C Student ID First Last Name Address ZIP
Code phone number
File D Social Security Name Address ZIP
Code phone number
7Stages of Information Processing
- File Transaction Processing
- Data Based Management System (DBMS)
8Data Base Management Systema single source for
all processing
Registration IS application
Library IS application
Credit Records IS application
Financial aids application
Common Data Dictionary
DBMS- database management system
Data Definition Language
Data Manipulation Language
INTEGRATED STUDENTS DATABASE
Students name address
Credit Records number of credits classes
Books Book a Book b
9Reasons for Extract Programs
- Accessibility
- move data out of online processing systems
- Performance
- perform analytical functions separate from online
processing functions - Control
- shift in control of the data
- the end-user ends up owing it
10Problems with naturally evolution of data
extraction
- Credibility of data
- Low Productivity
- Inability to transform data into information
11Stages of Information Processing
- File Transaction Processing
- Data Based Management System (DBMS)
- Extract Processing
- Decision Support Systems (DSS)
12Decision Support Systems
- Computer system at the management level of an
organization that combines data, sophisticated
analytical models, and user-friendly software to
support semi-structured and unstructured decision
making. - DSS often tend to be stand-alone systems,
developed by end-user groups not under central IS
control
13Components of DSS
- DSS database
- A collection of current or historical data from a
number of applications or groups - Model base
- A collection of analytical (math , statistic)
models that can easily be made accessible to the
DSS user. - DSS software system
- The DSS component that permits easy interaction
between the users of the system and the DSS
database model base.
14Extract Processing
Report
DSS
Back Office Intensive Manual Work - DSS
Ad-hoc report A
Ad-hoc report B
Ad-hoc report C
Ad-hoc report D
Registration System
Library System
Financial aids System
Credit Records System
Student ID Name Address ZIP Code phone number
Student ID First Name Last Name Address ZIP
Student ID First Last Name Address ZIP
Code phone number
Social Security Name Address ZIP Code phone
number
15Dilemma- Most of the Business Analysts time is
not spent in true data analysis
- These logistics factors can negatively impact and
slow down efficiency and effectiveness of
business analysis - Growing Volume of Data
- Data stored in many different systems and formats
- The criticality of quick decision making
- Introduction to new products and Market dynamics
- Change in organizational strategies
16Stages of Information Processing
- File Transaction Processing
- Data Based Management System (DBMS)
- Extract Processing
- Decision Support Systems (DSS)
- Data Warehouses
17DATA WAREHOUSE
- Multidimensional database with reporting and
query tools, that stores current and historical
data extracted from various operational systems
and consolidated for management reporting and
analysis. - Addresses the problem of integrating key
operational data from around the company in a
form that is consistent , reliable, and easily
available for reporting.
18Data Warehouse Enterprise Architecture
Transaction Processing Systems (Legacy)
DATA MARTS
Customer data
Marketing
Deposits
DATA WAREHOUSE
Savings
DATA EXTRACTION
TRANSFORMATION
Credit Card
CLEANING and CONDITIONING
Credit Cards
Collections
Small Business
SAS
BUSINESS OBJECTS
SQL
19Stages of Information Processing
- File Transaction Processing
- Data Based Management System (DBMS)
- Extract Processing
- Decision Support Systems (DSS)
- Data Warehouses
- OLAP
20Data Warehouse Architecture and OLAP
DATA MARTS
Transaction Processing Systems (Legacy)
OLAP
OLAP
Customer data
Marketing
Deposits
OLAP
DATA WAREHOUSE
Savings
DATA EXTRACTION
TRANSFORMATION
Credit Card
CLEANING and CONDITIONING
Credit Cards
OLAP
Collections
Small Business
SAS
BUSINESS OBJECTS
SQL
21OLTP vs. OLAP
- OLTP database applications are developed to meet
the day-to-day and operational data retrieval
needs of end-users - Provides read-write capability
- Data Warehouses along with OLAP tools are being
developed to meet information exploration and
historical trend analysis management needs - Provides read-only capability
22Stages of Information Processing
- File Transaction Processing
- Data Based Management System (DBMS)
- Extract Processing
- Decision Support Systems (DSS)
- Data Warehouses
- OLAP
- Data Mining
23Data Mining
- The exploration and analysis, by automatic or
semiautomatic means, of large quantities of data
in order to discover valid , meaningful patterns
and rules to assist with business decisions.
24Data Warehouse Architecture and OLAP
DATA MARTS
Transaction Processing Systems (Legacy)
OLAP
OLAP
Customer data
Marketing
Deposits
OLAP
DATA WAREHOUSE
Savings
DATA EXTRACTION
TRANSFORMATION
Credit Card
CLEANING and CONDITIONING
Credit Cards
OLAP
Collections
Small Business
DATA MINING
25Warehousing data outside the operational systems
- The primary concept of data warehousing is that
the data stored for business analysis can most
effectively be accessed by separating it from the
data in the operational systems. - Fundamental differences between operational and
informational (DW) environment - Nature of the data
- Development Cycle
- Supporting technology
- User community
- Processing characteristics