Title: Introduction to Data Warehousing
1Introduction to Data Warehousing
2Topics
- Operational Data Stores
- Why Warehousing
- Data Warehouse defined
- Data Warehouse basic elements
- Data Warehouse concepts
- Further info
3Operational Data Store
- subject-oriented, integrated, volatile,
current-valued data store containing only
corporate detailed data1
Bank
Manufacturing
Telecomm
ATM Transactions
Work in Process Transactions
Cellular Phone Calls
1. Building the Operational Data Store (Wiley
1996), Bill Inmon, et. all
4Operational Data Stores Characteristics
- High Performance
- Quick Response Time (acceptable)
- Multi-User
- Pre-defined transactions
- Data is archived
- Most are Relational Databases
- OLTP (On-Line Transaction Processing)
5(No Transcript)
6(No Transcript)
7Data Warehouse Definition
- A structured, extensible environment designed for
the analysis of non-volatile data, logically and
physically transformed from multiple source
applications to align with business structure,
updated and maintained for a long period of time,
expressed in simple business terms, and
summarized for quick analysis.
8Basic Elements of a Data Warehouse
DATA STAGING AREA
Source Systems
End User Data Access
Data Warehouse
Ad Hoc
Reporting
End User
Data Marts
General Data Flow
9Source Systems
- Legacy Applications (Mainframe)
- Flat Files
- Spreadsheets
- OLTP databases (Oracle, Informix, SQL Server)
- Enterprise Systems
10Data Staging Area
- Temporary Storage for cleaning, pruning,
- combining, remove duplicates, standardize.
- export
- ETL (Extract, Transform and Load) Tools are
tools used to perform this task - Example Informatica, Data Stage, AbInitio
11(No Transcript)
12(No Transcript)
13(No Transcript)
14Data Marts
- Subject-Area Data Warehouses
Production
Planning
Finance
15End User Applications
- Graphical Representation of Multidimensional data
(2D and 3D) - Excel (2D)
- Web Applications
- Ad-hoc Reporting Applications
16(No Transcript)
17Graphical Representation of Multidimensional
Structure
18Pivot Tables
19Pivot Charts
20Why build a Warehouse?
21What does the business gain?
22Data Warehouse Definition
- A structured, extensible environment designed for
the analysis of Non-Volatile data, logically and
physically transformed from multiple source
applications to align with business structure,
updated and maintained for a long period of time,
expressed in simple business terms, and
summarized for quick analysis.
23Typical Architecture
24Data WareHouse Concepts
- Fact Table
- Surrogate Keys
- Metric Facts
- Dimensions
- Star Schema
- Snowflake Schema
- OLAP
- Data Cube
- MetaData
- Data Mining
25What is a Fact Table?
Primary Keys
Metric Facts
26What is a Dimension?
- Surrogate Key
- Should not encode any information about the
contents of records - Automatically increasing integers make good
surrogate keys. - Maintain data warehouse information when
dimensions change
I can query ANY of these data items
Other Concepts Slow or Fast Changing Dimensions
27(No Transcript)
28(No Transcript)
29OLAP
- On-Line Analytical Processing
- Fast Analysis of Shared Multidimensional
Information
30What is a Data Cube?
- Non-relational data structure that stores a
multidimensional data set - Some vendors implement this with tables in a
relational database (ROLAP)
31What is Meta Data?
32Star Schema Real Examples
33Data Mining
- Is a technology that applies sophisticated and
complex algorithms to analyze data and expose
interesting information for analysis by decision
makers - Used in Marketing Research, Fraud Prevention
34Further Information
- www.olapreport.com
- www.datawarehousingonline.com
- www.datawarehousing.com
- www.tdwi.org
- www.datawarehousing.org
- May search for
- OLAP
- Business Intelligence
- Decision Support Systems
- Data Mining
- Vendors Ascential Software, Hyperion, Cognos,
ProClarity, Sagent Technology, and Business
Objects