Title: BizTalk
1Atlanta Microsoft Database Forum
Introduction to Data Warehousing Concepts
Presented by
Brian Thomas
Solution Builders, Inc.
March 8, 2004
Brian.Thomas_at_SolutionBuilders.com
2What is a Data Warehouse?
Data collected from one or many systems that
exist within and outside the organization. The
Data is structured in such a way as to reduce the
amount of time that it takes to produce reliable
information.
3Why Build a Data Warehouse?
- To Provide a Consistent Common Source for
Corporate Information - To Store Large Volumes of Historical Detail Data
from Mission Critical Applications - Improve the Ability to Access, Report Against,
and Analyze Information - To Solve or Improve Upon Business Processes
4Turning Data into Information
Functional Data Warehouse
Sales System
5Turning Data into Information
Functional Data Warehouse
Sales System
6Turning Data into Information
Cross Organizational Functional Data Warehouse
Division A
Sales System
Division B
Sales System
Division C
Sales System
7Turning Data into Information
Cross Functional Data Warehouse
Marketing System
Sales System
Production Systems
8Turning Data into Information
Cross Functional Data Warehouse
Marketing System
Sales System
Production Systems
9Turning Data into Information
Cross Organizational Cross Functional Data
Warehouse
Division A
Division B
Division C
10Data Warehouse Architecture
Management Systems
Access Methods
Source Systems
Data Warehouse Components
Planning Forecasting
Corporate Level
Portal / Web Interface
Division A
Analytics Modeling
Desktop Applications
Division B
Business Group Level
Performance Management
DW / DM
DW / DM
Data Access Query Management Services
Extraction Transformation Load (ETL)
DW / DM
Printed Reports
Division C
Scorecards Dashboards
Email
Divisional Level
DM
DM
DM
DM
DM
DM
Query Reporting
External Data
Mobile Devices
11Data Warehouse Architecture
Source Systems
Data Staging Area
Data Warehouse Repository
Division A
Division B
Extract, Transformation and Load (ETL)
Division C
External Data
12Data Warehouse Architecture
Data Staging Area
- Subject Area Oriented
- Data Structure more closely mirrors Operational
System Data Layouts - Supports Identification of Changed Data
- Acts as a Working Area to Support the
Transformation Process
13Data Warehouse Architecture
Extraction, Transformation Load (ETL)
- Perform Attribute Standardization and Cleansing
- Apply Business Rules and Calculations
- Consolidate using Matching and Merge / Purge
Logic - Ensure Proper Linking and Tracking of History
Extract, Transformation and Load (ETL)
14Data Warehouse Architecture
Extraction, Transformation Load (ETL)
App. A Male , Female App. B 1 , 0 App. C x ,
y App. D m , f
Male, Female
Lookup Function
App. A pipeline (cm) App. B pipeline
(inches) App. C pipeline (mcf) App. D pipeline
(yds)
pipeline (cm)
Conversion Function
App. A Date (julian) App. B Date
(yyyymmdd) App. C Date (mm/dd/yyyy) App. D Date
(absolute)
Date (julian)
Formatting Function
App. A Description App. B Description App. C
Description App. D Description
Description
Merging Function
App. A balance on hand App. B current
balance App. C cash in house App. D balance
Balance
Mapping Function
15Data Warehouse Architecture
Data Warehouse Repository
- Organized around Conformed Dimensions and Facts
- Promotes Usability and Intuitiveness
- Consolidated and Cross-Functional
- Historical and Atomic Representation of Data
-
- Insulated from Source System Modifications and
Additions
16Data Warehouse Repository
Star Schema Concepts
Fact Table
This table is the core of the Star Schema
Structure and contains the Facts or Measures
available through the Data Warehouse. These
Facts answer the questions of What, How Much,
or How Many. Some Examples
Sales Dollars, Units Sold, Gross Profit, Expense
Amount, Net Income, Unit Cost, Number of
Employees, Turnover, Salary, Tenure, etc.
17Data Warehouse Repository
Star Schema Concepts
Dimension Tables
These tables describe the Facts or Measures.
These tables contain the Attributes and may also
be Hierarchical. These Dimensions answer the
questions of Who, What, When, or
Where. Some Examples
- Day, Week, Month, Quarter, Year
- Sales Person, Sales Manager, VP of Sales
- Product, Product Category, Product Line
- Cost Center, Unit, Segment, Business, Company
18Data Warehouse Repository
Star Schema Concepts
19Data Warehouse Repository
Cube Concepts
Atlanta
Chicago
Markets Dimension
Denver
Grapes
Cherries
Dallas
Melons
Product Dimension
Apples
Q4
Q1
Q2
Q3
Time Dimension
20Data Warehouse Repository
Cube Concepts
Atlanta
Chicago
Markets Dimension
Denver
Grapes
Cherries
Dallas
Melons
Product Dimension
Apples
Q4
Q1
Q2
Q3
Time Dimension
21Data Warehouse Repository
Storage Concepts
- Relational On-Line Analytical Processing
(ROLAP) The information that is stored in the
Data Warehouse is held in a relational structure.
Aggregations are performed on the fly either by
the database or in the analysis tool. - Multidimensional On-Line Analytical Processing
(MOLAP) This information is aggregated in a
predefined manner based on the characteristics of
the Measures and the defined hierarchy of the
Dimensions. Since the data is pre-aggregated,
navigating through the hierarchies is
instantaneous. The user is simply navigating to
a point within the Multidimensional Cube and not
performing any on the fly aggregations. - Hybrid On-Line Analytical Processing (HOLAP)
This is a combination of MOLAP and ROLAP. A
portion of the data is predefined and aggregated.
This would typically be the set of information
that is accessed most frequently. Additional
detail can be held in a ROLAP structure and allow
a user to drill through the MOLAP structure into
the ROLAP structure.
22Data Warehouse Repository
Cube Concepts
23Microsoft Office, Reporting Services and .NET
Framework
Where does Microsoft fit in?
SQL Server DTS
SQL Server Relational Database and Analysis
Services
Management Systems
Access Methods
Source Systems
Data Warehouse Components
Planning Forecasting
Corporate Level
Portal / Web Interface
Division A
Analytics Modeling
Desktop Applications
Division B
Business Group Level
SharePoint Portal, Exchange, and .NET Framework
Performance Management
DW / DM
DW / DM
Data Access Query Management Services
Extraction Transformation Load (ETL)
DW / DM
Printed Reports
Division C
Scorecards Dashboards
Email
Divisional Level
DM
DM
DM
DM
DM
DM
Query Reporting
External Data
Mobile Devices
SQL Stored Procedures, SQL Views, MDX, and .NET
Web Services
24Q A