Title: DW-1: Introduction to Data Warehousing
1DW-1 Introduction to Data Warehousing
2Overview
- What is Database
- What Is Data Warehousing
- Data Marts and Data Warehouses
- The Data Warehousing Process
- Data in a Data Warehouse
3 What Is Database
- Before
- Program Algorithm Data Structure
- Now
- Application (Weblication) Visual I/F SQL
Query Database - Database is Integrated Data
- from multiple file system data for OLTP
- Data Base (From Air Base?), DB, ??????, ????(??)
4Database and Data Model
- Computer Representation of Data for efficient
understanding and processing - Data Model based on Relationship modeling
- Relationship between record
- one-to-one(11), one-to-many(1N),
many-to-many(NM) - Hierarhical Model Hierarchical Relationship, 1N
- Network model Network like relationship, NM
- Relational Model Use relation (table) for
Relationship - Object-Oriented data model Complex object
modeling - SET type, Reference, List
5 What Is Data Warehousing
- Defining Data Warehousing
- Operational Systems A Transactional Solution
- Analytical Systems A Data Warehousing Solution
- Comparing Transactional and Data Warehousing
Solutions
6Defining Data Warehousing
- Business Intelligence
- Database Marketing Personalized Product
- Especially S/W, Cocoon business etc.
- Electronic Commerce
- Data Warehouse ?? ??
- for OLAP, Data Mining, DSS
- Knowledge Management
- Data Warehousing Process to build Data Warehouse
7Defining Data Warehousing
- A Data Warehouse Is a Database That Contains
- Enterprise data
- Integrated sets of historical data
- Subject-oriented, consolidated, consistent data
- Data structured for distribution and querying
- A Data Warehousing Solution Is a Process That
- Retrieves and transforms data
- Manages the database
- Uses tools for building and managing the data
warehouse
8Operational Systems A Transactional Solution
- Track Individual Events
- Used for Real-time Data Entry and Editing
- Examples
- Order-tracking applications
- Customer service applications
- Point-of-sale applications
- Service-based sales applications
- Banking functions
9Analytical Systems A Data Warehousing Solution
- Assist with Strategic Decision Support
- Provide Different Levels of Analysis
- Allow Users to Navigate to Different Levels of
Data - Allow System Searches to Find New Relationships
- Examples
- Spreadsheet-based applications
- Sales forecasting applications
10Comparing Transactional and Data Warehousing
Solutions
11 Data Marts and Data Warehouses
- What Is a Data Mart
- Moving Data from a Data Warehouse to Data Marts
- Moving Data from Data Marts to a Data Warehouse
12What Is a Data Mart
- What Is a Data Mart
- A subset of a data warehouse
- Used in an enterprise
- Specific to a particular subject or business
activity - Why Build Data Marts
- Faster queries and fewer users
- Faster deployment time
- Integrated Data Marts
- Ensure consistent data
- Require advance planning
13Moving Data From a Data Warehouse to Data Marts
- Advantages
- Shared fields
- Common source
- Distributed processing
- Disadvantages
- Longer time to develop
14Moving Data from Data Marts to a Data Warehouse
- Advantages
- Simpler and faster to implement
- Department-specific data
- Smaller hardware requirements
- Disadvantages
- Data duplication
- Incompatible data marts
15The Data Warehousing Process
- Basic Elements of the Process
- Tools to Manage the Process
16Basic Elements of the Process
1
3
4
5
2
17Tools to Manage the Process
- SQL Server
- Data Transformation Services
- SQL Server OLAP Services
- Microsoft Repository
- Microsoft English Query
- PivotTable Service
18ETL process
- Extraction, Transformation, Loading
- Extraction ??
- Data retrieval from existing data source such as
File, Table etc. - Transformation ??
- Data modification, sorting, calculation etc
- Loading ??
- Bulk, incremental loading from operational DB
- Time consuming process may use special H/W
19Data in a Data Warehouse
- Data Characteristics
- Example of Organizing Data
20Data Characteristics
21Example of Organizing Data
Monthly Southeast Regional Sales Report - May
1999
City Miami Tampa Atlanta Savannah Columbia
State FL FL FL Totals GA GA GA Totals SC SC Totals
Units Sold 2,500 2,750 5,250 3,200 1,725 4,925 1,9
00 1,900 12,075
Sales 12,850 14,135 26,985 16,800
9,143 25,943 9,595 9,595 62,473
Southeast Region Total
22Data Warehouse Schema Example Star schema
23A Example of Cube Browsing
- 1 Fact with 4 Dimension Table
- -- Sales_Fact, Product, Store, Time, Customer
24Drilling Down
- Drilling Down to products
25Drilling Down
- Drilling Down to the lowest level of Customer
Dimension
26Rolling up
27Review
- What Is Data Warehousing
- Data Marts and Data Warehouses
- The Data Warehousing Process
- Data in a Data Warehouse
- Data Warehouse will be more popular than DB?