Title: Data Warehousing
1Data Warehousing
2Databases support
- Transaction Processing Systems
- operational level decision
- recording of transactions
- Decision Support Systems
- tactical and strategic decision making
- analysis of historical records
3Can one database support both?
RDBMS
DSS
TPS
4Can one database support both?
RDBMS
DSS
TPS
- low concurrency
- large reads
- significant aggregation
- high concurrency
- small transactions
- limited aggregation
Yes but at a cost in performance.
5The Solution
TPS
DSS
Production Database (OLTP)
Data Warehouse
Extract, Transport Transformation Load
6OLTP vs DW Characteristics
OLTP Database
Data Warehouse
High Read/Write Concurrency
Primarily Read Only
Highly Normalized
Highly Denormalized
Limited Transaction History
Massive Transaction History
Very Detailed Data
Detailed and Summarized Data
Limited External Data
Significant External Data
7Data Marts (3-tier approach)
Data Mart A
External Data Sources
DSS
Data Warehouse
Production Database (OLTP)
Data Mart B
DSS
ETL
Data Mart C
DSS
Transformation Limitation
8Data Marts (bottom-up approach)
External Data Sources
Data Mart A
DSS
ETL
Production Database (OLTP)
External Data Sources
Data Mart B
ETL
DSS
ETL
Data Mart C
DSS
External Data Sources
9Multi-dimensional (Sales) Data
80
110
60
25
California
40
90
50
30
Utah
70
55
60
35
March 3
Arizona
March 2
March 1
Diet Soda
Lime Soda
Soda
Orange Soda
10Cube Operations
- Cube (group by option)
- Slice (implement in Oracle with where clause)
- Dice (implement in Oracle with where clause)
- Drill Down (implemented in report writers)
- Roll-up (group by option)
- Pivot (not implemented by Oracle (but by Access))
11Cube Data Example
- Create table sales (
- Item varchar2(20),
- State varchar2(20),
- Amount number(6),
- Day date)
- Insert into Sales
- values('Soda','California',80,'01-Mar-2004')
- Insert into Sales
- values('Diet Soda','California',110,'01-Mar-2004
')
12Examine these queries
- Select from sales
- Select Item, State, sum(amount)
- from sales
- group by Item, State
- Select Item, State, sum(amount)
- from sales
- group by Rollup(Item, State)
- Select State, Item, sum(amount)
- from sales
- group by Rollup(State, Item)
- Select State, Item, sum(amount)
- from sales
- group by Cube(State, Item)
13Materialized Views
- Materialized views are schema objects that can be
used to summarize, precompute, replicate, and
distribute data. They are suitable in various
computing environments such as data warehousing,
decision support, and distributed or mobile
computing - In data warehouses, materialized views are used
to precompute and store aggregated data such as
sums and averages. Materialized views in these
environments are typically referred to as
summaries because they store summarized data. - Cost-based optimization can use materialized
views to improve query performance by
automatically recognizing when a materialized
view can and should be used to satisfy a request.
The optimizer transparently rewrites the request
to use the materialized view. Queries are then
directed to the materialized view and not to the
underlying detail tables or views. - In distributed environments, materialized views
are used to replicate data at distributed sites
and synchronize updates done at several sites
with conflict resolution methods. The
materialized views as replicas provide local
access to data that otherwise has to be accessed
from remote sites. - In mobile computing environments, materialized
views are used to download a subset of data from
central servers to mobile clients, with periodic
refreshes from the central servers and
propagation of updates by clients back to the
central servers.
14Create Materialized View (partial syntax)
15Materialized View refresh_clause
16MV Example
- Create Materialized View MVcustomer
- REFRESH start with sysdate Next sysdate(1/24)
- AS
- Select customerID,lastname,firstname, phone
- from customers
17RDBMS Star Schema
Item
Store
ItemID
StoreID
Name
Manager
Sales
UnitPrice
Street
SalesNO
Brand
City
SalesUnits
Category
Zip
SalesDollars
SalesCost
ItemID
Customer
Day
CustID
CustID
DayID
StoreID
Name
DayOfMonth
DayID
Phone
Month
Street
Year
City
DayOfWeek