Title: Data Warehouse- Procurement
1Data Warehouse- Procurement
2Overview
- Value chain reinforcement
- Blended versus separate transaction schema
- Slowly changing dimension techniques
3Procurement Case Study
- Procurement involves a wide range of activities
- Negotiation of contracts
- Issuing of purchase requisitions POs
- Tracking receipts
- Authorizing payments
- Common analytic requirements
- Which materials/products are purchased most
frequently? Who supplies them? At what prices? - Across the enterprise, are there opportunities to
negotiate contracts by consolidating suppliers,
single sourcing? - Are we purchasing from the preferred vendors?
- How are vendors performing?
4Procurement Transactions
- Business process to model Procurement
- Transactions purchase requisitions, purchase
orders, shipping notifications, receipts, and
payments - Granularity 1 row per procurement transaction
- Key dimensions
- Transaction date
- Product
- Vendor (1 row for each vendor)
- Contract terms (1 row for each set of terms
negotiated with a vendor) - Procurement transaction type
- Transaction date and product are conformed
dimensions - Measured facts
- Procured units
- Transaction amount
5Multiple- vs. Single-Transaction Fact Tables
Issues
- Business users view purchase orders, shipping
notices, warehouse receipts, and vendor payments
as separate unique processes - Several procurements transactions come from
different source systems - Purchasing system purchase requisitions and
purchase orders - Warehousing system shipping notices and
warehouse receipts - Accounts payable system vendor payments
- Several transaction types have different
dimensionality - E.g., discounts applicable to vendor payments but
not to other types of transactions - Control numbers such as PO generated during
procurement process are candidates for degenerate
dimensions - Design decision
- Build separate fact tables for each transaction
type, or - Build a blended transaction fact table with a
transaction type dimension - No simple answer.
6Multiple- vs. Single-Transaction Fact Tables -
Cont.
- Questions to ask
- What are the users analytic requirements?
- How do users usually analyze data?
- Do they analyze multiple transaction types
together or do they usually look at a single
transaction type? - Are there multiple unique business processes?
Yes ? leaning towards separate tables - Are multiple source systems involved? Yes ?
leaning towards separate tables - What is the dimensionality of the facts? Do some
dimensions apply only to some transaction types?
Yes ? leaning towards separate tables - Solution multiple transaction fact tables
7 Fact Table with Multiple Transaction Types
Product Dimension
Purchase Requisition Fact Product Key
(FK) Vendor Key (FK) Contract Terms Key (FK)
Date Dimension
Contract Terms Dimension
Vendor Dimension
Procurement Transaction Type Dimension
8Multiple Fact Table
Date Dimension
Product Dimension
Purchase Requisition Fact
Contract Terms Dimension
Vendor Dimension
Purchase Order Fact
Received Condition Dimension
Employee Dimension
Shipping Notices Fact
Discount Taken Dimension
Ware House Receipt Fact
Vendor Payment Fact
9Multiple Fact Table Details
- How it can be used
- How much time it took for material to come to
store after the Purchase Order has been released - Analysis can be done for the Movement of
materials - How much time the shipping takes
- What is the trend of Payments. Are the Suppliers
happy. - Vendor rating on various scales can also be done
10Multiple Fact Tables
- Advantages
- Richer, more descriptive dimensions and
attributes - Simplified staging activities, since operational
data exist in separate source systems - Loading data into separate fact tables will be
less complex than attempting to integrate from
multiple sources - Disadvantages
- More time to manage and administer more tables
to load, index, and aggregate
11Slowly Changing Dimensions
- We have assumed dimensions to be independent of
time, but some dimensions (other than natural
keys) may change slowly with time - Need to track change, without full-blown
normalized structure without making every
dimension time-dependent - For each attribute in our dimension tables, we
must specify a strategy to handle change
12Techniques for Dealing with Dimension Change
Type 1 Overwrite the Value
- Overwrite the old attribute value in the
dimension row, replacing it with the current
value. ? the attribute always reflects the most
recent assignment - The type 1 response is easy to implement, but it
does not maintain any history of prior attribute
values - Question Is there a business need for retaining
the old attribute value?
13Original product dimension
14Type 1 response
15Techniques for Dealing with Dimension Change
Type 2 Add a Dimension Row
- Create a new dimension row reflecting the new
attribute - Two separate surrogate keys one for old row and
one for new - Product Key is used as the primary key instead of
the SKU number, which is the natural key and is
the same for both rows - Could also use a most recent row indicator to
tell us which of the two rows is the current - Fact table is again untouched
- Adding a dimension row is the primary technique
for accurately tracking SCD attributes - Advantage
- New dimension row automatically partitions
history in fact table pre-change fact rows use
the pre-change surrogate key - No need to revisit preexisting aggregation tables
- Disadvantage
- Accelerated dimension table growth
- Does not allow us to associate the new attribute
value with the old fact file or vice versa
16Type 2 response
17Techniques for Dealing with Dimension Change
Type 3 Add a Dimension Column
- Add a new dimension column containing the old
attribute value (E.g., Prior Department) - Overwrite the old value with the new
- More appropriate when there is a need to
associate new attribute values with old fact
history - E.g., business need to track both old and new
values of department attribute both forward and
backward - Management can use either value for analysis
- Allows for observing new and historical fact data
by either the new or prior attribute values - Used less frequently
- Inappropriate to track numerous intermediate
attribute values
18Type 3 response
19Type 6 Response
20Type 6 Response (123)
21Hybrid Slowly Changing Dimension Techniques
- Two approaches that combine the basic Slowly
Changing Dimension techniques - Predictable changes with multiple version
overlays - Unpredictable changes with single- version
overlay - These approaches provide more flexibility at the
cost of greater complexity
22Predictable Changes with Multiple Version Overlays
- Used in cases of sales organization realignments
- Example Over a 5-year period the sales
organization is reorganized five times. - At first sight, candidate for Type 2 approach
(add dimension row), but more complex business
requirements. E.g., - Report each years sales using the district map
for that year - Report each years sales using the district map
from an arbitrary different year - Report an arbitrary span of years sales using a
single district map from a chosen year. - Type 3 is also inappropriate because gt2 district
maps - Because changes are predictable, an extension of
Type 3 is possible ? Multiple District columns - Current District District 2001 District 2002
23Unpredictable Changes with Single-Version Overlay
- Preserve historical accuracy surrounding
unpredictable attribute changes while supporting
the ability to report historical data according
to the current values - Issue a new dimension row (type 2) to capture the
change and add a new dimension column to track
the historical value (type 3). Also, overwrite
Current Department value (Type 1).
24More Rapidly Changing Dimensions
- Break off the rapidly changing attributes into
one or more separate dimensions - Two foreign keys in fact table
- 1. Primary dimension table
- 2. Rapidly changing attribute('s)
25(No Transcript)