Title: The Data Warehouse Toolkit
1The Data Warehouse Toolkit
- Chapter 6 CRM
- ????? ???(Evan Lin)
- MIS Dept., Yuan Ze Univ., Taiwan
- http//evan.scuec.net
- s917725_at_mail.yzu.edu.tw
2What is CRM
- Customer relationship management was the
buzzword,design and develop customer-centric
dimensional models to better understand their
customers behavior. - CRM has emerged as mission-critical business
strategy that many as essential to company
survival.
3Discuss following Concept
- CRM overview, including its operational and
analytic roles. - Customer name and address parsing,along with
international consideration - Common customer dimension attribute, such as
dates segmentation attribute,and aggregation
facts - Dimension outrigger for large cluster of
low-cardinality attribute.
4Discuss following Concept..
- Minidimensions for attribute browsing and change
tracking in large dimensions, as well as
variable-width attribute sets. - Implications of using type 2 slowing change
dimension technique on dimension counts - Behavior study groups to track a set of customers
that exhibit common characteristics or behaviors - Commercial customer hierarchy consideration,
including both fixed and variable depth.. - Combine customer to multiple data sources
- Analyzing customer across multiple business
processes
5CRM Overview
- The goal CRM is maximum relationships with your
customers over their lifetime. - CRM promises significant return for organization
that embrace it in term it in term of both
increased revenue and greater operational
efficiencies - Switch to a customer-driven perspective can lead
to increased sales effectiveness and closure
rates, revenue growth, enhanced sales
productivity at reduced cost,improved customer
profitability margins, higher customer
satisfaction , and increased customer retention.
6CRM Overview..
- View customer variable depend on
- Product line,business unit, business function ,or
geographic location. - In many cases.the existing business processes for
customer interactions have evolved over time as
operational or organization work-around. - The resulting patchwork set of customer-related
processes is often clumsy at best. - Since it is human nature to resist change, it
comes as no surprise that people-related issues
often challenge
7Operational and Analytic CRM
- It could be said that CRM suffers from a split
personality syndrome because it address both
Operational and Analytic requirement. - Effective CRM relies on the collection of data at
every interaction we have with a customer and
then the leveraging if that breadth of data
through analysis.
8Operational CRM
- CRM call for synchronization of customer-facing
processes. - Data is created on operational side of the CRM
equation, we obviously need to store and analyze
the historical metrics resulting from our
customer interaction and transaction systems.
9Analytic CRM
- Analytic CRM is enabled via accurate, integrated,
and accessible customer data in warehouse. - We are able to measure the effectiveness of
decisions made in the past in order to optimize
future interactions. - Customer data can be leveraged to better identify
up-sell and cross-sell opportunities, pinpoint
inefficiencies, generate demand, and improve
retention.
10- The model output can translate into specific
proactive or reactive tactics recommended for the
next point of customer contact, such as the
appropriate next product offer or anti-attrition
response. - The model results also are retained in the data
warehouse for subsequent analysis.
11- Information must feed back to the operational Web
site or call center systems on a more real-time
basis. Like ODS(Operational Data Store) - The ODS generally does not require the breadth or
depth of customer information available in the DW
12Packaged CRM
- In response to urgent need of business for CRM,
project teams may be wrestling with BUY versus
BUILD decision. - In the long term BUILD is better,but it take
longer require more resources, potentially high
cost. - Buying a packaged application will deliver a
practically ready-to-go solution,but it may not
focus on the integration and interface issues
needed for it to function in the large IT
content.
13Packaged CRM ..
- Buying packaged, regardless of its application
breadth, does not give us an excuse to dodge the
challenge of creating conformed dimension,
including customer dimension. - Any organization needs is another data stovepipe,
so be certain to integrate any packaged solution
properly.
14Customer Dimension
- The conformed customer dimension is a critical
element for effective CRM. - A well-maintained, well-deployed conforming
customer dimension is the cornerstone of sound
customer-centric analysis.
15Customer Dimension ..
- Customer dimension is typically the most change
dimension in data warehouse. - In large organization customer dimension can be
- Extremely deep(with million rows)
- Extremely wide(with dozens or hundreds of
attributes) - Subject to rather rapid change
- To further complicate matters, the customer
dimension often represents an amalgamation(??) of
data from multiple internal and external source
system.
16Name and Address Parsing
- Regardless of whether we are dealing with
individual human being or commercial entities, we
typically capture - Many designer use a liberal design for names and
address.. Likely Name-1, Name-2.. - Unfortunately, these catchall columns are
virtually worthless when it comes to better
understanding and segmenting the customer base.
17Name and Address Parsing..
- About NAME column is too far to limit..
- Hard to handling salutation(??),title ,suffix
- Can not identify what person first name is or how
she should be address in a personalized greeting.
- We would find multiple customers listed in single
name field. - We also additional descriptive information in the
name field..(UGMA?Trustee)
18Address Problem
- May contain enough room for any address, but
there is no discipline(?????) imposed by the
columns that will guarantee conformance with
postal regulation(????), address matching or
latitude/longitude identification. - Attribute has not standardized..
- RD ? Road
- Ste ? State..
- Commercial customer typically have multiple
address, such as physical and shipping
addresseach of these would follow much the same
logic as the address structure we just develop..
19International Name and Address consideration
- Customer geographic attributes become more
complicated if we are dealing with customers from
multiple countries. - In addition to name and address parsing we
discussed earlier - Universal representation
- Should be consistent from country to country.
- Cultural correctness
- Appropriate salutation(???????)in
e-mail,telephone, letter - Differences in addresses
- Idiosyncrasies(??) such as presenting the
destination city.
20Other Common Attribute
- Date
- We often find date in customer dimension, such as
date of first purchase,date of last purchase..
21Other Common Attribute..
- Aggregated facts as attribute
- User often are interested in constraining the
customer dimension based on aggregated
performance metrics.. - Filter on all customers who spent over a certain
dollar amount during last year, - While there are query usability and performance
advance to storing these attribute but Staging
processes to ensure that attributes accurate(??),
up-to-date , and consistent with actual fact
row.
22Dimension Outriggers for a Low-Cardinality
Attribute Set
- Snowflake the low-cardinality columns in the
dimension have been removed to separate
normalized tables that then link back into
original dimension table. - Not recommend more complex and low performance.
- Sometime build a dimension outriggers that has
appearance of a snowflake table.
23- Several factor to bend our non-snowflake rule
- Demographic data(??) is available at
significantly (????) difference grain than
primary dimension.. - The data is administered and loaded at difference
times than the rest(???) of the data in the
customer dimension.. - Dimension outriggers are permissible(????),but
they should be the exception rather than the
rule. A red warning flag should go up if your
design is riddled(???) with outriggers you may
have succumbed(??) to the temptation(???) to
overly(???) normalize design.
24Large Changing Customer Dimension
- In Chapter 4 the Type 2 cant do with large
changing dimension.. - Huge customer dimensions are even more likely to
change than moderately sized dimensions. We call
it rapidly changing monster dimension - To rescue(??) to address both browsing-performance
and change-tracking challenge is to break off
frequently analyzed or frequently changing
attribute into a separate dimension, referred to
as a minidimension.
25Minidimension..
- We include two foreign keys related to customer
the regular customer dimension key and
minidimension demographic key.. - This design delivers browsing and constrain
performance benefits by providing a smaller point
of entry to the fact.. - Queries can avoid the huge customer dimension
table altogether(??)unless attribute from that
table are constrained - When the demographic key participates as foreign
key in the fact table, another benefit is that
fact table serves to capture the demographic
profile changes.. - Example in table 6.3 in page 155..(like Type 2
response in SCDs)
26- Customer dimension are unique in that customer
attributes frequently are queries independently
of the fact table.. - Exhow many female customers live in Dade Country
by age bracket(?????).. - Use demographic data to link through the fact
table,the most recently value of demographic key
also can exist as foreign key on customer
dimension table. We refer to demographic table as
a customer dimension outrigger - The minidimension terminology refers to when the
demographics key is part of the fact table
composite key if the demographic key is a
foreign key in customer dimension, we refer to it
as an outrigger..
27- Type 1 if you embed most recent demographic key
in customer dimension.. - Type 2 if you want to track all change, but
avoid to rapidly changing monster dimension
problem.. - Use demographic data,through either the
minidimension or outrigger, can deliver more
functionality and complexity than some users can
handle..
28- The demographic dimension itself cannot be
allowed to grow too large. - We would build a second demographics dimension
- Minidimension growth while also clustering like
attributes together for a more intuitive user
presentation - They are potentially sourced from two different
data providers with difference update
frequencies..
29- The best approach for efficiently browsing and
tracking changes of key attributes in really huge
dimension is to break off one or more
minidimension from the dimension table, each
consisting of small clumps(?) of attributes that
have been administered to have a limited number
of values.
30VariableWidth Attribute Set
- Minidimension can be create to handle a variable
number of customer attributes.. - When using external prospect(??) lists, we often
are permitted only a one-time use of the list and
dont have the legal right to store the prospect
information internally. - We could store the prospect and customers
together in a single contact dimension,however,
there is a significant(???) disparity(???)
between the numbers of attributes for prospective
versus customer contacts.
31- We may know only a handful of identification and
location attributes about our prospects. - Database platform such as Oracle support
variable-width attribute set..
32Implications of Type 2 Customer Dimension Changes
- Type 2 remain the predominant technique for
tracking change in customer dimension with less
than 100,000 rows - Evan we have large customer dimension we need
still use type 2 response to handle very slowly
changing attribute left behind in the customer
dimension .. - Must avoid over-counting use COUNT DISTINCT
- Most recent row indicator in the customer
dimension is also helpful to do count based on
the most up-to-date descriptive values for
customers
33Customer Behavior Study Groups
- With customer analysis, simple queries,such as
how much have we sold to customers in this
geographic area in the past year,rapidly evolve
to more complex inquiries. - The secret to building complex behavioral study
group queries is to capture the keys of customers
or products whose behavior you tracking. You then
use the captured keys to constrain other fact
tables without having to return the original
behavior analysis..
34Commercial Customer Hierarchies
- One of most challenging aspects of dealing with
commercial customers is modeling their internal
organizational hierarchies. - Nested
- May change frequently
- We will talk about two approaches to handling
customer hierarchies - Fixed-Depth Hierarchies
- Variable-Depth Hierarchies
35Fixed-Depth Hierarchies
- It is uncommonly that customer dimension is
highly predictable with fixed number of levels - Easy to compute , easy to sum to the sum..
36Variable-Depth Hierarchies
- Representing an arbitrary (???),ragged (???)
organization hierarchies difficult task in a
relational environment
37Variable-Depth Hierarchies..
- In computer science approach to handling this
unpredictable hierarchies using recursive parent
customer key pointer on each customer dimension
row.. - GROUP BY can not follow the recursive tree point
downward an summarized and additive fact in a
companion fact table such as revenue in an
organization. - Oracles CONNECT BY can navigate a recursive
point, but cannot be used in the same SQL
statement as a join..
38- Instead of using a recursive pointer we insert a
bridge table between the customer dimension and
fact tables.. - The bridge table has been called a helper or
associative table in the past. - The number of rows in the bridge table typically
is several times large than the number of rows in
customer dimension..
39Combining Multiple Source Of Customer Data
- Conform customer dimension is a distillation of
data from several operational system and possibly
outside sources. - Unique customer has multiple identifiers in
multiple operational touch-point system. - There is no secret weapon for tracking this data
consolidation.
40Analysis Customer Data from Multiple Business
Processes
- Be very careful when simultaneously joining a
single dimension table to two fact tables of
different cardinality in many cases,relational
systems will return the wrong answer. A similar
problem arises when joining two fact tables of
different granularity together directly. - If use are frequently combining data from
multiple business processes,then an additional
fact table can be constructed that combines the
data once into a second-level,consolidated fact
table rather than relying on users to combine the
data consistently and accurately on their own..
41Summary
- Overview of CRM basics
- Design issue..
- Name ,address parsing..
- Dimension outrigger
- Low-cardinality attribute
- Customer dimension has million of rows
- Minidimension with variable-width attributes
- Customer behavioral study group dimension
- Customer hierarchies
- Fixed and variable depth hierarchies
- Potential downfalls of querying across the fact
tables .