Title: Data Hygiene
1Data Hygiene
2Information is a source of learning. But unless
it is organized, processed, and available to the
right people in a format for decision making, it
is a burden, not a benefit.C. William Pollard,
The Soul of the Firm
3CONCLUSION
INTRODUCTION
WHY DIRTY DATA
CLEANING STEPS
-
- Customer Relationships
- Anomaly Nightmare
- Parsing Matching
- Correcting Consolidating
- Standardizing
- USPS Services
INTRODUCTION
WHY DIRTY DATA
CLEANING STEPS
CONCLUSION
4Talking to your customers
CONCLUSION
INTRODUCTION
WHY DIRTY DATA
CLEANSING STEPS
- Customer Communication
- Face-to-Face
- Telephone
- Direct Mail
- E-Mail
- Variety of Purposes
- Sales
- Marketing
- Billing
- Customer Service
5Why Dirty Data?
CONCLUSION
INTRODUCTION
WHY DIRTY DATA
CLEANSING STEPS
- Multiple data sources
- Compiled Data from marketing, accounting,
customer service, online, etc - Survey Data
- Mailing Lists
- Transactional Data
- Registration Data
- Complaints
6Why Dirty Data?
CONCLUSION
INTRODUCTION
WHY DIRTY DATA
CLEANSING STEPS
- Lack of Standard Business Rules
- Multiple formats
- Multiple names within one field
- One name in two fields
- Name and address in same field
- Different addresses for the same customer
- Different spellings (or misspellings) for the
same customers
7CONCLUSION
INTRODUCTION
WHY DIRTY DATA
CLEANSING STEPS
Anomaly Nightmare
8Cleaning Steps
CONCLUSION
INTRODUCTION
WHY DIRTY DATA
CLEANSING STEPS
9Parsing
CONCLUSION
INTRODUCTION
WHY DIRTY DATA
CLEANSING STEPS
10Correcting
CONCLUSION
INTRODUCTION
WHY DIRTY DATA
CLEANSING STEPS
11Standardizing
CONCLUSION
INTRODUCTION
WHY DIRTY DATA
CLEANSING STEPS
12USPS Services
CONCLUSION
INTRODUCTION
WHY DIRTY DATA
CLEANSING STEPS
- Address Element Correction (AEC)
- Corrects and standardizes address elements.
- Misspellings
- Directionals (e.g. NW, South, etc..)
- Suffixes (e.g. Street, Avenue, Road, etc)
- Nonstandard abbreviations
- Missing Information (e.g. apt or suite )
- Provides reason why an address is incorrect
13USPS Services
CONCLUSION
INTRODUCTION
WHY DIRTY DATA
CLEANSING STEPS
- Delivery Sequence File (DSF)
- Address Validation
- Eliminates undeliverable addresses
- Zip4 Coding
- Carrier Route
- Delivery Sequence
- Enhances Mail Delivery and List Quality
- Business vs. Residential Indicator
- Location Occupancy and forwarding addresses
14USPS Services
CONCLUSION
INTRODUCTION
WHY DIRTY DATA
CLEANSING STEPS
- Locatable Address Conversion Service (LACS)
- Assists 911 Emergency Services
- Replaces Rural Routes and/or Box Numbers with
Physical Locations - New Land Development Addresses
15USPS Services
CONCLUSION
INTRODUCTION
WHY DIRTY DATA
CLEANSING STEPS
- National Change of Address (NCOA)
- 40 Million Americans Change Addresses Annually
17 individuals, 22 businesses - 3 year rolling file of business, individual, and
household moves, updated weekly. - Reduces Undeliverable and Duplicate Mail pieces
16Parsing, Correcting, Standardizing
CONCLUSION
INTRODUCTION
WHY DIRTY DATA
CLEANSING STEPS
TITLE
FIRST
CONC.
LAST
GENER.
NAME LINE
William
Mr. Bill St. John III 101 S.
Main Strete Sant. Louis, MO 63181
HSNO
ST-NM
ST-TYPE
ST-DIR
St.
STREET LINE
CITY
STATE
POST
St.
63118
GEOG. LINE
17Matching
CONCLUSION
INTRODUCTION
WHY DIRTY DATA
CLEANSING STEPS
18Consolidating
CONCLUSION
INTRODUCTION
WHY DIRTY DATA
CLEANSING STEPS
19Consolidating
CONCLUSION
INTRODUCTION
WHY DIRTY DATA
CLEANSING STEPS
20Why is data hygiene important?
CLEANSING STEPS
INTRODUCTION
WHY DIRTY DATA
CONCLUSION
- Reduces Costs
- More automation Less Manual Review
- Maximize Postal Savings
- Eliminates duplication costs (paper, printing,
postage, storage, data management)
- Enhances Customer Relationships
- Consolidated Customer View
- Facilitates Data Mining CRM
- Increased data accuracy
- Communications reaches your customers
- Improves Response Rates
21The Reality ONE Customer
CLEANSING STEPS
INTRODUCTION
WHY DIRTY DATA
CONCLUSION
Account No.83451234
Policy No.ME309451-2
Transaction B498/97
22Software Companies
- First Logic
- Group One
- Ascential
- Trillium Software