Title: Data Warehousing
1Data Warehousing
Virtual University of Pakistan
- Lecture-21
- Introduction to Data Quality Management (DQM)
Ahsan Abdullah Assoc. Prof. Head Center for
Agro-Informatics Research www.nu.edu.pk/cairindex.
asp National University of Computers Emerging
Sciences, Islamabad Email ahsan101_at_yahoo.com
2Introduction to Data Quality Management (DQM)
3What is Quality? Informally
- Some things are better than others i.e. they are
of higher quality. How much better is better? - Is the right item the best item to purchase? How
about after the purchase? - What is quality of service? The bank example
4What is Quality? Formally
Quality is conformance to requirements P.
Crosby, Quality is Free 1979
Degree of excellence Websters Third New
International Dictionary
5What is Quality? Examples from Auto Industry
Quality means meeting customers needs, not
necessarily exceeding them. Quality means
improving things customers care about, because
that makes their lives easier and more
comfortable. Why example from auto-industry?
6What is Data Quality?
What is Data?
?
Note Change the picture
Emp_ID 440
Height 58 Weight 160 lbs Gender
Male Age 35 yrs
Muhammad Khan
All data is an abstraction of something real
7What is Data Quality?
Intrinsic Data Quality Electronic reproduction of
reality.
Realistic Data Quality Degree of utility or value
of data to business.
8Data Quality Organizations
Intelligent Learning Organization High-quality
data is an open, shared resource with
value-adding processes. The dysfunctional
learning organization Low-quality data is a
proprietary resource with cost-adding processes.
Comment Put picture of person in water holding
round tube with data written on it
9Orrs Laws of Data Quality
Law 1 - Data that is not used cannot be
correct! Law 2 - Data quality is a function
of its use, not its collection! Law 3 - Data
will be no better than its most stringent
use! Law 4 - Data quality problems increase
with the age of the system! Law 5 The less
likely something is to occur, the more traumatic
it will be when it happens!
10Total Quality Control (TQM)
- Philosophy of involving all for systematic and
continuous improvement. - It is customer oriented. Why?
- TQM incorporates the concept of product quality,
process control, quality assurance, and quality
improvement. - Quality assurance is NOT Quality improvement
-
11Cot of fixing data quality
Cost of achieving quality
Lowest Quality
Highest quality
- Defect minimization is economical.
- Defect elimination is very very expensive.
12Cot of Data Quality Defects
- Controllable Costs
- Recurring costs for analyzing, correcting, and
preventing data errors - Resultant Costs
- Internal and external failure costs of business
opportunities missed. - Equipment Training Costs
13Where data quality is critical?
- Almost everywhere, some examples
- Marketing communications.
- Customer matching.
- Retail house-holding.
- Combining MIS systems after acquisition.
14Characteristics or Dimensions of Data Quality
Data Quality Characteristic Definition
Accuracy Qualitatively assessing lack of error, high accuracy corresponding to small error.
Completeness The degree to which values are present in the attributes that require them.
15Completeness Vs Accuracy
95 accurate and 100 complete OR 100 accurate
and 95 complete Which is better?
Depends on data quality (i) tolerances, the (ii)
corresponding application and the (iii) cost of
achieving that data quality vs. the (iv) business
value.
16Characteristics or Dimensions of Data Quality
Data Quality Characteristic Definition
Consistency A measure of the degree to which a set of data satisfies a set of constraints.
Timeliness A measure of how current or up to date the data is.
Uniqueness The state of being only one of its kind or being without an equal or parallel.
Interpretability The extent to which data is in appropriate languages, symbols, and units, and the definitions are clear.
Accessibility The extent to which data is available, or easily and quickly retrievable
Objectivity The extent to which data is unbiased, unprejudiced, and impartial