Quality Data An Improbable Dream? - PowerPoint PPT Presentation

About This Presentation
Title:

Quality Data An Improbable Dream?

Description:

Title: PowerPoint Presentation Author: rrichter Last modified by: ckeller Created Date: 2/7/2001 2:29:40 AM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:197
Avg rating:3.0/5.0
Slides: 50
Provided by: rri83
Learn more at: https://www.educause.edu
Category:

less

Transcript and Presenter's Notes

Title: Quality Data An Improbable Dream?


1
Quality DataAn Improbable Dream?
  • Elizabeth Vannan
  • Centre for Education Information
  • Victoria, BC, Canada

2
  • Information quality is a journey, not a
    destination
  • - Larry P. English

3
Agenda
  • Data Definitions and Standards Project
  • What is Quality Data?
  • The Cost of Poor-Quality Data
  • Improving Data Quality Our Process
  • Questions?

4
BC Higher Education
  • Canadas Western-most province
  • Population 4.023 Million
  • Land Area 366,795 Sq Miles
  • Publicly Funded Post-Secondary System
  • 22 Colleges
  • 6 Universities

5
CEISS
  • The Centre for Education Information is an
    independent organization that provides research
    and technology services to improve the
    performance of the BC education system

6
CEISS
  • Implement and manage administrative systems
  • Perform custom surveys, research and analysis
  • Facilitate development and implementation of data
    standards
  • Negotiate and manage province wide software
    contracts (Oracle, SCT Banner, Datatel)

7
DDEF Project
  • The Problem
  • Better data about the BC higher education sector
    needed for decision-making
  • No infrastructure in place to facilitate the
    collection of data electronically

Data Definitions and Standards Project Initiated
in 1995
8
DDEF Project
  • The Solution
  • Create data standards for all higher education
    information (Student, HR, Finance)
  • Develop a data warehouse based on standards for
    reporting
  • Implement a common technical infrastructure at
    all higher education institutions

9
DDEF Project
  • Project Goals
  • Improve the quantity and QUALITY of data
    available
  • Reduce the number of data and reporting requests
  • Develop business information system to support
    the management and evaluation of the BC
    Post-Secondary system

10
How Are We Doing?
  • 16 institutions implemented/implementing
  • Institutions using data warehouses for internal
    reporting
  • Data requests reduced
  • Ministry using data

11
Why Focus on Data Quality?
  • Poor data quality in our data warehouse impacts
  • Confidence
  • Decision making
  • Funding

12
Quality Data Are
  • The Four Attributes of Data Quality

13
Quality Data Are
  • Accurate
  • Free from errors
  • Representative

14
Quality Data Are
  • Complete
  • All values are present

15
Quality Data Are
  • Timely
  • Recorded immediately
  • Available when required

16
Quality Data Are
  • Flexible
  • Data definitions understood
  • Can be used for multiple purposes

17
Quality Data
  • Dont have to be perfect
  • Good enough to fill the business need at a price
    youre willing to pay

Our Challenge Defining Quality Criteria
for Higher Education Data
18
Cost of Poor-Quality Data
  • Business Process Costs

Incorrect Registrations Inaccurate Tuition
Billings Payroll Errors
19
Cost of Poor-Quality Data
  • Rework

Re-collect Data Correct Errors Data Verification
20
Cost of Poor-Quality Data
  • Missed Opportunities

Substandard Customer Service Poor Decision
Making Loss of Reputation
21
Improving Data Quality
Improved Data Quality
Business Process Review
22
Business Process Review
  • When, where, how is data collected?
  • Where is data stored?
  • Who creates data?
  • Who uses data?
  • What outputs are required?
  • What quality checks already exist?

23
Business Process Review
  • Involve all stakeholders!
  • For student data we involve
  • Executive
  • Registrars office
  • IT Department
  • Institutional Research

24
Business Process Review
  • Results
  • Understanding of business practices
  • Identification of data creators, custodians,
    users
  • Preliminary quality metrics
  • Problem business practices

25
Data Quality Assessment
  • Establish Metrics
  • Apply metrics to data
  • Review results

26
Establish Metrics
  • For each element determine quality criteria
  • Acceptable range of values
  • Acceptable syntax
  • Comparison to known values
  • Business rules
  • Thresholds

27
Quality Metrics
28
Applying Metrics
  • Collect known information for comparison
  • Develop queries to test each of your validation
    criteria
  • We use Oracle Discoverer, but other tools exist
    (MS Access, SQL)

29
Applying Metrics
Test 1 PEN must be 9 digits long. No characters,
no shorter values acceptable
30
Test 1 Results
Two Student Records Contain Invalid PEN Numbers
31
Test 1 Results
Invalid PENs Data Entry Error?
Can Identify specific students for data cleansing
32
Applying Metrics
Test 2 At least 80 of student records must have
valid PEN number
33
Test 2 Results
This Institution Meets the Quality Threshold
34
Applying Metrics
Test 3 No Duplicate PENs
35
Test 3 Results
This institution has a BIG problem! Can we see
more details?
36
Test 3 Results
Addition information reveals data loading problems
37
Reviewing Results
  • Systematic approach needed
  • Develop strategy for data cleaning
  • Identify source of data problems

Deal with Disparate Data Shock!
38
Reviewing Results
  • Insert a quality review checklist

39
Reviewing Results
40
Data Cleansing
  • Location
  • Administrative System?
  • Staging Area?
  • Who
  • Scope

41
Typical Data Cleansing
  • Correcting data entry errors
  • Removing or correcting nonsensical dates
  • Deleting garbage records
  • Combining or deleting duplicates
  • Updating and applying code sets

42
Business Practice Change
  • Two components
  • Implementing changes to improve data quality
  • Adopting ongoing data quality review process

Changing Business Practices is a Challenge Get
Stakeholder Support
43
Business Practice Change
  • Education
  • Centralizing responsibility for codes
  • Consolidating data collection
  • Implementing validation routines
  • Change business processes

44
Quality Review Process
  • Review data regularly
  • Make someone responsible
  • Establish procedures for correcting data problems
  • Communicate quality improvements

45
Some Changes in BC
  • Creation of Data Manager position, responsible
    for code sets, data quality
  • Regular education for registration clerks and
    other data creators
  • Established relationships between data creators
    and users
  • Re-engineered administrative systems

46
Improvements to BC Data
  • Improved data quality and quantity
  • Nonsensical dates almost eliminated
  • Completeness of key elements improved (from 50
    to 80-90)
  • Data now being collected for CE in standard format

47
Final Thoughts
  • Quality Data are Probable if you are willing to
  • Take a critical look at your existing data
  • Implement changes to how you collect and manage
    data
  • Invest the time to educate and communicate with
    data users and creators
  • Make data quality improvement an on-going process

48
Recommended Reading
  • Brackett, Michael H., Data Resource Quality,
    Turning Bad Habits into Good Practices (New
    YorkAddison-Wesley, 2000)
  • English, Larry P., Improving Data Warehouse and
    Business Information Quality (New York John
    Wiley and Sons, 1999)
  • Redman, Thomas C., Data Quality for the
    Information Age (BostonArtech House, Inc., 1996)

49
Thank You!
Presentation Available At www.ceiss.org or evannan
_at_ceiss.org
Write a Comment
User Comments (0)
About PowerShow.com