Data Quality: Prerequisite for Data Sharing - PowerPoint PPT Presentation

About This Presentation
Title:

Data Quality: Prerequisite for Data Sharing

Description:

Technology in Motion, Inc. 2. Agenda. Case Study Background. Data Quality Framework ... Case Study. Federal Bureau. Legal/Catching Bad Guys. Data, data everywhere... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 20
Provided by: bonnie71
Learn more at: https://dama-ncr.org
Category:

less

Transcript and Presenter's Notes

Title: Data Quality: Prerequisite for Data Sharing


1
Data Quality Prerequisite for Data Sharing
  • Bonnie K. ONeil
  • Sr. Principal Data Architect
  • PPC

Yazmin Rowe Data Architect Technology in Motion,
Inc.
2
Agenda
  • Case Study Background
  • Data Quality Framework
  • Proof of Concept
  • Incremental Approach
  • Take-aways

3
Case Study
  • Federal Bureau
  • Legal/Catching Bad Guys
  • Data, data everywhere
  • Presidents directives to share data
  • Especially around terrorism
  • Data quality an issue (where is it NOT an
    issue?!)
  • Initiative Implement Data Quality Program to
    enable Data Sharing

4
State of Affairs
  • No enterprise data model
  • No enterprise data dictionary
  • They DO have Data Management in place
  • DM Reviews
  • DM Handbook
  • Data Sharing Challenges
  • No gold copy or source of record for data
    elements
  • Different systems, supposedly same data element,
    different values
  • How many systems is this data element in?
  • Impact analysis
  • Data sharing simply not possible!

5
Framework FundamentalsDictionary is the Heart
of Data Quality
  • How can you tell what bad data is if you dont
    know what it is supposed to be in the first
    place?
  • Dictionary tells you this
  • Good definitions spell out the expectation for
    the data
  • You know it is a good definition when you are
    able to tell if the data conforms to expectations
    or not
  • Should be able to compare results of profiling
    with the definition for the field
  • Does the data conform to the definition?

6
Framework FundamentalsTop Down and Bottom Up
Data Management
  • Profiling to see whats really in the fields
  • Examine data element definitions (where they
    exist) to determine what the business thinks the
    fields contain
  • Enabled business person (Data Steward) to be key
    player

7
Study in 2007
  • First Six Months 2007
  • Enterprise-Wide Data Quality Standard/Procedures
  • Standards are part of the infrastructure
    necessary to share data externally
  • Enterprise Data Quality Framework
  • Strategic and tactical approach
  • Successful Proof-of-Concept Project
  • An enterprise data model
  • A data dictionary
  • In 2007
  • Software Selection for Data Quality Framework
  • Kick-off First Business Unit Data Quality Project

8
The Proof of Concept
  • All kinds of politics with getting new projects
    approved
  • Especially with Production Data
  • We performed a Proof of Concept (POC)
  • Isolated environment testing, using PCs
  • Agreed to get rid of production data after we
    have profiled it
  • Language is Critical
  • Cant call it Data Profiling
  • Instead, called it Data Demographic Analysis

9
Proof of Concept Contd
  • Found a Sponsor
  • Good friends with the DM Team Lead
  • Recent DQ issue
  • Has motivation to look into this
  • Formed an interdisciplinary project team (IPT)
  • Involved many people from different areas of the
    business

10
Shoestring Principle
  • Bonnies Law
  • Use Whatever is Laying Around
  • You will be surprised at what you find when you
    look for whatever is lying around
  • Already purchased software
  • Software/hardware scrapped from a failed project
  • Under-utilized systems

11
Using Bonnies Law
  • Repository products were too expensive
  • Had Oracle Warehouse Builder (OWB) lying around
  • New OWB has a data profiling option
  • Good News Saved us from having to buy a separate
    profiling tool
  • Bad News OWB was an option (meaning money)
  • Still cheaper than having to buy separate
    profiling tool
  • HAD TO HAVE Profiling!!
  • Using it NOT for ETL!

12
Benefits
  • Statistics on their data
  • Profiling Min/Max, NULL, Distinct,
    format/pattern, etc.
  • Cannot manage what you cannot measure!
  • Immediately pinpoint data quality issues
  • Traceability to data concepts (EDM)
  • Show multiple occurrences of same type of data
  • Setting the infrastructure in place for a super
    query
  • Provided a straw man data quality methodology
    (Framework)
  • In draft
  • Solicited comments from everybody
  • Helps get buy-in BIG TIME instead of shoving it
    down their throats
  • Users felt included instead of alienated

13
Scoping
  • Divide conquer
  • Pick a subject area
  • Less complex semantics
  • PEOPLE
  • Limit systems for the POC
  • Three, but ended up with two
  • Not overly complex but notsimple either
  • Kept refining the scope

14
Incremental Growing the Project
  • If the user likes it, this project can graduate
    to a real project
  • More complex subject areas
  • More systems
  • This is actually what happened!
  • Sell to other groups in the bureau
  • EDM will grow incrementally
  • Successfully established a Data Quality Program
    at the Bureau

15
Take-Aways
  • You must ALWAYS do data profiling!! Essential!!
    For anything. Period.
  • Try to use what you have instead of buying
    something new expensive
  • Youd be surprised what you find lying around
  • Involve the users and other groups within the
    business
  • Especially in creating a methodology
  • Lets them feel a part of the creation
  • Language is very important to people
  • Sometimes I have seen the term Data Warehouse
    disliked
  • Get your project funded by tagging it along with
    a business goal
  • Funding the EDM by way of data quality
  • Find business hot button and propose to solve it

16
Future Plans
  • Data Governance
  • Data Inventory
  • Master Data Management (MDM)
  • Formal integration of data quality measurements
    into SLC
  • Linking the EDM to application data
  • Suck in the data from the source systems
  • Suck in the EDM from data modeling tool
  • Map the two
  • Virtual mapping
  • No data movement taking place

17
Conclusion
  • In order to achieve data sharing, you must clean
    up the data first
  • You can get data quality projects funded if you
  • Start small
  • Solve an important business problem
  • Establish a framework
  • Get a sponsor who sees the business value in what
    you are providing
  • Be politically savvy about word usage dont use
    their charged words
  • Get business people involved and participating
  • Limit expenditures at first, until you have
    proven the business benefit
  • Do a POC to test drive your approach (Try it,
    youll like it)
  • Isolate it from production applications

18
Thanks!
  • Bonnie ONeil
  • Project Performance Corporation
  • 24771 Westridge Rd.
  • Golden, CO 80403
  • Office 303-642-3534
  • Cell 303-725-1737
  • boneil_at_ppc.com
  • PPC is based in the Washington, DC area
  • and performs both Government and Commercial work
  • IT Consulting, Project Management

Yazmin Rowe Technology in Motion, Inc. Office
703-278-0792 Cell 301-915-4471 Yazmin.Rowe_at_techi
nmotioninc.com
19
Reference
  • Newly released book
  • Business Metadata
  • Authors
  • Bill Inmon
  • Bonnie ONeil
  • Lowell Fryman
  • Making metadata useful to the business
  • Does metadata need to be translated into
    business speak?
  • Where does business metadata live?
  • What do you have to set in place to implement it?
  • How do you do a Vulcan Mind Meld to get it out of
    peoples heads?
Write a Comment
User Comments (0)
About PowerShow.com