Title: Information Management Framework Data Quality
1Information Management Framework Data Quality
2What is quality
- Quality is dynamic concept that is continuously
changing to respond to changing customer
requirements - Defined in 3 ways
- Conformance to specifications (DQA)
- Fitness for use (Surveys)
3Quality issues
- Problems can result from
- Human error
- Machine error
- Process error
4Purpose
5Conformance to specifications Quality Plan
Data Quality Assessments
6Data Quality Assessments
7Data Store
Data Collection
Data Access
Historic data
Storage
Access Use
Archive/Disposal
Collection
Information lifecycle phases
8Recording quality - ANZLIC
9Business rules
- Each business rule should have an expected
outcome (benchmark) - Business rules need to align to quality ANZLIC
elements
10Findings - DQ Processes
- The processes and guidelines are good!
- The Data Management Plan is important
- Needs to be completed by all data sets prior to
Assessment - Benchmarks for quality established with Data
Managers before DQA
11Soil Profile
- Very large and varied data set (millions of soil
properties) - Where Data exists - is mostly good
- Many missing values
- Data Transformation Errors
- Data on forms different to values in database
- Missing values set to default values in load
program.
12Data Analysis Soil Properties
- Examples of problems
- Location Accuracy - Invalid grid references for a
grid zone - Mandatory Fields missing data
- Nature of Exposure - 1269 records missing value
- Logical Inconsistencies
- If Horizon Code begins with 'B' And ACS Order
is 'SO' (Sodosol)Then pH gt 5.5238 records in
error.
13Data Analysis Ground Water
- Minimal spatial data (point locations only)
- Data where present is mostly good
- Many missing values
14- Examples of problems
- Invalid Key fields
- Work Number of non standard format
- Location Accuracy
- Invalid grid references for a grid zone
- Logical Inconsistencies
- Jobs completed before they started
- Hole depth of 36km
- Mandatory Fields missing data
- Work Type Code - 1503 records missing value.
15Data Analysis Ground Water
- Database Issues
- No Load or creation date in database (only update
date) - Impossible to apply date based business rules
- GW licenses mandatory from 2001 onwards.
- Logical Inconsistencies
- License Form A received and no GDS record
(1000s) - Needs investigation
16Data Analysis
- Action Lists
- Generated for each data set
- Scope of Remedies
- Improving data quality goes beyond the
identifying, measuring and fixing the data in the
IT systems. - Improve data capture
- Train entry staff
- Replace entry processes
- Provide meaningful feedback
- Change motivations to encourage quality
- Add defensive checkers, Periodic DQ asssessments,
Data cleansing
17Data Quality Reporting
- Data Quality Portal
- General DQ information
- Statistical Reporting and Monitoring
- Data Quality Exception Reporting
- Management of Data Quality issues
18Fitness for use - User needs covered later in day
19Improving quality
20Ways of improving quality
- Tackle quality at source, not downstream in the
lifecycle - Training data collectors in importance on getting
it right - Continual improvement with quality method
21Links among Process Groups in a Phase
Planning Process
Initiating process
Controlling process (check)
Executing process (do)
(Arrows represent flow of information)
Closing process
( PMBOK 2000 Fig 3-1 p31)
22(No Transcript)
23(No Transcript)