Data Quality and Ensuring Usability of routinely collected PC data

1 / 26
About This Presentation
Title:

Data Quality and Ensuring Usability of routinely collected PC data

Description:

Literature review 10 years of experiential learning working with data ... Data quality is best defined in terms of 'Fitness for purpose' - What purpose when? ... –

Number of Views:32
Avg rating:3.0/5.0
Slides: 27
Provided by: ish99
Category:

less

Transcript and Presenter's Notes

Title: Data Quality and Ensuring Usability of routinely collected PC data


1
Data Quality and Ensuring Usabilityof
routinely collected PC data
  • Presented to
  • Integrating Clinical and Genetic Datasets
    Nirvana or Pandoras Box
  • Presented by
  • Simon de Lusignan
  • slusigna_at_sgul.ac.uk

2
About me
  • GP in Guildford
  • 11,500 patient practice
  • 6.5 Whole time equivalent GPs
  • Computerised since 1988
  • Senior Lecturer, St. Georges
  • Primary Care Informatics (PCI) research group
  • Using routinely collected data for quality
    improvement research
  • Electronic libraries
  • Computer in the consultation
  • Telemonitoring
  • Chair PCI WG of EFMI
  • Developing a BSc in BMI

3
Overview
  • Introduction
  • Benefits from linking clinical genetic data
  • Growing volumes of accessible primary care data
    increasingly used for quality improvement
    research
  • Objective
  • Is it possible to define the features of a
    routinely collected dataset which can be
    integrated to genetic data
  • Method
  • Literature review 10 years of experiential
    learning working with data
  • Features of quality data
  • What is data quality?
  • Unique identifiers denominators
  • What need to be defined about data processing
    storage
  • Discussion

4
Introduction
  • GIVEN Benefits from linking clinical and
    genetic data
  • Routinely collected clinical data is used
    increasingly for
  • Quality improvement
  • Clinical Audit
  • Health Service Planning
  • Research

References 1. de Lusignan S, van Weel C. The use
of routinely collected computer data for research
in primary care opportunities and challenges.
Fam Pract. 2006 Apr23(2)253-63. 2 de Lusignan
S, Hague N, van Vlymen J, Kumarapeli P. Routinely
collected general practice data are complex but
with systematic processing can be used for
quality improvement and research. Accepted for
publication Informatics in primary care
5
Objective
  • To define the features of clinical data which
    make them fit for integration with genetic data

6
Features of quality data
  • Defining Data Quality
  • Unique identitifiers
  • Defined process of data extraction storage

7
Defining data quality
  • Evolving definitions
  • Completeness accuracy (Pringle et al. BJGP
    1995)
  • Currency (Williams, Methods 2003)
  • Sensitivity positive predictive value (Thiru
    et al., BMJ 2003)
  • Data Quality Probe (Brown Warmington IPC
    2003)
  • Fit for purpose (PCI WG EFMI, 2005)

8
Unique IDs
  • Linkage of data
  • Interoperability of systems
  • Follow-up / traceability of individuals
  • Population denominator ghosts.
  • England Wales - NHS number
  • Scotland - CHI number
  • Our system
  • MIQUEST unique ID for one practice
  • compound with study number
  • unique ID for practice
  • Convert to non-case sensitive ASCII format

9
Processing data
  • Appreciation of data entry issues contemporary
    perspective of system users
  • Defined stages of data processing applications
    used at each stage, quality controls
  • Archive coding systems and the look-up tables
    used to infer meaning or rubrics
  • The queries used to extract the data
  • A metadata system to ensure traceability of each
    cell of data
  • The ethical constraints that apply to the
    dataset.

10
(1) Data entry issues contemporary
perspective of users
  • COPD and Bronchitis codes are easily confused
  • Recoding half of the practice asthmatics from a
    diagnosis to history of code

Ref Faulconer ER, de Lusignan S. An eight-step
method for assessing diagnostic data quality
COPD as an exemplar. Inform Prim Care.
200412(4)243-54.
11
(2) Defined stages of data processing
  • We have defined eight discrete steps in data
    processing
  • (1) Design of queries, piloting,
  • (2) Data entry, (already dealt with)
  • (3) Extraction,
  • (4) Migration, unique IDs essential
  • (5) Integration,
  • (6) Cleaning,
  • (7) Processing, and
  • (8) Analysis

Ref van Vlymen J, de Lusignan S, Hague N, Chan
T, Dzregah B. Ensuring the Quality of Aggregated
General Practice Data Lessons from the
Primary Care Data Quality Programme (PCDQ). Stud
Health Technol Inform. 20051161010-5.
12
(3) Archive coding systems.
  • Coding systems are constantly evolving
  • In general coding systems are becoming larger
    more complex
  • You can go from many to few but not from few to
    many
  • We archive Clinical codes look-up engine used
  • e.g. NHS Triset Browser
  • Each relevant version
  • E.g. 4 and 5-Byte Read Codes Drug Dictionary,
    Proprietary codes

13
Example of look-up engine
14
(4) The query library
  • Re-issued by date
  • Query set for each clinical programme
  • e.g. C1, C2, C3 Cardiac programme
  • Query set for each extraction type
  • e.g. E4, E5, G4, G5 (E for EMIS, G for Generic)
  • Defined look-up tables rubrics for queries

15
The query library
16
The C2 queries
17
The C2 EMIS 5-Byte set
18
(5) Metadata system
  • Follows data from query set to analysis
  • Preserves original data
  • Derived variables clearly identified
  • Associated dates numerics labelled
  • Rules for units used
  • Look-up table used to define variable names

van Vlymen J, de Lusignan S. A system of metadata
to control the process of query, aggregating,
cleaning and analysing large datasets of primary
care data. Inform Prim Care. 200513(4)281-91.
19
Source data metadata structure
20
Linking elementsQuery libraryQuery Core
Clinical Concept Read code
21
Core clinical concept (CCC)
22
Automation
23
(6) Ethics
  • The Ethical constrains on any dataset are
    indexed in the query library

24
Summary
25
Summary
  • Data quality is best defined in terms of
  • Fitness for purpose - What purpose when?
  • Transparent methods of data processing allow
    audit of results
  • Understanding data entry issues / context is
    essential
  • Metadata can help control processing
  • Careful curation of data may allow its use
    beyond the timescale of the original study

26
  • Thanks for listening
  • Simon de Lusignan
  • Tel 020 8725 5661
  • Fax 020 8767 7697
  • Email slusigna_at_sgul.ac.uk
  • Web www.gpinformatics.org
  • www.sgul.ac.uk/informatics/
Write a Comment
User Comments (0)
About PowerShow.com