Title: EPSRC
1EPSRC Pathways to Impact Grant. Training and
education for the developers of databases in
research and clinical practice.
- Rick Jones, Yorkshire Centre of Health
Informatics, University of Leeds
2Context
- The UK can significantly enhance its clinical
research capability by using, strictly within the
bounds of patient confidentiality, the electronic
patient data that the UKs National Programmes
for IT in the NHS have the potential to allow.
This will have enormous benefits for all types of
clinical, public health and health services
research and for many aspects of patient care.
3Context Simulations
- 4.1 Surveillance (Pharmacovigilance)
- The vision for an ideal surveillance system is of
a nationwide active system for tracking
patients responses to medical interventions
(POMS, immunisations and OTC drugs as well as
Devices) and of disease and other incidents
requiring reporting. - 4.2 Clinical Trials
- Within the range of activities involved in
running a successful clinical trial in the
future, there will be a need to access and
process data from electronic records at a number
of stages before, during and after the trial. - 4.3 Prospective tracking of a known cohort
- This simulation team concluded that in order for
UK Biobank (which is a resource for prospective
studies) to be able to provide benefits access to
data will have to be at patient level data
(identifiable) both coded and textual. - 4.4 Observational epidemiology
- The construction of retrospective observational
epidemiological studies based on routine data
sources requires access to data from a very wide
range of electronic records, both within and
without the health services.
4Context The NHS Research Capability Programme
- Six work streams
- Technical Architecture
- Functional Scope
- Data Quality, Standards and Linkage
- Information Governance and Threat Assessment
- Infrastructure
- Communications and Stakeholder Engagement
5Harvesting research outcomes from clinical
databases - demonstrating the potential of TPP
SystmOne.
- Richard Gillott, Cardiovascular Database
Developer, LTHT - Rick Jones, Yorkshire Centre of Health
Informatics, University of Leeds
6Aims objectives
- Based on the work of the RCP could a practical
trial be carried out as proof of concept to - Prove the feasibility of extracting identifiable
patient data from GP systems for use in research - Construct an architecture to enable the rapid,
repeatable, and secure query collection of data.
7Pilot trial
- A small trial was planned with the aid of 2
research groups based in Leeds - The pilot aimed to demonstrate the value of the
information contained in the patient record, and
prove whether the data was sufficient in its
coverage of the population and its completeness
8Methods
9Results Yields of Records
81 Cardiovascular 66 Oncology
10For linked records
- We need
- Granulated consent all or part of record
- A model to allow selective searching by
- Clinical relevance
- Administrative components
- Therapeutics
- Diagnostics
- An understanding of when and how frequently to
conduct searches - A business model to reimburse GP systems
suppliers / data guardians for their search time
11Workshop 1. Headlines
- Specificity of Consent Statements
- Baseline Knowledge of Electronic Record Systems
- Videos of Consent Process
- Consent to approach relatives
- Trial Designs for Long-term outcomes
- Knowledge of Accountable Parties
- Specimen Consent Forms and Information Sheets
12Today Architecture
13Background
- Building upon the work undertaken by the RCP
- Proposed a number of high level architectures
- Centralized vs federated data
- Concentrates on Secondary Care
- Honest broker
- The RCPs work and its database of documents was
reviewed. - The aim was not to provide a conclusive model for
implementation, but to discuss some of the
important considerations that need to be made.
14About the model
- Uses the Service Orientated Architecture (SOA)
paradigm, in a format that ties in with the work
being conducted by the Research Capability
Program to link data from Primary, Secondary and
Tertiary sites across the NHS. - Adopts the hybrid model of centralized
federated data storage and processing - Recognises the mechanisms necessary for the
querying of data are already available in
some systems, such as SystmOne, thanks to the
GPES initiative. However, this scheme will need
to be extended, to support both the push and the
pull methods required by a scalable solution.
15Centralized vs distributed
16Overview diagram
17Deployment diagram
18In this workshop
- Can we explore and determine an architecture?
- Can we consider the metrics of use of a linked
search service? - Numbers of patients
- Frequency of searches
- At what scale should this be developed?
- Local pilot
- Employ National infrastructure
19Exercise 1
- What are the main use-cases of a system?
- Use simple UML notation
- High level view
- Identify as many actors as possible
20UML Use Cases
- Keep it simple
- Identify actors
- Patient
- Researcher
- Clinician
- Sys Admin
- Key tasks
- Dont worry about syntax
21Intermission Food for thought
- Tony Solomonides et al,
- Privacy compliance and enforcement on
Europeanhealthgrids an approach through ontology - Phil. Trans. R. Soc. A 2010 368, 4057-4072
22Exercise 2
- Can we identify the components needed to meet the
requirement? - Systems
- Processes
- Connections
- Well sort out the UML later!
23It doesnt need to be pretty
24Next steps
- Workshop output will be synthesised and
circulated for validation - Overview of work to date will be prepared for
31st March Workshop - Concept is that the overview will be presented
and critiqued by participants and prepared for a
final white paper report - We are expecting a number of patients at the
final session
25At 1pm..
- Free Text in GP Patient Records How much extra
information is there and how can we extract it?
Rob Koeling, University of Sussex. - The UK General Practice Research Database
provides a valuable source of information for
health services research. Coded data is
supplemented by free text (physicians' notes or
letters). However, due to the difficulty of
extracting information, and the cost of
anonymisation, free text is seldom utilised in
research. One of the goals of the PREP (Patient
Record Enhancement Project) project at the
University of Sussex and the Brighton and Sussex
Medical School, is to explore how much extra
information (i.e. on top of the coded data) is
concealed in the free text fields of GP patient
records. I will present some results of an
annotation effort in which a corpus containing
text records of 344 women in the year prior to a
diagnosis of ovarian cancer were marked up with 5
commonly presenting symptoms. I will also report
on a simple method we developed for automatically
extracting these symptoms using string matching.
I will talk about what this means for the
estimates of the incidence of these symptoms and
what we can do to extract this information. How
far can simple, readily available, methods take
us and what is the scope for more complex
information extraction techniques.
26Yorkshire Centre for Health Informatics
- Director Dr Susan Clamp
- 44 (0)113 343 4960
- s.clamp_at_leeds.ac.uk
www.ychi.leeds.ac.uk