Title: Dr Alan Hogg
1The Complexity of Informatics
- Dr Alan Hogg
- Section Head - Platform Development
- NCRI Informatics Coordination Unit
2Aim
- What are researchers trying to do?
- How can informatics support that aim?
- What are (some of) the complexities that
challenge us?
Looking at Informatics as a whole, not just
specifically ONIX the ICU
3Research Scenario
Drug X is an effective treatment for 10 women
with breast cancer What are the molecular
signatures associated with efficacy?
Prof Herbie Newell Former Director of
Translational Research at CR-UK
4Research Steps
- Online access to original patient tissue samples
from tissue repository
- Microarray analysis to sub-type tumours
- Identify molecular pathways associated with
candidate genes
- Qualify biomarkers with more samples and in
prospective clinical trials
5Research Outcome
- Biomarkers that can identify women with breast
cancer who are most likely to respond to Drug X - Better understanding of metabolic pathways
involved - New potential drug targets for future studies
6The Complexity of Informatics?
- Cuts across scientific domains -
- Clinical trials, genomics and systems biology
- Each is a discipline with their own terminologies
- Different forms of information (data types) -
- Clinical records, images, genomics literature
- Different processes -
- Extracting existing data (clinical information)
- Generating and analysing new data (genomics)
- Searching literature (metabolic pathways)
Without Informatics this is NOT possible,
however, it is not easy either
71. Clinical Data Set
- Willingness to share data
- Data sharing policies in place with sponsors of
research - Privacy policy where sensitive data exists
- Public confidence in data privacy
- Informatics needs to support those privacy
policies - Support authorisation and authentication
- Secure transfers between networks and
institutions (not a CD in the post!)
82. Online access to original patient tissue
samples from tissue repository
- Data Linkage
- Ability to link between data sets without
compromising privacy - HIRU in Wales have developed the ALF system
- Prevent identification through access to certain
combinations of data - GIMI project at Oxford Computing Laboratories
- Anonymisation vs Pseudo-anonymisation
- Support for Honest Brokers
92. Online access to original patient tissue
samples from tissue repository
- Data Standards
- Clear meta-data (data about data)
- For example defining the scale of the images,
date image taken etc - Pathology image annotations need to have a clear
and consistent meaning - RADLEX new pathology annotation standard being
developed in the US - Informatics needs to support the creation and
easy access to such standard vocabularies - Development and sharing of tools to support
annotation
103. Microarray analysis to sub-type tumours
- Automation - to collect data output and pass it
through a suitable analysis tool - Requires physical interoperability
- Issues with interfaces (access points) not
existing - Consistent meaning of information is vital
- Again promotion of standards
- Initiatives such as NCIs caBIG, EMBL-EBI
EMBRACE, and ONIX trying to address this
113. Microarray analysis to sub-type tumours
- International Repository to facilitate data
sharing by putting all the data together - Variation in data definitions and formats a
hindrance - Need professional data managers to curate it
- Physical infrastructure would be large
- Current funding models in UK not designed to
support long-term repository programmes
124. Identify molecular pathways associated with
candidate genes
- Knowing what exists
- What stuff exists that might be useful to you
- Resource catalogues simply knowing who has
what and where. - ICU working with NHS Information Centre, Research
Capability Programme, NCIN and caBIG to share
info. - Ability to search literature and data
repositories - Biologically Aware searching across multiple
repositories is Holy Grail - ICU collaborating with UCL, EBI and caBIG
135. Qualify Biomarkers
- Collaboration between teams
- Physical and political ability to operate
across project teams, institutions and national
boundaries - Publication of results
- British Library (publications) leading a project
with EMBL EBI (raw data) and Nature (metabolic
pathways) how do we maintain these links?
14Informatics Challenge - Summary
- Technical
- Physical ability to share data between networks,
individual systems and research teams - Data Definitions are clear and meaningful
- Meta-data its units of measure etc
- Data Standards to allow clear, consistent
annotation classification of research findings - Data Relationships defined linkages between
data sets - Hearts and minds
- Willingness to share
- Informatics systems must support the policies
under which the work is done
15Thank You
16Backup slides
17The Landscape
- Scientific Domains
- Revealed a diverse set of both practical/experimen
talist domains - An equally diverse landscape of
- Data capture and analysis
- Storage
- Maintenance and curation
- Formats and standards
- IT infrastructures