Title: Break out sessions
1Break out sessions
2 1. What standards are being used to describe
Genetic Variation Data2. What infrastructure
resources are available/what is needed for data
integration3. How should research interact with
the health service4. What funding model is
appropriate5. What should the NCRI be doing in
this area
Aims of the meeting
3Workshop Format
- Invited speakers representing the breadth of the
NCRI portfolio - Breakout sessions in 3 groups with questions
- We have drafted some already
- We are happy to take these from the floor too
- Report back
4Breakout groups
Group 2 Rakesh Nagarajan Richard Wooster Aengus
Stewart Andrew Devereau Gillian Heap Tim
French Angela Cox Stephan Feller Jay Kola Peter
Kerr
Group 3 Paul Lewis Crispin Miller Wendy
Russell Adrian Moody Cecilia Lai Heike
Grabsch Andy Bush Dawn Smith Paul Flicek Helen
Parkinson
- Group 1
- Anthony Brookes
- Tina Boussard
- Norman Freshney
- Jane Cope
- Chris Mattocks
- Audrey Petitjean
- Richard Begent
- Margus Lukk
- Liz Pittendreigh
- Daniel Zicha
- Max Wilkinson
5 1. What standards are being used to describe
Genetic Variation Data2. What infrastructure
resources are available/what is needed for data
integration3. How should research interact with
the health service4. What funding model is
appropriate5. What should the NCRI be doing in
this area
Aims of the meeting
6Group 1-Q1 Standards
- Discussion
- Standards and their role
- Reporting semantics
- Role of Journals
- Community buy in and uptake
- Conclusions
- Essential to drive Standards development
- Semantics fundamental
- Best standards evolve naturally (flexible)
- Role of Journals and Stakeholders
- Potentially biotech/industry
-
7Group 1 Q2 Resources
- Discussion
- Ontologies
- Individualisation
- Harmonisation
- Changing concept of data bases
- Funding
- Conclusions
- Domain is of great importance
- Ability to annotate either locally/centrally
- Community driven consortia
- Novel database models (self-sustaining)
- High dimensional data challenges
- Cancer provides ideal model
-
8Group 1 Q 5 What can NCRI do?
- Discussion/Conclusions
- Forums and facilitation
- Communication to funders/stakeholders
- International relationships
- Consensus building
- Expanding workshop portfolio (series)
9Infrastructure
- caBIG strongly semantically typed
- PML not semantically-typed will PML-2 support
an ontology - Should NCRI mandate full caBIG-compatibilty re
standards - Much existing data that doesnt meet caBIG
standards NCRI funders should be sympathetic to
this - Other caBIG infra caCORE, caDSR
- caGRID have demonstrated joining across
different data types by - UK myGRID workbench for researchers for using
bioinformatics resources (ongoing work) - HL7 SIG genotype/pedigree family object
modelling project - Choosing the right tool for the job e.g. NGRL
are developing website that lists resources/tools
- which ones are better than others, how well
are they curated, quality of data etc - Is it important to standardise algorithms?
Perhaps not as long as raw data is available - Any tool is potentially useful (unless its out
of date) - Tricky researchers/end users often want yes/no
answer but quite often an interpretation - Some metrics might be useful about updates,
usage, etc
10Infrastructure(2)
- Should funders insist that tools have some
quality metric e.g. a developer carries out a
control test on the tool they have developed - Generalise to other areas
- Balance between directed research (e.g.
NCRI-compatible) and solely cutting-edge - Moving from research to development means funding
dries up - Also cultural issue in that service-provision is
different than research researchers dont
necessarily want to provide it - OPEN-SOURCE v PROPRIETARY
- Both are necessary which is best model/balance.
open-source okay if support is in place - caBIG is welcoming companies to buy-in to
standards and possibly to provide support - Sanger has mixture e.g. heavy use of Oracle but
Ensembl runs off mySQL - Industry AZ must run off certain platform which
has advantages within company but tricky sharing
data outside - People might be initially reticent about
uniformity of platform but often end-up happy
with the benefit
11Infrastructure(3)
- Mass negotiation by NCRI on behalf of research
community for commercial software could be very
economically beneficial - Agreement that if funders are paying for the
development then software should be open-source
but support could be proprietary - Infrastructure for SHARING data as opposed to
ANALYSING data should always be open-source - OMG standard no-one using it
12CFH
- No family ID because patient-centred how do we
exchange pedigree data? Confidentiality issues
about sharing this type of data - Is SNOMED-CT good enough to handle genetic data?
- Is HL7 good enough for exchanging genetic data?
- Can slow things down because these are very big
standards organisations - Do Once and Share project looking at Clinical
Genetics (Andrew Devereau involved) - Difficult to ask people for consent when being
diagnosed - Secondary uses services can export pseudonymised
data for research - Research phenotype data is much more diverse
than CFH phenotype data therefore exchange
standard far away - (Are there established standards for ethnicity?)
13What NCRI can do
- Maybe working groups can be established to adress
issues - Involvement with other initiatives lobbying
- EU holding a workshop in March (standards and
funding) - HGVS (re standards)
- tighter alignment with caBIG workspaces
(Integrated Cancer Research workspace most
suitable) - NCI/NIH funding?
- Industry would benefit and should be approached
(e.g. like SNP consortium) - NHS Informatics (funding?)
- Patient advocacy groups are very powerful also
important to give feedback to patients (important
to have a good public face) - Important to be able to sell the benefits to
organisations like CFH and industry
14Group 1
- Discussion/Conclusions
- raw data is needed for gen var (HTP)
- ontologies are a problem
- phenotype covers everything incl. Histology
slides - Clinical definitions complex
- Mapping to genome is less of a problem than the
phenotype - Mapping only 70 pc of the data is OK the rest
may not be good enough - Protocols vary also need standardisation of the
processing as well as the reporting - This costs money, clinicians dont want to do
this as std - Funders have the power
- Training sets might help e.g for badly sampled
tissues vs useful ones - Rare diseases find stdizn easier data
sharing/data integration - Meta analysis is a way forward, id whats been
reported most and work from there - Conclusions
- 1
- 2
- 3
15Pt 2.
- Bioinformaticians suffer as part of MDTs non
bioinformaticians managing informatics projects
can be problematics - Transatlantic cooperation is desirable, interest
in the caBIG process - Training is key