Title: Genetic Variation and Cancer an Industry Perspective
1Genetic Variation and Cancer an Industry
Perspective
- Adrian Moody
- RD Genetics
- AstraZeneca
NCRI Genetic variation and Cancer workshop Friday
3rd February 2006
2Overview
- Genetics in AstraZeneca
- Value of genetics data for cancer drug discovery
- Challenges of interpreting data
- Future perspective
3(No Transcript)
4Drug Discovery and Development Process
5Value of Genetics Data to Cancer Drug Discovery
- Molecular characterisation and improved
understanding of disease processes - Opportunity to classify tumours by molecular
signature rather than histology - Target Identification through disease linkage
(somatic germline) - Personalised medicine opportunities to stratify
patients for drug treatment with respect to
response, adverse events, pharmacokinetics etc. - Significant logistical / experimental advantages
over other molecular measurements
6Automated technology has shifted bottlenecks
AZs GLP-standard fully automated DNA archive and
reformatting system holds over 400,000 genetic
samples in carefully regulated conditions, stores
reformats gt5000 per day
High throughput genotyping facilities can achieve
gt50,000 reactions per day
With millions of datapoints generated within
weeks, data management and analysis has become
the new bottleneck in pharmacogenetic research
Automated DNA sequencers resequence gt1000
samples/ day
Thornton et al, Drug Discovery Today, 2005
7Abundant public data
- Basic genetics data is increasingly plentiful
- Human Genome
- Db SNP
- HapMap
- Seattle SNP
- Cosmic
- Major new initiatives planned e.g. Cancer Genome
Atlas Project at NCI
Location, Location, Location
8An example - Gene Catalogue
- Allow genetic data to be referenced against the
genome. - Starting point for additional in silico
analysis. - Can be used to store additional relevant data
i.e. population frequency etc. - Limited use for providing broader context
9Context
TATAGCTTGCATGGATG/TGACTC
10Challenges for interpreting genetic data
- Distinguishing somatic mutations from germline
polymorphisms in the absence of paired normal - How many germline samples do you sequence to gain
confidence in mutation data? - Whats in a name?
- T790M EGFR mutation
- Integration of somatic and non-somatic genetics
- Somatic genetics and polymorphisms largely
treated in isolation.
11Challenges for interpreting genetic data
- Understanding functional consequence of genetic
variation - Sequence variation overlay against 3D structures
if available, or primary sequence models which
highlight key functional residues - Copy number changes need to understand basis of
amplification, amplicon boundaries and gene
content - Additional factors to understanding the
importance of a mutation - Causal, modifying or treatment selected?
- What tumour type, tumour stage, treatment regime?
- Additional genetic alterations
12Challenges for interpreting genetic data
- Identifying relevant in vitro and in vivo models
that recapitulate key clinical genetic variations - Requires extensive characterisation of cancer
cell lines - Experimental validation of functional consequence
required - Cross community consistency in cell line and data
standards - Is your cell line the same as mine?
- Is my cell line the same now as it was 2 years
ago? - Genetics crosses disciplines and these data need
to be integrated with biological, clinical and
pharmacological phenotype.
13Challenges for interpreting genetic data
- Statistical considerations for population based
analysis. - Prospective vs retrospective studies
- Is the phenotypic assessment consistent within
the study and with other studies? - Is the study suitably powered?
- Penetrance
- Ethnicity
- What to analyse?
- So much data, there must be something significant
in here..
14Future perspective
- Data analysis and integration recognised as a
significant issue - In-house initiatives
- Ensure valued added genetic data is accessible to
all within the organisation. - Integration of genetics with expression data and
pharmacology - Collaboration examples
- BBSRC industry interchange program proposal with
Imperial Feasibility of multivariate methods for
data integration - Univariate/multivariate statistical methods to
integrate different sources of biological data
such as SNPs, gene expression, metabolic data and
proteomics. - Oxford High Dimension data, i.e. GWA
15- Acknowledgements
- Tim French
- Thank you