Title: The COSMIC
1The COSMIC database and web site
Catalogue Of Somatic Mutations In Cancer
2Known cancer genes in the human
genome www.sanger.ac.uk/genetics/CGP/Census
352 Cancer Genes
Germline mutation 66
Missense mutation 83
Nonsense mutation 60
Somatic mutation 317
Frameshift mutation 61
Amplification 7
Translocation 263
Large deletion 26
The literature contains small intragenic somatic
mutation data on over 200,000 tumours
3Example of the response to gefitinib in a patient
with refractory nonsmall-cell lung cancer and
a somatic mutation in EGFR.
Mutation data EGFR status
Other data sex age pathological type no. prior
regimens smoking status duration of
therapy overall survival ethnicity
Lynch et al, N. Engl. J. Med., 2004
4The ERK MAPKinase pathway
MEK inhibitor (CI-1040) IC50 values as a
function of BRAF and NRAS mutational status
Solit et al, Nature Jan 2006
5Updates of 163 DNA variation web site /
database quarter when last updated
Link was dead 50 Last updated before 2002
16 Date not obvious 27 Last updated after 2002
70
Number of sites
2002
2003
2004
2005
2006
Links taken from HGVS web site
6Why develop COSMIC?
- To preserve somatic mutation data
- To share somatic mutation data
- To standardise genotype and phenotype
information - To integrate published data with the output of
the Cancer Genome Project
7Data types captured in COSMIC
Individual
Tumour
Gene
Mutations
Sequence change
Age, sex, Environmental variables
Classification
Reference sequence
Source Features
Links to other resources, Ensembl, OMIM
gtATGCCGATAGGAGCTAGGCTTAGCTTGACGGATGGCATGGCATTAGC
STTGGACTTTAGCSTGACAAGGACTTTAGCATAGGAT CAGGATTAGG
ATTGTAG
c.1799TgtA p.V600E
8Data flow in COSMIC
Scientific literature
Manual data curation
Web site
COSMIC
COSMIC
Data export
Cancer Genome Project Lab work
Lab DB (SNP)
Web site
External
Internal
9COSMIC web site
http//www.sanger.ac.uk/cosmic/
10Tissue summary
11Gene summary Rb1
12Gene histogram Rb1
13Gene / Tissue details Rb1
Tissue
Samples
Mutated
Links
14Individual sample details
15Data export and integration
- The data on the web site is available in multiple
formats - Export function from the web site for individual
genes (HTML, csv, Excel) - ftp.sanger.ac.uk csv and Excel for individual
genes - Oracle export of the whole database
- Integration with Ensembl
- COSMIC DAS track for Ensembl
16How to start and sustain a database/web site
- Scientific utility
- Attached to a project that is generating or
using the data - Contact with the users - cosmic_at_sanger.ac.uk
- - COSMIC-announce_at_sanger.ac.uk
- Publication
- Continued development
- COSMIC funding
- 2000 to 2011 Wellcome Trust 4 dedicated staff
others infrastructure - 2006 to 2009 GlaxoSmithKline 2 dedicated staff
17What is available in COSMIC?
Statistics for Feb 2006 release Experiments
228,669 Tumours 142,569 Mutant samples
25,176 Mutations 26,194 Papers curated
3,013 Genes 1,035
18Weekly Statistics for COSMIC sample content and
web site hits
Sample count
140,212
73,767
120,000
80,000
40,000
Web site hits
27,690
196,571
600,000
400,000
200,000
2004
2005
2006
Based on a 4 week average
19Cancer Cell Line Project
800 cancer cell lines 14 known cancer genes 978
mutations
20Future plans for COSMIC
- Curate more cancer genes, update existing genes
- Share data with other databases IARC TP53
database - Integrate other data from the CGP copy number
data - Further use of DAS and Ensembl
- Expand the number of queries and views on the
web site - Expand the export functions
- Display more unpublished mutation data from the
Cancer Genome Project - Display data from the NCI?