Title: Publication and credit for variants
1Publication and credit for variants
Myles Axton Editor, Nature Genetics
Melbourne 14th March 2008
2broad-sense human variome project
- Study phase
- 1) Collect rare clinical mutations from rare
hereditary diseases into 700 locus-specific
databases. - 3) Re-sequence genes for rare variants associated
with common diseases. - 2) Associate common variants with common diseases
via the HapMap and and whole genome association
studies.
Needs Harmonize software, ontologies, formats and
criteria. Add to public databases. Legacy data
framework. Collect, publish and credit. Track
repository tsunami. Replicate associations and
find causal variants. Develop minimal report and
public deposition criteria. Candidate approach
delusion.
3Time for a divorce of human and machine
Yeager et al. Nature Genetics 39, 645-649 2007
4The failure of Fort Lauderdale
Kuroki et al. Nature Genetics 38, 158-167
2006 Accession codes. GenBank/DDBJ/EMBL
AY692036, lthttp//www.ncbi.nlm.nih.gov/sites/entr
ez?termAY692036cmdSearchdbnuccoreQueryKey1gt
AY692037, BS000531BS000572, BS000574BS000581,
BS000583, BS000585, BS000587BS000682. See
Supplementary Table 1 for a full list of the
clone names and accession numbers. In addition,
the following as yet unpublished BACs, listed in
Supplementary Table 5, were generated and
deposited in GenBank by R.K. Wilson (Washington
University School of Medicine) and D.C. Page
(Whitehead Institute) AC142320,
lthttp//www.ncbi.nlm.nih.gov/sites/entrez?termAC
142320cmdSearchdbnuccoreQueryKey4gt AC144378
, AC145769, AC145782, AC146011, AC146175,
AC146245, AC146268, AC147117, AC147148,
AC147343, AC147657, AC147670 and
AC151848. URLs. Manually annotated HSAY genes,
http//vega.sanger.ac.uk/Homo_sapiens/ NCBI,
http//www.ncbi.nlm.nih.gov/ NCBI BLAST,
http//www.ncbi.nlm.nih.gov/BLAST/ RIKEN-Genomic
Sciences Center, http//stt.gsc.riken.jp/. Note
Supplementary information is available on the
Nature Genetics website.
5Microattribution
- Journals provide citation links from paper to
paper. Quantitative citation provides a rough
measure of immediate usefulness. - Database accessions should be traceable to their
original source. Citations to database accessions
in peer reviewed papers and database entries
should be counted and the cumulative count made
prominently public. - Microcitation provides incentive for
collaborative genome annotation and a more
accurate picture of individual and group research
activity. - Microattribution can be extended to web traffic
analysis (not here).
6 ??
It is clear that accessions mandated by the
journal predominate, but ideally, all publicly
deposited resources should be fully cited by
authors, and properly logged and counted by the
journals. http//blogs.nature.com/ng/freeass
ociation/2007/02/duke_of_url.html
7NCBI and EBI have adapted to the needs of HVP
http//www.ncbi.nlm.nih.gov/sites/varvu?gene7249
8Peer review for variants by locus
- Editor commissions annotation of a locus between
set genome coordinates. Submitters of annotations
to NCBI/EBI qualify as authors. - Author group led by locus-specific database
(LSDB) expert evaluates published and database
evidence for a phenotypic effect of each variant
in the locus. - Editor coordinates comments of user community
and assigned expert peer referees. Web editor
formats author annotations as a table of
parameter lists. - Publication with digital object identifier (DOI)
of the definitive set of annotations and links. - Publication in new human variome journal or
participating journals (HuGENet model) of a
synopsis of the locus (PubMed ID).
9Elements of a microattribution browser
- Variant reports submitted to NCBI/EBI
- Retain links to LSDB and data producer with
unique handle - Provide an automatically updating index to genome
via ssIDs. - Link out to background and citing literature via
PubMed/DOI. - Link out to commercial assays and clinical tests.
- Detailed phenotype in dbGAP or G2P, but locally
or in LSDB if IRB stipulates (parameters indicate
additional local information). - Custom UCSC/Ensembl comment browser track
- Displays peer reviewed variants with count of
citing references. - Displays current count of wiki annotations on
each variant.
10NCBI and EBI have adapted to the needs of HVP
11Essential parameters describe a variant
- Locus TSC2
- Variant type SNP
- Genomic g.10119CgtG
- Reference transcript c.465CgtG
- Predicted protein p.Y155
- Consensus ID rs45444196
- Map to Genome Build 36.2
- Publications Cited 2
- PMID of publications 10205261, 17304050
- OMIM MIM191100
- MIM Allele variation MIM191100.000n
- Individual phenotype phs000025.v2.p1
- Population frequency rs45444196Diversity
- gtgnldbSNPss71651017allelePos101len201taxid
9606alleles'C/G'molGenomic - TCTTCTTTAA GGTCATCAAG GATTACCCTT CCAACGAAGA
CCTTCACGAA AGGCTGGAGG - TTTTCAAGGC CCTCACAGAC AATGGGAGAC ACATCACCTA
- S
12Supplementary table 1 of a variome review
- seqss71651017, alleleG, phenophs25.v2.p1,
subTSC2DB.S_Povey, freq1 - seqss1234, alleleA, pheno0, subIllumina,
probesetHap550, freq1500
13Supplementary table 1 of a variome review
- seqss71651017, alleleG, phenophs25.v2.p1,
subTSC2DB.S_Povey, freq1 - seqss1234, alleleA, pheno0, subIllumina,
probesetHap550, freq1500 - URL/URI points via indexing parameter (ssID) to
NCBI or EBI database. - Commissioning journal and participating database
can use to display quantitative microcitation of
each ssID and submitter. - Parameter passing (Open URL/ CrossRef) allows
transmission to third party site or
microattribution proxy site.
14(No Transcript)
15Browser showing microattribution to reviewed
variants and a count of comments and annotations
Variant ssID links to annotations on
microattribution server
References on microattribution server are color
coded, sparklined or stacked to show numbers.
10,000 reviewed references in black, 10,001 wiki
comments in gray.
16Links from comment browser to curated (NCBI/EBI)
annotations and wiki comments
17A sample post-publication comment
Ordinary human-readable text. Requires only that
variants are cited by ssID.
Edit mode
Variant ssIDs appear as clickable links
The system automatically detects links to ssIDs
and keeps a count of these.
18A sample variant page (cf. SNPedia)
Overview of variant, maintained manually by
author (editor or user community)
Content generated automatically
NCBI links for the variant
Citations in reviewed articles
Citations in wiki and other databases
19Software
- Comment server
- Could use off-the shelf MediaWiki or Wiki
Professional. - Need to adapt page rendering so that automatic
content is generated. - Modify search interface so that all objects in a
particular portion of genome can be listed. - Modify system to help with the review process.
- Comment miner
- Needs to be written from scratch.
- MediaWiki uses MySQL in the backend, so this may
be fairly easy. - Genome browser server
- UCSC will adapt custom browser wiki to link to
local wiki database.
20Links to commercial assays and clinical tests
21Sustainable open access publishing
- Role of publisher is to broker between vendors
and authors. - Authors are prepared to enrol in order to
annotate, receive citations, uniquely identify
selves (despite changes of family name and
affiliation) and referee. - Authors may endorse vendors.
- Publisher keeps count of microendorsements and
guides vendors spend. - Vendors may sponsor particular loci, tests or
authors. - Unregistered readers read for free, but cannot
participate.
22(No Transcript)
23(No Transcript)
24Challenges for the variome community
- Parameter list and definintion
- Relative value of data producers, consortia and
data annotators - Sorting and searching parameter-rich strings (vs.
platform-generated) - Somatic mutations (cancer)
- Pipeline and credit for submitting single variant
reports post-review - Comparing target and citing parameter lists
(CrossRef) - Funding body and vendor IDs (professional ethics)
- Public references to locally controlled data
(IRB/access) - Text mining during the review process.
- Physician and public (Genetic Alliance)
interfaces - Who else will use microcitation credit, how?
- Local microattribution or a central author ID
site?
25Thanks for advice
- Gisli Masson
- Muin Khoury
- Mike Feolo
- Donna Maglott
- Lon Phan
- Irina Nudelman
- Tim Hubbard
- Lincoln Stein
- Paul Flicek
- Belinda Giardine
- Samir Brahmachari
- Cover Art by Eve Stockton and Alex Beard.
- Jim Kent
- Jim Ostell
- Ewan Birney
- Mike Cariaso
- Greg Lennon
- Steven Brenner
- Stephen Chanock
- Matt Day
- Timo Hannay
- Edison Tak-Bun Liu
26No compulsion without incentive